Skip to content

ai-se/Semi-Supervised

Repository files navigation

Semi-Supervised

alt text Taxonomy of Semi-Supervised Learning from Van et al.[1]

Self-Training: (Widely used in SE)

  1. Different supervised algorithms in a semi-supervised setting

Co-Training: (Widely used in SE)

  1. Multi-view co-training: Different combinations of supervised algorithms
  2. Single-view co-training: Different combinations of supervised algorithms
  3. CO-Forest
  4. Effort-Aware tri-Training

Boosting:

  1. SemiBoost

Feature extraction:

  1. Principal component analysis (not sure if we can call it semi-supervised)
  2. FTcF.MDS

Cluster-then-label:

  1. Clsuter data using some clustering algorithm (EM) and then use the labels for assigning labels to clsuter then predict.
  2. semi-supervised GMM

Pre-training:

  1. DNN - Not using it

Maximum-margin methods:

  1. S3VM

Perturbation-based methods:

  1. DNN - Not using it

Manifolds:

1.LapSVMp

Generative Models:

  1. DNN - Not using it

Graph Based:

  1. LabelPropagation
  2. LabelSpreading

Data Location:

Part of the data has been uploaded into the data folder, remaining data has been uploaded into https://zenodo.org/records/10022678.

[1] Jesper E Van Engelen and Holger H Hoos. A survey on semi-supervised learning. Machine Learning, 109(2):373–440, 2020