Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Evidence Accumulation Clustering #134

Open
thomasjpfan opened this issue Nov 23, 2021 · 0 comments
Open

Add Evidence Accumulation Clustering #134

thomasjpfan opened this issue Nov 23, 2021 · 0 comments

Comments

@thomasjpfan
Copy link
Member

Issue to keep track of scikit-learn/scikit-learn#1830:

Evidence accumulation clustering: EAC, an ensemble based clustering framework:
Fred, Ana LN, and Anil K. Jain. "Data clustering using evidence
accumulation." Pattern Recognition, 2002. Proceedings. 16th International
Conference on. Vol. 4. IEEE, 2002.

Basic overview of algorithm:

  1. Cluster the data many times using a clustering algorithm with randomly (within reason) selected parameters.
  2. Create a co-association matrix, which records the number of times each pair of instances were clustered together.
  3. Cluster this matrix.

This seems to work really well, like a kernel method, making the clustering "easier" that it was for the original dataset.

The default of the algorithm are setup to follow the defaults used by Fred and Jain (2002), whereby the clustering in step 1 is k-means with k selected randomly from 10 and 30. The clustering in step 3 is the MST algorithm, which I have yet to implement (will do in this PR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant