Add Evidence Accumulation Clustering #134

thomasjpfan · 2021-11-23T17:25:44Z

Issue to keep track of scikit-learn/scikit-learn#1830:

Evidence accumulation clustering: EAC, an ensemble based clustering framework:
Fred, Ana LN, and Anil K. Jain. "Data clustering using evidence
accumulation." Pattern Recognition, 2002. Proceedings. 16th International
Conference on. Vol. 4. IEEE, 2002.

Basic overview of algorithm:

Cluster the data many times using a clustering algorithm with randomly (within reason) selected parameters.

Create a co-association matrix, which records the number of times each pair of instances were clustered together.

Cluster this matrix.

This seems to work really well, like a kernel method, making the clustering "easier" that it was for the original dataset.

The default of the algorithm are setup to follow the defaults used by Fred and Jain (2002), whereby the clustering in step 1 is k-means with k selected randomly from 10 and 30. The clustering in step 3 is the MST algorithm, which I have yet to implement (will do in this PR).

thomasjpfan mentioned this issue Nov 23, 2021

MRG: Evidence Accumulation Clustering scikit-learn/scikit-learn#1830

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Evidence Accumulation Clustering #134

Add Evidence Accumulation Clustering #134

thomasjpfan commented Nov 23, 2021

Add Evidence Accumulation Clustering #134

Add Evidence Accumulation Clustering #134

Comments

thomasjpfan commented Nov 23, 2021