Expose Distance Information in KMedoids #168
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This pull request enhances the KMedoids implementation by exposing the distances of data points to their respective medoids. Previously, this information was internally computed during the clustering process but not exposed to users. The addition of the distances_ attribute allows users to access these distances without the need for additional pairwise distance calculations, which can be computationally expensive.
Changes Made
Addition of
distances_
Attribute:A new attribute,
distances_
, has been introduced to store the distances of each data point to its assigned medoid.Modification of fit Method:
The distances are now computed using the existing transform method and stored in the distances_ attribute.
The self.inertia_ attribute is updated to use the distances directly, avoiding redundant pairwise distance calculations.
Motivation
The motivation behind this enhancement is to provide users with direct access to the distances between data points and their respective medoids. This information can be valuable for users who wish to perform additional statistical analyses, such as identifying the closest data points to medoids, without incurring the cost of recomputing pairwise distances.
Example Usage
Users can now access the distances using the distances_ attribute after fitting the model:
This information can be utilized for various purposes, enhancing the flexibility and utility of the KMedoids implementation.