Cluster points
Clusters objects (based on centroid locations) using K-Means++ and/or DBSCAN algorithms.
Description
Clusters objects (based on centroid locations) using K-Means++ and/or DBSCAN algorithms. In K-Means++ [1], an optimisation of the standard K-Means algorithm, the points are assigned to a pre-determined number of clusters such that each point is assigned to its closest cluster mean position (this process is repeated until the cluster assignments stabilise or a maximum number of iterations is reached). For DBSCAN [2], points are clustered based on a minimum number of neighbours within a specified spatial range. As such, this algorithm doesn't require prior knowledge of the number of clusters. Both algorithms use their respective Apache Commons Math implementations.
References:
[1] Arthur, D.; Vassilvitskii, S. (2007). "k-means++: the advantages of careful seeding." Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics Philadelphia, PA, USA. pp. 1027–1035
[2] Ester, M.; Kriegel, H.-P.; Sander, J.; Xu, X. (1996). "A density-based algorithm for discovering clusters in large spatial databases with noise." Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). AAAI Press. pp. 226–231
Parameters
Parameter | Description |
---|---|
Input image | |
Output objects | |
Binary logic | |
Clustering algorithm | The clustering algorithm to use:
|
Number of clusters | If "Clustering algorithm" is set to "KMeans++", this is the number of clusters the points will be assigned to. |
Maximum number of iterations | If "Clustering algorithm" is set to "KMeans++", this is the maximum number of optimisation iterations that will be performed. If cluster assignment has stabilised prior to reaching this number of iterations the algorithm will terminate early. |
Neighbourhood for clustering (epsilon) | If "Clustering algorithm" is set to "DBSCAN", this is the minimum distance to neighbour points that must be satisfied for a point to be added to a cluster. This distance is specified in pixel units. |
Minimum number of points per cluster | If "Clustering algorithm" is set to "DBSCAN", this is the minimum number of neighbour points which must be within a specified distance ("Neighbourhood for clustering (epsilon)") of a point for that point to be included in the cluster. |