fowlkes_mallows_score#
- sklearn.metrics.fowlkes_mallows_score(labels_true, labels_pred, *, sparse=False)[源代码]#
测量一组点的两个集群的相似性。
Added in version 0.18.
Fowlkes-Malows指数(LDI)定义为精确度和召回率之间的几何平均值:
FMI = TP / sqrt((TP + FP) * (TP + FN))
Where
TP
is the number of True Positive (i.e. the number of pairs of points that belong to the same cluster in bothlabels_true
andlabels_pred
),FP
is the number of False Positive (i.e. the number of pairs of points that belong to the same cluster inlabels_pred
but not inlabels_true
) andFN
is the number of False Negative (i.e. the number of pairs of points that belong to the same cluster inlabels_true
but not inlabels_pred
).分数范围为0到1。高值表示两个聚类之间具有良好的相似性。
阅读更多的 User Guide .
- 参数:
- labels_true形状类似数组(n_samples,),dype =int
将数据聚集到不相交的子集中。
- labels_pred形状类似数组(n_samples,),dype =int
将数据聚集到不相交的子集中。
- sparse布尔,默认=假
用稀疏矩阵在内部计算权宜矩阵。
- 返回:
- score浮子
由此产生的Fowlkes-Malows分数。
引用
[1]E. B. Fowkles and C. L. Mallows, 1983. "A method for comparing two hierarchical clusterings". Journal of the American Statistical Association <https://www.tandfonline.com/doi/abs/10.1080/01621459.1983.10478008>
_[2]Wikipedia entry for the Fowlkes-Mallows Index <https://en.wikipedia.org/wiki/Fowlkes-Mallows_index>
_示例
完美的标签既同质又完整,因此评分为1.0::
>>> from sklearn.metrics.cluster import fowlkes_mallows_score >>> fowlkes_mallows_score([0, 0, 1, 1], [0, 0, 1, 1]) np.float64(1.0) >>> fowlkes_mallows_score([0, 0, 1, 1], [1, 1, 0, 0]) np.float64(1.0)
如果类成员完全分散在不同的集群中,则分配是完全随机的,因此FMI为空::
>>> fowlkes_mallows_score([0, 0, 0, 0], [0, 1, 2, 3]) 0.0