make_biclusters#

sklearn.datasets.make_biclusters(shape, n_clusters, *, noise=0.0, minval=10, maxval=100, shuffle=True, random_state=None)[源代码]#

生成用于双集群的恒定块对角线结构阵列。

阅读更多的 User Guide .

参数:

shape形状的多元组（n_rows，n_RST）: 结果的形状。
n_clustersint: 双集群的数量。
noisefloat，默认=0.0: 高斯噪音的标准差。
minvalfloat，默认=10: 双簇的最小值。
maxval浮动，默认=100: 双簇的最大值。
shuffle布尔，默认=True: 洗牌样本。
random_stateint，RandomState实例或无，默认=无: 确定创建数据集的随机数生成。传递int以获得跨多个函数调用的可重复输出。看到 Glossary .

返回:

X ：nd形状数组 shape形状数组: 生成的数组。
rows形状的nd数组（n_clusters，X.shape [0] ): 各行的集群成员资格指标。
cols形状的nd数组（n_clusters，X.shape [1] ): 各列的集群成员资格指标。

参见

make_checkerboard: 生成具有块棋盘结构的数组以进行双集群化。

引用

[1]

Dhillon, I. S. (2001, August). Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 269-274). ACM.

示例

>>> from sklearn.datasets import make_biclusters
>>> data, rows, cols = make_biclusters(
...     shape=(10, 20), n_clusters=2, random_state=42
... )
>>> data.shape
(10, 20)
>>> rows.shape
(2, 10)
>>> cols.shape
(2, 20)