标签传播圈:学习复杂结构#

LabelPropagation学习复杂的内部结构以演示“多管学习”的示例。外圆应标记为“红色”,内圆应标记为“蓝色”。由于两个标签组都位于各自不同的形状内,因此我们可以看到标签围绕圆正确传播。

# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause

我们生成一个具有两个同心圆的数据集。此外,数据集的每个样本都关联一个标签,即:0(属于外圆)、1(属于内圆)和-1(未知)。在这里,除两个标签外的所有标签都被标记为未知。

import numpy as np

from sklearn.datasets import make_circles

n_samples = 200
X, y = make_circles(n_samples=n_samples, shuffle=False)
outer, inner = 0, 1
labels = np.full(n_samples, -1.0)
labels[0] = outer
labels[-1] = inner

绘制原始数据

import matplotlib.pyplot as plt

plt.figure(figsize=(4, 4))
plt.scatter(
    X[labels == outer, 0],
    X[labels == outer, 1],
    color="navy",
    marker="s",
    lw=0,
    label="outer labeled",
    s=10,
)
plt.scatter(
    X[labels == inner, 0],
    X[labels == inner, 1],
    color="c",
    marker="s",
    lw=0,
    label="inner labeled",
    s=10,
)
plt.scatter(
    X[labels == -1, 0],
    X[labels == -1, 1],
    color="darkorange",
    marker=".",
    label="unlabeled",
)
plt.legend(scatterpoints=1, shadow=False, loc="center")
_ = plt.title("Raw data (2 classes=outer and inner)")
Raw data (2 classes=outer and inner)

的目的 LabelSpreading 是将标签与最初未知标签的样本关联起来。

from sklearn.semi_supervised import LabelSpreading

label_spread = LabelSpreading(kernel="knn", alpha=0.8)
label_spread.fit(X, labels)
LabelSpreading(alpha=0.8, kernel='knn')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.


现在,当标签未知时,我们可以检查哪些标签与每个样本关联。

output_labels = label_spread.transduction_
output_label_array = np.asarray(output_labels)
outer_numbers = (output_label_array == outer).nonzero()[0]
inner_numbers = (output_label_array == inner).nonzero()[0]

plt.figure(figsize=(4, 4))
plt.scatter(
    X[outer_numbers, 0],
    X[outer_numbers, 1],
    color="navy",
    marker="s",
    lw=0,
    s=10,
    label="outer learned",
)
plt.scatter(
    X[inner_numbers, 0],
    X[inner_numbers, 1],
    color="c",
    marker="s",
    lw=0,
    s=10,
    label="inner learned",
)
plt.legend(scatterpoints=1, shadow=False, loc="center")
plt.title("Labels learned with Label Spreading (KNN)")
plt.show()
Labels learned with Label Spreading (KNN)

Total running time of the script: (0分0.131秒)

相关实例

光谱双集群算法的演示

A demo of the Spectral Biclustering algorithm

新元:凸损失函数

SGD: convex loss functions

在scikit-learn中可视化交叉验证行为

Visualizing cross-validation behavior in scikit-learn

标签传播数字:展示性能

Label Propagation digits: Demonstrating performance

Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io> _