备注
Go to the end 下载完整的示例代码。或者通过浏览器中的MysterLite或Binder运行此示例
混淆矩阵#
使用混淆矩阵来评估虹膜数据集中分类器输出质量的示例。对角线元素表示预测标签等于真实标签的点的数量,而非对角线元素是那些被分类器错误标记的点。混淆矩阵的对角线值越高越好,表明许多正确的预测。
这些图显示了按类支持大小(每个类中的元素数量)进行规范化和不进行规范化的混乱矩阵。在类不平衡的情况下,这种规范化可能很有趣,可以更直观地解释哪个类被错误分类。
这里的结果并没有达到应有的水平,因为我们对正规化参数C的选择不是最好的。在现实生活中的应用中,通常使用以下方式选择此参数 调整估计器的超参数 .
Confusion matrix, without normalization
[[13 0 0]
[ 0 10 6]
[ 0 0 9]]
Normalized confusion matrix
[[1. 0. 0. ]
[0. 0.62 0.38]
[0. 0. 1. ]]
# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, svm
from sklearn.metrics import ConfusionMatrixDisplay
from sklearn.model_selection import train_test_split
# import some data to play with
iris = datasets.load_iris()
X = iris.data
y = iris.target
class_names = iris.target_names
# Split the data into a training set and a test set
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
# Run classifier, using a model that is too regularized (C too low) to see
# the impact on the results
classifier = svm.SVC(kernel="linear", C=0.01).fit(X_train, y_train)
np.set_printoptions(precision=2)
# Plot non-normalized confusion matrix
titles_options = [
("Confusion matrix, without normalization", None),
("Normalized confusion matrix", "true"),
]
for title, normalize in titles_options:
disp = ConfusionMatrixDisplay.from_estimator(
classifier,
X_test,
y_test,
display_labels=class_names,
cmap=plt.cm.Blues,
normalize=normalize,
)
disp.ax_.set_title(title)
print(title)
print(disp.confusion_matrix)
plt.show()
Total running time of the script: (0分0.122秒)
相关实例
Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>
_