备注
Go to the end 下载完整的示例代码。或者通过浏览器中的MysterLite或Binder运行此示例
例子利用 FrozenEstimator
#
此示例展示了 FrozenEstimator
.
FrozenEstimator
是一个实用类,允许冻结合适的估计器。例如,当我们想要将匹配的估计量传递给元估计量时,这很有用,例如 FixedThresholdClassifier
而不让元估计器重新适应估计器。
# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause
为预适应的分类器设置决策阈值#
scikit-learn中的拟合分类器使用任意决策阈值来决定给定样本属于哪个类别。决策阈值为 0.0
关于返回的值 decision_function ,或者 0.5
关于返回的概率 predict_proba .
然而,人们可能需要设置自定义决策阈值。我们可以通过使用 FixedThresholdClassifier
并用以下内容包裹分类器 FrozenEstimator
.
from sklearn.datasets import make_classification
from sklearn.frozen import FrozenEstimator
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import FixedThresholdClassifier, train_test_split
X, y = make_classification(n_samples=1000, random_state=0)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
classifier = LogisticRegression().fit(X_train, y_train)
print(
"Probability estimates for three data points:\n"
f"{classifier.predict_proba(X_test[-3:]).round(3)}"
)
print(
"Predicted class for the same three data points:\n"
f"{classifier.predict(X_test[-3:])}"
)
Probability estimates for three data points:
[[0.18 0.82]
[0.29 0.71]
[0. 1. ]]
Predicted class for the same three data points:
[1 1 1]
现在假设您想要对概率估计设置不同的决策阈值。我们可以通过将分类器包裹起来来做到这一点 FrozenEstimator
并将其传递给 FixedThresholdClassifier
.
threshold_classifier = FixedThresholdClassifier(
estimator=FrozenEstimator(classifier), threshold=0.9
)
Note that in the above piece of code, calling fit
on
FixedThresholdClassifier
does not refit the
underlying classifier.
现在,让我们看看预测相对于概率阈值的变化。
print(
"Probability estimates for three data points with FixedThresholdClassifier:\n"
f"{threshold_classifier.predict_proba(X_test[-3:]).round(3)}"
)
print(
"Predicted class for the same three data points with FixedThresholdClassifier:\n"
f"{threshold_classifier.predict(X_test[-3:])}"
)
Probability estimates for three data points with FixedThresholdClassifier:
[[0.18 0.82]
[0.29 0.71]
[0. 1. ]]
Predicted class for the same three data points with FixedThresholdClassifier:
[0 0 1]
我们看到概率估计保持不变,但由于使用了不同的决策阈值,因此预测的类别也不同。
请参阅 sphx_glr_auto_examples_model_selection_plot_cost_sensitive_learning.py 了解成本敏感的学习和决策阈值调整。
预拟合分类器的校准#
您可以使用 FrozenEstimator
校准预拟合分类器, CalibratedClassifierCV
.
from sklearn.calibration import CalibratedClassifierCV
from sklearn.metrics import brier_score_loss
calibrated_classifier = CalibratedClassifierCV(
estimator=FrozenEstimator(classifier)
).fit(X_train, y_train)
prob_pos_clf = classifier.predict_proba(X_test)[:, 1]
clf_score = brier_score_loss(y_test, prob_pos_clf)
print(f"No calibration: {clf_score:.3f}")
prob_pos_calibrated = calibrated_classifier.predict_proba(X_test)[:, 1]
calibrated_score = brier_score_loss(y_test, prob_pos_calibrated)
print(f"With calibration: {calibrated_score:.3f}")
No calibration: 0.033
With calibration: 0.032
Total running time of the script: (0分0.032秒)
相关实例
Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>
_