log_loss#

sklearn.metrics.log_loss(y_true, y_pred, *, normalize=True, sample_weight=None, labels=None)[源代码]#

日志损失,又名物流损失或交叉熵损失。

This is the loss function used in (multinomial) logistic regression and extensions of it such as neural networks, defined as the negative log-likelihood of a logistic model that returns y_pred probabilities for its training data y_true. The log loss is only defined for two or more labels. For a single sample with true label \(y \in \{0,1\}\) and a probability estimate \(p = \operatorname{Pr}(y = 1)\), the log loss is:

\[L_{\log}(y,p)= -(y \log(p)+(1 - y)\log(1 - p))\]

阅读更多的 User Guide .

参数:
y_true类阵列或标签指示矩阵

n_samples样本的基本真相(正确)标签。

y_pred类数组的浮点,形状=(n_samples,n_classes)或(n_samples,)

Predicted probabilities, as returned by a classifier's predict_proba method. If y_pred.shape = (n_samples,) the probabilities provided are assumed to be that of the positive class. The labels in y_pred are assumed to be ordered alphabetically, as done by LabelBinarizer.

y_pred 值被剪辑到 [eps, 1-eps] 哪里 eps 是机器精度 y_pred 的dtype。

normalize布尔,默认=True

如果为真,则返回每个样本的平均损失。否则,返回每个样本损失的总和。

sample_weight形状类似数组(n_samples,),默认=无

样本重量。

labels类数组,默认=无

如果未提供,标签将从y_true推断。如果 labelsNoney_pred 具有形状(n_samples),标签被假设为二进制,并从 y_true .

Added in version 0.18.

返回:
loss浮子

日志损失,又名物流损失或交叉熵损失。

注意到

使用的对数是自然对数(以e为底)。

引用

C.M.毕肖普(2006)。模式识别和机器学习。施普林格,第209页。

示例

>>> from sklearn.metrics import log_loss
>>> log_loss(["spam", "ham", "ham", "spam"],
...          [[.1, .9], [.9, .1], [.8, .2], [.35, .65]])
0.21616...