log_loss#

sklearn.metrics.log_loss(y_true, y_pred, *, normalize=True, sample_weight=None, labels=None)[源代码]#

日志损失，又名物流损失或交叉熵损失。

This is the loss function used in (multinomial) logistic regression and extensions of it such as neural networks, defined as the negative log-likelihood of a logistic model that returns y_pred probabilities for its training data y_true. The log loss is only defined for two or more labels. For a single sample with true label \(y \in \{0,1\}\) and a probability estimate \(p = \operatorname{Pr}(y = 1)\), the log loss is:

\[L_{\log}（y，p）= -（y \log（p）+（1 - y）\log（1 - p））\]

阅读更多的 User Guide .

参数:

y_true类阵列或标签指示矩阵

n_samples样本的基本真相（正确）标签。

y_pred类数组的浮点，形状=（n_samples，n_classes）或（n_samples，）

Predicted probabilities, as returned by a classifier's predict_proba method. If y_pred.shape = (n_samples,) the probabilities provided are assumed to be that of the positive class. The labels in y_pred are assumed to be ordered alphabetically, as done by LabelBinarizer.

y_pred 值被剪辑到 [eps, 1-eps] 哪里 eps 是机器精度 y_pred 的dtype。

normalize布尔，默认=True

如果为真，则返回每个样本的平均损失。否则，返回每个样本损失的总和。

sample_weight形状类似数组（n_samples，），默认=无

样本重量。

labels类数组，默认=无

如果未提供，标签将从y_true推断。如果 labels 是 None 和 y_pred 具有形状（n_samples），标签被假设为二进制，并从 y_true .

Added in version 0.18.

返回:

loss浮子: 日志损失，又名物流损失或交叉熵损失。

注意到

使用的对数是自然对数（以e为底）。

引用

C.M.毕肖普（2006）。模式识别和机器学习。施普林格，第209页。

示例

>>> from sklearn.metrics import log_loss
>>> log_loss(["spam", "ham", "ham", "spam"],
...          [[.1, .9], [.9, .1], [.8, .2], [.35, .65]])
0.21616...