备注

Go to the end 下载完整的示例代码。或者通过浏览器中的MysterLite或Binder运行此示例

管道#

在Windows笔记本中显示管道的默认配置为 'diagram' 哪里 set_config(display='diagram') .要停用HTML表示，请使用 set_config(display='text') .

要查看管道可视化中的更多详细步骤，请单击管道中的步骤。

# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause

使用预处理步骤和分类器调试管道#

本节构建了一个 Pipeline 通过预处理步骤， StandardScaler 和分类器， LogisticRegression ，并显示其视觉表示。

from sklearn import set_config
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

steps = [
    ("preprocessing", StandardScaler()),
    ("classifier", LogisticRegression()),
]
pipe = Pipeline(steps)

要可视化图表，默认值为 display='diagram' .

set_config(display="diagram")
pipe  # click on the diagram below to see the details of each step

Pipeline(steps=[('preprocessing', StandardScaler()),
                ('classifier', LogisticRegression())])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

要查看文本管道，请更改为 display='text' .

set_config(display="text")
pipe

Pipeline(steps=[('preprocessing', StandardScaler()),
                ('classifier', LogisticRegression())])

放回默认显示

set_config(display="diagram")

创建链接多个预处理步骤和分类器的管道#

本节构建了一个 Pipeline 通过多个预处理步骤， PolynomialFeatures 和 StandardScaler 和分类器步骤， LogisticRegression ，并显示其视觉表示。

from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures, StandardScaler

steps = [
    ("standard_scaler", StandardScaler()),
    ("polynomial", PolynomialFeatures(degree=3)),
    ("classifier", LogisticRegression(C=2.0)),
]
pipe = Pipeline(steps)
pipe  # click on the diagram below to see the details of each step

Pipeline(steps=[('standard_scaler', StandardScaler()),
                ('polynomial', PolynomialFeatures(degree=3)),
                ('classifier', LogisticRegression(C=2.0))])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Displaying a Pipeline and Dimensionality Reduction and Classifier#

本节构建了一个 Pipeline 利用降维步骤， PCA ，一个分类器， SVC ，并显示其视觉表示。

from sklearn.decomposition import PCA
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC

steps = [("reduce_dim", PCA(n_components=4)), ("classifier", SVC(kernel="linear"))]
pipe = Pipeline(steps)
pipe  # click on the diagram below to see the details of each step

Pipeline(steps=[('reduce_dim', PCA(n_components=4)),
                ('classifier', SVC(kernel='linear'))])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

连接一个复杂的管道，连接一个柱式Transformer#

本节构建了一个复杂的 Pipeline 与 ColumnTransformer 和分类器， LogisticRegression ，并显示其视觉表示。

import numpy as np

from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler

numeric_preprocessor = Pipeline(
    steps=[
        ("imputation_mean", SimpleImputer(missing_values=np.nan, strategy="mean")),
        ("scaler", StandardScaler()),
    ]
)

categorical_preprocessor = Pipeline(
    steps=[
        (
            "imputation_constant",
            SimpleImputer(fill_value="missing", strategy="constant"),
        ),
        ("onehot", OneHotEncoder(handle_unknown="ignore")),
    ]
)

preprocessor = ColumnTransformer(
    [
        ("categorical", categorical_preprocessor, ["state", "gender"]),
        ("numerical", numeric_preprocessor, ["age", "weight"]),
    ]
)

pipe = make_pipeline(preprocessor, LogisticRegression(max_iter=500))
pipe  # click on the diagram below to see the details of each step

Pipeline(steps=[('columntransformer',
                 ColumnTransformer(transformers=[('categorical',
                                                  Pipeline(steps=[('imputation_constant',
                                                                   SimpleImputer(fill_value='missing',
                                                                                 strategy='constant')),
                                                                  ('onehot',
                                                                   OneHotEncoder(handle_unknown='ignore'))]),
                                                  ['state', 'gender']),
                                                 ('numerical',
                                                  Pipeline(steps=[('imputation_mean',
                                                                   SimpleImputer()),
                                                                  ('scaler',
                                                                   StandardScaler())]),
                                                  ['age', 'weight'])])),
                ('logisticregression', LogisticRegression(max_iter=500))])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Pipeline

?Documentation for PipelineiNot fitted

Pipeline(steps=[('columntransformer',
                 ColumnTransformer(transformers=[('categorical',
                                                  Pipeline(steps=[('imputation_constant',
                                                                   SimpleImputer(fill_value='missing',
                                                                                 strategy='constant')),
                                                                  ('onehot',
                                                                   OneHotEncoder(handle_unknown='ignore'))]),
                                                  ['state', 'gender']),
                                                 ('numerical',
                                                  Pipeline(steps=[('imputation_mean',
                                                                   SimpleImputer()),
                                                                  ('scaler',
                                                                   StandardScaler())]),
                                                  ['age', 'weight'])])),
                ('logisticregression', LogisticRegression(max_iter=500))])

columntransformer: ColumnTransformer

?Documentation for columntransformer: ColumnTransformer

ColumnTransformer(transformers=[('categorical',
                                 Pipeline(steps=[('imputation_constant',
                                                  SimpleImputer(fill_value='missing',
                                                                strategy='constant')),
                                                 ('onehot',
                                                  OneHotEncoder(handle_unknown='ignore'))]),
                                 ['state', 'gender']),
                                ('numerical',
                                 Pipeline(steps=[('imputation_mean',
                                                  SimpleImputer()),
                                                 ('scaler', StandardScaler())]),
                                 ['age', 'weight'])])

categorical

['state', 'gender']

SimpleImputer

?Documentation for SimpleImputer

SimpleImputer(fill_value='missing', strategy='constant')

OneHotEncoder

?Documentation for OneHotEncoder

OneHotEncoder(handle_unknown='ignore')

numerical

['age', 'weight']

SimpleImputer

?Documentation for SimpleImputer

SimpleImputer()

StandardScaler

?Documentation for StandardScaler

StandardScaler()

LogisticRegression

?Documentation for LogisticRegression

LogisticRegression(max_iter=500)

管道#

使用预处理步骤和分类器调试管道#

创建链接多个预处理步骤和分类器的管道#

Displaying a Pipeline and Dimensionality Reduction and Classifier#

连接一个复杂的管道，连接一个柱式Transformer#

使用分类器在管道上启动网格搜索#