备注
Go to the end 下载完整的示例代码。或者通过浏览器中的MysterLite或Binder运行此示例
管道#
在Windows笔记本中显示管道的默认配置为 'diagram'
哪里 set_config(display='diagram')
.要停用HTML表示,请使用 set_config(display='text')
.
要查看管道可视化中的更多详细步骤,请单击管道中的步骤。
# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause
使用预处理步骤和分类器调试管道#
本节构建了一个 Pipeline
通过预处理步骤, StandardScaler
和分类器, LogisticRegression
,并显示其视觉表示。
from sklearn import set_config
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
steps = [
("preprocessing", StandardScaler()),
("classifier", LogisticRegression()),
]
pipe = Pipeline(steps)
要可视化图表,默认值为 display='diagram'
.
set_config(display="diagram")
pipe # click on the diagram below to see the details of each step
要查看文本管道,请更改为 display='text'
.
set_config(display="text")
pipe
Pipeline(steps=[('preprocessing', StandardScaler()),
('classifier', LogisticRegression())])
放回默认显示
set_config(display="diagram")
创建链接多个预处理步骤和分类器的管道#
本节构建了一个 Pipeline
通过多个预处理步骤, PolynomialFeatures
和 StandardScaler
和分类器步骤, LogisticRegression
,并显示其视觉表示。
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures, StandardScaler
steps = [
("standard_scaler", StandardScaler()),
("polynomial", PolynomialFeatures(degree=3)),
("classifier", LogisticRegression(C=2.0)),
]
pipe = Pipeline(steps)
pipe # click on the diagram below to see the details of each step
Displaying a Pipeline and Dimensionality Reduction and Classifier#
本节构建了一个 Pipeline
利用降维步骤, PCA
,一个分类器, SVC
,并显示其视觉表示。
from sklearn.decomposition import PCA
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC
steps = [("reduce_dim", PCA(n_components=4)), ("classifier", SVC(kernel="linear"))]
pipe = Pipeline(steps)
pipe # click on the diagram below to see the details of each step
连接一个复杂的管道,连接一个柱式Transformer#
本节构建了一个复杂的 Pipeline
与 ColumnTransformer
和分类器, LogisticRegression
,并显示其视觉表示。
import numpy as np
from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler
numeric_preprocessor = Pipeline(
steps=[
("imputation_mean", SimpleImputer(missing_values=np.nan, strategy="mean")),
("scaler", StandardScaler()),
]
)
categorical_preprocessor = Pipeline(
steps=[
(
"imputation_constant",
SimpleImputer(fill_value="missing", strategy="constant"),
),
("onehot", OneHotEncoder(handle_unknown="ignore")),
]
)
preprocessor = ColumnTransformer(
[
("categorical", categorical_preprocessor, ["state", "gender"]),
("numerical", numeric_preprocessor, ["age", "weight"]),
]
)
pipe = make_pipeline(preprocessor, LogisticRegression(max_iter=500))
pipe # click on the diagram below to see the details of each step
使用分类器在管道上启动网格搜索#
本节构建了一个 GridSearchCV
通过 Pipeline
与 RandomForestClassifier
并显示其视觉表示。
import numpy as np
from sklearn.compose import ColumnTransformer
from sklearn.ensemble import RandomForestClassifier
from sklearn.impute import SimpleImputer
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler
numeric_preprocessor = Pipeline(
steps=[
("imputation_mean", SimpleImputer(missing_values=np.nan, strategy="mean")),
("scaler", StandardScaler()),
]
)
categorical_preprocessor = Pipeline(
steps=[
(
"imputation_constant",
SimpleImputer(fill_value="missing", strategy="constant"),
),
("onehot", OneHotEncoder(handle_unknown="ignore")),
]
)
preprocessor = ColumnTransformer(
[
("categorical", categorical_preprocessor, ["state", "gender"]),
("numerical", numeric_preprocessor, ["age", "weight"]),
]
)
pipe = Pipeline(
steps=[("preprocessor", preprocessor), ("classifier", RandomForestClassifier())]
)
param_grid = {
"classifier__n_estimators": [200, 500],
"classifier__max_features": ["auto", "sqrt", "log2"],
"classifier__max_depth": [4, 5, 6, 7, 8],
"classifier__criterion": ["gini", "entropy"],
}
grid_search = GridSearchCV(pipe, param_grid=param_grid, n_jobs=1)
grid_search # click on the diagram below to see the details of each step
Total running time of the script: (0分0.094秒)
相关实例
Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>
_