13. 选择正确的估计器#

解决机器学习问题最困难的部分通常是找到适合工作的正确估计器。不同的估计器更适合不同类型的数据和不同的问题。

下面的流程图旨在为用户提供一个粗略的指导,了解如何处理有关在数据上尝试哪些估计器的问题。单击下图中的任何估计器以查看其文档。的 Try next 橙色箭头读为“如果该估计器没有达到预期结果,则按照箭头并尝试下一个”。使用滚动轮进行放大和缩小,然后单击并拖动进行平移。您还可以下载图表: ml_map.svg .

START
START
>50
samples
>50...
get
more
data
get...
NO
NO
predicting a
category
predicting...
YES
YES
do you have
labeled
data
do you hav...
YES
YES
predicting a
quantity
predicting...
NO
NO
just
looking
just...
NO
NO
predicting
structure
predicting...
NO
NO
tough
luck
tough...
<100K
samples
<100K...
YES
YES
SGD
Classifier
SGD...
NO
NO
Linear
SVC
Linear...
YES
YES
text
data
text...
Kernel
Approximation
Kernel...
KNeighbors
Classifier
KNeighbors...
NO
NO
SVC
SVC
Ensemble
Classifiers
Ensemble...
Naive
Bayes
Naive...
YES
YES
classification
classification
number of
categories
known
number of...
NO
NO
<10K
samples
<10K...
<10K
samples
<10K...
NO
NO
NO
NO
YES
YES
MeanShift
MeanShift
VBGMM
VBGMM
YES
YES
MiniBatch
KMeans
MiniBatch...
NO
NO
clustering
clustering
KMeans
KMeans
YES
YES
Spectral
Clustering
Spectral...
GMM
GMM
<100K
samples
<100K...
YES
YES
few features
should be
important
few features...
YES
YES
SGD
Regressor
SGD...
NO
NO
Lasso
Lasso
ElasticNet
ElasticNet
YES
YES
RidgeRegression
RidgeRegression
SVR(kernel="linear")
SVR(kernel="linea...
NO
NO
SVR(kernel="rbf")
SVR(kernel="rbf...
Ensemble
Regressors
Ensemble...
regression
regression
Randomized
PCA
Randomized...
YES
YES
<10K
samples
<10K...
Kernel
Approximation
Kernel...
NO
NO
IsoMap
IsoMap
Spectral
Embedding
Spectral...
YES
YES
LLE
LLE
dimensionality
reduction
dimensionality...
scikit-learn
algorithm cheat sheet
scikit-learn...
TRY
NEXT
TRY...
TRY
NEXT
TRY...
TRY
NEXT
TRY...
TRY
NEXT
TRY...
TRY
NEXT
TRY...
TRY
NEXT
TRY...
TRY
NEXT
TRY...
Text is not SVG - cannot display