版本0.18#

警告

Scikit-learn 0.18是scikit-learn支持Python 2.6的最后一个主要版本。scikit-learn的后续版本将需要Python 2.7或更高版本。

版本0.18.2#

June 20, 2017

Changelog#

代码贡献者#

Aman Dalmia、Loic Esteve、Nate Guerin、Sergei Lebedev

版本0.18.1#

November 11, 2016

Changelog#

增强功能#

Bug修复#

API变更摘要#

树木和森林

  • The min_weight_fraction_leaf parameter of tree-based classifiers and regressors now assumes uniform sample weights by default if the sample_weight argument is not passed to the fit function. Previously, the parameter was silently ignored. #7301 by Nelson Liu.

  • 树分裂标准类的克隆/腌制现在是内存安全的。 #7680 通过 Ibraim Ganiev .

线性、核心化和相关模型

版本0.18#

September 28, 2016

模型选择增强和API更改#

  • The model_selection module

    新模块 sklearn.model_selection ,它将以前的功能组合在一起 sklearn.cross_validation , sklearn.grid_searchsklearn.learning_curve ,引入了新的可能性,例如嵌套交叉验证和使用Pandas更好地操作参数搜索。

    许多事情都会保持不变,但也有一些关键差异。请阅读以下内容,了解有关这些变化的更多信息。

  • Data-independent CV splitters enabling nested cross-validation

    中定义的新交叉验证拆分器 sklearn.model_selection ,不再使用任何依赖于数据的参数进行初始化,例如 y .相反,他们暴露了一个 split 接收数据并为不同拆分生成器的方法。

    此更改使得可以使用交叉验证拆分器来执行嵌套交叉验证,其促进因素如下 model_selection.GridSearchCVmodel_selection.RandomizedSearchCV 公用事业。

  • The enhanced cv_results_ attribute

    cv_results_ 属性(of model_selection.GridSearchCVmodel_selection.RandomizedSearchCV )引入代替 grid_scores_ 属性是1D数组的dict,每个数组中的元素对应于参数设置(即搜索候选项)。

    cv_results_ 可以轻松导入Dict pandas 作为 DataFrame 用于探索搜索结果。

    cv_results_ 数组包括每个交叉验证拆分的分数(具有诸如 'split0_test_score' ),以及它们的平均值 ('mean_test_score' )和标准差 ('std_test_score' ).

    搜索候选者的排名(基于其平均交叉验证分数)可在 cv_results_['rank_test_score'] .

    每个参数的参数值单独存储为numpy掩蔽对象数组。如果相应的参数不适用,则该搜索候选项的值将被屏蔽。此外,所有参数指令的列表存储在 cv_results_['params'] .

  • Parameters n_folds and n_iter renamed to n_splits

    某些参数名称已更改: n_folds 新参数 model_selection.KFold , model_selection.GroupKFold (see下面是名称变更),并且 model_selection.StratifiedKFold 现已更名为 n_splits .的 n_iter 参数 model_selection.ShuffleSplit ,新班级 model_selection.GroupShuffleSplitmodel_selection.StratifiedShuffleSplit 现已更名为 n_splits .

  • Rename of splitter classes which accepts group labels along with data

    交叉验证拆分器 LabelKFold , LabelShuffleSplit , LeaveOneLabelOutLeavePLabelOut 已更名为 model_selection.GroupKFold , model_selection.GroupShuffleSplit , model_selection.LeaveOneGroupOutmodel_selection.LeavePGroupsOut 分别

    请注意, model_selection.LeavePGroupsOut .

  • Fit parameter labels renamed to groups

    labels 中参数 split 新重命名的拆分器的 model_selection.GroupKFold , model_selection.LeaveOneGroupOut , model_selection.LeavePGroupsOut , model_selection.GroupShuffleSplit 被重命名为 groups 遵循他们的班级名称的新命名法。

  • Parameter n_labels renamed to n_groups

    参数 n_labels 在新更名的 model_selection.LeavePGroupsOut 被改变为 n_groups .

  • 训练分数和时间信息

    cv_results_ 还包括每个交叉验证拆分的训练分数(带有诸如 'split0_train_score' ),以及它们的平均值 ('mean_train_score' )和标准差 ('std_train_score' ).为了避免评估培训成绩的成本,设置 return_train_score=False .

    此外,还可以在所有交叉验证拆分中对模型进行拆分、训练和评分所需时间的平均值和标准差 'mean_time''std_time' 分别

Changelog#

新功能#

分类器和回归器

其他估计数

模型选择与评估

增强功能#

树木和合奏

线性、核心化和相关模型

分解、多维学习和集群

预处理和特征选择

模型评估和元估计量

Metrics

杂项

Bug修复#

树木和合奏

线性、核心化和相关模型

分解、多维学习和集群

预处理和特征选择

  • preprocessing.data._transform_selected now always passes a copy of X to transform function when copy=True (#7194). By Caio Oliveira.

模型评估和元估计量

Metrics

杂项

  • model_selection.tests._search._check_param_grid 现在可以与扩展/实现的所有类型一起正常工作 Sequence (字符串除外),包括范围(Python 3.x)和xRange(Python 2.x)。 #7323 作者:维亚切斯拉夫·科瓦列夫斯基。

  • utils.extmath.randomized_range_finder 当请求多次乘势迭代时,在数字上更加稳定,因为它默认应用LU正规化。如果 n_iter<2 数字问题不太可能,因此不应用正常化。还有其他规范化选项可用: 'none', 'LU''QR' . #5141 通过 Giorgio Patrini .

  • 修复某些格式的错误 scipy.sparse 矩阵以及以它们为参数的估计量无法传递给 base.clone .通过 Loic Esteve .

  • datasets.load_svmlight_file 现在能够读取长int QID值。 #7101 通过 Ibraim Ganiev .

API变更摘要#

线性、核心化和相关模型

分解、多维学习和集群

模型评估和元估计量

代码贡献者#

Aditya Joshi, Alejandro, Alexander Fabisch, Alexander Loginov, Alexander Minyushkin, Alexander Rudy, Alexandre Abadie, Alexandre Abraham, Alexandre Gramfort, Alexandre Saint, alexfields, Alvaro Ulloa, alyssaq, Amlan Kar, Andreas Mueller, andrew giessel, Andrew Jackson, Andrew McCulloh, Andrew Murray, Anish Shah, Arafat, Archit Sharma, Ariel Rokem, Arnaud Joly, Arnaud Rachez, Arthur Mensch, Ash Hoover, asnt, b0noI, Behzad Tabibian, Bernardo, Bernhard Kratzwald, Bhargav Mangipudi, blakeflei, Boyuan Deng, Brandon Carter, Brett Naul, Brian McFee, Caio Oliveira, Camilo Lamus, Carol Willing, Cass, CeShine Lee, Charles Truong, Chyi-Kwei Yau, CJ Carey, codevig, Colin Ni, Dan Shiebler, Daniel, Daniel Hnyk, David Ellis, David Nicholson, David Staub, David Thaler, David Warshaw, Davide Lasagna, Deborah, definitelyuncertain, Didi Bar-Zev, djipey, dsquareindia, edwinENSAE, Elias Kuthe, Elvis DOHMATOB, Ethan White, Fabian Pedregosa, Fabio Ticconi, fisache, Florian Wilhelm, Francis, Francis O'Donovan, Gael Varoquaux, Ganiev Ibraim, ghg, Gilles Louppe, Giorgio Patrini, Giovanni Cherubin, Giovanni Lanzani, Glenn Qian, Gordon Mohr, govin-vatsan, Graham Clenaghan, Greg Reda, Greg Stupp, Guillaume Lemaitre, Gustav Mörtberg, halwai, Harizo Rajaona, Harry Mavroforakis, hashcode55, hdmetor, Henry Lin, Hobson Lane, Hugo Bowne-Anderson, Igor Andriushchenko, Imaculate, Inki Hwang, Isaac Sijaranamual, Ishank Gulati, Issam Laradji, Iver Jordal, jackmartin, Jacob Schreiber, Jake Vanderplas, James Fiedler, James Routley, Jan Zikes, Janna Brettingen, jarfa, Jason Laska, jblackburne, jeff levesque, Jeffrey Blackburne, Jeffrey04, Jeremy Hintz, jeremynixon, Jeroen, Jessica Yung, Jill-Jênn Vie, Jimmy Jia, Jiyuan Qian, Joel Nothman, johannah, John, John Boersma, John Kirkham, John Moeller, jonathan.striebel, joncrall, Jordi, Joseph Munoz, Joshua Cook, JPFrancoia, jrfiedler, JulianKahnert, juliathebrave, kaichogami, KamalakerDadi, Kenneth Lyons, Kevin Wang, kingjr, kjell, Konstantin Podshumok, Kornel Kielczewski, Krishna Kalyan, krishnakalyan3, Kvle Putnam, Kyle Jackson, Lars Buitinck, ldavid, LeiG, LeightonZhang, Leland McInnes, Liang-Chi Hsieh, Lilian Besson, lizsz, Loic Esteve, Louis Tiao, Léonie Borne, Mads Jensen, Maniteja Nandana, Manoj Kumar, Manvendra Singh, Marco, Mario Krell, Mark Bao, Mark Szepieniec, Martin Madsen, MartinBpr, MaryanMorel, Massil, Matheus, Mathieu Blondel, Mathieu Dubois, Matteo, Matthias Ekman, Max Moroz, Michael Scherer, michiaki ariga, Mikhail Korobov, Moussa Taifi, mrandrewandrade, Mridul Seth, nadya-p, Naoya Kanai, Nate George, Nelle Varoquaux, Nelson Liu, Nick James, NickleDave, Nico, Nicolas Goix, Nikolay Mayorov, ningchi, nlathia, okbalefthanded, Okhlopkov, Olivier Grisel, Panos Louridas, Paul Strickland, Perrine Letellier, pestrickland, Peter Fischer, Pieter, Ping-Yao, Chang, practicalswift, Preston Parry, Qimu Zheng, Rachit Kansal, Raghav RV, Ralf Gommers, Ramana.S, Rammig, Randy Olson, Rob Alexander, Robert Lutz, Robin Schucker, Rohan Jain, Ruifeng Zheng, Ryan Yu, Rémy Léone, saihttam, Saiwing Yeung, Sam Shleifer, Samuel St-Jean, Sartaj Singh, Sasank Chilamkurthy, saurabh.bansod, Scott Andrews, Scott Lowe, seales, Sebastian Raschka, Sebastian Saeger, Sebastián Vanrell, Sergei Lebedev, shagun Sodhani, shanmuga cv, Shashank Shekhar, shawpan, shengxiduan, Shota, shuckle16, Skipper Seabold, sklearn-ci, SmedbergM, srvanrell, Sébastien Lerique, Taranjeet, themrmax, Thierry, Thierry Guillemot, Thomas, Thomas Hallock, Thomas Moreau, Tim Head, tKammy, toastedcornflakes, Tom, TomDLT, Toshihiro Kamishima, tracer0tong, Trent Hauck, trevorstephens, Tue Vo, Varun, Varun Jewalikar, Viacheslav, Vighnesh Birodkar, Vikram, Villu Ruusmann, Vinayak Mehta, walter, waterponey, Wenhua Yang, Wenjian Huang, Will Welch, wyseguy7, xyguo, yanlend, Yaroslav Halchenko, yelite, Yen, YenChenLin, Yichuan Liu, Yoav Ram, Yoshiki, Zheng RuiFeng, zivori, Óscar Nájera