1.2.0中的新特性(2020年12月26日)#

这些是Pandas1.2.0中的变化。看见发行说明获取完整的更改日志，包括其他版本的Pandas。

警告

这个 xlwt 旧式写作包 .xls 不再维护Excel文件。这个 xlrd 套餐现在只供阅读老式的 .xls 文件。

以前，默认参数 engine=None 至 read_excel() 将导致使用 xlrd 引擎在许多情况下，包括新的Excel 2007+ (.xlsx )文件。如果 openpyxl 的情况下，其中许多情况现在将默认使用 openpyxl 引擎。请参阅 read_excel() 文档以了解更多详细信息。

因此，强烈建议您安装 openpyxl 阅读Excel 2007+ (.xlsx )文件。 Please do not report issues when using ``xlrd`` to read ``.xlsx`` files. 这不再受支持，请切换到使用 openpyxl 取而代之的是。

尝试使用 xlwt 引擎将引发一个 FutureWarning 除非该选项 io.excel.xls.writer 设置为 "xlwt" 。虽然此选项现在已弃用，并且还将引发 FutureWarning ，它可以全局设置并抑制警告。建议用户写下 .xlsx 文件使用 openpyxl 换成了发动机。

增强#

可以选择不允许重复标注#

Series 和 DataFrame 现在可以使用以下命令创建 allows_duplicate_labels=False 用于控制索引或列是否可以包含重复标签的标志 (GH28394 )。这可用于防止意外引入可能影响下游操作的重复标签。

默认情况下，继续允许重复。

In [1]: pd.Series([1, 2], index=['a', 'a'])
Out[1]:
a    1
a    2
Length: 2, dtype: int64

In [2]: pd.Series([1, 2], index=['a', 'a']).set_flags(allows_duplicate_labels=False)
...
DuplicateLabelError: Index has duplicates.
      positions
label
a        [0, 1]

大Pandas将会繁殖 allows_duplicate_labels 属性通过许多操作。

In [3]: a = (
   ...:     pd.Series([1, 2], index=['a', 'b'])
   ...:       .set_flags(allows_duplicate_labels=False)
   ...: )

In [4]: a
Out[4]:
a    1
b    2
Length: 2, dtype: int64

# An operation introducing duplicates
In [5]: a.reindex(['a', 'b', 'a'])
...
DuplicateLabelError: Index has duplicates.
      positions
label
a        [0, 2]

[1 rows x 1 columns]

警告

这是一个实验性的功能。目前，许多方法都无法传播 allows_duplicate_labels 价值。在未来的版本中，预计每个接受或返回一个或多个DataFrame或Series对象的方法都将传播 allows_duplicate_labels 。

看见重复标签想要更多。

这个 allows_duplicate_labels 标志存储在新的 DataFrame.flags 属性。这存储了应用于 Pandas物件 。这不同于 DataFrame.attrs ，它存储应用于数据集的信息。

将参数传递给fsspec后端#

许多读/写函数都获得了 storage_options 可选参数，用于将参数字典传递给存储后端。例如，这允许将凭据传递到S3和GCS存储。有关可以将哪些参数传递给哪些后端的详细信息，可在各个存储后端的文档中找到(详见fsspec文档 builtin implementations 并链接到 external ones )。请参阅章节读/写远程文件。

GH35655 添加了对fsspec的支持(包括 storage_options )用于读取EXCEL文件。

支持中的二进制文件句柄 `to_csv`#

to_csv() supports file handles in binary mode (GH19827 and GH35058) with encoding (GH13068 and GH23854) and compression (GH22555). If pandas does not automatically detect whether the file handle is opened in binary or text mode, it is necessary to provide mode="wb".

例如：

In [1]: import io

In [2]: data = pd.DataFrame([0, 1, 2])

In [3]: buffer = io.BytesIO()

In [4]: data.to_csv(buffer, encoding="utf-8", compression="gzip")

支持中的短标题和表格位置 `to_latex`#

DataFrame.to_latex() 现在允许用户指定浮动工作台位置 (GH35281 )和简短的说明文字 (GH36267 )。

关键字 position 已添加以设置位置。

In [5]: data = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})

In [6]: table = data.to_latex(position='ht')

In [7]: print(table)
\begin{table}[ht]
\centering
\begin{tabular}{lrr}
\toprule
{} &  a &  b \\
\midrule
0 &  1 &  3 \\
1 &  2 &  4 \\
\bottomrule
\end{tabular}
\end{table}

关键字的用法 caption 已被延长。除了接受单个字符串作为参数外，还可以选择提供元组 (full_caption, short_caption) 若要添加简短字幕宏，请执行以下操作。

In [8]: data = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})

In [9]: table = data.to_latex(caption=('the full long caption', 'short caption'))

In [10]: print(table)
\begin{table}
\centering
\caption[short caption]{the full long caption}
\begin{tabular}{lrr}
\toprule
{} &  a &  b \\
\midrule
0 &  1 &  3 \\
1 &  2 &  4 \\
\bottomrule
\end{tabular}
\end{table}

更改的默认浮点精度 `read_csv` 和 `read_table`#

对于C解析引擎，方法 read_csv() 和 read_table() 以前默认为解析器，该解析器可能会相对于精度的最后一位略微错误地读取浮点数。该选项 floating_precision="high" 始终可以避免此问题。从这个版本开始，现在的默认设置是使用更准确的解析器 floating_precision=None 对应于高精度解析器，以及新的选项 floating_precision="legacy" 使用传统解析器。默认情况下，更改为使用更高精度的解析器应该不会对性能产生影响。 (GH17154 )

浮点数据的实验性可空数据类型#

我们已经添加了 Float32Dtype / Float64Dtype 和 FloatingArray 。这些是专用于浮点数据的扩展数据类型，可以保存 pd.NA 缺少值指示器 (GH32265 ， GH34307 )。

而默认的浮点型数据类型已经支持使用 np.nan ，这些新数据类型使用 pd.NA (及其对应的行为)作为缺失值指示符，与现有的可为空的 integer 和 boolean 数据类型。

一个例子是， np.nan 和 pd.NA 不同的是比较运算：

# the default NumPy float64 dtype
In [11]: s1 = pd.Series([1.5, None])

In [12]: s1
Out[12]: 
0    1.5
1    NaN
Length: 2, dtype: float64

In [13]: s1 > 1
Out[13]: 
0     True
1    False
Length: 2, dtype: bool

# the new nullable float64 dtype
In [14]: s2 = pd.Series([1.5, None], dtype="Float64")

In [15]: s2
Out[15]: 
0     1.5
1    <NA>
Length: 2, dtype: Float64

In [16]: s2 > 1
Out[16]: 
0    True
1    <NA>
Length: 2, dtype: boolean

请参阅实验性的 NA 标量表示缺少的值文档部分，了解有关使用 pd.NA 缺少值指示符。

如上所述，可以使用“Float64”或“Float32”字符串指定数据类型(大写以区别于默认的“Float64”数据类型)。或者，您也可以使用dtype对象：

In [17]: pd.Series([1.5, None], dtype=pd.Float32Dtype())
Out[17]: 
0     1.5
1    <NA>
Length: 2, dtype: Float32

对于提供浮点结果的现有整型或布尔型可为空数据类型的操作，现在也将使用可为空的浮点数据类型 (GH38178 )。

警告

试验性：新的浮点数据类型目前处于试验性阶段，它们的行为或API仍可能在没有警告的情况下发生变化。尤其是有关NAN的行为(与NA缺失值不同)可能会发生变化。

聚合时保留索引/列名#

使用以下选项聚合时 concat() 或者 DataFrame 构造函数后，Pandas现在将尽可能地尝试保留索引和列名 (GH35847 )。在所有输入共享公共名称的情况下，该名称将被分配给结果。当输入的名称不都一致时，结果将是未命名的。以下是保留索引名称的示例：

In [18]: idx = pd.Index(range(5), name='abc')

In [19]: ser = pd.Series(range(5, 10), index=idx)

In [20]: pd.concat({'x': ser[1:], 'y': ser[:-1]}, axis=1)
Out[20]: 
       x    y
abc          
1    6.0  6.0
2    7.0  7.0
3    8.0  8.0
4    9.0  NaN
0    NaN  5.0

[5 rows x 2 columns]

同样的道理也适用于 MultiIndex ，但逻辑是在逐级的基础上单独应用的。

GroupBy直接支持EWM运营#

DataFrameGroupBy 现在直接支持指数加权窗口操作 (GH16037 )。

In [21]: df = pd.DataFrame({'A': ['a', 'b', 'a', 'b'], 'B': range(4)})

In [22]: df
Out[22]: 
   A  B
0  a  0
1  b  1
2  a  2
3  b  3

[4 rows x 2 columns]

In [23]: df.groupby('A').ewm(com=1.0).mean()
Out[23]: 
            B
A            
a 0  0.000000
  2  1.333333
b 1  1.000000
  3  2.333333

[4 rows x 1 columns]

另外 mean supports execution via Numba 使用 engine 和 engine_kwargs 争论。Numba必须作为可选依赖项安装才能使用此功能。

其他增强功能#

Added day_of_week (compatibility alias dayofweek) property to Timestamp, DatetimeIndex, Period, PeriodIndex (GH9605)
Added day_of_year (compatibility alias dayofyear) property to Timestamp, DatetimeIndex, Period, PeriodIndex (GH9605)
已添加 set_flags() 用于在Series或DataFrame上设置表范围标志 (GH28394 )
DataFrame.applymap() now supports na_action (GH23803)
Index WITH对象数据类型支持除法和乘法 (GH34160 )
io.sql.get_schema() 现在支持 schema 将模式添加到CREATE TABLE语句中的关键字参数 (GH28486 )
DataFrame.explode() 和 Series.explode() 现在支持集合分解 (GH35614 )
DataFrame.hist() 现在支持时间序列(日期时间)数据 (GH32590 )
Styler.set_table_styles() 现在允许直接设置行和列的样式，并可以链接 (GH35607 )
Styler 现在允许向单个数据单元格添加直接的CSS类名 (GH36159 )
Rolling.mean() 和 Rolling.sum() 使用Kahan求和来计算平均值，以避免数值问题 (GH10319 ， GH11645 ， GH13254 ， GH32761 ， GH36031 )
DatetimeIndex.searchsorted() ， TimedeltaIndex.searchsorted() ， PeriodIndex.searchsorted() ，以及 Series.searchsorted() 对于类似DateTime的类型，数据类型现在将尝试将字符串参数(类似列表和标量)转换为匹配的类DateTime类型 (GH36346 )
Added methods IntegerArray.prod(), IntegerArray.min(), and IntegerArray.max() (GH33790)
在上调用NumPy ufunc DataFrame WITH扩展类型现在会在可能的情况下保留扩展类型 (GH23743 )
Calling a binary-input NumPy ufunc on multiple DataFrame objects now aligns, matching the behavior of binary operations and ufuncs on Series (GH23743). This change has been reverted in pandas 1.2.1, and the behaviour to not align DataFrames is deprecated instead, see the the 1.2.1 release notes.
Where possible RangeIndex.difference() and RangeIndex.symmetric_difference() will return RangeIndex instead of Int64Index (GH36564)
DataFrame.to_parquet() 现在支持 MultiIndex 对于镶木地板格式的柱 (GH34777 )
read_parquet() 获得了 use_nullable_dtypes=True 选项来使用可为空的数据类型，这些数据类型使用 pd.NA 在可能的情况下，作为结果DataFrame的缺失值指示符(默认为 False ，并且仅适用于 engine="pyarrow" ) (GH31242 )
已添加 Rolling.sem() 和 Expanding.sem() 计算平均值的标准误差 (GH26476 )
Rolling.var() 和 Rolling.std() 用Kahan求和法和Welford法避免数值问题 (GH37051 )
DataFrame.corr() 和 DataFrame.cov() 使用韦尔福德方法避免数值问题 (GH37448 )
DataFrame.plot() now recognizes xlabel and ylabel arguments for plots of type scatter and hexbin (GH37001)
DataFrame 现在支持 divmod 运营 (GH37165 )
DataFrame.to_parquet() 现在返回一个 bytes 如果没有，则为 path 参数被传递 (GH37105 )
Rolling 现在支持 closed 固定窗口的参数 (GH34315 )
DatetimeIndex and Series with datetime64 or datetime64tz dtypes now support std (GH37436)
Window 现在支持中的所有Scipy窗口类型 win_type 支持灵活的关键字参数 (GH34556 )
testing.assert_index_equal() 现在有一个 check_order 允许以不区分顺序的方式检查索引的参数 (GH37478 )
read_csv() 支持压缩文件的内存映射 (GH37621 )
Add support for min_count keyword for DataFrame.groupby() and DataFrame.resample() for functions min, max, first and last (GH37821, GH37768)
改进的错误报告 DataFrame.merge() 当给出无效的合并列定义时 (GH16228 )
提高数值稳定性 Rolling.skew() ， Rolling.kurt() ， Expanding.skew() 和 Expanding.kurt() 通过实施卡汉求和 (GH6929 )
Improved error reporting for subsetting columns of a DataFrameGroupBy with axis=1 (GH37725)
Implement method cross for DataFrame.merge() and DataFrame.join() (GH5401)
什么时候 read_csv() ， read_sas() 和 read_json() 被调用的 chunksize/iterator 它们可以用在 with 语句，因为它们返回上下文管理器 (GH38225 )
增加了可用于设置Excel导出样式的命名颜色列表，启用了所有CSS4颜色 (GH38247 )

值得注意的错误修复#

这些错误修复可能会带来显著的行为变化。

DataFrame约简的一致性#

DataFrame.any() 和 DataFrame.all() 使用 bool_only=True 现在确定是否逐列排除对象数据类型列，而不是检查 all 对象数据类型列可以被视为布尔型。

这可以防止在列的子集上应用缩减可能导致更大的Series结果的病理行为。看见 (GH37799 )。

In [24]: df = pd.DataFrame({"A": ["foo", "bar"], "B": [True, False]}, dtype=object)

In [25]: df["C"] = pd.Series([True, True])

以前的行为 ：

In [5]: df.all(bool_only=True)
Out[5]:
C    True
dtype: bool

In [6]: df[["B", "C"]].all(bool_only=True)
Out[6]:
B    False
C    True
dtype: bool

新行为 ：

In [26]: In [5]: df.all(bool_only=True)
Out[26]: 
B    False
C     True
Length: 2, dtype: bool

In [27]: In [6]: df[["B", "C"]].all(bool_only=True)
Out[27]: 
B    False
C     True
Length: 2, dtype: bool

其他DataFrame缩减使用 numeric_only=None 也会避免这种病态的行为 (GH37827 )：

In [28]: df = pd.DataFrame({"A": [0, 1, 2], "B": ["a", "b", "c"]}, dtype=object)

以前的行为 ：

In [3]: df.mean()
Out[3]: Series([], dtype: float64)

In [4]: df[["A"]].mean()
Out[4]:
A    1.0
dtype: float64

新行为 ：

In [29]: df.mean()
Out[29]: 
A    1.0
Length: 1, dtype: float64

In [30]: df[["A"]].mean()
Out[30]: 
A    1.0
Length: 1, dtype: float64

此外，DataFrame减少了 numeric_only=None 现在将与它们的系列同行保持一致。特别是，对于级数方法引发的折减 TypeError ，DataFrame缩减现在将该列视为非数字，而不是强制转换为可能具有不同语义的NumPy数组 (GH36076 ， GH28949 ， GH21020 )。

In [31]: ser = pd.Series([0, 1], dtype="category", name="A")

In [32]: df = ser.to_frame()

以前的行为 ：

In [5]: df.any()
Out[5]:
A    True
dtype: bool

新行为 ：

In [33]: df.any()
Out[33]: Series([], Length: 0, dtype: bool)

提高了Python的最低版本#

Pandas 1.2.0支持Python3.7.1及更高版本 (GH35214 )。

提高了依赖项的最低版本#

更新了一些受支持的依赖项最低版本 (GH35214 )。如果已安装，我们现在需要：

套餐	最低版本	必填项	变化
钱币	1.16.5	X	X
皮兹	2017.3	X	X
Python-Dateutil	2.7.3	X
瓶颈	1.2.1
数字快递	2.6.8		X
最热(Dev)	5.0.1		X
Mypy(开发人员)	0.782		X

为 optional libraries 一般建议使用最新版本。下表列出了目前在整个Pandas发育过程中正在测试的每个库的最低版本。低于最低测试版本的可选库仍可运行，但不被视为受支持。

套餐	最低版本	变化
美味可口的汤	4.6.0
实木地板	0.3.2
FsSpec	0.7.4
Gcsf	0.6.0
Lxml	4.3.0	X
Matplotlib	2.2.3	X
Numba	0.46.0
OpenPyxl	2.6.0	X
绿箭侠	0.15.0	X
Pymysql	0.7.11	X
易燃物	3.5.1	X
S3FS	0.4.0
斯比	1.2.0
SQLALCHIZY	1.2.8	X
XARRAY	0.12.3	X
Xlrd	1.2.0	X
Xlsx写入器	1.0.2	X
超大重量	1.3.0	X
Pandas-Gbq	0.12.0

看见依赖项和可选依赖项想要更多。

其他API更改#

现在，按降序排序对于 Series.sort_values() 和 Index.sort_values() 对于DateTime-Like Index 子类。这将影响在多列上对DataFrame进行排序、使用可生成重复项的键函数进行排序或在使用 Index.sort_values() 。使用时 Series.value_counts() ，则缺失值的计数不再一定在重复计数列表中最后。相反，它的位置与原始系列中的位置相对应。使用时 Index.sort_values() 对于DateTime-Like Index 子类，NAT忽略 na_position 论据，并从头开始排序。现在他们尊重 na_position ，默认为 last ，与其他相同 Index 子类 (GH35992 )
Passing an invalid fill_value to Categorical.take(), DatetimeArray.take(), TimedeltaArray.take(), or PeriodArray.take() now raises a TypeError instead of a ValueError (GH37733)
Passing an invalid fill_value to Series.shift() with a CategoricalDtype now raises a TypeError instead of a ValueError (GH37733)
Passing an invalid value to IntervalIndex.insert() or CategoricalIndex.insert() now raises a TypeError instead of a ValueError (GH37733)
Attempting to reindex a Series with a CategoricalIndex with an invalid fill_value now raises a TypeError instead of a ValueError (GH37733)
CategoricalIndex.append() with an index that contains non-category values will now cast instead of raising TypeError (GH38098)

不推荐使用#

Deprecated parameter inplace in MultiIndex.set_codes() and MultiIndex.set_levels() (GH35626)
不推荐使用的参数 dtype 方法论 copy() 为了所有人 Index 子类。使用 astype() 方法，而不是更改数据类型 (GH35853 )
不推荐使用的参数 levels 和 codes 在……里面 MultiIndex.copy() 。使用 set_levels() 和 set_codes() 方法而不是 (GH36685 )
日期解析器函数 parse_date_time() ， parse_date_fields() ， parse_all_fields() 和 generic_parser() 从… pandas.io.date_converters 已弃用，并将在未来版本中删除；请使用 to_datetime() 取而代之的是 (GH35741 )
DataFrame.lookup() 已弃用，并将在未来版本中删除，请使用 DataFrame.melt() 和 DataFrame.loc() 取而代之的是 (GH35224 )
该方法 Index.to_native_types() 已弃用。使用 .astype(str) 取而代之的是 (GH28867 )
不推荐使用索引 DataFrame 具有单个类似DATETIME的字符串的行 df[string] (给出行索引或选择列的模糊性)，使用 df.loc[string] 取而代之的是 (GH36179 )
Deprecated Index.is_all_dates() (GH27744)
的默认值 regex 为 Series.str.replace() 将从 True 至 False 在未来的版本中。此外，单字符正则表达式将 not 在以下情况下被视为原义字符串 regex=True 已设置 (GH24804 )
Deprecated automatic alignment on comparison operations between DataFrame and Series, do frame, ser = frame.align(ser, axis=1, copy=False) before e.g. frame == ser (GH28759)
Rolling.count() 使用 min_periods=None 在将来的版本中将默认为窗口大小 (GH31302 )
现在不建议在DataFrames上使用“out”uuncs来返回4d ndarray。首先转换为ndarray (GH23743 )
TZ-Aware上已弃用的切片索引 DatetimeIndex 带着天真 datetime 对象，以匹配标量索引行为 (GH36148 )
Index.ravel() 返回一个 np.ndarray 已弃用，则以后将返回同一索引上的视图 (GH19956 )
Deprecate use of strings denoting units with 'M', 'Y' or 'y' in to_timedelta() (GH36666)
Index 方法： & ， | ，以及 ^ 表现为集合运算 Index.intersection() ， Index.union() ，以及 Index.symmetric_difference() 分别被弃用，并且在将来将表现为逐点布尔操作 Series 行为。请改用命名集方法 (GH36758 )
Categorical.is_dtype_equal() 和 CategoricalIndex.is_dtype_equal() 已弃用，将在未来版本中删除 (GH37545 )
Series.slice_shift() 和 DataFrame.slice_shift() 已弃用，请使用 Series.shift() 或 DataFrame.shift() 取而代之的是 (GH37601 )
对无序的部分切片 DatetimeIndex 不推荐使用键不在索引中的对象，并将在将来的版本中删除这些对象 (GH18531 )
这个 how 输入关键字 PeriodIndex.astype() 已弃用，并将在未来版本中删除，请使用 index.to_timestamp(how=how) 取而代之的是 (GH37982 )
Deprecated Index.asi8() for Index subclasses other than DatetimeIndex, TimedeltaIndex, and PeriodIndex (GH37877)
这个 inplace 的参数 Categorical.remove_unused_categories() 已弃用，并将在将来的版本中删除 (GH37643 )
这个 null_counts 的参数 DataFrame.info() 已弃用，并被替换为 show_counts 。它将在未来的版本中删除 (GH37999 )

Calling NumPy ufuncs on non-aligned DataFrames

调用非对齐DataFrames上的NumPy uuncs改变了Pandas 1.2.0中的行为(在调用ufunc之前对齐输入)，但在Pandas 1.2.1中恢复了这种更改。现在不建议使用不对齐的行为，请参阅 the 1.2.1 release notes 了解更多详细信息。

性能改进#

使用dtype创建DataFrame或Series时的性能改进 str 或 StringDtype 具有多个字符串元素的From数组 (GH36304 ， GH36317 ， GH36325 ， GH36432 ， GH37371 )
性能提升 GroupBy.agg() 使用 numba 发动机 (GH35759 )
创建时的性能改进 Series.map() 从一本巨大的词典中 (GH34717 )
性能提升 GroupBy.transform() 使用 numba 发动机 (GH36240 )
Styler UUID方法经更改以压缩Web上的数据传输，同时保持合理较低的表冲突概率 (GH36345 )
性能提升 to_datetime() 使用非ns时间单位 float dtype 列 (GH20445 )
Performance improvement in setting values on an IntervalArray (GH36310)
内部指标法 _shallow_copy() 现在使新索引和原始索引共享缓存的属性，避免了在任何一个上创建时再次创建这些属性。这可以加快依赖于创建现有索引副本的操作 (GH36840 )
Performance improvement in RollingGroupby.count() (GH35625)
小型性能下降到 Rolling.min() 和 Rolling.max() 适用于固定窗 (GH36567 )
降低峰值内存使用量 DataFrame.to_pickle() 在使用时 protocol=5 在Python3.8+中 (GH34244 )
Faster dir calls when the object has many index labels, e.g. dir(ser) (GH37450)
Performance improvement in ExpandingGroupby (GH37064)
Performance improvement in Series.astype() and DataFrame.astype() for Categorical (GH8628)
Performance improvement in DataFrame.groupby() for float dtype (GH28303), changes of the underlying hash-function can lead to changes in float based indexes sort ordering for ties (e.g. Index.value_counts())
性能提升 pd.isin() 对于元素超过1e6的输入 (GH36611 )
性能提升 DataFrame.__setitem__() 使用类似列表的索引器 (GH37954 )
read_json() 现在避免在指定块大小时将整个文件读入内存 (GH34548 )

错误修复#

直截了当的#

Categorical.fillna() 将始终返回副本，验证传递的填充值，而不管是否有任何NAS要填充，并且不允许 NaT 作为数字类别的填充值 (GH36530 )
窃听 Categorical.__setitem__() 尝试设置元组值时错误引发的 (GH20439 )
Bug in CategoricalIndex.equals() incorrectly casting non-category entries to np.nan (GH37667)
Bug in CategoricalIndex.where() incorrectly setting non-category entries to np.nan instead of raising TypeError (GH37977)
窃听 Categorical.to_numpy() 和 np.array(categorical) 具有TZ感知功能 datetime64 类别错误地删除了时区信息，而不是转换为对象dtype (GH38136 )

类似DateTime#

Bug in DataFrame.combine_first() that would convert datetime-like column on other DataFrame to integer when the column is not present in original DataFrame (GH28481)
窃听 DatetimeArray.date 其中一个 ValueError 将使用只读后备数组引发 (GH33530 )
窃听 NaT 未能提出的比较 TypeError 关于无效的不平等比较 (GH35046 )
窃听 DateOffset 当输入值超出正常范围(例如，月=12)时，从PICLE文件重构的属性不同于原始对象 (GH34511 )
Bug in DatetimeIndex.get_slice_bound() where datetime.date objects were not accepted or naive Timestamp with a tz-aware DatetimeIndex (GH35690)
窃听 DatetimeIndex.slice_locs() 哪里 datetime.date 对象未被接受 (GH34077 )
窃听 DatetimeIndex.searchsorted() ， TimedeltaIndex.searchsorted() ， PeriodIndex.searchsorted() ，以及 Series.searchsorted() 使用 datetime64 ， timedelta64 或 Period 数据类型放置 NaT 值与NumPy不一致 (GH36176 ， GH36254 )
中的不一致 DatetimeArray ， TimedeltaArray ，以及 PeriodArray 方法 __setitem__ 将字符串数组转换为类似DateTime的标量，而不是标量字符串 (GH36261 )
窃听 DatetimeArray.take() 错误地允许 fill_value 时区不匹配 (GH37356 )
窃听 DatetimeIndex.shift 移位空索引时引发错误 (GH14811 )
Timestamp 和 DatetimeIndex Tz感知对象和Tz-naive对象之间的比较现在遵循标准库 datetime behavior, returning True/False 为 !=/== 和不平等比较的提高 (GH28507 )
窃听 DatetimeIndex.equals() 和 TimedeltaIndex.equals() 错误地考虑 int64 索引相等 (GH36744 )
Series.to_json(), DataFrame.to_json(), and read_json() now implement time zone parsing when orient structure is table (GH35973)
astype() 现在尝试转换为 datetime64[ns, tz] 直接从 object 使用从字符串推断的时区 (GH35973 )
Bug in TimedeltaIndex.sum() and Series.sum() with timedelta64 dtype on an empty index or series returning NaT instead of Timedelta(0) (GH31751)
窃听 DatetimeArray.shift() 错误地允许 fill_value 时区不匹配 (GH37299 )
添加一个错误 BusinessDay 使用非零值 offset 设置为非标量其他 (GH37457 )
窃听 to_datetime() 使用只读数组错误地引发 (GH34857 )
窃听 Series.isin() 使用 datetime64[ns] 数据类型和 DatetimeIndex.isin() 将整数错误地转换为日期时间 (GH36621 )
窃听 Series.isin() 使用 datetime64[ns] 数据类型和 DatetimeIndex.isin() 没有考虑到Tz感知和Tz天真的约会时间总是不同的 (GH35728 )
窃听 Series.isin() 使用 PeriodDtype 数据类型和 PeriodIndex.isin() 未能考虑不同观点的论点 PeriodDtype 一如既往的与众不同 (GH37528 )
窃听 Period 构造函数现在可以正确处理 value 论据 (GH34621 和 GH17053 )

Timedelta#

窃听 TimedeltaIndex ， Series ，以及 DataFrame 楼层划分，带 timedelta64 数据类型和 NaT 在分母中 (GH35529 )
Bug in parsing of ISO 8601 durations in Timedelta and to_datetime() (GH29773, GH36204)
窃听 to_timedelta() 使用只读数组错误地引发 (GH34857 )
窃听 Timedelta 当字符串输入的精度高于纳秒时，错误地将其截断为次秒级部分 (GH36738 )

时区#

Bug in date_range() was raising AmbiguousTimeError for valid input with ambiguous=False (GH35297)
窃听 Timestamp.replace() 正在丢失折叠信息 (GH37610 )

数字#

窃听 to_numeric() 浮点精度不正确的地方 (GH31364 )
窃听 DataFrame.any() 使用 axis=1 和 bool_only=True 忽略了 bool_only 关键字 (GH32432 )
窃听 Series.equals() 其中一个 ValueError 在将NumPy数组与标量进行比较时引发 (GH35267 )
窃听 Series 其中两个系列各有一个 DatetimeIndex 不同时区在执行算术运算时这些索引被错误地改变 (GH33671 )
窃听 pandas.testing 与一起使用时的模块函数 check_exact=False 关于复数类型 (GH28235 )
窃听 DataFrame.__rmatmul__() 报告转置形状时出错 (GH21581 )
窃听 Series Flex算术方法，其中在使用 list ， tuple 或 np.ndarray 会有一个不正确的名字 (GH36760 )
窃听 IntegerArray 与之相乘 timedelta 和 np.timedelta64 对象 (GH36870 )
窃听 MultiIndex 与元组错误地将元组视为类数组的比较 (GH21517 )
窃听 DataFrame.diff() 使用 datetime64 数据类型包括 NaT 未能填充的值 NaT 结果正确 (GH32441 )
窃听 DataFrame 算术运算错误地接受关键字参数 (GH36843 )
窃听 IntervalArray 与 Series 不退货系列 (GH36908 )
Bug in DataFrame allowing arithmetic operations with list of array-likes with undefined results. Behavior changed to raising ValueError (GH36702)
Bug in DataFrame.std() with timedelta64 dtype and skipna=False (GH37392)
Bug in DataFrame.min() and DataFrame.max() with datetime64 dtype and skipna=False (GH36907)
Bug in DataFrame.idxmax() and DataFrame.idxmin() with mixed dtypes incorrectly raising TypeError (GH38195)

转换#

窃听 DataFrame.to_dict() 使用 orient='records' 现在为类似DateTime的列返回python本机DateTime对象 (GH21256 )
窃听 Series.astype() 转换自 string 至 float 在以下情况下提出 pd.NA 值 (GH37626 )

字符串#

Bug in Series.to_string(), DataFrame.to_string(), and DataFrame.to_latex() adding a leading space when index=False (GH24980)
Bug in to_numeric() raising a TypeError when attempting to convert a string dtype Series containing only numeric strings and NA (GH37262)

间隔#

窃听 DataFrame.replace() 和 Series.replace() 哪里 Interval 数据类型将转换为对象数据类型 (GH34871 )
Bug in IntervalIndex.take() with negative indices and fill_value=None (GH37330)
窃听 IntervalIndex.putmask() 将类似DateTime的数据类型错误地转换为对象数据类型 (GH37968 )
窃听 IntervalArray.astype() 错误地删除带有 CategoricalDtype 对象 (GH37984 )

标引#

Bug in PeriodIndex.get_loc() incorrectly raising ValueError on non-datelike strings instead of KeyError, causing similar errors in Series.__getitem__(), Series.__contains__(), and Series.loc.__getitem__() (GH34240)
窃听 Index.sort_values() 其中，当传递空值时，方法将中断，尝试比较缺少的值，而不是将它们推到排序顺序的末尾 (GH35584 )
Bug in Index.get_indexer() and Index.get_indexer_non_unique() where int64 arrays are returned instead of intp (GH36359)
窃听 DataFrame.sort_index() 其中，参数升序作为单级索引上的列表传递会产生错误的结果 (GH32334 )
窃听 DataFrame.reset_index() 错误地引发了一个 ValueError 对于使用 MultiIndex 级别中缺少值的情况下 Categorical 数据类型 (GH24206 )
使用类似日期时间的值上的布尔掩码进行索引时出现错误，有时返回的是视图而不是副本 (GH36210 )
窃听 DataFrame.__getitem__() 和 DataFrame.loc.__getitem__() 使用 IntervalIndex 列和数字索引器 (GH26490 )
窃听 Series.loc.__getitem__() 具有非唯一的 MultiIndex 和一个空列表索引器 (GH13691 )
Bug in indexing on a Series or DataFrame with a MultiIndex and a level named "0" (GH37194)
Bug in Series.__getitem__() when using an unsigned integer array as an indexer giving incorrect results or segfaulting instead of raising KeyError (GH37218)
窃听 Index.where() 将数值错误地转换为字符串 (GH37591 )
窃听 DataFrame.loc() 当索引器是步长为负值的切片时返回空结果 (GH38071 )
窃听 Series.loc() 和 DataFrame.loc() 在索引为 object 数据类型，并且给定的数字标签在索引中 (GH26491 )
Bug in DataFrame.loc() returned requested key plus missing values when loc was applied to single level from a MultiIndex (GH27104)
在上编制索引时出错 Series 或 DataFrame 使用一个 CategoricalIndex 使用包含NA值的列表式索引器 (GH37722 )
窃听 DataFrame.loc.__setitem__() 展开一个空的 DataFrame 具有混合数据类型 (GH37932 )
窃听 DataFrame.xs() 忽略 droplevel=False 对于列 (GH19056 )
Bug in DataFrame.reindex() raising IndexingError wrongly for empty DataFrame with tolerance not None or method="nearest" (GH27315)
Bug in indexing on a Series or DataFrame with a CategoricalIndex using list-like indexer that contains elements that are in the index's categories but not in the index itself failing to raise KeyError (GH37901)
将布尔标签插入到 DataFrame 带有数字的 Index 列错误地转换为整数 (GH36319 )
Bug in DataFrame.iloc() and Series.iloc() aligning objects in __setitem__ (GH22046)
窃听 MultiIndex.drop() 如果找到部分标签，则不引发 (GH37820 )
窃听 DataFrame.loc() 没有举起 KeyError 当丢失的组合与 slice(None) 对于剩余标高 (GH19556 )
Bug in DataFrame.loc() raising TypeError when non-integer slice was given to select values from MultiIndex (GH25165, GH24263)
窃听 Series.at() 返回 Series 使用一个元素代替标量 MultiIndex 只有一个级别 (GH38053 )
窃听 DataFrame.loc() 当索引器的顺序不同于 MultiIndex 过滤的步骤 (GH31330 ， GH34603 )
窃听 DataFrame.loc() 和 DataFrame.__getitem__() 加薪 KeyError 当列是 MultiIndex 只有一个级别 (GH29749 )
Bug in Series.__getitem__() and DataFrame.__getitem__() raising blank KeyError without missing keys for IntervalIndex (GH27365)
设置新标签时出现错误 DataFrame 或 Series 使用一个 CategoricalIndex 错误地提高 TypeError 当新标签不在索引类别中时 (GH38098 )
窃听 Series.loc() 和 Series.iloc() 加薪 ValueError 当插入类似列表的内容时 np.array ， list 或 tuple 在一个 object 等长系列 (GH37748 ， GH37486 )
窃听 Series.loc() 和 Series.iloc() 设置的所有值 object 与列表样的系列 ExtensionArray 而不是插入它 (GH38271 )

丢失#

Bug in SeriesGroupBy.transform() now correctly handles missing values for dropna=False (GH35014)
窃听 Series.nunique() 使用 dropna=True 返回不正确的结果，而两个 NA 和 None 存在缺失的值 (GH37566 )
Bug in Series.interpolate() where kwarg limit_area and limit_direction had no effect when using methods pad and backfill (GH31048)

MultiIndex#

Bug in DataFrame.xs() when used with IndexSlice raises TypeError with message "Expected label or tuple of labels" (GH35301)
Bug in DataFrame.reset_index() with NaT values in index raises ValueError with message "cannot convert float NaN to integer" (GH36541)
Bug in DataFrame.combine_first() when used with MultiIndex containing string and NaN values raises TypeError (GH36562)
窃听 MultiIndex.drop() 掉落 NaN 将不存在的密钥作为输入提供时的值 (GH18853 )
窃听 MultiIndex.drop() 当索引有重复且未排序时，丢弃的值比预期的多 (GH33494 )

I/O#

read_sas() 不再在故障时泄漏资源 (GH35566 )
Bug in DataFrame.to_csv() and Series.to_csv() caused a ValueError when it was called with a filename in combination with mode containing a b (GH35058)
窃听 read_csv() 使用 float_precision='round_trip' 未处理 decimal 和 thousands 参数 (GH35365 )
to_pickle() 和 read_pickle() 正在关闭用户提供的文件对象 (GH35679 )
to_csv() passes compression arguments for 'gzip' always to gzip.GzipFile (GH28103)
to_csv() 不支持对没有文件名的二进制文件对象进行ZIP压缩 (GH35058 )
to_csv() 和 read_csv() 没有兑现 compression 和 encoding 用于在内部转换为文件类对象的路径类对象 (GH35677 ， GH26124 ， GH32392 )
DataFrame.to_pickle() ， Series.to_pickle() ，以及 read_pickle() 不支持对文件对象进行压缩 (GH26237 ， GH29054 ， GH29570 )
窃听 LongTableBuilder.middle_separator() 正在复制LaTeX文档的表列表中的LaTeX长表项 (GH34360 )
窃听 read_csv() 使用 engine='python' 如果第一行中存在多个条目且第一个元素以BOM表开头，则截断数据 (GH36343 )
Removed private_key and verbose from read_gbq() as they are no longer supported in pandas-gbq (GH34654, GH30200)
Bumped minimum pytables version to 3.5.1 to avoid a ValueError in read_hdf() (GH24839)
Bug in read_table() and read_csv() when delim_whitespace=True and sep=default (GH36583)
窃听 DataFrame.to_json() 和 Series.to_json() 与一起使用时 lines=True 和 orient='records' 记录的最后一行没有附加‘换行符’ (GH36888 )
窃听 read_parquet() 具有固定的偏移量时区。无法识别时区的字符串表示形式 (GH35997 ， GH36004 )
窃听 DataFrame.to_html() ， DataFrame.to_string() ，以及 DataFrame.to_latex() 忽略了 na_rep 在以下情况下的参数 float_format 还指定了 (GH9046 ， GH13828 )
显示太多尾随零的复数的输出渲染中出现错误 (GH36799 )
窃听 HDFStore 抛出了一个 TypeError 使用导出空的DataFrame时 datetime64[ns, tz] 具有固定HDF5存储的数据类型 (GH20594 )
窃听 HDFStore 在使用导出系列时丢弃时区信息 datetime64[ns, tz] 具有固定HDF5存储的数据类型 (GH20594 )
read_csv() 关闭用户提供的二进制文件句柄时 engine="c" 和一个 encoding 已被请求 (GH36980 )
Bug in DataFrame.to_hdf() was not dropping missing rows with dropna=True (GH35719)
窃听 read_html() 是在募集一个 TypeError 在提供 pathlib.Path 参数设置为 io 参数 (GH37705 )
DataFrame.to_excel() ， Series.to_excel() ， DataFrame.to_markdown() ，以及 Series.to_markdown() 现在支持写入S3和Google云存储等fsspec URL (GH33987 )
窃听 read_fwf() 使用 skip_blank_lines=True 没有跳过空行 (GH37758 )
Parse missing values using read_json() with dtype=False to NaN instead of None (GH28501)
read_fwf() 用来推断压缩的 compression=None 这与另一个不一致 read_* 功能 (GH37909 )
DataFrame.to_html() 忽视了 formatters 论证 ExtensionDtype 列 (GH36525 )
将最低x数组版本更改为0.12.3，以避免引用已删除的 Panel 班级 (GH27101 ， GH37983 )
DataFrame.to_csv() was re-opening file-like handles that also implement os.PathLike (GH38125)
切片的转换中出现错误 pyarrow.Table 带有缺失值的DataFrame (GH38525 )
窃听 read_sql_table() 提高一名 sqlalchemy.exc.OperationalError 当列名包含百分号时 (GH37517 )

期间#

窃听 DataFrame.replace() 和 Series.replace() 哪里 Period 数据类型将转换为对象数据类型 (GH34871 )

标绘#

窃听 DataFrame.plot() 正在旋转xtickLabels时 subplots=True ，即使x轴不是不规则的时间序列 (GH29460 )
Bug in DataFrame.plot() where a marker letter in the style keyword sometimes caused a ValueError (GH21003)
窃听 DataFrame.plot.bar() 和 Series.plot.bar() 其中，刻度位置是按值顺序分配的，而不是使用数值的实际值或字符串的智能排序 (GH26186 ， GH11465 )。此修复已在Pandas 1.2.1中恢复，请参见 1.2.1中的新特性(2021年1月20日)
成对轴丢失了它们的刻度标签，这应该只发生在除最后一行或最后一列之外的所有外部共享轴上 (GH33819 )
窃听 Series.plot() 和 DataFrame.plot() 正在抛出一个 ValueError 当序列或DataFrame由 TimedeltaIndex 频率固定，且x轴下限大于上限 (GH37454 )
Bug in DataFrameGroupBy.boxplot() when subplots=False would raise a KeyError (GH16748)
窃听 DataFrame.plot() 和 Series.plot() 正在覆盖matplotlib的共享y轴行为 sharey 参数已传递 (GH37942 )
窃听 DataFrame.plot() 是在募集一个 TypeError 使用 ExtensionDtype 列 (GH32073 )

造型师#

窃听 Styler.render() 由于中的格式错误，生成的HTML不正确 rowspan 属性，则它现在与w3语法匹配 (GH38234 )

分组/重采样/滚动#

Bug in DataFrameGroupBy.count() and SeriesGroupBy.sum() returning NaN for missing categories when grouped on multiple Categoricals. Now returning 0 (GH35028)
窃听 DataFrameGroupBy.apply() 这有时会抛出错误的 ValueError 如果分组轴具有重复条目 (GH16646 )
窃听 DataFrame.resample() 这将抛出一个 ValueError 重采样时 "D" 至 "24H" 过渡到夏令时(DST) (GH35219 )
Bug when combining methods DataFrame.groupby() with DataFrame.resample() and DataFrame.interpolate() raising a TypeError (GH35325)
Bug in DataFrameGroupBy.apply() where a non-nuisance grouping column would be dropped from the output columns if another groupby method was called before .apply (GH34656)
上的列子设置时出现错误 DataFrameGroupBy (例如 df.groupby('a')[['b']]) )将重置属性 axis ， dropna ， group_keys ， level ， mutated ， sort ，以及 squeeze 设置为其缺省值 (GH9959 )
窃听 DataFrameGroupBy.tshift() 未能筹集到 ValueError 当不能为组的索引推断频率时 (GH35937 )
Bug in DataFrame.groupby() does not always maintain column index name for any, all, bfill, ffill, shift (GH29764)
Bug in DataFrameGroupBy.apply() raising error with np.nan group(s) when dropna=False (GH35889)
Bug in Rolling.sum() returned wrong values when dtypes where mixed between float and integer and axis=1 (GH20649, GH35596)
窃听 Rolling.count() 退货 np.nan 使用 FixedForwardWindowIndexer 作为窗口， min_periods=0 并且窗口中仅缺少值 (GH35579 )
Bug where pandas.core.window.Rolling produces incorrect window sizes when using a PeriodIndex (GH34225)
Bug in DataFrameGroupBy.ffill() and DataFrameGroupBy.bfill() where a NaN group would return filled values instead of NaN when dropna=True (GH34725)
窃听 RollingGroupby.count() 其中一个 ValueError 在指定 closed 参数 (GH35869 )
窃听 DataFrameGroupBy.rolling() 使用部分居中窗口返回错误的值 (GH36040 )
窃听 DataFrameGroupBy.rolling() 返回了错误的值，时间感知窗口包含 NaN 。加薪 ValueError 因为窗户现在不是单调的了 (GH34617 )
Bug in Rolling.__iter__() where a ValueError was not raised when min_periods was larger than window (GH37156)
使用 Rolling.var() 而不是 Rolling.std() 避免了以下数字问题： Rolling.corr() 什么时候 Rolling.var() 仍在浮点精度范围内，而 Rolling.std() 不是 (GH31286 )
Bug in DataFrameGroupBy.quantile() and Resampler.quantile() raised TypeError when values were of type Timedelta (GH29485)
窃听 Rolling.median() 和 Rolling.quantile() 返回了错误的值 BaseIndexer 窗口具有非单调起点或终点的子类 (GH37153 )
窃听 DataFrame.groupby() 掉落 nan 结果中包含的组 dropna=False 在单列上分组时 (GH35646 ， GH35542 )
Bug in DataFrameGroupBy.head(), DataFrameGroupBy.tail(), SeriesGroupBy.head(), and SeriesGroupBy.tail() would raise when used with axis=1 (GH9772)
窃听 DataFrameGroupBy.transform() 与一起使用时会引发 axis=1 和变换核(例如“Shift”) (GH36308 )
Bug in DataFrameGroupBy.resample() using .agg with sum produced different result than just calling .sum (GH33548)
窃听 DataFrameGroupBy.apply() 丢弃的值位于 nan 返回与原始帧相同的轴时分组 (GH38227 )
窃听 DataFrameGroupBy.quantile() 不能处理像阵列一样的 q 按列分组时 (GH33795 )
窃听 DataFrameGroupBy.rank() 使用 datetime64tz 或句号数据类型错误地将结果转换为这些数据类型，而不是返回 float64 数据类型 (GH38187 )

重塑#

窃听 DataFrame.crosstab() 在具有重复行名、重复列名或行和列标签之间重复名称的输入上返回不正确的结果 (GH22529 )
Bug in DataFrame.pivot_table() with aggfunc='count' or aggfunc='sum' returning NaN for missing categories when pivoted on a Categorical. Now returning 0 (GH31422)
窃听 concat() 和 DataFrame 在某些情况下不保留输入索引名的构造函数 (GH13475 )
Bug in func crosstab() when using multiple columns with margins=True and normalize=True (GH35144)
窃听 DataFrame.stack() 其中，空的DataFrame.Stack将引发错误 (GH36113 )。现在返回一个多重索引为空的空系列。
Bug in Series.unstack(). Now a Series with single level of Index trying to unstack would raise a ValueError (GH36113)
Bug in DataFrame.agg() with func={'name':<FUNC>} incorrectly raising TypeError when DataFrame.columns==['Name'] (GH36212)
窃听 Series.transform() 会给出不正确的结果或在参数 func 是一本词典 (GH35811 )
窃听 DataFrame.pivot() 没有保存 MultiIndex 行和列均为多索引时的列的级别名称 (GH36360 )
窃听 DataFrame.pivot() 改型 index 在以下情况下的参数 columns 通过了，但 values 不是 (GH37635 )
Bug in DataFrame.join() returned a non deterministic level-order for the resulting MultiIndex (GH36910)
Bug in DataFrame.combine_first() caused wrong alignment with dtype string and one level of MultiIndex containing only NA (GH37591)
修复了中的回归问题 merge() 论兼并 DatetimeIndex 具有空的DataFrame (GH36895 )
Bug in DataFrame.apply() not setting index of return value when func return type is dict (GH37544)
Bug in DataFrame.merge() and pandas.merge() returning inconsistent ordering in result for how=right and how=left (GH35382)
Bug in merge_ordered() couldn't handle list-like left_by or right_by (GH35269)
Bug in merge_ordered() returned wrong join result when length of left_by or right_by equals to the rows of left or right (GH38166)
窃听 merge_ordered() 当元素进入时未引发 left_by 或 right_by 不存在于 left 柱或 right 列 (GH38167 )
窃听 DataFrame.drop_duplicates() 未验证的bool数据类型 ignore_index 关键字 (GH38274 )

ExtensionArray#

修复了以下错误 DataFrame 通过DICT实例化设置为标量扩展类型的列被视为对象类型，而不是扩展类型 (GH35965 )
修复了以下错误 astype() 具有相同的数据类型和 copy=False 将返回一个新对象 (GH28488 )
Fixed bug when applying a NumPy ufunc with multiple outputs to an IntegerArray returning None (GH36913)
Fixed an inconsistency in PeriodArray's __init__ signature to those of DatetimeArray and TimedeltaArray (GH37289)
减幅为 BooleanArray ， Categorical ， DatetimeArray ， FloatingArray ， IntegerArray ， PeriodArray ， TimedeltaArray ，以及 PandasArray 现在是仅限关键字的方法 (GH37541 )
修复了一个错误，其中 TypeError 如果成员资格检查是在一个 ExtensionArray 包含类似NaN的值 (GH37867 )

其他#

窃听 DataFrame.replace() 和 Series.replace() 错误地引发 AssertionError 而不是一个 ValueError 当传递无效的参数组合时 (GH36045 )
Bug in DataFrame.replace() and Series.replace() with numeric values and string to_replace (GH34789)
修复了中的元数据传播 Series.abs() 和在Series和DataFrames上调用的uuncs (GH28283 )
窃听 DataFrame.replace() 和 Series.replace() 错误地从 PeriodDtype 对象数据类型 (GH34871 )
修复了当列名称与元数据名称重叠时，元数据传播错误地将DataFrame列复制为元数据的错误 (GH37037 )
修复了元数据在 Series.dt ， Series.str 访问者， DataFrame.duplicated ， DataFrame.stack ， DataFrame.unstack ， DataFrame.pivot ， DataFrame.append ， DataFrame.diff ， DataFrame.applymap 和 DataFrame.update 方法： (GH28283 ， GH37381 )
Fixed metadata propagation when selecting columns with DataFrame.__getitem__ (GH28283)
Bug in Index.intersection() with non-Index failing to set the correct name on the returned Index (GH38111)
窃听 RangeIndex.intersection() 未在退回的 Index 在某些角落的情况下 (GH38197 )
窃听 Index.difference() 未在退回的 Index 在某些角落的情况下 (GH38268 )
窃听 Index.union() 根据操作数是否是 Index 或其他类似列表的列表 (GH36384 )
窃听 Index.intersection() 将不匹配的数值数据类型强制转换为 object 数据类型，而不是最小公共数据类型 (GH38122 )
窃听 IntervalIndex.union() 返回类型不正确的 Index 当为空时 (GH38282 )
Passing an array with 2 or more dimensions to the Series constructor now raises the more specific ValueError rather than a bare Exception (GH35744)
窃听 dir 哪里 dir(obj) 不会显示在Pandas对象的实例上定义的属性 (GH37173 )
窃听 Index.drop() 加薪 InvalidIndexError 当索引有重复项时 (GH38051 )
Bug in RangeIndex.difference() returning Int64Index in some cases where it should return RangeIndex (GH38028)
修复了中的错误 assert_series_equal() 将类似DateTime的数组与等效的非扩展dtype数组进行比较时 (GH37609 )
Bug in is_bool_dtype() would raise when passed a valid string such as "boolean" (GH38386)
逻辑运算符提升中的固定回归 ValueError 当列中的 DataFrame 是一种 CategoricalIndex 包含未使用的类别 (GH38367 )

贡献者#

共有257人为此次发布贡献了补丁。名字中带有“+”的人第一次贡献了一个补丁。

21CSM +
AbdulMAbdi +
Abhiraj Hinge +
Abhishek Mangla +
Abo7atm +
Adam Spannbauer +
Albert Villanova del Moral
Alex Kirko
Alex Lim +
Alex Thorne +
Aleš Erjavec +
Ali McMaster
Amanda Dsouza +
Amim Knabben +
Andrew Wieteska
Anshoo Rajput +
Anthony Milbourne
Arun12121 +
Asish Mahapatra
Avinash Pancham +
BeanNan +
Ben Forbes +
Brendan Wilby +
Bruno Almeida +
Byron Boulton +
Chankey Pathak
Chris Barnes +
Chris Lynch +
Chris Withers
Christoph Deil +
Christopher Hadley +
Chuanzhu Xu
Coelhudo +
Dan Moore
Daniel Saxton
David Kwong +
David Li +
David Mrva +
Deepak Pandey +
Deepyaman Datta
Devin Petersohn
Dmitriy Perepelkin +
Douglas Hanley +
Dāgs Grīnbergs +
Eli Treuherz +
Elliot Rampono +
Erfan Nariman
Eric Goddard
Eric Leung +
Eric Wieser
Ethan Chen +
Eve +
Eyal Trabelsi +
Fabian Gebhart +
Fangchen Li
Felix Claessen +
Finlay Maguire +
Florian Roscheck +
Gabriel Monteiro
Gautham +
Gerard Jorgensen +
Gregory Livschitz
Hans
Harsh Sharma
Honfung Wong +
Igor Gotlibovych +
Iqrar Agalosi Nureyza
Irv Lustig
Isaac Virshup
Jacob Peacock
Jacob Stevens-Haas +
Jan Müller +
Janus
Jeet Parekh
Jeff Hernandez +
Jeff Reback
Jiaxiang
Joao Pedro Berno Zanutto +
Joel Nothman
Joel Whittier +
John Karasinski +
John McGuigan +
Johnny Pribyl +
Jonas Laursen +
Jonathan Shreckengost +
Joris Van den Bossche
Jose +
JoseNavy +
Josh Temple +
Jun Kudo +
Justin Essert
Justin Sexton +
Kaiqi Dong
Kamil Trocewicz +
Karthik Mathur
Kashif +
Kenny Huynh
Kevin Sheppard
Kumar Shivam +
Leonardus Chen +
Levi Matus +
Lucas Rodés-Guirao +
Luis Pinto +
Lynch +
Marc Garcia
Marco Gorelli
Maria-Alexandra Ilie +
Marian Denes
Mark Graham +
Martin Durant
Matt Roeschke
Matthew Roeschke
Matthias Bussonnier
Maxim Ivanov +
Mayank Chaudhary +
MeeseeksMachine
Meghana Varanasi +
Metehan Kutlu +
Micael Jarniac +
Micah Smith +
Michael Marino
Miroslav Šedivý
Mohammad Jafar Mashhadi
Mohammed Kashif +
Nagesh Kumar C +
Nidhi Zare +
Nikhil Choudhary +
Number42
Oleh Kozynets +
OlivierLuG
Pandas Development Team
Paolo Lammens +
Paul Ganssle
Pax +
Peter Liu +
Philip Cerles +
Pranjal Bhardwaj +
Prayag Savsani +
Purushothaman Srikanth +
Qbiwan +
Rahul Chauhan +
Rahul Sathanapalli +
Rajat Bishnoi +
Ray Bell
Reshama Shaikh +
Richard Shadrach
Robert Bradshaw
Robert de Vries
Rohith295
S Mono +
S.TAKENO +
Sahid Velji +
Sam Cohen +
Sam Ezebunandu +
Sander +
Sarthak +
Sarthak Vineet Kumar +
Satrio H Wicaksono +
Scott Lasley
Shao Yang Hong +
Sharon Woo +
Shubham Mehra +
Simon Hawkins
Sixuan (Cherie) Wu +
Souris Ash +
Steffen Rehberg
Suvayu Ali
Sven
SylvainLan +
T. JEGHAM +
Terji Petersen
Thomas Dickson +
Thomas Heavey +
Thomas Smith
Tobias Pitters
Tom Augspurger
Tomasz Sakrejda +
Torsten Wörtwein +
Ty Mick +
UrielMaD +
Uwe L. Korn
Vikramaditya Gaonkar +
VirosaLi +
W.R +
Warren White +
Wesley Boelrijk +
William Ayd
Yanxian Lin +
Yassir Karroum +
Yong Kai Yi +
Yuanhao Geng +
Yury Mikhaylov +
Yutaro Ikeda
Yuya Takashina +
Zach Brookler +
Zak Kohler +
ZhihuiChen0903 +
abmyii
alexhtn +
asharma13524 +
attack68
beanan +
chinhwee
cleconte987
danchev +
ebardie +
edwardkong
elliot rampono +
estasney +
gabicca
geetha-rangaswamaiah +
gfyoung
guru kiran
hardikpnsp +
icanhazcodeplz +
ivanovmg +
jbrockmendel
jeschwar
jnecus
joooeey +
junk +
krajatcl +
lacrosse91 +
leo +
lpkirwin +
lrjball
lucasrodes +
ma3da +
mavismonica +
mlondschien +
mzeitlin11 +
nguevara +
nrebena
parkdj1 +
partev
patrick
realead
rxxg +
samilAyoub +
sanderland
shawnbrown
sm1899 +
smartvinnetou
ssortman +
steveya +
taytzehao +
tiagohonorato +
timhunderwood
tkmz-n +
tnwei +
tpanza +
vineethraj510 +
vmdhhh +
xinrong-databricks +
yonas kassa +
yonashub +
Ádám Lippai +

1.2.1中的新特性(2021年1月20日)

1.1.5中的新特性(2020年12月7日)

1.2.0中的新特性(2020年12月26日)#

增强#

可以选择不允许重复标注#

将参数传递给fsspec后端#

支持中的二进制文件句柄 to_csv#

支持中的短标题和表格位置 to_latex#

更改的默认浮点精度 read_csv 和 read_table#

浮点数据的实验性可空数据类型#

聚合时保留索引/列名#

GroupBy直接支持EWM运营#

其他增强功能#

值得注意的错误修复#

DataFrame约简的一致性#

提高了Python的最低版本#

提高了依赖项的最低版本#

其他API更改#

不推荐使用#

性能改进#

错误修复#

直截了当的#

类似DateTime#

Timedelta#

时区#

数字#

转换#

字符串#

间隔#

标引#

丢失#

MultiIndex#

I/O#

期间#

标绘#

造型师#

分组/重采样/滚动#

重塑#

ExtensionArray#

其他#

贡献者#

支持中的二进制文件句柄 `to_csv`#

支持中的短标题和表格位置 `to_latex`#

更改的默认浮点精度 `read_csv` 和 `read_table`#