pandas.DataFrame.value_counts#

DataFrame.value_counts(subset=None, normalize=False, sort=True, ascending=False, dropna=True)[源代码]#

返回包含DataFrame中唯一行计数的Series。

1.1.0 新版功能.

参数

subset类似列表，可选: 计算唯一组合时使用的列。
normalize布尔值，默认为False: 返回比例而不是频率。
sort布尔值，默认为True: 按频率排序。
ascending布尔值，默认为False: 按升序排序。
dropna布尔值，默认为True: 不包括包含NA值的行数。

1.3.0 新版功能.

退货

系列

参见

Series.value_counts: 级数的等价法。

注意事项

返回的Series将有一个多索引，每个输入列有一个级别。默认情况下，包含任何NA值的行将从结果中省略。默认情况下，生成的系列将按降序排列，因此第一个元素是出现频率最高的行。

示例

>>> df = pd.DataFrame({'num_legs': [2, 4, 4, 6],
...                    'num_wings': [2, 0, 0, 0]},
...                   index=['falcon', 'dog', 'cat', 'ant'])
>>> df
        num_legs  num_wings
falcon         2          2
dog            4          0
cat            4          0
ant            6          0

>>> df.value_counts()
num_legs  num_wings
4         0            2
2         2            1
6         0            1
dtype: int64

>>> df.value_counts(sort=False)
num_legs  num_wings
2         2            1
4         0            2
6         0            1
dtype: int64

>>> df.value_counts(ascending=True)
num_legs  num_wings
2         2            1
6         0            1
4         0            2
dtype: int64

>>> df.value_counts(normalize=True)
num_legs  num_wings
4         0            0.50
2         2            0.25
6         0            0.25
dtype: float64

使用 dropna 设置为 False 我们还可以计算NA值的行数。

>>> df = pd.DataFrame({'first_name': ['John', 'Anne', 'John', 'Beth'],
...                    'middle_name': ['Smith', pd.NA, pd.NA, 'Louise']})
>>> df
  first_name middle_name
0       John       Smith
1       Anne        <NA>
2       John        <NA>
3       Beth      Louise

>>> df.value_counts()
first_name  middle_name
Beth        Louise         1
John        Smith          1
dtype: int64

>>> df.value_counts(dropna=False)
first_name  middle_name
Anne        NaN            1
Beth        Louise         1
John        Smith          1
            NaN            1
dtype: int64

pandas.DataFrame.update

pandas.DataFrame.var