pandas.Series.sample#

Series.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None, ignore_index=False)[源代码]#

从对象轴返回项目的随机样本。

您可以使用 random_state 为了重现性。

参数

n整型，可选: 从AXIS返回的项目数。不能与一起使用 frac 。如果是，则默认为1 frac =无。
frac浮动，可选: 要返回的轴项的分数。不能与一起使用 n 。
replace布尔值，默认为False: 允许或不允许多次对同一行进行采样。
weights字符串或类似ndarray，可选: 默认的‘None’会导致相同的概率权重。如果传递一个序列，将在索引上与目标对象对齐。未在采样对象中找到的权重中的索引值将被忽略，采样对象中不在权重中的索引值将被指定为零。如果在DataFrame上调用，则在axis=0时将接受列名。除非权重是系列，否则权重必须与被采样轴的长度相同。如果权重总和不为1，则它们将被规格化为总和为1。权重列中缺少的值将被视为零。不允许无限值。
random_stateInt，类似数组，BitGenerator，np随机.RandomState，np随机.Generator，可选: 如果为int、类似数组或BitGenerator，则为随机数生成器的种子。如果为np.随机性.RandomState或np.随机性.Generator，则按给定方式使用。

在 1.1.0 版更改: 类似数组的BitGenerator对象现在作为种子传递给np.随机.RandomState()

在 1.4.0 版更改: 现已接受np.Ranom.Generator对象
axis{0或‘index’，1或‘Columns’，无}，默认为无: 要采样的轴。接受轴号或名称。对于给定的数据类型，缺省值为Stat轴(0表示Series和DataFrames)。
ignore_index布尔值，默认为False: 如果为True，则生成的索引将标记为0，1，…，n-1。

1.3.0 新版功能.

退货

系列或DataFrame: 与调用方相同类型的新对象，包含 n 从调用者对象中随机抽样的项。

参见

DataFrameGroupBy.sample: 从每组DataFrame对象中生成随机样本。
SeriesGroupBy.sample: 从每组Series对象中生成随机采样。
numpy.random.choice: 从给定的一维数值数组生成随机样本。

注意事项

如果 frac >1、 replacement 应设置为 True 。

示例

>>> df = pd.DataFrame({'num_legs': [2, 4, 8, 0],
...                    'num_wings': [2, 0, 0, 0],
...                    'num_specimen_seen': [10, 2, 1, 8]},
...                   index=['falcon', 'dog', 'spider', 'fish'])
>>> df
        num_legs  num_wings  num_specimen_seen
falcon         2          2                 10
dog            4          0                  2
spider         8          0                  1
fish           0          0                  8

从列表中提取3个随机元素 Series df['num_legs'] ：请注意，我们使用 random_state 以确保例子的重现性。

>>> df['num_legs'].sample(n=3, random_state=1)
fish      0
spider    8
falcon    2
Name: num_legs, dtype: int64

随机抽取50%的样本 DataFrame 更换后：

>>> df.sample(frac=0.5, replace=True, random_state=1)
      num_legs  num_wings  num_specimen_seen
dog          4          0                  2
fish         0          0                  8

的上样本 DataFrame 替换：请注意 replace 参数必须为 True 为 frac 参数>1。

>>> df.sample(frac=2, replace=True, random_state=1)
        num_legs  num_wings  num_specimen_seen
dog            4          0                  2
fish           0          0                  8
falcon         2          2                 10
falcon         2          2                 10
fish           0          0                  8
dog            4          0                  2
fish           0          0                  8
dog            4          0                  2

使用DataFrame列作为权重。中具有较大值的行 num_specimen_seen 列更有可能被抽样。

>>> df.sample(n=2, weights='num_specimen_seen', random_state=1)
        num_legs  num_wings  num_specimen_seen
falcon         2          2                 10
fish           0          0                  8

pandas.Series.rtruediv

pandas.Series.searchsorted