逗号分隔值

文件扩展名

.csv

我们可能知道逗号分隔值(comma-separated values,csv)文件,主要是通过与使用电子表格软件(如Microsoft Excel)的人合作。由于这种软件的本机文件格式可能很难读取,因此我们希望这些电子表格的csv导出。

通常,csv文件是一种纯文本格式,每行包含一条记录,其中包含多个分隔符分隔字段。每个记录的字段顺序都相同。

注意,csv文件没有标准化,因此存在很多不同的风格。例如,分隔符不必是逗号,也可以是简单的空格。如果分隔符是字段值的一部分,例如,在字符串中,则需要一个转义字符,这取决于csv实现的风格。在每个实现中,注释的处理方式也不同,有些允许在标题中通过 # 以及其他根本不支持他们的人

因此,csv文件可能用于非常简单的文件,但可能证明不适合复杂的数据。

让我们来看一个csv文件,它是存储在文件中的原子模拟的结果:

with open('md_result.csv', 'r') as f:
    print(f.read())
# Atom types:
# - He: Helium
# - Ar: Argon
# Potentials:
# - He: Lennard-Jones potential with epsilon/k_B = 10.22 K, sigma = 256 pm
# - Ar: Lennard-Jones potential with epsilon/k_B = 120 K, sigma = 341 pm
# Simulation box size: 100 µm x 200 µm x 300 µm
# Periodic boundary conditions in all directions
# Step: 0
# Time: 0 s
# All quantities are given in SI units
atom_id type x-position y-position z-position x-velocity y-velocity z-velocity
0 He 5.7222e-07 4.8811e-09 2.0415e-07 -2.9245e+01 1.0045e+02 1.2828e+02
1 He 9.7710e-07 3.6371e-07 4.7311e-07 -1.9926e+02 2.3275e+02 -5.3438e+02
2 Ar 6.4989e-07 6.7873e-07 9.5000e-07 -1.5592e+00 -3.7876e+02 8.4091e+01
3 Ar 5.9024e-08 3.7138e-07 7.3455e-08 3.4282e+02 1.5682e+02 -3.8991e+01
4 He 7.6746e-07 8.3017e-08 4.8520e-07 -3.0450e+01 -3.7975e+02 -3.3632e+02
5 Ar 1.7226e-07 4.6023e-07 4.7356e-08 -3.1151e+02 -4.2939e+02 -6.9474e+02
6 Ar 9.6394e-07 7.2845e-07 8.8623e-07 -8.2636e+01 4.5098e+01 -1.0626e+01
7 He 5.4450e-07 4.6373e-07 6.2270e-07 1.5889e+02 2.5858e+02 -1.5150e+02
8 He 7.9322e-07 9.4700e-07 3.5194e-08 -1.9703e+02 1.5674e+02 -1.8520e+02
9 Ar 2.7797e-07 1.6487e-07 8.2403e-07 -3.8650e+01 -6.9632e+02 2.1642e+02
10 He 1.1842e-07 6.3244e-07 5.0958e-07 -1.4963e+02 4.2288e+02 -7.6309e+01
11 Ar 2.0359e-07 8.3369e-07 9.6348e-07 4.8457e+02 -2.6741e+02 -3.5254e+02
12 He 5.1019e-07 2.2470e-07 2.3846e-08 -2.3192e+02 -9.9510e+01 3.2770e+01
13 Ar 3.5383e-07 8.4581e-07 7.2340e-07 -3.0395e+02 4.7316e+01 2.2253e+02
14 He 3.8515e-07 2.8940e-07 5.6028e-07 2.3308e+02 2.5418e+02 4.2983e+02
15 He 1.5842e-07 9.8225e-07 5.7859e-07 1.9963e+02 2.0311e+02 -4.2560e+02
16 He 3.6831e-07 7.6520e-07 2.9884e-07 6.6341e+01 2.2232e+02 -9.7653e+01
17 He 2.8696e-07 1.5129e-07 6.4060e-07 9.0358e+01 -6.7459e+01 -6.4782e+01
18 He 1.0325e-07 9.9012e-07 3.4381e-07 7.1108e+01 1.1060e+01 1.5912e+01
19 Ar 4.3929e-07 7.5363e-07 9.9974e-07 2.3919e+02 1.7383e+02 3.3529e+02

请注意,不是注释的第一行包含字段名。这对以后很重要。使用 csv 从python标准库中,我们可以很好地阅读它:

import csv

number_of_rows_to_skip = 12
with open('md_result.csv', 'r', newline='') as f:
    # skip the first rows
    for _ in range(number_of_rows_to_skip):
        next(f)

    csv_reader = csv.reader(f, delimiter=' ')
    for row in csv_reader:
        print(row)

然后产生以下输出:

['0', 'He', '5.7222e-07', '4.8811e-09', '2.0415e-07', '-2.9245e+01', '1.0045e+02', '1.2828e+02']
['1', 'He', '9.7710e-07', '3.6371e-07', '4.7311e-07', '-1.9926e+02', '2.3275e+02', '-5.3438e+02']
['2', 'Ar', '6.4989e-07', '6.7873e-07', '9.5000e-07', '-1.5592e+00', '-3.7876e+02', '8.4091e+01']
['3', 'Ar', '5.9024e-08', '3.7138e-07', '7.3455e-08', '3.4282e+02', '1.5682e+02', '-3.8991e+01']
['4', 'He', '7.6746e-07', '8.3017e-08', '4.8520e-07', '-3.0450e+01', '-3.7975e+02', '-3.3632e+02']
['5', 'Ar', '1.7226e-07', '4.6023e-07', '4.7356e-08', '-3.1151e+02', '-4.2939e+02', '-6.9474e+02']
['6', 'Ar', '9.6394e-07', '7.2845e-07', '8.8623e-07', '-8.2636e+01', '4.5098e+01', '-1.0626e+01']
['7', 'He', '5.4450e-07', '4.6373e-07', '6.2270e-07', '1.5889e+02', '2.5858e+02', '-1.5150e+02']
['8', 'He', '7.9322e-07', '9.4700e-07', '3.5194e-08', '-1.9703e+02', '1.5674e+02', '-1.8520e+02']
['9', 'Ar', '2.7797e-07', '1.6487e-07', '8.2403e-07', '-3.8650e+01', '-6.9632e+02', '2.1642e+02']
['10', 'He', '1.1842e-07', '6.3244e-07', '5.0958e-07', '-1.4963e+02', '4.2288e+02', '-7.6309e+01']
['11', 'Ar', '2.0359e-07', '8.3369e-07', '9.6348e-07', '4.8457e+02', '-2.6741e+02', '-3.5254e+02']
['12', 'He', '5.1019e-07', '2.2470e-07', '2.3846e-08', '-2.3192e+02', '-9.9510e+01', '3.2770e+01']
['13', 'Ar', '3.5383e-07', '8.4581e-07', '7.2340e-07', '-3.0395e+02', '4.7316e+01', '2.2253e+02']
['14', 'He', '3.8515e-07', '2.8940e-07', '5.6028e-07', '2.3308e+02', '2.5418e+02', '4.2983e+02']
['15', 'He', '1.5842e-07', '9.8225e-07', '5.7859e-07', '1.9963e+02', '2.0311e+02', '-4.2560e+02']
['16', 'He', '3.6831e-07', '7.6520e-07', '2.9884e-07', '6.6341e+01', '2.2232e+02', '-9.7653e+01']
['17', 'He', '2.8696e-07', '1.5129e-07', '6.4060e-07', '9.0358e+01', '-6.7459e+01', '-6.4782e+01']
['18', 'He', '1.0325e-07', '9.9012e-07', '3.4381e-07', '7.1108e+01', '1.1060e+01', '1.5912e+01']
['19', 'Ar', '4.3929e-07', '7.5363e-07', '9.9974e-07', '2.3919e+02', '1.7383e+02', '3.3529e+02']

但正如你所看到的,所有的数字都是以字符串的形式读取的。这是因为csv文件没有保存类型信息。一个快速的黑客可能是:

import csv

number_of_rows_to_skip = 12
possible_types = (int, float, str)

with open('md_result.csv', 'r', newline='') as f:
    # skip the first rows
    for _ in range(number_of_rows_to_skip):
        next(f)

    csv_reader = csv.reader(f, delimiter=' ')
    for row in csv_reader:
        for i, entry in enumerate(row):
            for possible_type in possible_types:
                try:
                    entry = possible_type(entry)
                except ValueError:
                    continue
                except:
                    raise
                else:
                    row[i] = entry
                    break
        print(row)

在这里,我们定义了要检查的类型顺序,在本例中,我们首先检查条目是否可以转换为整数,然后转换为浮点,然后转换为字符串。如果强制转换操作成功,我们将行的条目设置为新值,并退出检查类型的循环。现在产量接近我们想要的。

[0, 'He', 5.7222e-07, 4.8811e-09, 2.0415e-07, -29.245, 100.45, 128.28]
[1, 'He', 9.771e-07, 3.6371e-07, 4.7311e-07, -199.26, 232.75, -534.38]
[2, 'Ar', 6.4989e-07, 6.7873e-07, 9.5e-07, -1.5592, -378.76, 84.091]
[3, 'Ar', 5.9024e-08, 3.7138e-07, 7.3455e-08, 342.82, 156.82, -38.991]
[4, 'He', 7.6746e-07, 8.3017e-08, 4.852e-07, -30.45, -379.75, -336.32]
[5, 'Ar', 1.7226e-07, 4.6023e-07, 4.7356e-08, -311.51, -429.39, -694.74]
[6, 'Ar', 9.6394e-07, 7.2845e-07, 8.8623e-07, -82.636, 45.098, -10.626]
[7, 'He', 5.445e-07, 4.6373e-07, 6.227e-07, 158.89, 258.58, -151.5]
[8, 'He', 7.9322e-07, 9.47e-07, 3.5194e-08, -197.03, 156.74, -185.2]
[9, 'Ar', 2.7797e-07, 1.6487e-07, 8.2403e-07, -38.65, -696.32, 216.42]
[10, 'He', 1.1842e-07, 6.3244e-07, 5.0958e-07, -149.63, 422.88, -76.309]
[11, 'Ar', 2.0359e-07, 8.3369e-07, 9.6348e-07, 484.57, -267.41, -352.54]
[12, 'He', 5.1019e-07, 2.247e-07, 2.3846e-08, -231.92, -99.51, 32.77]
[13, 'Ar', 3.5383e-07, 8.4581e-07, 7.234e-07, -303.95, 47.316, 222.53]
[14, 'He', 3.8515e-07, 2.894e-07, 5.6028e-07, 233.08, 254.18, 429.83]
[15, 'He', 1.5842e-07, 9.8225e-07, 5.7859e-07, 199.63, 203.11, -425.6]
[16, 'He', 3.6831e-07, 7.652e-07, 2.9884e-07, 66.341, 222.32, -97.653]
[17, 'He', 2.8696e-07, 1.5129e-07, 6.406e-07, 90.358, -67.459, -64.782]
[18, 'He', 1.0325e-07, 9.9012e-07, 3.4381e-07, 71.108, 11.06, 15.912]
[19, 'Ar', 4.3929e-07, 7.5363e-07, 9.9974e-07, 239.19, 173.83, 335.29]

但是用这个编程仍然需要您确切地知道哪个字段号对应于哪个条目。而且,您的格式可能因文件而异,因此硬编码索引会导致错误的结果。如果我们能以某种方式通过名称访问字段,例如, row['id'] 获取记录的ID。这是哪里 csv.DictReader 进来。

>>> import csv
>>> number_of_rows_to_skip = 11
>>> with open('md_result.csv', 'r', newline='') as f:
...     # skip the first rows
...     for _ in range(number_of_rows_to_skip):
...         next(f)
...
...     csv_reader = csv.DictReader(f, delimiter=' ')
...     for row in csv_reader:
...         print(row)
...
OrderedDict([('atom_id', '0'), ('type', 'He'), ('x-position', '5.7222e-07'), ('y-position', '4.8811e-09'), ('z-position', '2.0415e-07'), ('x-velocity', '-2.9245e+01'), ('y-velocity', '1.0045e+02'), ('z-velocity', '1.2828e+02')])
OrderedDict([('atom_id', '1'), ('type', 'He'), ('x-position', '9.7710e-07'), ('y-position', '3.6371e-07'), ('z-position', '4.7311e-07'), ('x-velocity', '-1.9926e+02'), ('y-velocity', '2.3275e+02'), ('z-velocity', '-5.3438e+02')])
OrderedDict([('atom_id', '2'), ('type', 'Ar'), ('x-position', '6.4989e-07'), ('y-position', '6.7873e-07'), ('z-position', '9.5000e-07'), ('x-velocity', '-1.5592e+00'), ('y-velocity', '-3.7876e+02'), ('z-velocity', '8.4091e+01')])
OrderedDict([('atom_id', '3'), ('type', 'Ar'), ('x-position', '5.9024e-08'), ('y-position', '3.7138e-07'), ('z-position', '7.3455e-08'), ('x-velocity', '3.4282e+02'), ('y-velocity', '1.5682e+02'), ('z-velocity', '-3.8991e+01')])
OrderedDict([('atom_id', '4'), ('type', 'He'), ('x-position', '7.6746e-07'), ('y-position', '8.3017e-08'), ('z-position', '4.8520e-07'), ('x-velocity', '-3.0450e+01'), ('y-velocity', '-3.7975e+02'), ('z-velocity', '-3.3632e+02')])
OrderedDict([('atom_id', '5'), ('type', 'Ar'), ('x-position', '1.7226e-07'), ('y-position', '4.6023e-07'), ('z-position', '4.7356e-08'), ('x-velocity', '-3.1151e+02'), ('y-velocity', '-4.2939e+02'), ('z-velocity', '-6.9474e+02')])
OrderedDict([('atom_id', '6'), ('type', 'Ar'), ('x-position', '9.6394e-07'), ('y-position', '7.2845e-07'), ('z-position', '8.8623e-07'), ('x-velocity', '-8.2636e+01'), ('y-velocity', '4.5098e+01'), ('z-velocity', '-1.0626e+01')])
OrderedDict([('atom_id', '7'), ('type', 'He'), ('x-position', '5.4450e-07'), ('y-position', '4.6373e-07'), ('z-position', '6.2270e-07'), ('x-velocity', '1.5889e+02'), ('y-velocity', '2.5858e+02'), ('z-velocity', '-1.5150e+02')])
OrderedDict([('atom_id', '8'), ('type', 'He'), ('x-position', '7.9322e-07'), ('y-position', '9.4700e-07'), ('z-position', '3.5194e-08'), ('x-velocity', '-1.9703e+02'), ('y-velocity', '1.5674e+02'), ('z-velocity', '-1.8520e+02')])
OrderedDict([('atom_id', '9'), ('type', 'Ar'), ('x-position', '2.7797e-07'), ('y-position', '1.6487e-07'), ('z-position', '8.2403e-07'), ('x-velocity', '-3.8650e+01'), ('y-velocity', '-6.9632e+02'), ('z-velocity', '2.1642e+02')])
OrderedDict([('atom_id', '10'), ('type', 'He'), ('x-position', '1.1842e-07'), ('y-position', '6.3244e-07'), ('z-position', '5.0958e-07'), ('x-velocity', '-1.4963e+02'), ('y-velocity', '4.2288e+02'), ('z-velocity', '-7.6309e+01')])
OrderedDict([('atom_id', '11'), ('type', 'Ar'), ('x-position', '2.0359e-07'), ('y-position', '8.3369e-07'), ('z-position', '9.6348e-07'), ('x-velocity', '4.8457e+02'), ('y-velocity', '-2.6741e+02'), ('z-velocity', '-3.5254e+02')])
OrderedDict([('atom_id', '12'), ('type', 'He'), ('x-position', '5.1019e-07'), ('y-position', '2.2470e-07'), ('z-position', '2.3846e-08'), ('x-velocity', '-2.3192e+02'), ('y-velocity', '-9.9510e+01'), ('z-velocity', '3.2770e+01')])
OrderedDict([('atom_id', '13'), ('type', 'Ar'), ('x-position', '3.5383e-07'), ('y-position', '8.4581e-07'), ('z-position', '7.2340e-07'), ('x-velocity', '-3.0395e+02'), ('y-velocity', '4.7316e+01'), ('z-velocity', '2.2253e+02')])
OrderedDict([('atom_id', '14'), ('type', 'He'), ('x-position', '3.8515e-07'), ('y-position', '2.8940e-07'), ('z-position', '5.6028e-07'), ('x-velocity', '2.3308e+02'), ('y-velocity', '2.5418e+02'), ('z-velocity', '4.2983e+02')])
OrderedDict([('atom_id', '15'), ('type', 'He'), ('x-position', '1.5842e-07'), ('y-position', '9.8225e-07'), ('z-position', '5.7859e-07'), ('x-velocity', '1.9963e+02'), ('y-velocity', '2.0311e+02'), ('z-velocity', '-4.2560e+02')])
OrderedDict([('atom_id', '16'), ('type', 'He'), ('x-position', '3.6831e-07'), ('y-position', '7.6520e-07'), ('z-position', '2.9884e-07'), ('x-velocity', '6.6341e+01'), ('y-velocity', '2.2232e+02'), ('z-velocity', '-9.7653e+01')])
OrderedDict([('atom_id', '17'), ('type', 'He'), ('x-position', '2.8696e-07'), ('y-position', '1.5129e-07'), ('z-position', '6.4060e-07'), ('x-velocity', '9.0358e+01'), ('y-velocity', '-6.7459e+01'), ('z-velocity', '-6.4782e+01')])
OrderedDict([('atom_id', '18'), ('type', 'He'), ('x-position', '1.0325e-07'), ('y-position', '9.9012e-07'), ('z-position', '3.4381e-07'), ('x-velocity', '7.1108e+01'), ('y-velocity', '1.1060e+01'), ('z-velocity', '1.5912e+01')])
OrderedDict([('atom_id', '19'), ('type', 'Ar'), ('x-position', '4.3929e-07'), ('y-position', '7.5363e-07'), ('z-position', '9.9974e-07'), ('x-velocity', '2.3919e+02'), ('y-velocity', '1.7383e+02'), ('z-velocity', '3.3529e+02')])

注解

如果您至少没有使用python 3.6, DictReader 返回常规 dict 而不是它的有序变体, OrderedDict .

现在田地在 OrderedDict ,强制转换字段条目的例程略有不同:

>>> number_of_rows_to_skip = 11
>>> with open('md_result.csv', 'r', newline='') as f:
...     # skip the first rows
...     for _ in range(number_of_rows_to_skip):
...         next(f)
...
...     csv_reader = csv.DictReader(f, delimiter=' ')
...     for row in csv_reader:
...         for key, entry in row.items():
...             for possible_type in possible_types:
...                 try:
...                     entry = possible_type(entry)
...                 except ValueError:
...                     continue
...                 except:
...                     raise
...                 else:
...                     row[key] = entry
...                     break
...         print(row)
...
OrderedDict([('atom_id', 0), ('type', 'He'), ('x-position', 5.7222e-07), ('y-position', 4.8811e-09), ('z-position', 2.0415e-07), ('x-velocity', -29.245), ('y-velocity', 100.45), ('z-velocity', 128.28)])
OrderedDict([('atom_id', 1), ('type', 'He'), ('x-position', 9.771e-07), ('y-position', 3.6371e-07), ('z-position', 4.7311e-07), ('x-velocity', -199.26), ('y-velocity', 232.75), ('z-velocity', -534.38)])
OrderedDict([('atom_id', 2), ('type', 'Ar'), ('x-position', 6.4989e-07), ('y-position', 6.7873e-07), ('z-position', 9.5e-07), ('x-velocity', -1.5592), ('y-velocity', -378.76), ('z-velocity', 84.091)])
OrderedDict([('atom_id', 3), ('type', 'Ar'), ('x-position', 5.9024e-08), ('y-position', 3.7138e-07), ('z-position', 7.3455e-08), ('x-velocity', 342.82), ('y-velocity', 156.82), ('z-velocity', -38.991)])
OrderedDict([('atom_id', 4), ('type', 'He'), ('x-position', 7.6746e-07), ('y-position', 8.3017e-08), ('z-position', 4.852e-07), ('x-velocity', -30.45), ('y-velocity', -379.75), ('z-velocity', -336.32)])
OrderedDict([('atom_id', 5), ('type', 'Ar'), ('x-position', 1.7226e-07), ('y-position', 4.6023e-07), ('z-position', 4.7356e-08), ('x-velocity', -311.51), ('y-velocity', -429.39), ('z-velocity', -694.74)])
OrderedDict([('atom_id', 6), ('type', 'Ar'), ('x-position', 9.6394e-07), ('y-position', 7.2845e-07), ('z-position', 8.8623e-07), ('x-velocity', -82.636), ('y-velocity', 45.098), ('z-velocity', -10.626)])
OrderedDict([('atom_id', 7), ('type', 'He'), ('x-position', 5.445e-07), ('y-position', 4.6373e-07), ('z-position', 6.227e-07), ('x-velocity', 158.89), ('y-velocity', 258.58), ('z-velocity', -151.5)])
OrderedDict([('atom_id', 8), ('type', 'He'), ('x-position', 7.9322e-07), ('y-position', 9.47e-07), ('z-position', 3.5194e-08), ('x-velocity', -197.03), ('y-velocity', 156.74), ('z-velocity', -185.2)])
OrderedDict([('atom_id', 9), ('type', 'Ar'), ('x-position', 2.7797e-07), ('y-position', 1.6487e-07), ('z-position', 8.2403e-07), ('x-velocity', -38.65), ('y-velocity', -696.32), ('z-velocity', 216.42)])
OrderedDict([('atom_id', 10), ('type', 'He'), ('x-position', 1.1842e-07), ('y-position', 6.3244e-07), ('z-position', 5.0958e-07), ('x-velocity', -149.63), ('y-velocity', 422.88), ('z-velocity', -76.309)])
OrderedDict([('atom_id', 11), ('type', 'Ar'), ('x-position', 2.0359e-07), ('y-position', 8.3369e-07), ('z-position', 9.6348e-07), ('x-velocity', 484.57), ('y-velocity', -267.41), ('z-velocity', -352.54)])
OrderedDict([('atom_id', 12), ('type', 'He'), ('x-position', 5.1019e-07), ('y-position', 2.247e-07), ('z-position', 2.3846e-08), ('x-velocity', -231.92), ('y-velocity', -99.51), ('z-velocity', 32.77)])
OrderedDict([('atom_id', 13), ('type', 'Ar'), ('x-position', 3.5383e-07), ('y-position', 8.4581e-07), ('z-position', 7.234e-07), ('x-velocity', -303.95), ('y-velocity', 47.316), ('z-velocity', 222.53)])
OrderedDict([('atom_id', 14), ('type', 'He'), ('x-position', 3.8515e-07), ('y-position', 2.894e-07), ('z-position', 5.6028e-07), ('x-velocity', 233.08), ('y-velocity', 254.18), ('z-velocity', 429.83)])
OrderedDict([('atom_id', 15), ('type', 'He'), ('x-position', 1.5842e-07), ('y-position', 9.8225e-07), ('z-position', 5.7859e-07), ('x-velocity', 199.63), ('y-velocity', 203.11), ('z-velocity', -425.6)])
OrderedDict([('atom_id', 16), ('type', 'He'), ('x-position', 3.6831e-07), ('y-position', 7.652e-07), ('z-position', 2.9884e-07), ('x-velocity', 66.341), ('y-velocity', 222.32), ('z-velocity', -97.653)])
OrderedDict([('atom_id', 17), ('type', 'He'), ('x-position', 2.8696e-07), ('y-position', 1.5129e-07), ('z-position', 6.406e-07), ('x-velocity', 90.358), ('y-velocity', -67.459), ('z-velocity', -64.782)])
OrderedDict([('atom_id', 18), ('type', 'He'), ('x-position', 1.0325e-07), ('y-position', 9.9012e-07), ('z-position', 3.4381e-07), ('x-velocity', 71.108), ('y-velocity', 11.06), ('z-velocity', 15.912)])
OrderedDict([('atom_id', 19), ('type', 'Ar'), ('x-position', 4.3929e-07), ('y-position', 7.5363e-07), ('z-position', 9.9974e-07), ('x-velocity', 239.19), ('y-velocity', 173.83), ('z-velocity', 335.29)])

只要文件中的字段名一致,就可以编写需要较少维护的代码。

另一种读取csv文件的方法是使用 loadtxt() numpy的函数。通过指定数据类型 Structured arrays 类型转换是为您完成的,同时保留类似字典的行为。还可以指定应忽略的注释字符和要跳过的行数:

csv_dtype = [
    ('atom_id', np.int32),
    ('type', np.string_, 2),
    ('position', np.float64, 3),
    ('velocity', np.float64, 3)
]
with open('md_result.csv', 'r') as f:
    md_data = np.loadtxt(f, dtype=csv_dtype, skiprows=12)
print(md_data)
[ ( 0, b'He', [  5.72220000e-07,   4.88110000e-09,   2.04150000e-07], [ -29.245 ,  100.45  ,  128.28  ])
 ( 1, b'He', [  9.77100000e-07,   3.63710000e-07,   4.73110000e-07], [-199.26  ,  232.75  , -534.38  ])
 ( 2, b'Ar', [  6.49890000e-07,   6.78730000e-07,   9.50000000e-07], [  -1.5592, -378.76  ,   84.091 ])
 ( 3, b'Ar', [  5.90240000e-08,   3.71380000e-07,   7.34550000e-08], [ 342.82  ,  156.82  ,  -38.991 ])
 ( 4, b'He', [  7.67460000e-07,   8.30170000e-08,   4.85200000e-07], [ -30.45  , -379.75  , -336.32  ])
 ( 5, b'Ar', [  1.72260000e-07,   4.60230000e-07,   4.73560000e-08], [-311.51  , -429.39  , -694.74  ])
 ( 6, b'Ar', [  9.63940000e-07,   7.28450000e-07,   8.86230000e-07], [ -82.636 ,   45.098 ,  -10.626 ])
 ( 7, b'He', [  5.44500000e-07,   4.63730000e-07,   6.22700000e-07], [ 158.89  ,  258.58  , -151.5   ])
 ( 8, b'He', [  7.93220000e-07,   9.47000000e-07,   3.51940000e-08], [-197.03  ,  156.74  , -185.2   ])
 ( 9, b'Ar', [  2.77970000e-07,   1.64870000e-07,   8.24030000e-07], [ -38.65  , -696.32  ,  216.42  ])
 (10, b'He', [  1.18420000e-07,   6.32440000e-07,   5.09580000e-07], [-149.63  ,  422.88  ,  -76.309 ])
 (11, b'Ar', [  2.03590000e-07,   8.33690000e-07,   9.63480000e-07], [ 484.57  , -267.41  , -352.54  ])
 (12, b'He', [  5.10190000e-07,   2.24700000e-07,   2.38460000e-08], [-231.92  ,  -99.51  ,   32.77  ])
 (13, b'Ar', [  3.53830000e-07,   8.45810000e-07,   7.23400000e-07], [-303.95  ,   47.316 ,  222.53  ])
 (14, b'He', [  3.85150000e-07,   2.89400000e-07,   5.60280000e-07], [ 233.08  ,  254.18  ,  429.83  ])
 (15, b'He', [  1.58420000e-07,   9.82250000e-07,   5.78590000e-07], [ 199.63  ,  203.11  , -425.6   ])
 (16, b'He', [  3.68310000e-07,   7.65200000e-07,   2.98840000e-07], [  66.341 ,  222.32  ,  -97.653 ])
 (17, b'He', [  2.86960000e-07,   1.51290000e-07,   6.40600000e-07], [  90.358 ,  -67.459 ,  -64.782 ])
 (18, b'He', [  1.03250000e-07,   9.90120000e-07,   3.43810000e-07], [  71.108 ,   11.06  ,   15.912 ])
 (19, b'Ar', [  4.39290000e-07,   7.53630000e-07,   9.99740000e-07], [ 239.19  ,  173.83  ,  335.29  ])]

因此,这使得使用非常方便,例如,可以很容易地计算速度,如下所示:

print(np.linalg.norm(md_data['velocity'], axis=1))

与输出

[ 165.53317168  615.98627785  387.98565049  378.99652094  508.18147093
  874.11550713   94.73885146  339.20775124  312.54965765  730.20064455
  455.01614782  656.20167982  254.48575481  379.66301803  551.07856763
  512.09488281  251.72091508  130.04611632   73.70117372  447.04374406]