逗号分隔值¶

文件扩展名: .csv

我们可能主要通过与使用Microsoft Excel等电子表格软件的人一起工作来了解逗号分隔值(CSV)文件。由于这类软件的原生文件格式可能阅读起来很麻烦，我们恳请您以CSV格式导出这些电子表格。

通常，CSV文件是纯文本格式，每行包含一条记录，并带有几个以分隔符分隔的字段。每条记录的字段顺序都是相同的。

请注意，CSV文件没有标准化，因此存在许多不同的风格。例如，分隔符不必是逗号，也可以是简单的空格。如果分隔符是字段值的一部分(例如，在字符串中)，则需要转义字符，这取决于CSV实现的风格。在每个实现中，注释的处理方式也不同，有些注释允许通过以下方式在标题中添加注释 # ，以及其他根本不支持它们的人

因此，CSV文件可以用于非常简单的文件，但可能不适合复杂的数据。

让我们看一下CSV文件，它是存储在文件中的原子模拟的结果：

with open('md_result.csv', 'r') as f:
    print(f.read())

# Atom types:
# - He: Helium
# - Ar: Argon
# Potentials:
# - He: Lennard-Jones potential with epsilon/k_B = 10.22 K, sigma = 256 pm
# - Ar: Lennard-Jones potential with epsilon/k_B = 120 K, sigma = 341 pm
# Simulation box size: 100 µm x 200 µm x 300 µm
# Periodic boundary conditions in all directions
# Step: 0
# Time: 0 s
# All quantities are given in SI units
atom_id type x-position y-position z-position x-velocity y-velocity z-velocity
0 He 5.7222e-07 4.8811e-09 2.0415e-07 -2.9245e+01 1.0045e+02 1.2828e+02
1 He 9.7710e-07 3.6371e-07 4.7311e-07 -1.9926e+02 2.3275e+02 -5.3438e+02
2 Ar 6.4989e-07 6.7873e-07 9.5000e-07 -1.5592e+00 -3.7876e+02 8.4091e+01
3 Ar 5.9024e-08 3.7138e-07 7.3455e-08 3.4282e+02 1.5682e+02 -3.8991e+01
4 He 7.6746e-07 8.3017e-08 4.8520e-07 -3.0450e+01 -3.7975e+02 -3.3632e+02
5 Ar 1.7226e-07 4.6023e-07 4.7356e-08 -3.1151e+02 -4.2939e+02 -6.9474e+02
6 Ar 9.6394e-07 7.2845e-07 8.8623e-07 -8.2636e+01 4.5098e+01 -1.0626e+01
7 He 5.4450e-07 4.6373e-07 6.2270e-07 1.5889e+02 2.5858e+02 -1.5150e+02
8 He 7.9322e-07 9.4700e-07 3.5194e-08 -1.9703e+02 1.5674e+02 -1.8520e+02
9 Ar 2.7797e-07 1.6487e-07 8.2403e-07 -3.8650e+01 -6.9632e+02 2.1642e+02
10 He 1.1842e-07 6.3244e-07 5.0958e-07 -1.4963e+02 4.2288e+02 -7.6309e+01
11 Ar 2.0359e-07 8.3369e-07 9.6348e-07 4.8457e+02 -2.6741e+02 -3.5254e+02
12 He 5.1019e-07 2.2470e-07 2.3846e-08 -2.3192e+02 -9.9510e+01 3.2770e+01
13 Ar 3.5383e-07 8.4581e-07 7.2340e-07 -3.0395e+02 4.7316e+01 2.2253e+02
14 He 3.8515e-07 2.8940e-07 5.6028e-07 2.3308e+02 2.5418e+02 4.2983e+02
15 He 1.5842e-07 9.8225e-07 5.7859e-07 1.9963e+02 2.0311e+02 -4.2560e+02
16 He 3.6831e-07 7.6520e-07 2.9884e-07 6.6341e+01 2.2232e+02 -9.7653e+01
17 He 2.8696e-07 1.5129e-07 6.4060e-07 9.0358e+01 -6.7459e+01 -6.4782e+01
18 He 1.0325e-07 9.9012e-07 3.4381e-07 7.1108e+01 1.1060e+01 1.5912e+01
19 Ar 4.3929e-07 7.5363e-07 9.9974e-07 2.3919e+02 1.7383e+02 3.3529e+02

请注意，不是注释的第一行包含字段名。这对于以后来说将是很重要的。使用 csv 从Python标准库中，我们可以很好地读取它：

import csv

number_of_rows_to_skip = 12
with open('md_result.csv', 'r', newline='') as f:
    # skip the first rows
    for _ in range(number_of_rows_to_skip):
        next(f)

    csv_reader = csv.reader(f, delimiter=' ')
    for row in csv_reader:
        print(row)

这将导致以下输出：

['0', 'He', '5.7222e-07', '4.8811e-09', '2.0415e-07', '-2.9245e+01', '1.0045e+02', '1.2828e+02']
['1', 'He', '9.7710e-07', '3.6371e-07', '4.7311e-07', '-1.9926e+02', '2.3275e+02', '-5.3438e+02']
['2', 'Ar', '6.4989e-07', '6.7873e-07', '9.5000e-07', '-1.5592e+00', '-3.7876e+02', '8.4091e+01']
['3', 'Ar', '5.9024e-08', '3.7138e-07', '7.3455e-08', '3.4282e+02', '1.5682e+02', '-3.8991e+01']
['4', 'He', '7.6746e-07', '8.3017e-08', '4.8520e-07', '-3.0450e+01', '-3.7975e+02', '-3.3632e+02']
['5', 'Ar', '1.7226e-07', '4.6023e-07', '4.7356e-08', '-3.1151e+02', '-4.2939e+02', '-6.9474e+02']
['6', 'Ar', '9.6394e-07', '7.2845e-07', '8.8623e-07', '-8.2636e+01', '4.5098e+01', '-1.0626e+01']
['7', 'He', '5.4450e-07', '4.6373e-07', '6.2270e-07', '1.5889e+02', '2.5858e+02', '-1.5150e+02']
['8', 'He', '7.9322e-07', '9.4700e-07', '3.5194e-08', '-1.9703e+02', '1.5674e+02', '-1.8520e+02']
['9', 'Ar', '2.7797e-07', '1.6487e-07', '8.2403e-07', '-3.8650e+01', '-6.9632e+02', '2.1642e+02']
['10', 'He', '1.1842e-07', '6.3244e-07', '5.0958e-07', '-1.4963e+02', '4.2288e+02', '-7.6309e+01']
['11', 'Ar', '2.0359e-07', '8.3369e-07', '9.6348e-07', '4.8457e+02', '-2.6741e+02', '-3.5254e+02']
['12', 'He', '5.1019e-07', '2.2470e-07', '2.3846e-08', '-2.3192e+02', '-9.9510e+01', '3.2770e+01']
['13', 'Ar', '3.5383e-07', '8.4581e-07', '7.2340e-07', '-3.0395e+02', '4.7316e+01', '2.2253e+02']
['14', 'He', '3.8515e-07', '2.8940e-07', '5.6028e-07', '2.3308e+02', '2.5418e+02', '4.2983e+02']
['15', 'He', '1.5842e-07', '9.8225e-07', '5.7859e-07', '1.9963e+02', '2.0311e+02', '-4.2560e+02']
['16', 'He', '3.6831e-07', '7.6520e-07', '2.9884e-07', '6.6341e+01', '2.2232e+02', '-9.7653e+01']
['17', 'He', '2.8696e-07', '1.5129e-07', '6.4060e-07', '9.0358e+01', '-6.7459e+01', '-6.4782e+01']
['18', 'He', '1.0325e-07', '9.9012e-07', '3.4381e-07', '7.1108e+01', '1.1060e+01', '1.5912e+01']
['19', 'Ar', '4.3929e-07', '7.5363e-07', '9.9974e-07', '2.3919e+02', '1.7383e+02', '3.3529e+02']

但是正如您所看到的，所有的数字都是以字符串的形式读入的。这是由于CSV文件没有保留类型信息。快速破解可能如下所示：

import csv

number_of_rows_to_skip = 12
possible_types = (int, float, str)

with open('md_result.csv', 'r', newline='') as f:
    # skip the first rows
    for _ in range(number_of_rows_to_skip):
        next(f)

    csv_reader = csv.reader(f, delimiter=' ')
    for row in csv_reader:
        for i, entry in enumerate(row):
            for possible_type in possible_types:
                try:
                    entry = possible_type(entry)
                except ValueError:
                    continue
                except:
                    raise
                else:
                    row[i] = entry
                    break
        print(row)

在这里，我们定义了要检查的类型顺序，在本例中，我们首先检查条目是否可以强制转换为整数，然后转换为浮点数，然后转换为字符串。如果强制转换操作成功，我们将行的条目设置为新值，并退出检查类型的循环。现在产量更接近我们想要的了。

[0, 'He', 5.7222e-07, 4.8811e-09, 2.0415e-07, -29.245, 100.45, 128.28]
[1, 'He', 9.771e-07, 3.6371e-07, 4.7311e-07, -199.26, 232.75, -534.38]
[2, 'Ar', 6.4989e-07, 6.7873e-07, 9.5e-07, -1.5592, -378.76, 84.091]
[3, 'Ar', 5.9024e-08, 3.7138e-07, 7.3455e-08, 342.82, 156.82, -38.991]
[4, 'He', 7.6746e-07, 8.3017e-08, 4.852e-07, -30.45, -379.75, -336.32]
[5, 'Ar', 1.7226e-07, 4.6023e-07, 4.7356e-08, -311.51, -429.39, -694.74]
[6, 'Ar', 9.6394e-07, 7.2845e-07, 8.8623e-07, -82.636, 45.098, -10.626]
[7, 'He', 5.445e-07, 4.6373e-07, 6.227e-07, 158.89, 258.58, -151.5]
[8, 'He', 7.9322e-07, 9.47e-07, 3.5194e-08, -197.03, 156.74, -185.2]
[9, 'Ar', 2.7797e-07, 1.6487e-07, 8.2403e-07, -38.65, -696.32, 216.42]
[10, 'He', 1.1842e-07, 6.3244e-07, 5.0958e-07, -149.63, 422.88, -76.309]
[11, 'Ar', 2.0359e-07, 8.3369e-07, 9.6348e-07, 484.57, -267.41, -352.54]
[12, 'He', 5.1019e-07, 2.247e-07, 2.3846e-08, -231.92, -99.51, 32.77]
[13, 'Ar', 3.5383e-07, 8.4581e-07, 7.234e-07, -303.95, 47.316, 222.53]
[14, 'He', 3.8515e-07, 2.894e-07, 5.6028e-07, 233.08, 254.18, 429.83]
[15, 'He', 1.5842e-07, 9.8225e-07, 5.7859e-07, 199.63, 203.11, -425.6]
[16, 'He', 3.6831e-07, 7.652e-07, 2.9884e-07, 66.341, 222.32, -97.653]
[17, 'He', 2.8696e-07, 1.5129e-07, 6.406e-07, 90.358, -67.459, -64.782]
[18, 'He', 1.0325e-07, 9.9012e-07, 3.4381e-07, 71.108, 11.06, 15.912]
[19, 'Ar', 4.3929e-07, 7.5363e-07, 9.9974e-07, 239.19, 173.83, 335.29]

但是使用这个进行编程仍然需要您确切地知道哪个字段编号对应于哪个条目。而且，您的格式可能因文件不同而不同，因此您的硬编码索引会导致错误的结果。如果我们能以某种方式按名称访问字段会更好，例如， row['id'] 来获取记录的ID。这里就是 csv.DictReader 进来了。

>>> import csv
>>> number_of_rows_to_skip = 11
>>> with open('md_result.csv', 'r', newline='') as f:
...     # skip the first rows
...     for _ in range(number_of_rows_to_skip):
...         next(f)
...
...     csv_reader = csv.DictReader(f, delimiter=' ')
...     for row in csv_reader:
...         print(row)
...
OrderedDict([('atom_id', '0'), ('type', 'He'), ('x-position', '5.7222e-07'), ('y-position', '4.8811e-09'), ('z-position', '2.0415e-07'), ('x-velocity', '-2.9245e+01'), ('y-velocity', '1.0045e+02'), ('z-velocity', '1.2828e+02')])
OrderedDict([('atom_id', '1'), ('type', 'He'), ('x-position', '9.7710e-07'), ('y-position', '3.6371e-07'), ('z-position', '4.7311e-07'), ('x-velocity', '-1.9926e+02'), ('y-velocity', '2.3275e+02'), ('z-velocity', '-5.3438e+02')])
OrderedDict([('atom_id', '2'), ('type', 'Ar'), ('x-position', '6.4989e-07'), ('y-position', '6.7873e-07'), ('z-position', '9.5000e-07'), ('x-velocity', '-1.5592e+00'), ('y-velocity', '-3.7876e+02'), ('z-velocity', '8.4091e+01')])
OrderedDict([('atom_id', '3'), ('type', 'Ar'), ('x-position', '5.9024e-08'), ('y-position', '3.7138e-07'), ('z-position', '7.3455e-08'), ('x-velocity', '3.4282e+02'), ('y-velocity', '1.5682e+02'), ('z-velocity', '-3.8991e+01')])
OrderedDict([('atom_id', '4'), ('type', 'He'), ('x-position', '7.6746e-07'), ('y-position', '8.3017e-08'), ('z-position', '4.8520e-07'), ('x-velocity', '-3.0450e+01'), ('y-velocity', '-3.7975e+02'), ('z-velocity', '-3.3632e+02')])
OrderedDict([('atom_id', '5'), ('type', 'Ar'), ('x-position', '1.7226e-07'), ('y-position', '4.6023e-07'), ('z-position', '4.7356e-08'), ('x-velocity', '-3.1151e+02'), ('y-velocity', '-4.2939e+02'), ('z-velocity', '-6.9474e+02')])
OrderedDict([('atom_id', '6'), ('type', 'Ar'), ('x-position', '9.6394e-07'), ('y-position', '7.2845e-07'), ('z-position', '8.8623e-07'), ('x-velocity', '-8.2636e+01'), ('y-velocity', '4.5098e+01'), ('z-velocity', '-1.0626e+01')])
OrderedDict([('atom_id', '7'), ('type', 'He'), ('x-position', '5.4450e-07'), ('y-position', '4.6373e-07'), ('z-position', '6.2270e-07'), ('x-velocity', '1.5889e+02'), ('y-velocity', '2.5858e+02'), ('z-velocity', '-1.5150e+02')])
OrderedDict([('atom_id', '8'), ('type', 'He'), ('x-position', '7.9322e-07'), ('y-position', '9.4700e-07'), ('z-position', '3.5194e-08'), ('x-velocity', '-1.9703e+02'), ('y-velocity', '1.5674e+02'), ('z-velocity', '-1.8520e+02')])
OrderedDict([('atom_id', '9'), ('type', 'Ar'), ('x-position', '2.7797e-07'), ('y-position', '1.6487e-07'), ('z-position', '8.2403e-07'), ('x-velocity', '-3.8650e+01'), ('y-velocity', '-6.9632e+02'), ('z-velocity', '2.1642e+02')])
OrderedDict([('atom_id', '10'), ('type', 'He'), ('x-position', '1.1842e-07'), ('y-position', '6.3244e-07'), ('z-position', '5.0958e-07'), ('x-velocity', '-1.4963e+02'), ('y-velocity', '4.2288e+02'), ('z-velocity', '-7.6309e+01')])
OrderedDict([('atom_id', '11'), ('type', 'Ar'), ('x-position', '2.0359e-07'), ('y-position', '8.3369e-07'), ('z-position', '9.6348e-07'), ('x-velocity', '4.8457e+02'), ('y-velocity', '-2.6741e+02'), ('z-velocity', '-3.5254e+02')])
OrderedDict([('atom_id', '12'), ('type', 'He'), ('x-position', '5.1019e-07'), ('y-position', '2.2470e-07'), ('z-position', '2.3846e-08'), ('x-velocity', '-2.3192e+02'), ('y-velocity', '-9.9510e+01'), ('z-velocity', '3.2770e+01')])
OrderedDict([('atom_id', '13'), ('type', 'Ar'), ('x-position', '3.5383e-07'), ('y-position', '8.4581e-07'), ('z-position', '7.2340e-07'), ('x-velocity', '-3.0395e+02'), ('y-velocity', '4.7316e+01'), ('z-velocity', '2.2253e+02')])
OrderedDict([('atom_id', '14'), ('type', 'He'), ('x-position', '3.8515e-07'), ('y-position', '2.8940e-07'), ('z-position', '5.6028e-07'), ('x-velocity', '2.3308e+02'), ('y-velocity', '2.5418e+02'), ('z-velocity', '4.2983e+02')])
OrderedDict([('atom_id', '15'), ('type', 'He'), ('x-position', '1.5842e-07'), ('y-position', '9.8225e-07'), ('z-position', '5.7859e-07'), ('x-velocity', '1.9963e+02'), ('y-velocity', '2.0311e+02'), ('z-velocity', '-4.2560e+02')])
OrderedDict([('atom_id', '16'), ('type', 'He'), ('x-position', '3.6831e-07'), ('y-position', '7.6520e-07'), ('z-position', '2.9884e-07'), ('x-velocity', '6.6341e+01'), ('y-velocity', '2.2232e+02'), ('z-velocity', '-9.7653e+01')])
OrderedDict([('atom_id', '17'), ('type', 'He'), ('x-position', '2.8696e-07'), ('y-position', '1.5129e-07'), ('z-position', '6.4060e-07'), ('x-velocity', '9.0358e+01'), ('y-velocity', '-6.7459e+01'), ('z-velocity', '-6.4782e+01')])
OrderedDict([('atom_id', '18'), ('type', 'He'), ('x-position', '1.0325e-07'), ('y-position', '9.9012e-07'), ('z-position', '3.4381e-07'), ('x-velocity', '7.1108e+01'), ('y-velocity', '1.1060e+01'), ('z-velocity', '1.5912e+01')])
OrderedDict([('atom_id', '19'), ('type', 'Ar'), ('x-position', '4.3929e-07'), ('y-position', '7.5363e-07'), ('z-position', '9.9974e-07'), ('x-velocity', '2.3919e+02'), ('y-velocity', '1.7383e+02'), ('z-velocity', '3.3529e+02')])

注解

如果您使用的不是Python3.6或更低版本， DictReader 返回常规 dict 而不是它的有序变体， OrderedDict 。

现在，字段位于 OrderedDict ，转换字段条目的例程略有不同：

>>> number_of_rows_to_skip = 11
>>> with open('md_result.csv', 'r', newline='') as f:
...     # skip the first rows
...     for _ in range(number_of_rows_to_skip):
...         next(f)
...
...     csv_reader = csv.DictReader(f, delimiter=' ')
...     for row in csv_reader:
...         for key, entry in row.items():
...             for possible_type in possible_types:
...                 try:
...                     entry = possible_type(entry)
...                 except ValueError:
...                     continue
...                 except:
...                     raise
...                 else:
...                     row[key] = entry
...                     break
...         print(row)
...
OrderedDict([('atom_id', 0), ('type', 'He'), ('x-position', 5.7222e-07), ('y-position', 4.8811e-09), ('z-position', 2.0415e-07), ('x-velocity', -29.245), ('y-velocity', 100.45), ('z-velocity', 128.28)])
OrderedDict([('atom_id', 1), ('type', 'He'), ('x-position', 9.771e-07), ('y-position', 3.6371e-07), ('z-position', 4.7311e-07), ('x-velocity', -199.26), ('y-velocity', 232.75), ('z-velocity', -534.38)])
OrderedDict([('atom_id', 2), ('type', 'Ar'), ('x-position', 6.4989e-07), ('y-position', 6.7873e-07), ('z-position', 9.5e-07), ('x-velocity', -1.5592), ('y-velocity', -378.76), ('z-velocity', 84.091)])
OrderedDict([('atom_id', 3), ('type', 'Ar'), ('x-position', 5.9024e-08), ('y-position', 3.7138e-07), ('z-position', 7.3455e-08), ('x-velocity', 342.82), ('y-velocity', 156.82), ('z-velocity', -38.991)])
OrderedDict([('atom_id', 4), ('type', 'He'), ('x-position', 7.6746e-07), ('y-position', 8.3017e-08), ('z-position', 4.852e-07), ('x-velocity', -30.45), ('y-velocity', -379.75), ('z-velocity', -336.32)])
OrderedDict([('atom_id', 5), ('type', 'Ar'), ('x-position', 1.7226e-07), ('y-position', 4.6023e-07), ('z-position', 4.7356e-08), ('x-velocity', -311.51), ('y-velocity', -429.39), ('z-velocity', -694.74)])
OrderedDict([('atom_id', 6), ('type', 'Ar'), ('x-position', 9.6394e-07), ('y-position', 7.2845e-07), ('z-position', 8.8623e-07), ('x-velocity', -82.636), ('y-velocity', 45.098), ('z-velocity', -10.626)])
OrderedDict([('atom_id', 7), ('type', 'He'), ('x-position', 5.445e-07), ('y-position', 4.6373e-07), ('z-position', 6.227e-07), ('x-velocity', 158.89), ('y-velocity', 258.58), ('z-velocity', -151.5)])
OrderedDict([('atom_id', 8), ('type', 'He'), ('x-position', 7.9322e-07), ('y-position', 9.47e-07), ('z-position', 3.5194e-08), ('x-velocity', -197.03), ('y-velocity', 156.74), ('z-velocity', -185.2)])
OrderedDict([('atom_id', 9), ('type', 'Ar'), ('x-position', 2.7797e-07), ('y-position', 1.6487e-07), ('z-position', 8.2403e-07), ('x-velocity', -38.65), ('y-velocity', -696.32), ('z-velocity', 216.42)])
OrderedDict([('atom_id', 10), ('type', 'He'), ('x-position', 1.1842e-07), ('y-position', 6.3244e-07), ('z-position', 5.0958e-07), ('x-velocity', -149.63), ('y-velocity', 422.88), ('z-velocity', -76.309)])
OrderedDict([('atom_id', 11), ('type', 'Ar'), ('x-position', 2.0359e-07), ('y-position', 8.3369e-07), ('z-position', 9.6348e-07), ('x-velocity', 484.57), ('y-velocity', -267.41), ('z-velocity', -352.54)])
OrderedDict([('atom_id', 12), ('type', 'He'), ('x-position', 5.1019e-07), ('y-position', 2.247e-07), ('z-position', 2.3846e-08), ('x-velocity', -231.92), ('y-velocity', -99.51), ('z-velocity', 32.77)])
OrderedDict([('atom_id', 13), ('type', 'Ar'), ('x-position', 3.5383e-07), ('y-position', 8.4581e-07), ('z-position', 7.234e-07), ('x-velocity', -303.95), ('y-velocity', 47.316), ('z-velocity', 222.53)])
OrderedDict([('atom_id', 14), ('type', 'He'), ('x-position', 3.8515e-07), ('y-position', 2.894e-07), ('z-position', 5.6028e-07), ('x-velocity', 233.08), ('y-velocity', 254.18), ('z-velocity', 429.83)])
OrderedDict([('atom_id', 15), ('type', 'He'), ('x-position', 1.5842e-07), ('y-position', 9.8225e-07), ('z-position', 5.7859e-07), ('x-velocity', 199.63), ('y-velocity', 203.11), ('z-velocity', -425.6)])
OrderedDict([('atom_id', 16), ('type', 'He'), ('x-position', 3.6831e-07), ('y-position', 7.652e-07), ('z-position', 2.9884e-07), ('x-velocity', 66.341), ('y-velocity', 222.32), ('z-velocity', -97.653)])
OrderedDict([('atom_id', 17), ('type', 'He'), ('x-position', 2.8696e-07), ('y-position', 1.5129e-07), ('z-position', 6.406e-07), ('x-velocity', 90.358), ('y-velocity', -67.459), ('z-velocity', -64.782)])
OrderedDict([('atom_id', 18), ('type', 'He'), ('x-position', 1.0325e-07), ('y-position', 9.9012e-07), ('z-position', 3.4381e-07), ('x-velocity', 71.108), ('y-velocity', 11.06), ('z-velocity', 15.912)])
OrderedDict([('atom_id', 19), ('type', 'Ar'), ('x-position', 4.3929e-07), ('y-position', 7.5363e-07), ('z-position', 9.9974e-07), ('x-velocity', 239.19), ('y-velocity', 173.83), ('z-velocity', 335.29)])

只要文件中的字段名称一致，您就可以编写需要较少维护的代码。

读取CSV文件的另一种方式是使用 loadtxt() NumPy的函数。通过按如下方式指定数据类型 Structured arrays 类型转换是为您完成的，同时保留类似字典的行为。您还可以指定应忽略的注释字符和要跳过的行数：

csv_dtype = [
    ('atom_id', np.int32),
    ('type', np.string_, 2),
    ('position', np.float64, 3),
    ('velocity', np.float64, 3)
]
with open('md_result.csv', 'r') as f:
    md_data = np.loadtxt(f, dtype=csv_dtype, skiprows=12)
print(md_data)

[ ( 0, b'He', [  5.72220000e-07,   4.88110000e-09,   2.04150000e-07], [ -29.245 ,  100.45  ,  128.28  ])
 ( 1, b'He', [  9.77100000e-07,   3.63710000e-07,   4.73110000e-07], [-199.26  ,  232.75  , -534.38  ])
 ( 2, b'Ar', [  6.49890000e-07,   6.78730000e-07,   9.50000000e-07], [  -1.5592, -378.76  ,   84.091 ])
 ( 3, b'Ar', [  5.90240000e-08,   3.71380000e-07,   7.34550000e-08], [ 342.82  ,  156.82  ,  -38.991 ])
 ( 4, b'He', [  7.67460000e-07,   8.30170000e-08,   4.85200000e-07], [ -30.45  , -379.75  , -336.32  ])
 ( 5, b'Ar', [  1.72260000e-07,   4.60230000e-07,   4.73560000e-08], [-311.51  , -429.39  , -694.74  ])
 ( 6, b'Ar', [  9.63940000e-07,   7.28450000e-07,   8.86230000e-07], [ -82.636 ,   45.098 ,  -10.626 ])
 ( 7, b'He', [  5.44500000e-07,   4.63730000e-07,   6.22700000e-07], [ 158.89  ,  258.58  , -151.5   ])
 ( 8, b'He', [  7.93220000e-07,   9.47000000e-07,   3.51940000e-08], [-197.03  ,  156.74  , -185.2   ])
 ( 9, b'Ar', [  2.77970000e-07,   1.64870000e-07,   8.24030000e-07], [ -38.65  , -696.32  ,  216.42  ])
 (10, b'He', [  1.18420000e-07,   6.32440000e-07,   5.09580000e-07], [-149.63  ,  422.88  ,  -76.309 ])
 (11, b'Ar', [  2.03590000e-07,   8.33690000e-07,   9.63480000e-07], [ 484.57  , -267.41  , -352.54  ])
 (12, b'He', [  5.10190000e-07,   2.24700000e-07,   2.38460000e-08], [-231.92  ,  -99.51  ,   32.77  ])
 (13, b'Ar', [  3.53830000e-07,   8.45810000e-07,   7.23400000e-07], [-303.95  ,   47.316 ,  222.53  ])
 (14, b'He', [  3.85150000e-07,   2.89400000e-07,   5.60280000e-07], [ 233.08  ,  254.18  ,  429.83  ])
 (15, b'He', [  1.58420000e-07,   9.82250000e-07,   5.78590000e-07], [ 199.63  ,  203.11  , -425.6   ])
 (16, b'He', [  3.68310000e-07,   7.65200000e-07,   2.98840000e-07], [  66.341 ,  222.32  ,  -97.653 ])
 (17, b'He', [  2.86960000e-07,   1.51290000e-07,   6.40600000e-07], [  90.358 ,  -67.459 ,  -64.782 ])
 (18, b'He', [  1.03250000e-07,   9.90120000e-07,   3.43810000e-07], [  71.108 ,   11.06  ,   15.912 ])
 (19, b'Ar', [  4.39290000e-07,   7.53630000e-07,   9.99740000e-07], [ 239.19  ,  173.83  ,  335.29  ])]

因此，这使得使用起来非常方便，例如，可以很容易地计算速度，如下所示：

print(np.linalg.norm(md_data['velocity'], axis=1))

使用输出

[ 165.53317168  615.98627785  387.98565049  378.99652094  508.18147093
11550713   94.73885146  339.20775124  312.54965765  730.20064455
01614782  656.20167982  254.48575481  379.66301803  551.07856763
09488281  251.72091508  130.04611632   73.70117372  447.04374406]