访问表#

访问表属性和数据通常与的基本接口一致 numpy structured arrays 。

基础#

为了快速概述，下面的代码显示了访问表数据的基础知识。在相关的地方，有一个关于返回的对象类型的注释。除非另有说明，否则表访问返回的对象可以修改以更新原始表数据或属性。另请参见复制与引用了解更多有关此主题的信息。

做一张桌子 ：：

from astropy.table import Table
import numpy as np

arr = np.arange(15).reshape(5, 3)
t = Table(arr, names=('a', 'b', 'c'), meta={'keywords': {'key1': 'val1'}})

表属性 ：：

t.columns   # Dict of table columns (access by column name, index, or slice)
t.colnames  # List of column names
t.meta      # Dict of meta-data
len(t)      # Number of table rows

访问表数据 ：：

t['a']       # Column 'a'
t['a'][1]    # Row 1 of column 'a'
t[1]         # Row 1
t[1]['a']    # Column 'a' of row 1
t[1][1:]     # Row 1, columns b and c
t[2:5]       # Table object with rows 2:5
t[[1, 3, 4]]  # Table object with rows 1, 3, 4 (copy)
t[np.array([1, 3, 4])]  # Table object with rows 1, 3, 4 (copy)
t[[]]        # Same table definition but with no rows of data
t['a', 'c']  # Table with cols 'a', 'c' (copy)
dat = np.array(t)  # Copy table data to numpy structured array object
t['a'].quantity  # an astropy.units.Quantity for Column 'a'
t['a'].to('km')  # an astropy.units.Quantity for Column 'a' in units of kilometers
t.columns[1]  # Column 1 (which is the 'b' column)
t.columns[0:2]  # New table with columns 0 and 1

备注

虽然它们看起来几乎相等，但它们之间有两个性能差异 t[1]['a'] （较慢，因为 Row 对象被创建）与 t['a'][1] （更快）。如果可能，请始终使用后者。

打印表或列 ：：

print(t)     # Print formatted version of table to the screen
t.pprint()   # Same as above
t.pprint(show_unit=True)  # Show column unit
t.pprint(show_name=False)  # Do not show column names
t.pprint_all() # Print full table no matter how long / wide it is (same as t.pprint(max_lines=-1, max_width=-1))

t.more()  # Interactively scroll through table like Unix "more"

print(t['a'])    # Formatted column values
t['a'].pprint()  # Same as above, with same options as Table.pprint()
t['a'].more()    # Interactively scroll through column
t['a', 'c'].pprint()  # Print columns 'a' and 'c' of table

lines = t.pformat()  # Formatted table as a list of lines (same options as pprint)
lines = t['a'].pformat()  # Formatted column values as a list

细节#

对于以下所有示例，假定表的创建方式如下所示：

>>> from astropy.table import Table, Column
>>> import numpy as np
>>> import astropy.units as u

>>> arr = np.arange(15, dtype=np.int32).reshape(5, 3)
>>> t = Table(arr, names=('a', 'b', 'c'), meta={'keywords': {'key1': 'val1'}})
>>> t['a'].format = "{:.3f}"  # print with 3 digits after decimal point
>>> t['a'].unit = 'm sec^-1'
>>> t['a'].description = 'unladen swallow velocity'
>>> print(t)
     a      b   c
  m sec^-1
  -------- --- ---
     0.000   1   2
     3.000   4   5
     6.000   7   8
     9.000  10  11
    12.000  13  14

备注

在上面的示例中， format ， unit ，以及 description 的属性 Column 是直接设置的。为混合柱喜欢 Quantity 您必须通过 info 属性，例如， t['a'].info.format = "{:.3f}" 。您可以使用 info 具有的属性 Column 对象，因此处理任何表列的一般解决方案都是通过 info 属性。看见混合属性以获取更多信息。

摘要信息#

您可以按如下方式获取有关表的摘要信息：

>>> t.info
<Table length=5>
name dtype   unit   format       description
---- ----- -------- ------ ------------------------
   a int32 m sec^-1 {:.3f} unladen swallow velocity
   b int32
   c int32

如果作为函数调用，则可以提供 option 它指定要返回的信息类型。内置的 option 可选择的有 'attributes' (列属性，这是默认属性)或 'stats' (基本列统计信息)。这个 option 参数还可以是可用选项列表：：

>>> t.info('stats')  
<Table length=5>
name mean   std   min max
---- ---- ------- --- ---
   a    6 4.24264   0  12
   b    7 4.24264   1  13
   c    8 4.24264   2  14

>>> t.info(['attributes', 'stats'])  
<Table length=5>
name dtype   unit   format       description        mean   std   min max
---- ----- -------- ------ ------------------------ ---- ------- --- ---
   a int32 m sec^-1 {:.3f} unladen swallow velocity    6 4.24264   0  12
   b int32                                             7 4.24264   1  13
   c int32                                             8 4.24264   2  14

列还具有 info 属性具有相同的行为和参数，但提供有关单个列的信息：：

>>> t['a'].info
name = a
dtype = int32
unit = m sec^-1
format = {:.3f}
description = unladen swallow velocity
class = Column
n_bad = 0
length = 5

>>> t['a'].info('stats')  
name = a
mean = 6
std = 4.24264
min = 0
max = 12
n_bad = 0
length = 5

访问属性#

下面的代码将访问表列显示为 TableColumns 对象，获取列名、表元数据和表行数。表元数据是一个 OrderedDict 默认情况下。**

>>> t.columns
<TableColumns names=('a','b','c')>

>>> t.colnames
['a', 'b', 'c']

>>> t.meta  # Dict of meta-data
{'keywords': {'key1': 'val1'}}

>>> len(t)
5

访问数据#

正如预期的那样，您可以按名称访问表列，并从该列中获取一个带有数字索引的元素：

>>> t['a']  # Column 'a'
<Column name='a' dtype='int32' unit='m sec^-1' format='{:.3f}' description='unladen swallow velocity' length=5>
 0.000
 3.000
 6.000
 9.000
12.000


>>> t['a'][1]  # Row 1 of column 'a'
3

打印表列时，它的格式将根据 format 属性（请参见格式说明符 ). 请注意上面的列表示法和它通过 print() 或 str() ：：

>>> print(t['a'])
   a
m sec^-1
--------
   0.000
   3.000
   6.000
   9.000
  12.000

同样，可以从该行中选择表行和列：

>>> t[1]  # Row object corresponding to row 1
<Row index=1>
   a       b     c
m sec^-1
 int32   int32 int32
-------- ----- -----
   3.000     4     5

>>> t[1]['a']  # Column 'a' of row 1
3

A Row 对象具有与其父表相同的列和元数据：

>>> t[1].columns
<TableColumns names=('a','b','c')>

>>> t[1].meta
{'keywords': {'key1': 'val1'}}

对表进行切片会返回一个新的表对象，其中引用了切片区域中的原始数据（请参见复制与引用 ). 将复制表元数据和列定义。:

>>> t[2:5]  # Table object with rows 2:5 (reference)
<Table length=3>
   a       b     c
m sec^-1
 int32   int32 int32
-------- ----- -----
   6.000     7     8
   9.000    10    11
  12.000    13    14

可以使用索引数组或通过指定多个列名来选择表行。这将返回所选行或列的原始表的副本。:

>>> print(t[[1, 3, 4]])  # Table object with rows 1, 3, 4 (copy)
     a      b   c
  m sec^-1
  -------- --- ---
     3.000   4   5
     9.000  10  11
    12.000  13  14


>>> print(t[np.array([1, 3, 4])])  # Table object with rows 1, 3, 4 (copy)
     a      b   c
  m sec^-1
  -------- --- ---
     3.000   4   5
     9.000  10  11
    12.000  13  14


>>> print(t['a', 'c'])  # or t[['a', 'c']] or t[('a', 'c')]
...                     # Table with cols 'a', 'c' (copy)
     a      c
  m sec^-1
  -------- ---
     0.000   2
     3.000   5
     6.000   8
     9.000  11
    12.000  14

我们可以使用条件语句从表中选择行来创建布尔掩码。使用布尔数组索引的表只返回掩码数组元素所在的行 True . 可以使用位运算符组合不同的条件。:

>>> mask = (t['a'] > 4) & (t['b'] > 8)  # Table rows where column a > 4
>>> print(t[mask])                      # and b > 8
...
     a      b   c
  m sec^-1
  -------- --- ---
     9.000  10  11
    12.000  13  14

最后，您可以作为本机访问基础表数据 numpy 通过使用创建副本或引用创建结构化数组 numpy.array() **

>>> data = np.array(t)  # copy of data in t as a structured array
>>> data = np.array(t, copy=False)  # reference to data in t

可能缺少列#

在某些情况下，可能不能保证表中存在列，但确实存在一个良好的缺省值，如果不存在，则可以使用。的栏目 Table 可以表示为 dict 子类实例通过 columns 属性，这意味着可以使用 dict.get() 方法：

>>> t.columns.get("b", np.zeros(len(t)))
<Column name='b' dtype='int32' length=5>
 1
 4
 7
10
13
>>> t.columns.get("x", np.zeros(len(t)))
array([0., 0., 0., 0., 0.])

如果是单一的 Row 可以使用它的 get() 方法，而无需经过 columns **

>>> row = t[2]
>>> row.get("c", -1)
8
>>> row.get("y", -1)
-1

表相等#

我们可以使用两种不同的方法检查表数据是否相等：

这个 == 比较运算符。在一般情况下，这将返回一个一维数组 dtype=bool 将每行映射到 True 当且仅当 entire row 火柴。对于不可比较的数据(不同 dtype 或不可接受的长度)，布尔值 False 是返回的。这与 numpy 尝试比较结构化数组可能会引发异常。
表格 values_equal() 以逐个元素比较表值。这将返回一个布尔值 True 或 False 对于每张表 element ，所以你会得到一个 Table 价值观。

备注

这两种方法都将报告相等 after 广播，匹配 numpy 数组比较。

实例#

检查表相等性：

>>> t1 = Table(rows=[[1, 2, 3],
...                  [4, 5, 6],
...                  [7, 7, 9]], names=['a', 'b', 'c'])
>>> t2 = Table(rows=[[1, 2, -1],
...                  [4, -1, 6],
...                  [7, 7, 9]], names=['a', 'b', 'c'])

>>> t1 == t2
array([False, False,  True])

>>> t1.values_equal(t2)  # Compare to another table
<Table length=3>
 a     b     c
bool  bool  bool
---- ----- -----
True  True False
True False  True
True  True  True

>>> t1.values_equal([2, 4, 7])  # Compare to an array column-wise
<Table length=3>
  a     b     c
 bool  bool  bool
----- ----- -----
False  True False
 True False False
 True  True False

>>> t1.values_equal(7)  # Compare to a scalar column-wise
<Table length=3>
  a     b     c
 bool  bool  bool
----- ----- -----
False False False
False False False
 True  True False

套印#

可以使用以下几种方法之一将表或列中的值打印或检索为格式化表：

print() 功能。
Table.more() 或 Column.more() 方法以交互方式滚动浏览表值。
Table.pprint() 或 Column.pprint() 方法将表的格式化版本打印到屏幕上。
Table.pformat() 或 Column.pformat() 方法以固定宽度字符串列表的形式返回格式化的表或列。这可以用作保存表的一种快捷方式。

这些方法使用格式说明符如果有，尽量使输出可读。默认情况下，表和列打印不会打印大于可用交互式屏幕大小的表。如果无法确定屏幕大小（在非交互式环境中或在Windows上），则使用默认大小25行80列。如果表太大，则行和/或列将从中间剪切以适合它。

例子#

打印格式化表格：

>>> arr = np.arange(3000).reshape(100, 30)  # 100 rows x 30 columns array
>>> t = Table(arr)
>>> print(t)
col0 col1 col2 col3 col4 col5 col6 ... col23 col24 col25 col26 col27 col28 col29
---- ---- ---- ---- ---- ---- ---- ... ----- ----- ----- ----- ----- ----- -----
  1    2    3    4    5    6 ...    23    24    25    26    27    28    29
 31   32   33   34   35   36 ...    53    54    55    56    57    58    59
 61   62   63   64   65   66 ...    83    84    85    86    87    88    89
 91   92   93   94   95   96 ...   113   114   115   116   117   118   119
121  122  123  124  125  126 ...   143   144   145   146   147   148   149
151  152  153  154  155  156 ...   173   174   175   176   177   178   179
181  182  183  184  185  186 ...   203   204   205   206   207   208   209
211  212  213  214  215  216 ...   233   234   235   236   237   238   239
241  242  243  244  245  246 ...   263   264   265   266   267   268   269
271  272  273  274  275  276 ...   293   294   295   296   297   298   299
 ...  ...  ...  ...  ...  ...  ... ...   ...   ...   ...   ...   ...   ...   ...
2701 2702 2703 2704 2705 2706 ...  2723  2724  2725  2726  2727  2728  2729
2731 2732 2733 2734 2735 2736 ...  2753  2754  2755  2756  2757  2758  2759
2761 2762 2763 2764 2765 2766 ...  2783  2784  2785  2786  2787  2788  2789
2791 2792 2793 2794 2795 2796 ...  2813  2814  2815  2816  2817  2818  2819
2821 2822 2823 2824 2825 2826 ...  2843  2844  2845  2846  2847  2848  2849
2851 2852 2853 2854 2855 2856 ...  2873  2874  2875  2876  2877  2878  2879
2881 2882 2883 2884 2885 2886 ...  2903  2904  2905  2906  2907  2908  2909
2911 2912 2913 2914 2915 2916 ...  2933  2934  2935  2936  2937  2938  2939
2941 2942 2943 2944 2945 2946 ...  2963  2964  2965  2966  2967  2968  2969
2971 2972 2973 2974 2975 2976 ...  2993  2994  2995  2996  2997  2998  2999
Length = 100 rows

more（）方法#

为了浏览表或列的所有行，请使用 Table.more() 或 Column.more() 方法：研究方法。它们让您可以像Unix一样交互地滚动各行 more 指挥部。显示表或列的一部分后，支持的导航键为：

f、 空间 ：前进一页
b ：后一页
r ：刷新同一页
n ：下一行
p ：上一行
< ：转到开始
> ：转到结尾
q ：退出浏览
h ：打印此帮助

pprint（）方法#

为了完全控制打印输出，请使用 Table.pprint() 或 Column.pprint() 方法：研究方法。它们具有关键字参数 max_lines ， max_width ， show_name ， show_unit ，以及 show_dtype ，含义如下：

>>> arr = np.arange(3000, dtype=float).reshape(100, 30)
>>> t = Table(arr)
>>> t['col0'].format = '%e'
>>> t['col0'].unit = 'km**2'
>>> t['col29'].unit = 'kg sec m**-2'

>>> t.pprint(max_lines=8, max_width=40)
    col0     ...    col29
    km2      ... kg sec m**-2
------------ ... ------------
0.000000e+00 ...         29.0
         ... ...          ...
2.940000e+03 ...       2969.0
2.970000e+03 ...       2999.0
Length = 100 rows

>>> t.pprint(max_lines=8, max_width=40, show_unit=False)
    col0     ... col29
------------ ... ------
0.000000e+00 ...   29.0
         ... ...    ...
2.940000e+03 ... 2969.0
2.970000e+03 ... 2999.0
Length = 100 rows

>>> t.pprint(max_lines=8, max_width=40, show_name=False)
    km2      ... kg sec m**-2
------------ ... ------------
0.000000e+00 ...         29.0
3.000000e+01 ...         59.0
         ... ...          ...
2.940000e+03 ...       2969.0
2.970000e+03 ...       2999.0
Length = 100 rows

>>> t.pprint(max_lines=8, max_width=40, show_dtype=True)
    col0       col1  ...    col29
    km2              ... kg sec m**-2
  float64    float64 ...   float64
------------ ------- ... ------------
0.000000e+00     1.0 ...         29.0
         ...     ... ...          ...
2.970000e+03  2971.0 ...       2999.0
Length = 100 rows

为了强制打印所有值而不考虑输出长度或宽度 pprint_all() ，相当于设置 max_lines 和 max_width 到 -1 在里面 pprint() . pprint_all() 采用与 pprint() . 对于本例中的宽表，您可以看到六行包装的输出，如下所示：

>>> t.pprint_all(max_lines=8)  
    col0         col1     col2   col3   col4   col5   col6   col7   col8   col9  col10  col11  col12  col13  col14  col15  col16  col17  col18  col19  col20  col21  col22  col23  col24  col25  col26  col27  col28     col29
    km2                                                                                                                                                                                                               kg sec m**-2
------------ ----------- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------------
0.000000e+00    1.000000    2.0    3.0    4.0    5.0    6.0    7.0    8.0    9.0   10.0   11.0   12.0   13.0   14.0   15.0   16.0   17.0   18.0   19.0   20.0   21.0   22.0   23.0   24.0   25.0   26.0   27.0   28.0         29.0
         ...         ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...    ...          ...
2.940000e+03 2941.000000 2942.0 2943.0 2944.0 2945.0 2946.0 2947.0 2948.0 2949.0 2950.0 2951.0 2952.0 2953.0 2954.0 2955.0 2956.0 2957.0 2958.0 2959.0 2960.0 2961.0 2962.0 2963.0 2964.0 2965.0 2966.0 2967.0 2968.0       2969.0
2.970000e+03 2971.000000 2972.0 2973.0 2974.0 2975.0 2976.0 2977.0 2978.0 2979.0 2980.0 2981.0 2982.0 2983.0 2984.0 2985.0 2986.0 2987.0 2988.0 2989.0 2990.0 2991.0 2992.0 2993.0 2994.0 2995.0 2996.0 2997.0 2998.0       2999.0
Length = 100 rows

对于列，的语法和行为 pprint() 除了没有 max_width 关键字参数：

>>> t['col3'].pprint(max_lines=8)
 col3
------
   3.0
  33.0
   ...
2943.0
2973.0
Length = 100 rows

列对齐方式#

各个列能够以多种不同的方式对齐，以增强查看体验：

>>> t1 = Table()
>>> t1['long column name 1'] = [1, 2, 3]
>>> t1['long column name 2'] = [4, 5, 6]
>>> t1['long column name 3'] = [7, 8, 9]
>>> t1['long column name 4'] = [700000, 800000, 900000]
>>> t1['long column name 2'].info.format = '<'
>>> t1['long column name 3'].info.format = '0='
>>> t1['long column name 4'].info.format = '^'
>>> t1.pprint()
 long column name 1 long column name 2 long column name 3 long column name 4
------------------ ------------------ ------------------ ------------------
                 1 4                  000000000000000007       700000
                 2 5                  000000000000000008       800000
                 3 6                  000000000000000009       900000

方便的是，对齐可以用另一种方式处理-通过向关键字参数传递一个列表 align ：：

>>> t1 = Table()
>>> t1['column1'] = [1, 2, 3]
>>> t1['column2'] = [2, 4, 6]
>>> t1.pprint(align=['<', '0='])
column1 column2
------- -------
1       0000002
2       0000004
3       0000006

也可以使用单个字符串值设置所有列的对齐方式：

>>> t1.pprint(align='^')
column1 column2
------- -------
   1       2
   2       4
   3       6

对齐的填充字符可以设置为对齐字符的前缀（请参见 Format Specification Mini-Language 更多说明）。这两种方法都可以在 align 列中的参数和 format 属性。注意下面有趣的互动：

>>> t1 = Table([[1.0, 2.0], [1, 2]], names=['column1', 'column2'])

>>> t1['column1'].format = '#^.2f'
>>> t1.pprint()
column1 column2
------- -------
##1.00#       1
##2.00#       2

现在如果我们设置一个全局对齐，我们原来的列格式似乎丢失了：

>>> t1.pprint(align='!<')
column1 column2
------- -------
1.00!!! 1!!!!!!
2.00!!! 2!!!!!!

避免这种情况的方法是为每一列显式指定对齐字符串，并使用 None 其中应使用列格式：：

>>> t1.pprint(align=[None, '!<'])
column1 column2
------- -------
##1.00# 1!!!!!!
##2.00# 2!!!!!!

pformat（）方法#

为了获得用于操作或写入文件的格式化输出，请使用 Table.pformat() 或 Column.pformat() 方法：研究方法。这些组件的行为与 pprint() 中的每个格式化行对应的列表 pprint() 输出。这个 pformat_all() 方法可用于返回 Table 。

>>> lines = t['col3'].pformat(max_lines=8)

隐藏列#

这个 Table 类具有在使用任何Print方法时选择性地显示或隐藏表中的某些列的功能。这对于非常宽的列或由于各种原因而“无趣”的列可能很有用。输出哪些列的规范与表本身相关联，因此它通过切片、复制和序列化(例如，保存到 ECSV格式 )。一种用例是专门的表子类，这些表子类包含通常对用户不有用的辅助列。

指定在通过两个互补的两个列处理打印时要包括哪些列 Table 属性：

pprint_include_names ：要包括的列名，其中默认值为 None 表示包括所有列。
pprint_exclude_names ：要排除的列名，其中默认值为 None 表示不排除任何列。

通常，您一次只应使用这两个属性中的一个。但是，可以同时设置这两个列，并且实际打印的列集在概念上用以下伪代码表示：

include_names = (set(table.pprint_include_names() or table.colnames)
                 - set(table.pprint_exclude_names() or ())

实例#

让我们从定义一个具有一行和六列的简单表格开始：

>>> from astropy.table.table_helpers import simple_table
>>> t = simple_table(size=1, cols=6)
>>> print(t)
a   b   c   d   e   f
--- --- --- --- --- ---
1 1.0   c   4 4.0   f

现在您可以获取 pprint_include_names 属性，然后包含一些用于打印的名称：：

>>> print(t.pprint_include_names())
None
>>> t.pprint_include_names = ('a', 'c', 'e')
>>> print(t.pprint_include_names())
('a', 'c', 'e')
>>> print(t)
 a   c   e
--- --- ---
  1   c 4.0

现在，您可以从打印中排除某些列。请注意，对于INCLUDE和EXCLUDE，您都可以添加表中不存在的列名。这允许在完全构造表之前预定义属性。：：

>>> t.pprint_include_names = None  # Revert to printing all columns
>>> t.pprint_exclude_names = ('a', 'c', 'e', 'does-not-exist')
>>> print(t)
 b   d   f
--- --- ---
1.0   4   f

接下来，您可以 add 或 remove 属性中的名称：：

>>> t = simple_table(size=1, cols=6)  # Start with a fresh table
>>> t.pprint_exclude_names.add('b')  # Single name
>>> t.pprint_exclude_names.add(['d', 'f'])  # List or tuple of names
>>> t.pprint_exclude_names.remove('f')  # Single name or list/tuple of names
>>> t.pprint_exclude_names()
('b', 'd')

最后，您可以临时在 context manager 。例如：：

>>> t = simple_table(size=1, cols=6)
>>> t.pprint_include_names = ('a', 'b')
>>> print(t)
 a   b
--- ---
  1 1.0

>>> # Show all (for pprint_include_names the value of None => all columns)
>>> with t.pprint_include_names.set(None):
...     print(t)
 a   b   c   d   e   f
--- --- --- --- --- ---
  1 1.0   c   4 4.0   f

这些属性的名称规范可以包括Unix样式的GLOB，如下所示 * 和 ? 。看见 fnmatch 获取详细信息(特别是如果需要如何转义这些字符)。例如：：

>>> t = Table()
>>> t.pprint_exclude_names = ['boring*']
>>> t['a'] = [1]
>>> t['b'] = ['b']
>>> t['boring_ra'] = [122.0]
>>> t['boring_dec'] = [89.9]
>>> print(t)
 a   b
--- ---
  1   b

多维列#

如果一个列有多个维度，那么该列的每个元素本身就是一个数组。在下面的示例中有三行，每行都是 2 x 2 数组。此类列的格式化输出仅显示每个行元素的第一个和最后一个值，并指示列名标题中的数组维度：

>>> t = Table()
>>> arr = [ np.array([[ 1.,  2.],
...                   [10., 20.]]),
...         np.array([[ 3.,  4.],
...                   [30., 40.]]),
...         np.array([[ 5.,  6.],
...                   [50., 60.]]) ]
>>> t['a'] = arr
>>> t['a'].shape
(3, 2, 2)
>>> t.pprint()
     a
-----------
1.0 .. 20.0
3.0 .. 40.0
5.0 .. 60.0

为了查看多维列的所有数据值，请使用列表示法。这使用标准 numpy 打印任何数组的机制：

>>> t['a'].data
array([[[ 1.,  2.],
        [10., 20.]],
       [[ 3.,  4.],
        [30., 40.]],
       [[ 5.,  6.],
        [50., 60.]]])

结构化数组列#

对于结构化数组的列，格式字符串必须是使用 "new style" format strings 其中参数替换对应于结构化数组中的字段名。考虑以下示例，该示例包括一列参数值，其中值、最小值和最大值作为名为的字段存储在列中 val ， min ，以及 max 。默认情况下，字段值显示为元组：：

>>> pars = np.array(
...   [(1.2345678, -20, 3),
...    (12.345678, 4.5678, 33)],
...   dtype=[('val', 'f8'), ('min', 'f8'), ('max', 'f8')]
... )
>>> t = Table()
>>> t['a'] = [1, 2]
>>> t['par'] = pars
>>> print(t)
 a    par [val, min, max]
--- ------------------------
  1    (1.2345678, -20., 3.)
  2 (12.345678, 4.5678, 33.)

但是，适当地设置格式字符串可以设置每个字段值的格式并控制总体输出：

>>> t['par'].info.format = '{val:6.2f} ({min:5.1f}, {max:5.1f})'
>>> print(t)
 a   par [val, min, max]
--- ---------------------
  1   1.23 (-20.0,   3.0)
  2  12.35 (  4.6,  33.0)

带单位的列#

备注

Table 和 QTable 实例以不同的方式处理带有单位的条目。以下内容描述了 Table 。数量和数量解释了如何使用 QTable 不同于 Table 。

A Column 单位在标准范围内的对象 Table 有某些与数量相关的便利设施可用。首先，它可以显式地转换为 Quantity 对象通过 quantity 属性和 to() 方法：

>>> data = [[1., 2., 3.], [40000., 50000., 60000.]]
>>> t = Table(data, names=('a', 'b'))
>>> t['a'].unit = u.m
>>> t['b'].unit = 'km/s'
>>> t['a'].quantity  
<Quantity [1., 2., 3.] m>
>>> t['b'].to(u.kpc/u.Myr)  
<Quantity [40.9084866 , 51.13560825, 61.3627299 ] kpc / Myr>

请注意 quantity 属性实际上是 view 列中的数据，而不是副本。因此，可以通过对 quantity 属性：

>>> t['b']
<Column name='b' dtype='float64' unit='km / s' length=3>
40000.0
50000.0
60000.0

>>> t['b'].quantity[0] = 45000000*u.m/u.s
>>> t['b']
<Column name='b' dtype='float64' unit='km / s' length=3>
45000.0
50000.0
60000.0

即使没有显式转换，也可以将带有单位的列视为 Quantity 在……里面 some 算术表达式(有关此问题的警告，请参阅下面的警告)：

>>> t['a'] + .005*u.km  
<Quantity [6., 7., 8.] m>
>>> from astropy.constants import c
>>> (t['b'] / c).decompose()  
<Quantity [0.15010384, 0.16678205, 0.20013846]>

警告

Table 列可以做到 not 总是表现得与 Quantity 。 Table 列的行为更像是常规 numpy 数组，除非显式转换为 Quantity 或与 Quantity 使用算术运算符。例如，以下代码的工作方式与您预期的不同：

>>> data = [[30, 90]]
>>> t = Table(data, names=('angle',))
>>> t['angle'].unit = 'deg'
>>> np.sin(t['angle'])  
<Column name='angle' dtype='float64' unit='deg' length=2>
-0.988031624093
 0.893996663601

这是错误的，因为它说结果是以度为单位的， and sin 将这些值视为弧度而不是度。如果你根本不确定你会得到正确的结果，最安全的选择就是使用 QTable 或显式转换为 Quantity **

>>> np.sin(t['angle'].quantity)  
<Quantity [0.5, 1. ]>

Bytestring列#

使用bytestring列 (numpy 'S' dtype）可用于 astropy 表，因为它们可以与自然的Python字符串进行比较 (str )类型。看到了吗 The bytes/str dichotomy in Python 3 对于差异的简要概述。

中表示字符串的标准方法 numpy 是通过Unicode 'U' 数据类型。问题是这需要每个字符4个字节，如果您有非常多的字符串，这可能会占用内存并影响性能。一个非常常见的用例是，这些字符串实际上是ASCII，并且可以用每个字符1个字节来表示。在……里面 astropy 可以直接、方便地使用中的字节串数据 Table 和 Column 行动。

注意，在处理HDF5文件时，bytestring问题是一个特殊的问题，在HDF5文件中，字符数据作为bytestring读取 ('S' 当使用统一文件读写接口 . 由于HDF5文件经常用于存储非常大的数据集，因此与转换到 'U' 数据类型不可接受。

实例#

下面的示例说明如何处理中的字节串数据 astropy **

>>> t = Table([['abc', 'def']], names=['a'], dtype=['S'])

>>> t['a'] == 'abc'  # Gives expected answer
array([ True, False])

>>> t['a'] == b'abc'  # Still gives expected answer
array([ True, False])

>>> t['a'][0] == 'abc'  # Expected answer
True

>>> t['a'][0] == b'abc'  # Cannot compare to bytestring
False

>>> t['a'][0] = 'bä'
>>> t
<Table length=2>
  a
bytes3
------
    bä
   def

>>> t['a'] == 'bä'
array([ True, False])

>>> # Round trip unicode strings through HDF5
>>> t.write('test.hdf5', format='hdf5', path='data', overwrite=True)
>>> t2 = Table.read('test.hdf5', format='hdf5', path='data')
>>> t2
<Table length=2>
 col0
bytes3
------
    bä
   def