```>>> from env_helper import info; info()
```
```页面更新时间： 2022-12-28 00:02:24

Linux发行版本: Debian GNU/Linux 11 (bullseye)
操作系统内核: Linux-5.10.0-20-amd64-x86_64-with-glibc2.31
Python版本: 3.9.2
```

# 11.3. 数据处理方法¶

## 11.3.1. 五、几何操作¶

geopandas在shapely库中提供了所有几何操作的工具。

### 1、创建方法¶

```GeoSeries.buffer(distance, resolution=16)
```

```GeoSeries.boundary
```

```GeoSeries.centroid
```

```GeoSeries.convex_hull
```

```GeoSeries.envelope
```

```GeoSeries.simplify(tolerance, preserve_topology=True)
```

```GeoSeries.unary_union
```

### 2、仿射变换¶

```GeoSeries.rotate(self, angle, origin='center', use_radians=False)
```

```GeoSeries.scale(self, xfact=1.0, yfact=1.0, zfact=1.0, origin='center')
```

```GeoSeries.skew(self, angle, origin='center', use_radians=False)
```

```GeoSeries.translate(self, angle, origin='center', use_radians=False)
```

### 3、几何操作的示例¶

```>>> %matplotlib inline
>>>
>>> import geopandas as gpd
>>> from geopandas import GeoDataFrame
>>> from shapely.geometry import Polygon
>>> from geopandas import GeoSeries
>>> p1 = Polygon([(0, 0), (1, 0), (1, 1)])
>>> p2 = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])
>>> p3 = Polygon([(2, 0), (3, 0), (3, 1), (2, 1)])
>>> g = GeoSeries([p1, p2, p3])
>>>
```

```>>> print (g.area)
```
```0    0.5
1    1.0
2    1.0
dtype: float64
```

```>>> g.buffer(0.5)
```
```0    POLYGON ((-0.35355 0.35355, 0.64645 1.35355, 0...
1    POLYGON ((-0.50000 0.00000, -0.50000 1.00000, ...
2    POLYGON ((1.50000 0.00000, 1.50000 1.00000, 1....
dtype: geometry
```

GeoPandas使用的是descartes方法生成matplotlib图。要生成我们自己的GeoSeries图，请使用：

```>>> g.plot()
```
```<AxesSubplot:>
```

```>>> from shapely.geometry import Point
>>> import numpy as np
>>> xmin, xmax, ymin, ymax = 900000, 1080000, 120000, 280000
>>> xc = (xmax - xmin) * np.random.random(2000) + xmin
>>> yc = (ymax - ymin) * np.random.random(2000) + ymin
>>> pts = GeoSeries([Point(x, y) for x, y in zip(xc, yc)])
```

```>>> circles = pts.buffer(2000)
```

```>>> mp = circles.unary_union
```

## 11.3.2. 六、使用覆盖设置操作¶

（熟悉shapely库的用户的请注意：overlay可以被认为是提供标准形状操作的版本，对于处理两个GeoSeries应用集合操作的复杂性来讲，标准的形状集合操作也可以作为GeoSeries方法。）

### 1、不同的Overlay操作¶

```>>>
>>> from shapely.geometry import Polygon
>>>
>>> polys1 = gpd.GeoSeries([Polygon([(0,0), (2,0), (2,2), (0,2)]),
>>>                        Polygon([(2,2), (4,2), (4,4), (2,4)])])
>>>
>>>
>>> polys2 = gpd.GeoSeries([Polygon([(1,1), (3,1), (3,3), (1,3)]),
>>>                          Polygon([(3,3), (5,3), (5,5), (3,5)])])
>>>
>>>
>>> df1 = gpd.GeoDataFrame({'geometry': polys1, 'df1':[1,2]})
>>> df2 = gpd.GeoDataFrame({'geometry': polys2, 'df2':[1,2]})
```

```>>> ax = df1.plot(color='red');
>>>
>>> df2.plot(ax=ax, color='green');
```

```>>> res_union = gpd.overlay(df1, df2, how='union')
>>>
>>> res_union
```
df1 df2 geometry
0 1.0 1.0 POLYGON ((1.00000 2.00000, 2.00000 2.00000, 2....
1 2.0 1.0 POLYGON ((2.00000 3.00000, 3.00000 3.00000, 3....
2 2.0 2.0 POLYGON ((3.00000 4.00000, 4.00000 4.00000, 4....
3 1.0 NaN POLYGON ((0.00000 2.00000, 1.00000 2.00000, 1....
4 2.0 NaN MULTIPOLYGON (((2.00000 3.00000, 2.00000 4.000...
5 NaN 1.0 MULTIPOLYGON (((1.00000 2.00000, 1.00000 3.000...
6 NaN 2.0 POLYGON ((3.00000 5.00000, 5.00000 5.00000, 5....
```>>> ax = res_union.plot()
>>>
>>> df1.plot(ax=ax, facecolor='none');
>>>
>>> df2.plot(ax=ax, facecolor='none');
```

```>>> res_intersection = gpd.overlay(df1, df2, how='intersection')
>>>
>>> res_intersection
```
df1 df2 geometry
0 1 1 POLYGON ((1.00000 2.00000, 2.00000 2.00000, 2....
1 2 1 POLYGON ((2.00000 3.00000, 3.00000 3.00000, 3....
2 2 2 POLYGON ((3.00000 4.00000, 4.00000 4.00000, 4....
```>>> ax = res_intersection.plot()
>>>
>>> df1.plot(ax=ax, facecolor='none');
>>>
>>> df2.plot(ax=ax, facecolor='none');
```

how =’symmetric_difference’与’intersection’相反，返回的几何只是GeoDataFrames其中之一的一部分，并不是两者的一部分：

```>>> res_symdiff = gpd.overlay(df1, df2, how='symmetric_difference')
>>>
>>> res_symdiff
```
df1 df2 geometry
0 1.0 NaN POLYGON ((0.00000 2.00000, 1.00000 2.00000, 1....
1 2.0 NaN MULTIPOLYGON (((2.00000 3.00000, 2.00000 4.000...
2 NaN 1.0 MULTIPOLYGON (((1.00000 2.00000, 1.00000 3.000...
3 NaN 2.0 POLYGON ((3.00000 5.00000, 5.00000 5.00000, 5....
```>>> ax = res_symdiff.plot()
>>>
>>> df1.plot(ax=ax, facecolor='none');
>>>
>>> df2.plot(ax=ax, facecolor='none');
```

```>>> res_difference = gpd.overlay(df1, df2, how='difference')
>>>
>>> res_difference
```
geometry df1
0 POLYGON ((0.00000 2.00000, 1.00000 2.00000, 1.... 1
1 MULTIPOLYGON (((2.00000 3.00000, 2.00000 4.000... 2
```>>> ax = res_difference.plot()
>>>
>>> df1.plot(ax=ax, facecolor='none');
>>>
>>> df2.plot(ax=ax, facecolor='none');
```

```>>> res_identity = gpd.overlay(df1, df2, how='identity')
>>>
>>> res_identity
>>>
```
df1 df2 geometry
0 1.0 1.0 POLYGON ((1.00000 2.00000, 2.00000 2.00000, 2....
1 2.0 1.0 POLYGON ((2.00000 3.00000, 3.00000 3.00000, 3....
2 2.0 2.0 POLYGON ((3.00000 4.00000, 4.00000 4.00000, 4....
3 1.0 NaN POLYGON ((0.00000 2.00000, 1.00000 2.00000, 1....
4 2.0 NaN MULTIPOLYGON (((2.00000 3.00000, 2.00000 4.000...
```>>> ax = res_identity.plot()
>>>
>>> df1.plot(ax=ax, facecolor='none');
>>>
>>> df2.plot(ax=ax, facecolor='none');
```

### 2、覆盖国家的示例¶

```>>> world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
>>>
>>>
>>> # Select South Amarica and some columns
>>> countries = world[world['continent'] == "South America"]
>>>
>>> countries = countries[['geometry', 'name']]
>>>
>>> # Project to crs that uses meters as distance measure
>>> countries = countries.to_crs('+init=epsg:3395')
>>>
>>> capitals = capitals.to_crs('+init=epsg:3395')
```
```/usr/lib/python3/dist-packages/pyproj/crs/crs.py:280: FutureWarning: '+init=<authority>:<code>' syntax is deprecated. '<authority>:<code>' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
projstring = _prepare_from_string(projparams)
```

```>>>  # Look at countries:
>>> countries.plot();
>>>
>>> # Now buffer cities to find area within 500km.
>>> # Check CRS -- World Mercator, units of meters.
>>> capitals.crs
>>>
>>>
>>> # make 500km buffer
>>> capitals['geometry']= capitals.buffer(500000)
>>>
>>> capitals.plot();
```

```>>> country_cores = gpd.overlay(countries, capitals, how='intersection')
```
```>>> country_cores.plot();
```

```>>> country_peripheries = gpd.overlay(countries, capitals, how='difference')
```
```>>> country_peripheries.plot();
```

## 11.3.3. 七、溶解聚集¶

dissolve可以做三件事情：（a）它将给定组中的所有几何图形，并一起溶解成单个几何特征（使用unary_union方法）；（b）它聚合组中的所有数据行，使用的是groupby.aggregate（）方法；（c）将以上这两种方法结合。

### 1、dissolve示例¶

```>>> world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
>>>
>>> world = world[['continent', 'geometry']]
>>>
>>> continents = world.dissolve(by='continent')
>>>
>>> continents.plot();
>>>
```
geometry
continent
Africa MULTIPOLYGON (((-16.714 13.595, -17.126 14.374...
Antarctica MULTIPOLYGON (((-180.000 -84.713, -179.942 -84...
Asia MULTIPOLYGON (((27.192 40.691, 26.358 40.152, ...
Europe MULTIPOLYGON (((-177.664 71.133, -178.694 70.8...
North America MULTIPOLYGON (((-169.529 62.977, -170.291 63.1...

## 11.3.4. 合并数据¶

```>>> %matplotlib inline
>>> import geopandas as gpd
>>>
>>>
>>>
>>> # For attribute join
>>> country_shapes = world[['geometry', 'iso_a3']]
>>>
>>> country_names = world[['name', 'iso_a3']]
>>>
>>> # For spatial join
>>> countries = world[['geometry', 'name']]
>>>
>>> countries = countries.rename(columns={'name':'country'})
```

### 属性连接¶

```>>> # country_shapes` is GeoDataFrame with country shapes and iso codes
>>>
```
geometry iso_a3
0 MULTIPOLYGON (((180.00000 -16.06713, 180.00000... FJI
1 POLYGON ((33.90371 -0.95000, 34.07262 -1.05982... TZA
2 POLYGON ((-8.66559 27.65643, -8.66512 27.58948... ESH
3 MULTIPOLYGON (((-122.84000 49.00000, -122.9742... CAN
4 MULTIPOLYGON (((-122.84000 49.00000, -120.0000... USA
```>>> country_names.head()
```
name iso_a3
0 Fiji FJI
1 Tanzania TZA
2 W. Sahara ESH
4 United States of America USA
```>>> country_shapes = country_shapes.merge(country_names, on='iso_a3')
>>>
```
geometry iso_a3 name
0 MULTIPOLYGON (((180.00000 -16.06713, 180.00000... FJI Fiji
1 POLYGON ((33.90371 -0.95000, 34.07262 -1.05982... TZA Tanzania
2 POLYGON ((-8.66559 27.65643, -8.66512 27.58948... ESH W. Sahara
3 MULTIPOLYGON (((-122.84000 49.00000, -122.9742... CAN Canada
4 MULTIPOLYGON (((-122.84000 49.00000, -120.0000... USA United States of America

### 空间连接¶

```>>> # One GeoDataFrame of countries, one of Cities.
>>> # Want to merge so we can get each city's country.
```
geometry country
0 MULTIPOLYGON (((180.00000 -16.06713, 180.00000... Fiji
1 POLYGON ((33.90371 -0.95000, 34.07262 -1.05982... Tanzania
2 POLYGON ((-8.66559 27.65643, -8.66512 27.58948... W. Sahara
3 MULTIPOLYGON (((-122.84000 49.00000, -122.9742... Canada
4 MULTIPOLYGON (((-122.84000 49.00000, -120.0000... United States of America
```>>> cities.head()
```
name geometry
0 Vatican City POINT (12.45339 41.90328)
1 San Marino POINT (12.44177 43.93610)
```>>> cities_with_country = gpd.sjoin(cities, countries, how="inner", op='intersects')