geometry access with .geometry

geocompx · Sep 13, 2023 · 71f18c9 · 71f18c9
1 parent af2abd9
commit 71f18c9
Show file tree

Hide file tree

Showing 5 changed files with 29 additions and 27 deletions.
diff --git a/01-spatial-data.qmd b/01-spatial-data.qmd
@@ -222,16 +222,18 @@ gdf[gdf['name_long'] == 'Egypt'].explore()
 
 A vital column in a `GeoDataFrame` is the geometry column, of class `GeoSeries`.
 The geometry column contains the geometric part of the vector layer.
+The geometry column can be accessed by name, which typically (e.g., when reading from a file) is `'geometry'`, as in `gdf['geometry']`.
+However, the recommendation is to use the fixed `.geometry` property, which refers to the geometry column regardless whether its name is `'geometry'` or any other name.
 In the case of the `gdf` object, the geometry column contains `'MultiPolygon'`s  associated with each country:
 
 ```{python}
-gdf['geometry']
+gdf.geometry
 ```
 
 The geometry column also contains the spatial reference information, if any:
 
 ```{python}
-gdf['geometry'].crs
+gdf.geometry.crs
 ```
 
 Many geometry operations, such as calculating the centroid, buffer, or bounding box of each feature involve just the geometry.
@@ -243,14 +245,14 @@ gdf.centroid
 ```
 
 ```{python}
-gdf['geometry'].centroid
+gdf.geometry.centroid
 ```
 
-Note that `.centroid`, and other similar operators in `geopandas` such as `.buffer` (@sec-buffers) or `.convex_hull`, return only the geometry (i.e., a `GeoSeries`), not a `GeoDataFrame` with the original attribute data. In case we want the latter, we can create a copy of the `GeoDataFrame` and then "overwrite" its geometry (or, we can overwrite the geometries directly in case we don't need the original ones, as in `gdf['geometry']=gdf.centroid`):
+Note that `.centroid`, and other similar operators in `geopandas` such as `.buffer` (@sec-buffers) or `.convex_hull`, return only the geometry (i.e., a `GeoSeries`), not a `GeoDataFrame` with the original attribute data. In case we want the latter, we can create a copy of the `GeoDataFrame` and then "overwrite" its geometry (or, we can overwrite the geometries directly in case we don't need the original ones, as in `gdf.geometry=gdf.centroid`):
 
 ```{python}
 gdf2 = gdf.copy()
-gdf2['geometry'] = gdf.centroid
+gdf2.geometry = gdf.centroid
 gdf2
 ```
 
@@ -259,13 +261,13 @@ Note that the types of geometries contained in a geometry column (and, thus, a v
 Accordingly, the `.type` property returns a `Series` (of type `string`), rather than a single value:
 
 ```{python}
-gdf['geometry'].type
+gdf.geometry.type
 ```
 
 To summarize the occurrence of different geometry types in a geometry column, we can use the **pandas** method called `value_counts`:
 
 ```{python}
-gdf['geometry'].type.value_counts()
+gdf.geometry.type.value_counts()
 ```
 
 In this case, we see that the `gdf` layer contains only `'MultiPolygon'` geometries.
@@ -353,13 +355,13 @@ Each element in the geometry column is a geometry object, of class `shapely`.
 For example, here is one specific geometry selected by implicit index (that of Canada):
 
 ```{python}
-gdf['geometry'].iloc[3]
+gdf.geometry.iloc[3]
 ```
 
 We can also select a specific geometry based on the `'name_long'` attribute:
 
 ```{python}
-gdf[gdf['name_long'] == 'Egypt']['geometry'].iloc[0]
+gdf[gdf['name_long'] == 'Egypt'].geometry.iloc[0]
 ```
 
 The **shapely** package is compatible with the Simple Features standard.

diff --git a/03-spatial-operations.qmd b/03-spatial-operations.qmd
@@ -78,7 +78,7 @@ canterbury
 
 ```{python}
 # Does each 'nz_height' point intersect with 'canterbury'?
-sel = nz_height.intersects(canterbury['geometry'].iloc[0])
+sel = nz_height.intersects(canterbury.geometry.iloc[0])
 sel
 ```
 
@@ -112,7 +112,7 @@ Alternatively, we can evaluate other methods, such as `.disjoint` to obtain all
 
 ```{python}
 # Is each 'nz_height' point disjoint from 'canterbury'?
-sel = nz_height.disjoint(canterbury['geometry'].iloc[0])
+sel = nz_height.disjoint(canterbury.geometry.iloc[0])
 canterbury_height2 = nz_height[sel]
 ```
 
@@ -345,8 +345,8 @@ We can check if any of the points are the same by creating a pairwise boolean ma
 Note that the `.to_numpy` method is applied to go from a `DataFrame` to a `numpy` array, for which `.any` gives a global rather than a row-wise summary, which is what we want in this case:
 
 ```{python}
-m = cycle_hire['geometry'].apply(
-  lambda x: cycle_hire_osm['geometry'].intersects(x)
+m = cycle_hire.geometry.apply(
+  lambda x: cycle_hire_osm.geometry.intersects(x)
 )
 m.to_numpy().any()
 ```
@@ -360,7 +360,7 @@ Note that we transform the data to a projected CRS (`27700`) to use real buffer
 ```{python}
 crs = 27700
 cycle_hire_buffers = cycle_hire.copy().to_crs(crs)
-cycle_hire_buffers['geometry'] = cycle_hire_buffers.buffer(20)
+cycle_hire_buffers.geometry = cycle_hire_buffers.buffer(20)
 z = gpd.sjoin(
   cycle_hire_buffers, 
   cycle_hire_osm.to_crs(crs)
@@ -377,7 +377,7 @@ We also go back from buffers to points using `.centroid`:
 z = z[['id', 'capacity', 'geometry']] \
     .dissolve(by='id', aggfunc='mean') \
     .reset_index()
-z['geometry'] = z.centroid
+z.geometry = z.centroid
 z
 ```
 
@@ -598,7 +598,7 @@ co
 The distance matrix `d` is obtained as follows (technically speaking, this is a `DataFrame`). In plain language, we take the geometry from each each row in `nz_height.iloc[:3, :]`, and apply the `.distance` method on `co` with that row as the argument:
 
 ```{python}
-d = nz_height.iloc[:3, :].apply(lambda x: co.distance(x['geometry']), axis=1)
+d = nz_height.iloc[:3, :].apply(lambda x: co.distance(x.geometry), axis=1)
 d
 ```
 

diff --git a/04-geometry-operations.qmd b/04-geometry-operations.qmd
@@ -190,7 +190,7 @@ In the common scenario when the original attributes of the input features need t
 
 ```{python}
 seine_buff_5km = seine.copy()
-seine_buff_5km['geometry'] = seine.buffer(5000)
+seine_buff_5km.geometry = seine.buffer(5000)
 seine_buff_5km
 ```
 
@@ -469,7 +469,7 @@ What is going on in terms of the geometries? Behind the scenes, `.dissolve` comb
 
 ```{python}
 us_west = us_states[us_states['REGION'] == 'West']
-us_west_union = us_west['geometry'].unary_union
+us_west_union = us_west.geometry.unary_union
 ```
 
 Note that the result is a `shapely` geometry, as the individual attributes are "lost" as part of dissolving. The result is shown in @fig-dissolve2.
@@ -486,7 +486,7 @@ To dissolve two (or more) groups of a `GeoDataFrame` into one geometry, we can e
 ```{python}
 sel = (us_states['REGION'] == 'West') | (us_states['NAME'] == 'Texas')
 texas_union = us_states[sel]
-texas_union = texas_union['geometry'].unary_union
+texas_union = texas_union.geometry.unary_union
 ```
 
 or concatenate the two separate subsets:

diff --git a/05-raster-vector.qmd b/05-raster-vector.qmd
@@ -74,7 +74,7 @@ To mask the image, i.e., convert all pixels which do not intersect with the `zio
 ```{python}
 out_image_mask, out_transform_mask = rasterio.mask.mask(
     src_srtm, 
-    zion['geometry'], 
+    zion.geometry, 
     crop=False, 
     nodata=9999
 )
@@ -351,7 +351,7 @@ To count occurrences of categorical raster values within polygons, we can use ma
 ```{python}
 out_image, out_transform = rasterio.mask.mask(
     src_nlcd, 
-    zion['geometry'].to_crs(src_nlcd.crs), 
+    zion.geometry.to_crs(src_nlcd.crs), 
     crop=False, 
     nodata=9999
 )
@@ -435,14 +435,14 @@ Now, we can rasterize. Rasterization is a very flexible operation: the results d
 To illustrate this flexibility we will try three different approaches to rasterization. First, we create a raster representing the presence or absence of cycle hire points (known as presence/absence rasters). In this case, we transfer the value of `1` to all pixels where at least one point falls in. To transform the point `GeoDataFrame` into a generator of `shapely` geometries and the (fixed) values, we use the following expression:
 
 ```{python}
-((g, 1) for g in cycle_hire_osm_projected['geometry'].to_list())
+((g, 1) for g in cycle_hire_osm_projected.geometry.to_list())
 ```
 
 Therefore, the rasterizing expression is:
 
 ```{python}
 ch_raster1 = rasterio.features.rasterize(
-    ((g, 1) for g in cycle_hire_osm_projected['geometry'].to_list()),
+    ((g, 1) for g in cycle_hire_osm_projected.geometry.to_list()),
     out_shape=shape,
     transform=transform
 )
@@ -458,7 +458,7 @@ To count the number of stations, we can use the fixed value of `1` combined with
 
 ```{python}
 ch_raster2 = rasterio.features.rasterize(
-    ((g, 1) for g in cycle_hire_osm_projected['geometry'].to_list()),
+    ((g, 1) for g in cycle_hire_osm_projected.geometry.to_list()),
     out_shape=shape,
     transform=transform,
     merge_alg=rasterio.enums.MergeAlg.add
@@ -514,7 +514,7 @@ california
 Second, we "cast" the polygon into a `"MultiLineString"` geometry, using the `.boundary` property that `GeoSeries` have: 
 
 ```{python}
-california_borders = california['geometry'].boundary
+california_borders = california.geometry.boundary
 california_borders
 ```
 
@@ -551,7 +551,7 @@ Compare it to a polygon rasterization, with `all_touched=False` by default, whic
 
 ```{python}
 california_raster2 = rasterio.features.rasterize(
-    ((g, 1) for g in california['geometry'].to_list()),
+    ((g, 1) for g in california.geometry.to_list()),
     out_shape=shape,
     transform=transform,
     fill=np.nan

diff --git a/06-reproj.qmd b/06-reproj.qmd
@@ -320,7 +320,7 @@ lonlat2UTM(174.7, -36.9)
 Here is another example for London (where we "unpack" the coordinates of the 1^st^ geometry in `lnd_layer` into the `lonlat2UTM` function arguments):
 
 ```{python}
-lonlat2UTM(*lnd_layer['geometry'].iloc[0].coords[0])
+lonlat2UTM(*lnd_layer.geometry.iloc[0].coords[0])
 ```
 
 Currently, we also have tools helping us to select a proper CRS.