diff --git a/01-spatial-data.html b/01-spatial-data.html index 1816a3cf..cfd4595a 100644 --- a/01-spatial-data.html +++ b/01-spatial-data.html @@ -418,17 +418,17 @@

1.2.2 Vector layers

The most commonly used geographic vector data structure is the vector layer. There are several approaches for working with vector layers in Python, ranging from low-level packages (e.g., osgeo, fiona) to the relatively high-level geopandas package that is the focus of this section. Before writing and running code for creating and working with geographic vector objects, we need to import geopandas (by convention as gpd for more concise code) and shapely.

-
+
import pandas as pd
 import shapely
 import geopandas as gpd

We also limit the maximum number of printed rows to six, to save space, using the 'display.max_rows' option of pandas.

-
+
pd.set_option('display.max_rows', 6)

Projects often start by importing an existing vector layer saved as a GeoPackage (.gpkg) file, an ESRI Shapefile (.shp), or other geographic file format. The function gpd.read_file imports a GeoPackage file named world.gpkg located in the data directory of Python’s working directory into a GeoDataFrame named gdf.

-
+
gdf = gpd.read_file('data/world.gpkg')

The result is an object of type (class) GeoDataFrame with 177 rows (features) and 11 columns, as shown in the output of the following code:

@@ -438,14 +438,14 @@

geopandas.geodataframe.GeoDataFrame

-
+
gdf.shape
(177, 11)

The GeoDataFrame class is an extension of the DataFrame class from the popular pandas package (McKinney 2010). This means we can treat non-spatial attributes from a vector layer as a table, and process them using the ordinary, i.e., non-spatial, established function methods. For example, standard data frame subsetting methods can be used. The code below creates a subset of the gdf dataset containing only the country name and the geometry.

-
+
gdf = gdf[['name_long', 'geometry']]
 gdf
@@ -504,7 +504,7 @@

The following expression creates a subdataset based on a condition, such as equality of the value in the 'name_long' column to the string 'Egypt'.

-
+
gdf[gdf['name_long'] == 'Egypt']
@@ -579,7 +579,7 @@