Skip to content

Commit

Permalink
ch03 corrections (spatial join)
Browse files Browse the repository at this point in the history
  • Loading branch information
michaeldorman committed Sep 23, 2023
1 parent ff9e282 commit 9bccfa0
Showing 1 changed file with 26 additions and 17 deletions.
43 changes: 26 additions & 17 deletions 03-spatial-operations.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -322,46 +322,55 @@ points.iloc[1].distance(poly.iloc[0])

Joining two non-spatial datasets relies on a shared 'key' variable, as described in @sec-vector-attribute-joining. Spatial data joining applies the same concept, but instead relies on spatial relations, described in the previous section. As with attribute data, joining adds new columns to the target object (the argument x in joining functions), from a source object (y).

The process is illustrated by the following example: imagine you have ten points randomly distributed across the Earth's surface and you ask, for the points that are on land, which countries are they in? Implementing this idea in a reproducible example will build your geographic data handling skills and show how spatial joins work. The starting point is to create points that are randomly scattered over the Earth's surface:
The process is illustrated by the following example: imagine you have ten points randomly distributed across the Earth's surface and you ask, for the points that are on land, which countries are they in? Implementing this idea in a reproducible example will build your geographic data handling skills and show how spatial joins work. The starting point is to create points that are randomly scattered over the Earth's surface (@fig-spatial-join (a)):

```{python}
np.random.seed(11) ## set seed for reproducibility
np.random.seed(11) ## set seed for reproducibility
bb = world.total_bounds ## the world's bounds
x = np.random.uniform(low=bb[0], high=bb[2], size=10)
y = np.random.uniform(low=bb[1], high=bb[3], size=10)
random_points = gpd.points_from_xy(x, y, crs=4326)
random_points = gpd.GeoSeries(random_points)
random_points = gpd.GeoDataFrame({'geometry': random_points})
random_points
```

The scenario illustrated in @fig-spatial-join shows that the `random_points` object (top left) lacks attribute data, while the world (top right) has attributes, including country names shown for a sample of countries in the legend. Spatial joins are implemented with `gpd.sjoin`, as illustrated in the code chunk below. The output is the `random_joined` object which is illustrated in @fig-spatial-join (bottom left). Before creating the joined dataset, we use spatial subsetting to create world_random, which contains only countries that contain random points, to verify the number of country names returned in the joined dataset should be four (see the top right panel of @fig-spatial-join).
The scenario illustrated in @fig-spatial-join shows that the `random_points` object (top left) lacks attribute data, while the world (top right) has attributes, including country names shown for a sample of countries in the legend. Spatial joins are implemented with `x.sjoin(y)`, as illustrated in the code chunk below. The output is the `random_joined` object which is illustrated in @fig-spatial-join (c). Before creating the joined dataset, we use spatial subsetting to create `world_random`, which contains only countries that contain random points, to verify the number of country names returned in the joined dataset should be four (see the top right panel of @fig-spatial-join (b)):

```{python}
# Subset
world_random = world[world.intersects(random_points.unary_union)]
world_random
```

Now we can do the spatial join:

```{python}
# Spatial join
random_joined = gpd.sjoin(random_points, world, how='left')
random_joined = random_points.sjoin(world, how='left')
random_joined
```

The input points and countries, the illustration of intersecting countries, and the join result, are shown in @fig-spatial-join:

```{python}
#| label: fig-spatial-join
#| fig-cap: Illustration of a spatial join. A new attribute variable is added to random points (top left) from source world object (top right) resulting in the data represented in the final panel.
#| fig-cap: Illustration of a spatial join
#| fig-subcap:
#| - A new attribute variable is added to random points,
#| - from source world object,
#| - resulting in points associated with country names
#| layout-ncol: 2
fig, axes = plt.subplots(2, 2, figsize=(8,4))
base = world.plot(color='white', edgecolor='lightgrey', ax=axes[0][0])
random_points.plot(ax=base, color='None', edgecolor='red')
base = world.plot(color='white', edgecolor='lightgrey', ax=axes[0][1])
world_random.plot(ax=base, column='name_long')
base = world.plot(color='white', edgecolor='lightgrey', ax=axes[1][0])
random_joined.geometry.plot(ax=base, color='grey');
random_joined.plot(ax=base, column='name_long', legend=True)
fig.delaxes(axes[1][1]);
# Random points
base = world.plot(color='white', edgecolor='lightgrey')
random_points.plot(ax=base, color='None', edgecolor='red');
# World countries intersecting with the points
base = world.plot(color='white', edgecolor='lightgrey')
world_random.plot(ax=base, column='name_long');
# Points with joined country names
base = world.plot(color='white', edgecolor='lightgrey')
random_joined.geometry.plot(ax=base, color='grey')
random_joined.plot(ax=base, column='name_long', legend=True);
```

### Non-overlapping joins
Expand Down

0 comments on commit 9bccfa0

Please sign in to comment.