Skip to content

Commit

Permalink
raster of distances to nearest geometry
Browse files Browse the repository at this point in the history
  • Loading branch information
michaeldorman committed Aug 1, 2023
1 parent fafa670 commit 28b195f
Show file tree
Hide file tree
Showing 6 changed files with 167 additions and 58 deletions.
99 changes: 50 additions & 49 deletions 04-spatial-operations.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -25,35 +25,36 @@ import numpy as np
import os
import rasterio
import scipy.ndimage
from rasterio.plot import show
import rasterio.plot
import rasterio.merge
import rasterio.features
```

and load the sample data for this chapter:

```{python}
#| echo: false
from pathlib import Path
data_path = Path("data")
file_path = Path("data/landsat.tif")
data_path = Path('data')
file_path = Path('data/landsat.tif')
if not file_path.exists():
if not data_path.is_dir():
os.mkdir(data_path)
print("Attempting to get the data")
print('Attempting to get the data')
import requests
r = requests.get("https://github.com/geocompx/geocompy/releases/download/0.1/landsat.tif")
with open(file_path, "wb") as f:
r = requests.get('https://github.com/geocompx/geocompy/releases/download/0.1/landsat.tif')
with open(file_path, 'wb') as f:
f.write(r.content)
```

```{python}
nz = gpd.read_file("data/nz.gpkg")
nz_height = gpd.read_file("data/nz_height.gpkg")
nz = gpd.read_file('data/nz.gpkg')
nz_height = gpd.read_file('data/nz_height.gpkg')
world = gpd.read_file('data/world.gpkg')
cycle_hire = gpd.read_file('data/cycle_hire.gpkg')
cycle_hire_osm = gpd.read_file('data/cycle_hire_osm.gpkg')
src_elev = rasterio.open("data/elev.tif")
src_multi_rast = rasterio.open("data/landsat.tif")
src_elev = rasterio.open('data/elev.tif')
src_multi_rast = rasterio.open('data/landsat.tif')
src_grain = rasterio.open('data/grain.tif')
```

Expand Down Expand Up @@ -625,15 +626,7 @@ In case we want all values at once we can apply `list`. The result is `16`:
list(src_elev.sample([(0.1, 0.1)]))
```

Raster objects can also be subset with another raster object, as demonstrated in the code chunk below:

...

```{python}
# ...
```

Another common use case of spatial subsetting is using a boolean mask, based on another raster with the same extent and resolution, or the original one, as illustrated in Figure .... To do that, we "erase" the values in the array of one raster, according to another corresponding "mask" raster. For example, let us read the `elev.tif` raster array:
Another common use case of spatial subsetting is using a boolean mask, based on another raster with the same extent and resolution, or the original one, as illustrated in @fig-raster-subset. To do that, we "erase" the values in the array of one raster, according to another corresponding "mask" raster. For example, let us read the `elev.tif` raster array:

```{python}
elev = src_elev.read(1)
Expand Down Expand Up @@ -664,15 +657,15 @@ The result is shown in @fig-raster-subset.
#| fig-cap: Original raster (left). Raster mask (middle). Output of masking a raster (right).
fig, axes = plt.subplots(ncols=3, figsize=(8,4))
show(elev, ax=axes[0])
show(mask, ax=axes[1])
show(elev1, ax=axes[2])
axes[0].set_title("Original")
axes[1].set_title("Mask")
axes[2].set_title("Result");
rasterio.plot.show(elev, ax=axes[0])
rasterio.plot.show(mask, ax=axes[1])
rasterio.plot.show(elev1, ax=axes[2])
axes[0].set_title('Original')
axes[1].set_title('Mask')
axes[2].set_title('Result');
```

The above approach can be also used to replace some values (e.g., expected to be wrong) with `nan`:
The above approach can be also used to replace some values (e.g., expected to be wrong) with `np.nan`:

```{python}
elev1 = elev.copy()
Expand All @@ -687,7 +680,7 @@ These operations are in fact Boolean local operations, since we compare cell-wis

The term 'map algebra' was coined in the late 1970s to describe a "set of conventions, capabilities, and techniques" for the analysis of geographic raster and (although less prominently) vector data (Tomlin 1994). In this context, we define map algebra more narrowly, as operations that modify or summarise raster cell values, with reference to surrounding cells, zones, or statistical functions that apply to every cell.

Map algebra operations tend to be fast, because raster datasets only implicitly store coordinates, hence the old adage raster is faster but vector is corrector. The location of cells in raster datasets can be calculated by using its matrix position and the resolution and origin of the dataset (stored in the header). For the processing, however, the geographic position of a cell is barely relevant as long as we make sure that the cell position is still the same after the processing. Additionally, if two or more raster datasets share the same extent, projection and resolution, one could treat them as matrices for the processing.
Map algebra operations tend to be fast, because raster datasets only implicitly store coordinates, hence the old adage "raster is faster but vector is corrector". The location of cells in raster datasets can be calculated by using its matrix position and the resolution and origin of the dataset (stored in the header). For the processing, however, the geographic position of a cell is barely relevant as long as we make sure that the cell position is still the same after the processing. Additionally, if two or more raster datasets share the same extent, projection and resolution, one could treat them as matrices for the processing.

This is the way that map algebra works with the terra package. First, the headers of the raster datasets are queried and (in cases where map algebra operations work on more than one dataset) checked to ensure the datasets are compatible. Second, map algebra retains the so-called one-to-one locational correspondence, meaning that cells cannot move. This differs from matrix algebra, in which values change position, for example when multiplying or dividing matrices.

Expand Down Expand Up @@ -727,17 +720,17 @@ elev + elev
#| fig-cap: 'Examples of different local operations of the elev raster object: adding two rasters, squaring, applying logarithmic transformation, and performing a logical operation.'
fig, axes = plt.subplots(ncols=4, figsize=(8,4))
show(elev + elev, ax=axes[0], cmap="Oranges")
show(elev1 ** 2, ax=axes[1], cmap="Oranges") # using elev here produces int8 overflows
show(np.log(elev), ax=axes[2], cmap="Oranges")
show(elev > 5, ax=axes[3], cmap="Oranges")
axes[0].set_title("elev+elev")
axes[1].set_title("elev ** 2")
axes[2].set_title("np.log(elev)")
axes[3].set_title("elev > 5");
rasterio.plot.show(elev + elev, ax=axes[0], cmap='Oranges')
rasterio.plot.show(elev1 ** 2, ax=axes[1], cmap='Oranges') # using elev here produces int8 overflows
rasterio.plot.show(np.log(elev), ax=axes[2], cmap='Oranges')
rasterio.plot.show(elev > 5, ax=axes[3], cmap='Oranges')
axes[0].set_title('elev+elev')
axes[1].set_title('elev ** 2')
axes[2].set_title('np.log(elev)')
axes[3].set_title('elev > 5');
```

Another good example of local operations is the classification of intervals of numeric values into groups such as grouping a digital elevation model into low (class 1), middle (class 2) and high elevations (class 3). Here, we assign the raster values in the ranges 012, 1224 and 2436 are reclassified to take values 1, 2 and 3, respectively.
Another good example of local operations is the classification of intervals of numeric values into groups such as grouping a digital elevation model into low (class 1), middle (class 2) and high elevations (class 3). Here, we assign the raster values in the ranges 0--12, 12--24 and 24--36 are reclassified to take values 1, 2 and 3, respectively.

```{python}
recl = elev.copy()
Expand All @@ -753,10 +746,10 @@ The reclassified result is shown in @fig-raster-reclassify.
#| fig-cap: Reclassifying a continuous raster into three categories.
fig, axes = plt.subplots(ncols=2, figsize=(8,4))
show(elev, ax=axes[0], cmap="Oranges")
show(recl, ax=axes[1], cmap="Oranges")
axes[0].set_title("Original")
axes[1].set_title("Reclassified");
rasterio.plot.show(elev, ax=axes[0], cmap='Oranges')
rasterio.plot.show(recl, ax=axes[1], cmap='Oranges')
axes[0].set_title('Original')
axes[1].set_title('Reclassified');
```

The calculation of the normalized difference vegetation index (NDVI) is a well-known local (pixel-by-pixel) raster operation. It returns a raster with values between -1 and 1; positive values indicate the presence of living plants (mostly > 0.2). NDVI is calculated from red and near-infrared (NIR) bands of remotely sensed imagery, typically from satellite systems such as Landsat or Sentinel. Vegetation absorbs light heavily in the visible light spectrum, and especially in the red channel, while reflecting NIR light, explaining the NVDI formula:
Expand Down Expand Up @@ -796,10 +789,10 @@ The result is shown in @fig-raster-ndvi.
#| fig-cap: RGB image (left) and NDVI values (right) calculated for the example satellite file of the Zion National Park.
fig, axes = plt.subplots(ncols=2, figsize=(8,4))
show(multi_rast_rgb, ax=axes[0], cmap="RdYlGn")
show(ndvi, ax=axes[1], cmap="Greens")
axes[0].set_title("RGB image")
axes[1].set_title("NDVI");
rasterio.plot.show(multi_rast_rgb, ax=axes[0], cmap='RdYlGn')
rasterio.plot.show(ndvi, ax=axes[1], cmap='Greens')
axes[0].set_title('RGB image')
axes[1].set_title('NDVI');
```

### Focal operations
Expand Down Expand Up @@ -915,9 +908,17 @@ z

This returns the statistics for each category, here the mean elevation for each grain size class. For example, the mean elevation in pixels characterized by grain size `0` is `14.8`, and so on.

### Global operations and distances
### Global operations and distances {#sec-global-operations-and-distances}

...
Global operations are a special case of zonal operations with the entire raster dataset representing a single zone.
The most common global operations are descriptive statistics for the entire raster dataset such as the minimum or maximum---we already discussed those in @sec-summarizing-raster-objects.

Aside from that, global operations are also useful for the computation of distance and weight rasters.
In the first case, one can calculate the distance from each cell to specific target cells or vector geometries.
For example, one might want to compute the distance to the nearest coast (see example in @sec-distance-to-nearest-geometry).
We might also want to consider topography, that means, we are not only interested in the pure distance but would like also to avoid the crossing of mountain ranges when going to the coast.
To do so, we can weight the distance with elevation so that each additional altitudinal meter "prolongs" the Euclidean distance (this is beyond the scope of the book).
Visibility and viewshed computations also belong to the family of global operations (this is also beyond the scope of the book).

### Map algebra counterparts in vector processing

Expand Down Expand Up @@ -949,9 +950,9 @@ Both inputs and the result are shown in @fig-raster-merge:
#| fig-cap: Raster merging
fig, axes = plt.subplots(2, 2, figsize=(8,4))
show(src_1, ax=axes[0][0])
show(src_2, ax=axes[0][1])
show(out_image, transform=out_transform, ax=axes[1][0])
rasterio.plot.show(src_1, ax=axes[0][0])
rasterio.plot.show(src_2, ax=axes[0][1])
rasterio.plot.show(out_image, transform=out_transform, ax=axes[1][0])
axes[1][1].axis('off')
axes[0][0].set_title('aut.tif')
axes[0][1].set_title('ch.tif')
Expand Down
12 changes: 6 additions & 6 deletions 05-geometry-operations.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -645,8 +645,8 @@ The second step is to update the `transform`, taking into account the change in

```{python}
new_transform = src.transform * src.transform.scale(
(src.width / r.shape[0]),
(src.height / r.shape[1])
(src.width / r.shape[1]),
(src.height / r.shape[0])
)
new_transform
```
Expand All @@ -670,8 +670,8 @@ In case we need to export the new raster, we need to update the metadata:
dst_kwargs = src.meta.copy()
dst_kwargs.update({
'transform': new_transform,
'width': r.shape[0],
'height': r.shape[1],
'width': r.shape[1],
'height': r.shape[0],
})
dst_kwargs
```
Expand Down Expand Up @@ -712,8 +712,8 @@ And here is the same expression as shown for aggregation, to calculate the new t

```{python}
new_transform2 = src.transform * src.transform.scale(
(src.width / r2.shape[0]),
(src.height / r2.shape[1])
(src.width / r2.shape[1]),
(src.height / r2.shape[0])
)
new_transform2
```
Expand Down
110 changes: 109 additions & 1 deletion 06-raster-vector.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ import rasterio
import rasterio.mask
import rasterstats
import rasterio.plot
import rasterio.features
import math
import os
```
Expand All @@ -43,6 +44,8 @@ zion = gpd.read_file('data/zion.gpkg')
zion_points = gpd.read_file('data/zion_points.gpkg')
cycle_hire_osm = gpd.read_file('data/cycle_hire_osm.gpkg')
us_states = gpd.read_file('data/us_states.gpkg')
nz = gpd.read_file('data/nz.gpkg')
src_nz_elev = rasterio.open('data/nz_elev.tif')
```

## Introduction
Expand Down Expand Up @@ -356,7 +359,7 @@ axes[1].set_title('Categorical data extraction');

## Rasterization {#sec-rasterization}

### Rasterizing points
### Rasterizing points {#sec-rasterizing-points}

Rasterization is the conversion of vector objects into their representation in raster objects. Usually, the output raster is used for quantitative analysis (e.g., analysis of terrain) or modeling. As we saw in @sec-spatial-class, the raster data model has some characteristics that make it conducive to certain methods. Furthermore, the process of rasterization can help simplify datasets because the resulting values all have the same spatial resolution: rasterization can be seen as a special type of geographic data aggregation.

Expand Down Expand Up @@ -571,6 +574,8 @@ There are three standard methods to convert a raster to a vector layer:
* Raster to points
* Raster to contours

### Raster to polygons

The most straightforward form of vectorization is the first one, converting raster cells to polygons, where each pixel is represented by a rectangular polygon. The second method, raster to points, has the additional step of calculating polygon centroids. The third method, raster to contours, is somewhat unrelated. Let us demonstrate the three in the given order.

The `rasterio.features.shapes` function can be used to access to the raster pixel as polygon geometries, as well as raster values. The returned object is a generator, which yields `geometry,value` pairs. The additional `transform` argument is used to yield true spatial coordinates of the polygons, which is usually what we want.
Expand Down Expand Up @@ -627,6 +632,8 @@ The resulting polygon layer is shown in @fig-raster-to-polygons. As shown using
result.plot(column='value', edgecolor='black', legend=True);
```

### Raster to points {#sec-raster-to-points}

To transform raster to points, we can use `rasterio.features.shapes`, as in conversion to polygons, only with the addition of the `.centroid` method to go from polygons to their centroids. However, to avoid dissolving nearby pixels, we will actually convert a raster with consecutive IDs, then extract the "true" values by point (it is not strictly necessary in this example, since the values of `elev.tif` are all unique):

```{python}
Expand Down Expand Up @@ -670,6 +677,8 @@ result.plot(column='value', legend=True, ax=axes[1])
rasterio.plot.show(src_elev, cmap='Greys', ax=axes[1]);
```

### Raster to contours

Another common type of spatial vectorization is the creation of contour lines representing lines of continuous height or temperatures (isotherms) for example. We will use a real-world digital elevation model (DEM) because the artificial raster elev produces parallel lines (task for the reader: verify this and explain why this happens). Plotting contour lines is straightforward, using the `contour=True` option of `rasterio.plot.show` (@fig-raster-contours1):

```{python}
Expand Down Expand Up @@ -721,5 +730,104 @@ rasterio.plot.show(src_dem, ax=ax)
contours.plot(ax=ax, edgecolor='black');
```

## Distance to nearest geometry {#sec-distance-to-nearest-geometry}

Calculating a raster of distances to the nearest geometry is an example of a "global" raster operation (@sec-global-operations-and-distances).
To demonstrate it, suppose that we need to calculate a raster representing the distance to the nearest coast in New Zealand.
This example also wraps many of the concepts introduced in this chapter and in previous chapter, such as raster aggregation, raster conversion to points, and rasterizing points.

For the coastline, we will dissolve the New Zealand administrative division polygon layer and "extract" the boundary (which is a `"MultiLineString"` geometry):

```{python}
coastline = gpd.GeoSeries(nz.unary_union, crs=nz.crs) \
.to_crs(src_nz_elev.crs) \
.boundary
coastline
```

For a "template" raster, we will aggregate the New Zealand DEM, in the `nz_elev.tif` file, to 5 times coarser resolution.
The code section below follows the aggeregation example in @sec-raster-agg-disagg, then replaces the original (aggregated) values with unique IDs, which is a preliminary step when converting to points, as explained in @sec-raster-to-points.
Finally, we also replace "erase" (i.e., replace with `np.nan`) IDs which were `np.nan` in the aggregated elevation raster, i.e., beyond the land area of New Zealand:

```{python}
factor = 0.2
# Reading aggregated array
r = src_nz_elev.read(1,
out_shape=(
int(src_nz_elev.height * factor),
int(src_nz_elev.width * factor)
),
resampling=rasterio.enums.Resampling.average
)
# Updating the transform
new_transform = src_nz_elev.transform * src_nz_elev.transform.scale(
(src_nz_elev.width / r.shape[1]),
(src_nz_elev.height / r.shape[0])
)
# Generating unique IDs per cell
ids = r.copy()
ids = np.arange(0, r.size).reshape(r.shape).astype(np.float32)
# "Erasing" irrelevant IDs
ids[np.isnan(r)] = np.nan
ids
```

The result is an array named `ids` with the IDs, and the corresponding `new_transform`, as plotted in @fig-raster-distances1:

```{python}
#| label: fig-raster-distances1
#| fig-cap: Template with cell IDs to calculate distance to nearest geometry
fig, ax = plt.subplots()
rasterio.plot.show(ids, transform=new_transform, ax=ax)
gpd.GeoSeries(coastline).plot(ax=ax, edgecolor='black');
```

To calculate distances, we must convert each pixel to a vector (point) geometry.
We use the technique demonstrated in @sec-raster-to-points:

```{python}
shapes = rasterio.features.shapes(ids, transform=new_transform)
pol = list(shapes)
pnt = [shapely.geometry.shape(i[0]).centroid for i in pol]
```

The result `pnt` is a `list` of `shapely` geometries, representing raster cell centroids (excluding `np.nan` pixels):

```{python}
print(pnt[0])
```

Next we calculate the correspinding `list` of distances, using the `.distance` method from `shapely`:

```{python}
distances = [(i, i.distance(coastline)) for i in pnt]
distances[0]
```

Finally, we rasterize (see @sec-rasterizing-points) the distances into our raster template:

```{python}
image = rasterio.features.rasterize(
distances,
out_shape=ids.shape,
dtype=np.float_,
transform=new_transform,
fill=np.nan
)
image
```

The resulting raster of distances is shown in @fig-raster-distances2:

```{python}
#| label: fig-raster-distances2
#| fig-cap: Distance to nearest coastline in New Zealand
fig, ax = plt.subplots()
rasterio.plot.show(image, transform=new_transform, ax=ax)
gpd.GeoSeries(coastline).plot(ax=ax, edgecolor='black');
```

## Exercises

Loading

0 comments on commit 28b195f

Please sign in to comment.