# load the library
library(terra)
library(tidyverse)
Basic Manipulation
1 Library
The terra
package in R is a powerful and versatile package for working with geospatial data, including vector and raster data. It provides a wide range of functionality for reading, processing, analyzing, and visualizing spatial data.
For more in-depth information and resources on the terra package and spatial data science in R, you can explore the original website Spatial Data Science.
Firstly load the library to the R space:
The libraries for spatial data in Python are divided into several libraries, unlike the comprehensive terra
library in R. For vector data, you can use the geopandas
library, and for raster data, rasterio
is a good choice, among others.
For more in-depth information and resources on the spatial data science in Python, you can explore the website Python Open Source Spatial Programming & Remote Sensing.
import os
import pandas as pd
import numpy as np
# Vector
import geopandas as gpd
from shapely.geometry import Point, LineString, Polygon, shape
import fiona
# Rster
import rasterio
from rasterio.plot import show as rast_plot
from rasterio.crs import CRS
from rasterio.warp import calculate_default_transform, reproject, Resampling
import rasterio.features
from rasterio.enums import Resampling
# Plot
import matplotlib.pyplot as plt
2 Creating spatial data manually
Creating spatial data manually is not a common practice due to the typically large volumes of data required. However, by starting from scratch and creating spatial data manually, you can gain a deeper understanding of the data’s structure and properties. This manual creation process helps you become more familiar with how spatial data is organized and can be a valuable learning exercise.
The examples provided here are just a few methods for manually creating spatial data. There are numerous ways to create spatial data in R with the terra
package. You can refer to the package documentation, specifically the rast()
and vect()
functions, to explore more advanced methods for creating and manipulating spatial data.
2.1 Vector
As introduced in the section, spatial vector data typically consists of three main components:
- Geometry: Describes the spatial location and shape of features.
- Attributes: Non-spatial properties associated with features.
- CRS (Coordinate Reference System): Defines the spatial reference framework.
# Define the coordinate reference system (CRS) with EPSG codes
<- "EPSG:31468"
crs_31468
# Define coordinates for the first polygon
<- c(4484566, 4483922, 4483002, 4481929, 4481222, 4482500, 4483000, 4484666, 4484233)
x_polygon_1 <- c(5554566, 5554001, 5553233, 5554933, 5550666, 5551555, 5550100, 5551711, 5552767)
y_polygon_1 <- cbind(id=1, part=1, x_polygon_1, y_polygon_1)
geometry_polygon_1 # Define coordinates for the second polygon
<- c(4481929, 4481222, 4480500)
x_polygon_2 <- c(5554933, 5550666, 5552555)
y_polygon_2 <- cbind(id=2, part=1, x_polygon_2, y_polygon_2)
geometry_polygon_2 # Combine the two polygons into one data frame
<- rbind(geometry_polygon_1, geometry_polygon_2)
geometry_polygon
# Create a vector layer for the polygons, specifying their type, attributes, CRS, and additional attributes
<- vect(geometry_polygon, type="polygons",
vect_Test atts = data.frame(ID_region = 1:2, Name = c("a", "b")),
crs = crs_31468)
$region_area <- expanse(vect_Test)
vect_Test
# Visualize the created polygons
plot(vect_Test)
# Define the coordinate reference system (CRS) with EPSG codes
= "EPSG:31468"
crs_31468
# Define coordinates for the first polygon
= [4484566, 4483922, 4483002, 4481929, 4481222, 4482500, 4483000, 4484666, 4484233]
x_polygon_1 = [5554566, 5554001, 5553233, 5554933, 5550666, 5551555, 5550100, 5551711, 5552767]
y_polygon_1 # Create a list of coordinate pairs for the first polygon
= Polygon([(x, y) for x, y in zip(x_polygon_1, y_polygon_1)])
geometry_polygon_1
# Define coordinates for the second polygon
= [4481929, 4481222, 4480500]
x_polygon_2 = [5554933, 5550666, 5552555]
y_polygon_2 # Create a list of coordinate pairs for the second polygon
= Polygon([(x, y) for x, y in zip(x_polygon_2, y_polygon_2)])
geometry_polygon_2
# Construct Shapely polygons using the lists of coordinates
= [geometry_polygon_1, geometry_polygon_2]
geometry_polygon
# Create a GeoDataFrame with the polygons, specifying their attributes, CRS, and additional attributes
= gpd.GeoDataFrame({
vect_Test 'ID_region': [1, 2],
'Name': ['a', 'b'],
'geometry': geometry_polygon,
=crs_31468)
}, crs
# Calculate the region area and add it as a new column
'region_area'] = vect_Test.area
vect_Test[
# Visualize the created polygons
vect_Test.plot() plt.show()
plt.close()
2.2 Raster
For raster data, the geometry is relatively simple and can be defined by the following components:
- Coordinate of Original Point (X0, Y0) plus Resolutions (X and Y)
- Boundaries (Xmin, Xmax, Ymin, Ymax) plus Number of Rows and Columns
One of the most critical aspects of raster data is the values stored within its cells. You can set or modify these values using the values()<-
function in R.
<- rast(ncol=10, nrow=10, xmin=-150, xmax=-80, ymin=20, ymax=60)
rast_Test values(rast_Test) <- runif(ncell(rast_Test))
plot(rast_Test)
= "C:\\Lei\\HS_Web\\data_share/raster_Py.tif"
fn_Rast_Test
# Create a new raster with the specified dimensions and extent
= 10, 10
ncol, nrow = -150, -80, 20, 60
xmin, xmax, ymin, ymax
# Create the empty raster with random values
with rasterio.open(
fn_Rast_Test,"w",
="GTiff",
driver=np.float32,
dtype=1,
count=ncol,
width=nrow,
height=rasterio.transform.from_origin(xmin, ymax, (xmax - xmin) / ncol, (ymax - ymin) / nrow),
transform="EPSG:4326"
crsas dst:
) # Generate random values and assign them to the raster
= np.random.rand(nrow, ncol).astype(np.float32)
random_values 1) # Write the values to band 1
dst.write(random_values,
# Now you have an empty raster with random values, and you can read and manipulate it as needed
with rasterio.open(fn_Rast_Test) as src:
= src.read(1)
rast_Test
rast_plot(rast_Test)
Certainly, you can directly create a data file like an ASC (ASCII) file for raster data.
3 Read and write
Reading and writing data are fundamental processes that precede spatial data manipulation. Spatial data is typically acquired from external sources.
The test files are available in Github
However, due to the substantial differences between raster and vector data structures, they are often handled separately.
Data Type | Read | Write |
---|---|---|
Vector | vect() |
writeVect() |
Raster | rast() |
writeRast() |
# Read shp-file as a vector layer
<- vect("https://raw.githubusercontent.com/HydroSimul/Web/main/data_share/minibeispiel_polygon.geojson")
vect_Test
# Read raster file
<- rast("https://raw.githubusercontent.com/HydroSimul/Web/main/data_share/minibeispiel_raster.asc")
rast_Test
# Info and Plot of vector layer
vect_Test
class : SpatVector
geometry : polygons
dimensions : 2, 3 (geometries, attributes)
extent : 4480500, 4484666, 5550100, 5554933 (xmin, xmax, ymin, ymax)
source : minibeispiel_polygon.geojson
coord. ref. : DHDN / 3-degree Gauss-Kruger zone 4 (EPSG:31468)
names : ID_region Name region_area
type : <int> <chr> <num>
values : 1 a 8.853e+06
2 b 2.208e+06
plot(vect_Test)
# Info and Plot of raster layer
rast_Test
class : SpatRaster
dimensions : 5, 5, 1 (nrow, ncol, nlyr)
resolution : 1000, 1000 (x, y)
extent : 4480000, 4485000, 5550000, 5555000 (xmin, xmax, ymin, ymax)
coord. ref. :
source : minibeispiel_raster.asc
name : minibeispiel_raster
plot(rast_Test)
Export:
= "fn_Output_Vector.geojson"
fn_Vect_Out writeVector(vect_Test, fn_Vect_Out, "GeoJSON")
= "fn_OutPut_Raster.tif"
fn_Rast_Out writeRaster(rast_Test, fn_Rast_Out)
Data Type | Read | Write |
---|---|---|
Vector | geopandas.read_file() |
geopandas.to_file() |
Raster | rastio.open('r') |
rastio.open('w') |
# Read GeoJSON file as a vector layer
= gpd.read_file("https://raw.githubusercontent.com/HydroSimul/Web/main/data_share/minibeispiel_polygon.geojson")
vect_Test
# Read raster file
= rasterio.open("https://raw.githubusercontent.com/HydroSimul/Web/main/data_share/minibeispiel_raster.asc")
rast_Test
# Info and Plot of the vector layer
print(vect_Test)
ID_region ... geometry
0 1 ... MULTIPOLYGON (((4484566.000 5554566.000, 44839...
1 2 ... MULTIPOLYGON (((4481929.000 5554933.000, 44812...
[2 rows x 4 columns]
vect_Test.plot() plt.show()
plt.close()
# Info and Plot of the raster layer
print(rast_Test.profile)
{'driver': 'AAIGrid', 'dtype': 'float32', 'nodata': -9999.0, 'width': 5, 'height': 5, 'count': 1, 'crs': CRS.from_wkt('GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0],UNIT["Degree",0.0174532925199433],AXIS["Longitude",EAST],AXIS["Latitude",NORTH]]'), 'transform': Affine(1000.0, 0.0, 4480000.0,
0.0, -1000.0, 5555000.0), 'blockysize': 1, 'tiled': False}
rast_plot(rast_Test)
Export:
= "fn_Output_Vector.geojson"
fn_Vect_Out ="GeoJSON")
vect_Test.to_file(fn_Vect_Out, driver
# Write the raster to a GeoTIFF file
= "fn_OutPut_Raster.tif"
fn_Rast_Out with rasterio.open(fn_Rast_Out, 'w', driver='GTiff', dtype=rast_Test.dtype, count=1, width=rast_Test.shape[1], height=rast_Test.shape[0], transform=rast_Test.transform, crs=rast_Test.crs) as dst:
1) dst.write(rast_Test,
4 Coordinate Reference Systems
4.1 Assigning a CRS
In cases where the Coordinate Reference System (CRS) information is not included in the data file’s content, you can assign it manually using the crs()
function. This situation often occurs when working with raster data in formats like ASC (Arc/Info ASCII Grid) or other file formats that may not store CRS information.
crs(rast_Test) <- "EPSG:31468"
rast_Test
class : SpatRaster
dimensions : 5, 5, 1 (nrow, ncol, nlyr)
resolution : 1000, 1000 (x, y)
extent : 4480000, 4485000, 5550000, 5555000 (xmin, xmax, ymin, ymax)
coord. ref. : DHDN / 3-degree Gauss-Kruger zone 4 (EPSG:31468)
source : minibeispiel_raster.asc
name : minibeispiel_raster
As the results showed, the CRS information has been filled with the necessary details in line coord. ref.
.
= rasterio.open("https://raw.githubusercontent.com/HydroSimul/Web/main/data_share/minibeispiel_raster.asc", 'r+')
rast_Test = CRS.from_epsg(31468)
rast_Test.crs print(rast_Test.crs)
EPSG:31468
The use of EPSG (European Petroleum Survey Group) codes is highly recommended for defining Coordinate Reference Systems (CRS) in spatial data. You can obtain information about EPSG codes from the EPSG website.
You should not use this approach to change the CRS of a data set from what it is to what you want it to be. Assigning a CRS is like labeling something.
4.2 Transforming vector data
The transformation of vector data is relatively simple, as it involves applying a mathematical formula to the coordinates of each point to obtain their new coordinates. This transformation can be considered as without loss of precision.
The project()
function can be utilized to reproject both vector and raster data.
# New CRS
<- "EPSG:4326"
crs_New # Reproject
<- project(vect_Test, crs_New)
vect_Test_New
# Info of vector layer
vect_Test_New
class : SpatVector
geometry : polygons
dimensions : 2, 3 (geometries, attributes)
extent : 11.72592, 11.78419, 50.08692, 50.13034 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
names : ID_region Name region_area
type : <int> <chr> <num>
values : 1 a 8.853e+06
2 b 2.208e+06
geopands.to_crs()
# New CRS
= "EPSG:4326"
crs_New
# Reproject the vector layer to the new CRS
= vect_Test.to_crs(crs=crs_New)
vect_Test_New
# Info of vector layer
print(vect_Test_New)
ID_region ... geometry
0 1 ... MULTIPOLYGON (((11.78268 50.12711, 11.77370 50...
1 2 ... MULTIPOLYGON (((11.74578 50.13033, 11.73611 50...
[2 rows x 4 columns]
4.3 Transforming raster data
Vector data can be transformed from lon/lat coordinates to planar and back without loss of precision. This is not the case with raster data. A raster consists of rectangular cells of the same size (in terms of the units of the CRS; their actual size may vary). It is not possible to transform cell by cell. For each new cell, values need to be estimated based on the values in the overlapping old cells. If the values are categorical data, the “nearest neighbor” method is commonly used. Otherwise some sort of interpolation is employed (e.g. “bilinear”). (From Spatial Data Science)
Because projection of rasters affects the cell values, in most cases you will want to avoid projecting raster data and rather project vector data.
4.3.1 With CRS
The simplest approach is to provide a new CRS:
project()
# New CRS
<- "EPSG:4326"
crs_New # Reproject
<- project(rast_Test, crs_New, method = 'near')
rast_Test_New
# Info and Plot of vector layer
rast_Test_New
class : SpatRaster
dimensions : 4, 6, 1 (nrow, ncol, nlyr)
resolution : 0.01176853, 0.01176853 (x, y)
extent : 11.7188, 11.78941, 50.08395, 50.13102 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
source(s) : memory
name : minibeispiel_raster
min value : 1
max value : 3
plot(rast_Test)
plot(rast_Test_New)
= 'C:\\Lei\\HS_Web\\data_share/minibeispiel_raster.tif'
fn_Rast_New # Define the new CRS
= {'init': 'EPSG:4326'}
new_crs = calculate_default_transform(
transform, width, height *rast_Test.bounds)
rast_Test.crs, new_crs, rast_Test.width, rast_Test.height, = rast_Test.meta.copy()
kwargs
kwargs.update({'crs': new_crs,
'transform': transform,
'width': width,
'height': height
}) = rasterio.open(fn_Rast_New, 'w', **kwargs)
rast_Test_New
reproject(=rasterio.band(rast_Test, 1),
source=rasterio.band(rast_Test_New, 1),
destination#src_transform=rast_Test.transform,
=rast_Test.crs,
src_crs#dst_transform=transform,
=new_crs,
dst_crs=Resampling.nearest) resampling
(Band(ds=<open BufferedDatasetWriter name='C:/Lei/HS_Web/data_share/minibeispiel_raster.tif' mode='w'>, bidx=1, dtype='float32', shape=(4, 6)), None)
rast_Test_New.close()
= rasterio.open("https://raw.githubusercontent.com/HydroSimul/Web/main/data_share/minibeispiel_raster.asc")
rast_Test rast_plot(rast_Test)
= rasterio.open(fn_Rast_New)
rast_Test_New rast_plot(rast_Test_New)
4.3.2 With Mask Raster
A second way is provide an existing SpatRaster
with the geometry you desire, with special boundary and resolution, this is a better way.
# New CRS
<- rast(ncol=10, nrow=10, xmin=265000, xmax=270000, ymin=5553000, ymax=5558000)
rast_Mask crs(rast_Mask) <- "EPSG:25833"
values(rast_Mask) <- 1
# Reproject
<- project(rast_Test, rast_Mask)
rast_Test_New
# Info and Plot of vector layer
rast_Test_New
class : SpatRaster
dimensions : 10, 10, 1 (nrow, ncol, nlyr)
resolution : 500, 500 (x, y)
extent : 265000, 270000, 5553000, 5558000 (xmin, xmax, ymin, ymax)
coord. ref. : ETRS89 / UTM zone 33N (EPSG:25833)
source(s) : memory
name : minibeispiel_raster
min value : 1
max value : 3
plot(rast_Test)
plot(rast_Test_New)
5 Vector data manipulation
In vector manipulation, it’s crucial to handle both attributes and shapes, especially when combining multiple shapes or layers with other shapes and addressing overlapping layers.
5.1 Attributes manipulation
5.1.1 Extract all Attributes
as.data.frame()
<- as.data.frame(vect_Test)
df_Attr df_Attr
ID_region Name region_area
1 1 a 8853404
2 2 b 2208109
5.1.2 Extract one with attribute name
$name
[, "name"]
$ID_region vect_Test
[1] 1 2
"ID_region"] vect_Test[,
class : SpatVector
geometry : polygons
dimensions : 2, 1 (geometries, attributes)
extent : 4480500, 4484666, 5550100, 5554933 (xmin, xmax, ymin, ymax)
source : minibeispiel_polygon.geojson
coord. ref. : DHDN / 3-degree Gauss-Kruger zone 4 (EPSG:31468)
names : ID_region
type : <int>
values : 1
2
5.1.3 Add a new attribute
$name <-
[, "name"] <-
$New_Attr <- c("n1", "n2")
vect_Test"New_Attr"] <- c("n1", "n2") vect_Test[,
5.1.4 Merge several attributes
- same order
cbind()
- common (key-)attributes
merge()
<- data.frame(Name = c("a", "b"), new_Attr2 = c(9, 6))
df_New_Attr
cbind(vect_Test, df_New_Attr)
class : SpatVector
geometry : polygons
dimensions : 2, 6 (geometries, attributes)
extent : 4480500, 4484666, 5550100, 5554933 (xmin, xmax, ymin, ymax)
source : minibeispiel_polygon.geojson
coord. ref. : DHDN / 3-degree Gauss-Kruger zone 4 (EPSG:31468)
names : ID_region Name region_area New_Attr Name new_Attr2
type : <int> <chr> <num> <chr> <chr> <num>
values : 1 a 8.853e+06 n1 a 9
2 b 2.208e+06 n2 b 6
merge(vect_Test, df_New_Attr, by = "Name")
class : SpatVector
geometry : polygons
dimensions : 2, 5 (geometries, attributes)
extent : 4480500, 4484666, 5550100, 5554933 (xmin, xmax, ymin, ymax)
coord. ref. : DHDN / 3-degree Gauss-Kruger zone 4 (EPSG:31468)
names : Name ID_region region_area New_Attr new_Attr2
type : <chr> <int> <num> <chr> <num>
values : a 1 8.853e+06 n1 9
b 2 2.208e+06 n2 6
5.1.5 Delete a attribute
$name <- NULL
$New_Attr <- c("n1", "n2")
vect_Test"New_Attr"] <- c("n1", "n2") vect_Test[,
5.2 Object Append and aggregate
5.2.1 Append new Objects
rbind()
# New Vect
# Define the coordinate reference system (CRS) with EPSG codes
<- "EPSG:31468"
crs_31468
# Define coordinates for the first polygon
<- c(4480400, 4481222, 4480500)
x_polygon_3 <- c(5551000, 5550666, 5552555)
y_polygon_3 <- cbind(id=3, part=1, x_polygon_3, y_polygon_3)
geometry_polygon_3
# Create a vector layer for the polygons, specifying their type, attributes, CRS, and additional attributes
<- vect(geometry_polygon_3, type="polygons", atts = data.frame(ID_region = 3, Name = c("b")), crs = crs_31468)
vect_New $region_area <- expanse(vect_New)
vect_New
# Append the objects
<- rbind(vect_Test, vect_New)
vect_Append vect_Append
class : SpatVector
geometry : polygons
dimensions : 3, 4 (geometries, attributes)
extent : 4480400, 4484666, 5550100, 5554933 (xmin, xmax, ymin, ymax)
source : minibeispiel_polygon.geojson
coord. ref. : DHDN / 3-degree Gauss-Kruger zone 4 (EPSG:31468)
names : ID_region Name region_area New_Attr
type : <int> <chr> <num> <chr>
values : 1 a 8.853e+06 n1
2 b 2.208e+06 n2
3 b 6.558e+05 NA
pandas.concat()
# Define the coordinate reference system (CRS) with EPSG code
= "EPSG:31468"
crs_31468
# Define coordinates for the new polygon
= [4480400, 4481222, 4480500]
x_polygon_3 = [5551000, 5550666, 5552555]
y_polygon_3
# Create a Polygon geometry
= Polygon(zip(x_polygon_3, y_polygon_3))
geometry_polygon_3
# Create a GeoDataFrame for the new polygon
= gpd.GeoDataFrame({'ID_region': [3], 'Name': ['b'], 'geometry': [geometry_polygon_3]}, crs=crs_31468)
vect_New
# Calculate the region area
'region_area'] = vect_New['geometry'].area
vect_New[
# Append the new GeoDataFrame to the existing one
= gpd.GeoDataFrame(pd.concat([vect_Test, vect_New], ignore_index=True), crs=crs_31468)
vect_Append
# Now, vect_Append contains the combined data
print(vect_Append)
ID_region ... geometry
0 1 ... MULTIPOLYGON (((4484566.000 5554566.000, 44839...
1 2 ... MULTIPOLYGON (((4481929.000 5554933.000, 44812...
2 3 ... POLYGON ((4480400.000 5551000.000, 4481222.000...
[3 rows x 4 columns]
5.2.2 Aggregate / Dissolve
It is common to aggregate (“dissolve”) polygons that have the same value for an attribute of interest.
aggregate()
# Aggregate by the "Name"
<- terra::aggregate(vect_Append, by = "Name")
vect_Aggregated vect_Aggregated
class : SpatVector
geometry : polygons
dimensions : 2, 5 (geometries, attributes)
extent : 4480400, 4484666, 5550100, 5554933 (xmin, xmax, ymin, ymax)
coord. ref. : DHDN / 3-degree Gauss-Kruger zone 4 (EPSG:31468)
names : Name mean_ID_region mean_region_area New_Attr agg_n
type : <chr> <num> <num> <chr> <int>
values : a 1 8.853e+06 n1 1
b 2.5 1.432e+06 NA 2
plot(vect_Append, "ID_region")
plot(vect_Aggregated, "Name")
geopandas.dissolve()
# Aggregate by the "Name"
= vect_Append.dissolve(by="Name", aggfunc="first")
vect_Aggregated
print(vect_Aggregated)
geometry ... region_area
Name ...
a POLYGON ((4484566.000 5554566.000, 4483922.000... ... 8.853404e+06
b POLYGON ((4480400.000 5551000.000, 4480500.000... ... 2.208109e+06
[2 rows x 3 columns]
vect_Test.plot() plt.show()
plt.close()
vect_Aggregated.plot() plt.show()
plt.close()
5.3 Overlap
To perform operations that involve overlap between two vector datasets, we will create a new vector dataset:
<- as.polygons(rast_Test)[1,]
vect_Overlap names(vect_Overlap) <- "ID_Rast"
plot(vect_Overlap, "ID_Rast")
# Read the raster and get the shapes
= rasterio.open("https://raw.githubusercontent.com/HydroSimul/Web/main/data_share/minibeispiel_raster.asc", 'r+')
rast_Test = CRS.from_epsg(31468)
rast_Test.crs = rast_Test.transform
transform = rasterio.features.shapes(rast_Test.read(1), transform=transform)
shapes
# Convert the shapes to a GeoDataFrame
= [shape(s) for s, v in shapes if v == 1]
geometries = gpd.GeoDataFrame({'geometry': geometries})
vect_Overlap
# Add an "ID_Rast" column to the GeoDataFrame
'ID_Rast'] = range(1, len(geometries) + 1)
vect_Overlap[="EPSG:31468"
vect_Overlap.crs
# Plot the polygons with "ID_Rast" as the attribute
='ID_Rast')
vect_Overlap.plot(column plt.show()
plt.close()
5.3.1 Erase
erase()
<- erase(vect_Test, vect_Overlap)
vect_Erase plot(vect_Erase, "ID_region")
geopandas.overlay(how='difference')
= gpd.overlay(vect_Test, vect_Overlap, how='difference')
vect_Erase ='ID_region', cmap='jet')
vect_Erase.plot(column plt.show()
plt.close()
5.3.2 Intersect
intersect()
<- terra::intersect(vect_Test, vect_Overlap)
vect_Intersect plot(vect_Intersect, "ID_region")
geopandas.overlay(how='intersection')
= gpd.overlay(vect_Test, vect_Overlap, how='intersection')
vect_Intersect ='ID_region', cmap='jet')
vect_Intersect.plot(column plt.show()
plt.close()
5.3.3 Union
Appends the geometries and attributes of the input.
union()
<- terra::union(vect_Test, vect_Overlap)
vect_Union plot(vect_Union, "ID_region")
geopandas.overlay(how='union')
= gpd.overlay(vect_Test, vect_Overlap, how='union')
vect_Union ='ID_region', cmap='jet')
vect_Union.plot(column plt.show()
plt.close()
5.3.4 Cover
cover()
is a combination of intersect()
and union()
. intersect returns new (intersected) geometries with the attributes of both input datasets. union appends the geometries and attributes of the input. cover returns the intersection and appends the other geometries and attributes of both datasets.
cover()
<- terra::cover(vect_Test, vect_Overlap)
vect_Cover plot(vect_Cover, "ID_region")
geopandas.overlay(how='identity')
= gpd.overlay(vect_Test, vect_Overlap, how='identity')
vect_Cover ='ID_region', cmap='jet')
vect_Cover.plot(column plt.show()
plt.close()
5.3.5 Difference
symdif()
<- terra::symdif(vect_Test, vect_Overlap)
vect_Difference plot(vect_Difference, "ID_region")
geopandas.overlay(how='symmetric_difference')
= gpd.overlay(vect_Test, vect_Overlap, how='symmetric_difference')
vect_Difference ='ID_region', cmap='jet')
vect_Difference.plot(column plt.show()
plt.close()
6 Raster data manipulation
Compared to vector data, raster data stores continuous numeric values more, leading to significant differences in manipulation and analysis approaches.
Data preparation for Python:
= rasterio.open("https://raw.githubusercontent.com/HydroSimul/Web/main/data_share/minibeispiel_raster.asc", 'r+')
rast_Test = CRS.from_epsg(31468)
rast_Test.crs = rast_Test.read(1)
rast_Test_data == -9999] = np.nan rast_Test_data[rast_Test_data
6.1 Raster algebra
Many generic functions that allow for simple and elegant raster algebra have been implemented for Raster objects, including the normal algebraic operators such as +
, -
, *
, /
, logical operators such as >
, >=
, <
, ==
, !
, and functions like abs
, round
, ceiling
, floor
, trunc
, sqrt
, log
, log10
, exp
, cos
, sin
, atan
, tan
, max
, min
, range
, prod
, sum
, any
, all
. In these functions, you can mix raster objects with numbers, as long as the first argument is a raster object. (Spatial Data Science)
<- rast_Test + 10
rast_Add plot(rast_Add)
= rast_Test_data + 10
rast_Add rast_plot(rast_Add)
6.2 Replace with Condition
rast[condition] <-
# Copy to a new raster
<- rast_Test
rast_Replace
# Replace
> 1] <- 10
rast_Replace[rast_Replace plot(rast_Replace)
rast[condition] =
= rast_Test_data
rast_Replace
# Replace values greater than 1 with 10
> 1] = 10
rast_Replace[rast_Replace rast_plot(rast_Replace)
6.3 Summary of multi-layers
<- mean(rast_Test, rast_Replace)
rast_Mean plot(rast_Mean)
= (rast_Test_data + rast_Replace) / 2
rast_Mean rast_plot(rast_Mean)
6.4 Aggregate and disaggregate
aggregate()
disagg()
# Aggregate by factor 2
<- aggregate(rast_Test, 2)
rast_Aggregate plot(rast_Aggregate)
# Disaggregate by factor 2
<- disagg(rast_Test, 2)
rast_Disagg rast_Disagg
class : SpatRaster
dimensions : 10, 10, 1 (nrow, ncol, nlyr)
resolution : 500, 500 (x, y)
extent : 4480000, 4485000, 5550000, 5555000 (xmin, xmax, ymin, ymax)
coord. ref. : DHDN / 3-degree Gauss-Kruger zone 4 (EPSG:31468)
source(s) : memory
name : minibeispiel_raster
min value : 1
max value : 3
plot(rast_Disagg)
# Aggregate by factor 2
= rast_Test_data
rast_Aggregate = rast_Aggregate[::2, ::2]
rast_Aggregate rast_plot(rast_Aggregate)
# Disaggregate by factor 2
= rast_Test_data
rast_Disagg = np.repeat(np.repeat(rast_Disagg, 2, axis=0), 2, axis=1)
rast_Disagg rast_plot(rast_Disagg)
6.5 Crop
The crop function lets you take a geographic subset of a larger raster object with an extent. But you can also use other spatial object, in them an extent can be extracted.
crop()
- with extention
- with rster
- with vector
<- crop(rast_Test, vect_Test[1,])
rast_Crop plot(rast_Crop)
6.6 Trim
trim()
Trim (shrink) a SpatRaster
by removing outer rows and columns that are NA or another value.
<- rast_Test
rast_Trim0 21:25] <- NA
rast_Trim0[<- trim(rast_Trim0) rast_Trim
plot(rast_Trim0)
plot(rast_Trim)
6.7 Mask
mask()
crop(mask = TRUE)
=mask()
+trim()
When you use mask manipulation in spatial data analysis, it involves setting the cells that are not covered by a mask to NA (Not Available) values. If you apply the crop(mask = TRUE)
operation, it means that not only will the cells outside of the mask be set to NA, but the resulting raster will also be cropped to match the extent of the mask.
<- mask(rast_Disagg, vect_Test[1,])
rast_Mask <- crop(rast_Disagg, vect_Test[1,], mask = TRUE) rast_CropMask
plot(rast_Mask)
plot(rast_CropMask)
= vect_Test.iloc[0:1].geometry.values[0]
vect_Mask
# Create a mask for the vect_Mask on the raster
= rasterio.features.geometry_mask([vect_Mask], out_shape=rast_Test.shape, transform=rast_Test.transform, invert=True)
rast_Mask # Apply the mask to the raster
= rast_Test_data.copy()
rast_Crop ~rast_Mask] = rast_Test.nodata # Set values outside the geometry to nodata
rast_Crop[
rast_plot(rast_Crop)