Introduction to Raster Data
- Raster data is pixelated data where each pixel is associated with a specific location.
- Raster data always has an extent and a resolution.
- The extent is the geographical area covered by a raster.
- The resolution is the area covered by each pixel of a raster.
Introduction to Vector Data
- Vector data structures represent specific features on the Earth’s surface along with attributes of those features.
- Vector data is often interpreted data and collected for a different purpose than you would want to use it for.
- Vector objects are either points, lines, or polygons.
Coordinate Reference Systems
- All geospatial datasets (raster and vector) are associated with a specific coordinate reference system.
- A coordinate reference system includes datum, projection, and additional parameters specific to the dataset.
- All maps are distorted because of the projection.
The Geospatial Landscape
- Many software packages exist for working with geospatial data.
- Command-line programs allow you to automate and reproduce your work.
- JupyterLab provides a user-friendly interface for working with Python.
Access satellite imagery using Python
- Accessing satellite images via the providers’ API enables a more reliable and scalable data retrieval.
- STAC catalogs can be browsed and searched using the same tools and scripts.
-
rioxarrayallows you to open and download remote raster files.
Read and visualize raster dataResampling the raster image
-
rioxarrayandxarrayare for working with multidimensional arrays like pandas is for working with tabular data. -
rioxarraystores CRS information as a CRS object that can be converted to an EPSG code or PROJ4 string. - Missing raster data are filled with nodata values, which should be handled with care for statistics and visualization.
Vector data in Python
- Load spatial objects into Python with
geopandas.read_file()function. - Spatial objects can be plotted directly with
GeoDataFrame’s.plot()method. - Convert CRS of spatial objects with
.to_crs(). Note that this generates aGeoSeriesobject. - Create a buffer of spatial objects with
.buffer(). - Merge spatial objects with
pd.concat().
Crop raster data with rioxarray and geopandas
- Use
clip_boxto crop a raster with a bounding box. - Use
clipto crop a raster with a given polygon. - Use
reproject_matchto match two raster datasets.
Raster Calculations in Python
- Python’s built-in math operators are fast and simple options for raster math.
Calculating Zonal Statistics on Rasters
- Zones can be extracted by attribute columns of a vector dataset
- Zones can be rasterized using
rasterio.features.rasterize - Calculate zonal statistics with
xrspatial.zonal_statsover the rasterized zones.
Parallel raster computations using Dask
- The
%%timeJupyter magic command can be used to profile calculations. - Data ‘chunks’ are the unit of parallelization in raster calculations.
- (
rio)xarraycan open raster files as chunked arrays. - The chunk shape and size can significantly affect the calculation performance.
- Cloud-optimized GeoTIFFs have an internal structure that enables performant parallel read.