How to read data from STAC#
This notebook shows how to read the data in from a STAC asset using xarray and a little hidden helper library called xpystac.
tl;dr#
For any PySTAC object that can be represented as an ndimensional dataset you can read the data using the following command:
xr.open_dataset(object)
Dependencies#
There are lots of optional dependencies depending on where and how the data you are interested in are stored. Here are some of the libraries that you will probably need:
dask - to delay data loading until access
fsspec - to access data from remote storage
pystac - STAC object structures
xarray, rioxarray - data structures
xpystac, stackstac - helper for loading pystac into xarray objects
[1]:
!pip install adlfs dask 'fsspec[http]' planetary_computer --quiet
!pip install stackstac xarray xpystac zarr --quiet
Despite all these install instructions, the import block is very straightforward
[2]:
import xarray as xr
import pystac
Examples#
Here are a few examples of the different types of objects that you can open in xarray.
COGs#
Read all the data from the COGs referenced by the assets on an item.
[3]:
landsat_item = pystac.Item.from_file(
"https://planetarycomputer.microsoft.com/api/stac/v1/collections/landsat-c2-l2/items/LC09_L2SP_088084_20230408_02_T2"
)
xr.open_dataset(landsat_item)
[3]:
<xarray.Dataset>
Dimensions: (time: 1, y: 7802, x: 7762, band: 19)
Coordinates: (12/32)
* time (time) datetime64[ns] 2023-04-08T23:37:51.63...
id (time) <U31 ...
* x (x) float64 3.774e+05 3.774e+05 ... 6.102e+05
* y (y) float64 -3.713e+06 ... -3.947e+06
proj:shape object ...
sci:doi <U16 ...
... ...
raster:bands (band) object ...
classification:bitfields (band) object ...
common_name (band) object ...
center_wavelength (band) object ...
full_width_half_max (band) object ...
epsg int64 ...
Dimensions without coordinates: band
Data variables: (12/19)
qa (time, y, x) float64 ...
red (time, y, x) float64 ...
blue (time, y, x) float64 ...
drad (time, y, x) float64 ...
emis (time, y, x) float64 ...
emsd (time, y, x) float64 ...
... ...
swir16 (time, y, x) float64 ...
swir22 (time, y, x) float64 ...
coastal (time, y, x) float64 ...
qa_pixel (time, y, x) float64 ...
qa_radsat (time, y, x) float64 ...
qa_aerosol (time, y, x) float64 ...
Attributes:
spec: RasterSpec(epsg=32656, bounds=(377370.0, -3947130.0, 610230....
crs: epsg:32656
transform: | 30.00, 0.00, 377370.00|\n| 0.00,-30.00,-3713070.00|\n| 0.0...
resolution: 30.0Zarr#
Read from an asset that references data stored in zarr
[4]:
daymet_collection = pystac.Collection.from_file(
"https://planetarycomputer.microsoft.com/api/stac/v1/collections/daymet-daily-hi"
)
daymet_asset = daymet_collection.assets["zarr-abfs"]
xr.open_dataset(daymet_asset)
[4]:
<xarray.Dataset>
Dimensions: (time: 14965, y: 584, x: 284, nv: 2)
Coordinates:
lat (y, x) float32 ...
lon (y, x) float32 ...
* time (time) datetime64[ns] 1980-01-01T12:00:00 ... 20...
* x (x) float32 -5.802e+06 -5.801e+06 ... -5.519e+06
* y (y) float32 -3.9e+04 -4e+04 ... -6.21e+05 -6.22e+05
Dimensions without coordinates: nv
Data variables:
dayl (time, y, x) float32 ...
lambert_conformal_conic int16 ...
prcp (time, y, x) float32 ...
srad (time, y, x) float32 ...
swe (time, y, x) float32 ...
time_bnds (time, nv) datetime64[ns] ...
tmax (time, y, x) float32 ...
tmin (time, y, x) float32 ...
vp (time, y, x) float32 ...
yearday (time) int16 ...
Attributes:
Conventions: CF-1.6
Version_data: Daymet Data Version 4.0
Version_software: Daymet Software Version 4.0
citation: Please see http://daymet.ornl.gov/ for current Daymet ...
references: Please see http://daymet.ornl.gov/ for current informa...
source: Daymet Software Version 4.0
start_year: 1980Reference file#
If the collection has a reference file we can use that
[5]:
cmip6_collection = pystac.Collection.from_file(
"https://planetarycomputer.microsoft.com/api/stac/v1/collections/nasa-nex-gddp-cmip6"
)
cmip6_asset = cmip6_collection.assets["ACCESS-CM2.historical"]
xr.open_dataset(cmip6_asset)
[5]:
<xarray.Dataset>
Dimensions: (time: 23741, lat: 600, lon: 1440)
Coordinates:
* lat (lat) float64 -59.88 -59.62 -59.38 -59.12 ... 89.38 89.62 89.88
* lon (lon) float64 0.125 0.375 0.625 0.875 ... 359.1 359.4 359.6 359.9
* time (time) datetime64[us] 1950-01-01T12:00:00 ... 2014-12-31T12:00:00
Data variables:
hurs (time, lat, lon) float32 ...
huss (time, lat, lon) float32 ...
pr (time, lat, lon) float32 ...
rlds (time, lat, lon) float32 ...
rsds (time, lat, lon) float32 ...
sfcWind (time, lat, lon) float32 ...
tas (time, lat, lon) float32 ...
tasmax (time, lat, lon) float32 ...
tasmin (time, lat, lon) float32 ...
Attributes: (12/22)
Conventions: CF-1.7
activity: NEX-GDDP-CMIP6
cmip6_institution_id: CSIRO-ARCCSS
cmip6_license: CC-BY-SA 4.0
cmip6_source_id: ACCESS-CM2
contact: Dr. Rama Nemani: rama.nemani@nasa.gov, Dr. Bridget...
... ...
scenario: historical
source: BCSD
title: ACCESS-CM2, r1i1p1f1, historical, global downscale...
tracking_id: 16d27564-470f-41ea-8077-f4cc3efa5bfe
variant_label: r1i1p1f1
version: 1.0