This project aims to connect Sentinel 1 with the Open Data Cube.
It takes Synthetic Aperture Radar data (specifically GRD scenes from the Sentinel 1 platform) and prepares it for ingestion into an opendatacube instance (such as Digital Earth Australia), using the Sentinel Toolbox (SNAP) software.
NOTE: in-progress, use at own risk.
Prerequisites:
- Sentinel 1 GRD (ground-range, detected amplitude) scenes.
- Precise orbital ephemeris metadata. (Possibly also calibration?)
- A digital model for the elevations of the scattering surface (DSM/DEM).
- gpt (graph processing tool) from the Sentinel Toolbox software.
- Access to a configured ODC instance.
Stages:
- Update metadata (i.e. orbit vectors)
- Trim border noise (an artifact of the S1 GRD products)
- Calibrate (radiometric, outputting beta-nought)
- Flatten (radiometric terrain correction)
- Range-Doppler (geometric terrain correction)
- Format for the AGDC (e.g. export metadata, tile and index)
Initially will use auto-downloaded auxilliary data (ephemeris, DEM). Later, intend to use GDAL tools to subset a DSM, or test efficiency of chunked file-formats for the DSM raster.
Steps 1-4 will be combined in a gpt xml.
Step 5 will be a gpt command-line instruction. (Some operators chain together inefficiently, at least in previous gpt versions.)
Step 6 will be a python prep script.
The overall orchestration will initially be a shell script. (Other options would be a Makefile or a python cluster scheduling script.)
A jupyter notebook will demonstrate the result (using opendatacube API).
- Ocean is masked out. (This is due to the nodata value used for the DEM by the terrain correction steps.)
- Border noise is not entirely eliminated (some perimeter pixels).
- Could conceive a more efficient unified radiometric/geometric terrain-correction operator (to reduce file IO concerning DSM)?
- Further comparison with GAMMA software output is necessary.
- Signal intensity units unspecified.
- Currently autodownloading ancilliary data (e.g. using 3s SRTM, which is suboptimal).
- Output format is ENVI raster (approximately 10x larger than input zip) rather than cloud optimised GeoTIFF.
Process imagery
- Ensure the graph processing tool is available (run "ln -s ../snap6/bin/gpt gpt" after installing SNAP)
- Batch process some scenes (run "./bulk.sh example_list.txt" after confirming example input)
(Takes 10-15min/scene, using 4 cores and 10-15GB memory, on VDI@NCI.)
Insert into Open Data Cube
- Ensure the environment has been prepared (run "datacube system check")
- Define the products (run "datacube product add productdef.yaml")
- For each newly preprocessed scene, run a preparation script (e.g. "python prep.py output1.dim") to generate metadata (yaml) in an appropriate format for datacube indexing.
- For each of those prepared scenes, index into the datacube (e.g. "datacube dataset add output*.yaml --auto-match")
- Verify the data using the datacube API (e.g. a python notebook).