A Python package for efficient processing of cubic earth observation (EO) data 🚀
GitHub: https://github.com/andesdatacube/cubexpress/ 🌐
PyPI: https://pypi.org/project/cubexpress/ 🛠️
Overview
CubeXpress is a Python package designed to simplify and accelerate the process of working with Google Earth Engine (GEE) data cubes. With features like multi-threaded downloads, automatic subdivision of large requests, and direct pixel-level computations on GEE, CubeXpress helps you handle massive datasets with ease.
Key Features
- Fast Image and Collection Downloads
Retrieve single images or entire collections at once, taking advantage of multi-threaded requests. - Automatic Tiling
Large images are split ("quadsplit") into smaller sub-tiles, preventing errors with GEE’s size limits. - Direct Pixel Computations
Perform computations (e.g., band math) directly on GEE, then fetch results in a single step. - Scalable & Efficient
Optimized memory usage and parallelism let you handle complex tasks in big data environments.
Installation
Install the latest version from PyPI:
pip install cubexpress
Note: You need a valid Google Earth Engine account and
earthengine-api
installed (pip install earthengine-api
). Also runee.Initialize()
before using CubeXpress.
Basic Usage
Download a single ee.Image
import ee
import cubexpress
# Initialize Earth Engine
ee.Initialize(project="your-project-id")
# Create a raster transform
geotransform = cubexpress.lonlat2rt(
lon=-76.5,
lat=-9.5,
edge_size=128, # Width=Height=128 pixels
scale=90 # 90m resolution
)
# Define a single Request
request = cubexpress.Request(
id="dem_test",
raster_transform=geotransform,
bands=["elevation"],
image="NASA/NASADEM_HGT/001" # Note: you can wrap with ee.Image("NASA/NASADEM_HGT/001").divide(10000) if needed
# Build the RequestSet
cube_requests = cubexpress.RequestSet(requestset=[request])
# Download with multi-threading
cubexpress.getcube(
request=cube_requests,
output_path="output_dem",
nworkers=4,
max_deep_level=5
)
This will create a GeoTIFF named dem_test.tif
in the output_dem
folder, containing the elevation band.
Download pixel values from an ee.ImageCollection
You can fetch multiple images by constructing a RequestSet
with several Request
objects. For example, filter Sentinel-2 images near a point:
import ee
import cubexpress
ee.Initialize(project="your-project-id")
# Filter a Sentinel-2 collection
point = ee.Geometry.Point([-97.59, 33.37])
collection = ee.ImageCollection("COPERNICUS/S2_SR_HARMONIZED") \
.filterBounds(point) \
.filterDate('2024-01-01', '2024-01-31')
# Extract image IDs
image_ids = collection.aggregate_array('system:id').getInfo()
# Set the geotransform
geotransform = cubexpress.lonlat2rt(
lon=-97.59,
lat=33.37,
edge_size=512,
scale=10
)
# Build multiple requests
requests = [
cubexpress.Request(
id=f"s2test_{i}",
raster_transform=geotransform,
bands=["B4", "B3", "B2"],
image=image_id # Note: you can wrap with ee.Image(image_id).divide(10000) if needed
)
for i, image_id in enumerate(image_ids)
]
# Create the RequestSet
cube_requests = cubexpress.RequestSet(requestset=requests)
# Download
cubexpress.getcube(
request=cube_requests,
output_path="output_sentinel",
nworkers=4,
max_deep_level=5
)
Process and extract a pixel from an ee.Image
If you provide an ee.Image
with custom calculations (e.g., .divide(10000)
, .normalizedDifference(...)
), CubeXpress can run those on GEE, then download the result. For large results, it automatically splits the image into sub-tiles.
import ee
import cubexpress
ee.Initialize(project="your-project-id")
# Example: NDVI from Sentinel-2
image = ee.Image("COPERNICUS/S2_HARMONIZED/20170804T154911_20170804T155116_T18SUJ") \
.normalizedDifference(["B8", "B4"]) \
.rename("NDVI")
geotransform = cubexpress.lonlat2rt(
lon=-76.59,
lat=38.89,
edge_size=256,
scale=10
)
request = cubexpress.Request(
id="ndvi_test",
raster_transform=geotransform,
bands=["NDVI"],
image=image # custom expression
)
cube_requests = cubexpress.RequestSet(requestset=[request])
cubexpress.getcube(
request=cube_requests,
output_path="output_ndvi",
nworkers=2,
max_deep_level=5
)
Advanced Usage
Same Set of Sentinel-2 Images for Multiple Points
Below is a advanced example demonstrating how to work with multiple points and a Sentinel-2 image collection in one script. We first create a global collection but then filter it on a point-by-point basis, extracting only the images that intersect each coordinate. Finally, we download them in parallel using CubeXpress.
import ee
import cubexpress
# Initialize Earth Engine with your project
ee.Initialize(project="your-project-id")
# Define multiple points (longitude, latitude)
points = [
(-97.64, 33.37),
(-97.59, 33.37)
]
# Start with a broad Sentinel-2 collection
collection = (
ee.ImageCollection("COPERNICUS/S2_SR_HARMONIZED")
.filterDate("2024-01-01", "2024-01-31")
)
# Build a list of Request objects
requestset = []
for i, (lon, lat) in enumerate(points):
# Create a point geometry for the current coordinates
point_geom = ee.Geometry.Point([lon, lat])
collection_filtered = collection.filterBounds(point_geom)
# Convert the filtered collection into a list of asset IDs
image_ids = collection_filtered.aggregate_array("system:id").getInfo()
# Define a geotransform for this point
geotransform = cubexpress.lonlat2rt(
lon=lon,
lat=lat,
edge_size=512, # Adjust the image size in pixels
scale=10 # 10m resolution for Sentinel-2
)
# Create one Request per image found for this point
requestset.extend([
cubexpress.Request(
id=f"s2test_{i}_{idx}",
raster_transform=geotransform,
bands=["B4", "B3", "B2"],
image=image_id
)
for idx, image_id in enumerate(image_ids)
])
# Combine into a RequestSet
cube_requests = cubexpress.RequestSet(requestset=requestset)
# Download everything in parallel
results = cubexpress.getcube(
request=cube_requests,
nworkers=4,
output_path="images_s2",
max_deep_level=5
)
print("Downloaded files:", results)
How it works:
- Points: We define multiple coordinates in
points
. - Global collection: We retrieve a broad Sentinel-2 collection covering the desired date range.
- Per-point filter: For each point, we call
.filterBounds(...)
to get only images intersecting that location. - Geotransform: We create a local geotransform (
edge_size
,scale
) defining the spatial extent and resolution around each point. - Requests: Each point-image pair becomes a
Request
, stored in a single list. - Parallel download: With
cubexpress.getcube()
, all requests are fetched simultaneously, automatically splitting large outputs into sub-tiles if needed (up tomax_deep_level
).
License
This project is licensed under the MIT License.
Built with 🌎 and ❤️ by the CubeXpress team