Building Geospatial Data Cubes for Earth Data Science

A Tutorial for Tribal College Students and Faculty

Author: Lilly Jones, PhD, Daear Consulting, LLC
Primary Focus: Pine Ridge (Oglala Lakota)
Level: Beginner
License: AGPL-3.0 license

What Is a Data Cube?

When scientists study the environment, they collect data across three dimensions simultaneously: space (where), time (when), and variable (what). A data cube is a data structure that holds all three dimensions together so you can ask questions like:

"What was the vegetation condition at Pine Ridge in August 2012 compared to August 2022? How does that change compare to the drought record?"

A spreadsheet can't answer that question, but a data cube can.

This tutorial teaches you to build and query geospatial data cubes in Python using satellite and climate data centered on Pine Ridge.

Why Pine Ridge?

Pine Ridge is home to the Oglala Lakota Nation and is one of the largest reservations in the United States by land area. The land is primarily mixed-grass prairie, badlands, ponderosa pine hills, shaped by climate, fire, drought, and the management decisions of the people who steward it.

Understanding environmental change on these lands through data science is not just a technical exercise, it is a contribution to Tribal sovereignty, land management capacity, and the long-term health of the community.

Data Sovereignty

Before you start, read docs/data_sovereignty.md. The data in this tutorial comes from public federal sources including NASA satellites, NOAA weather stations, Census boundaries. That data describes Indigenous lands. The frameworks that govern responsible use of that data including OCAP®, CARE, FAIR, and IEEE 2890-2025 are introduced in the data sovereignty document and referenced throughout the tutorial.

Tutorial Structure

Notebook	Topic	What you will build
00	Orientation	Understanding the data cube concept
01	Arrays and Dimensions	From numpy to xarray: adding meaning to numbers
02	Your First Data Cube	MODIS NDVI cube over Pine Ridge
03	Time Slicing	Querying across years, seasons, and events
04	Multi-Variable Cube	Adding temperature and precipitation
05	Spatial Operations	Clip, mask, and reproject with rioxarray
06	Analysis Patterns	Anomalies, trends, and composites
07	Open Data Cube Intro	What ODC adds and when to use it
08	Your Own Question	Build a cube for something you care about

Quick Start

# Clone the repository
git clone https://github.com/olc-techsupport/data_cube_tutorial
cd tribal_datacube_tutorial

# Create and activate the environment
conda env create -f environment.yml
conda activate tribal-datacube

Start with 00_orientation.ipynb.

Data

See README.md for the full data inventory and citation information. The cache/ directory is created automatically when you run the notebooks. It is listed in .gitignore and should never be committed.

Sources used in this tutorial:

MODIS MOD13Q1 NDVI (250m, 16-day): NASA/ORNL DAAC
gridMET daily climate (4km): University of Idaho Climatology Lab
Census TIGER AIANNH boundaries: US Census Bureau

MODIS MOD13Q1 NDVI

What: 16-day composite NDVI at 250m resolution
Source: NASA ORNL DAAC MODIS Web Service
URL: https://modis.ornl.gov/rst/api/v1/
No account required for point time series queries
Citation: Didan, K. (2021). MODIS/Terra Vegetation Indices 16-Day L3 Global 250m SIN Grid V061. NASA EOSDIS Land Processes DAAC. doi:10.5067/MODIS/MOD13Q1.061
Used in: Notebooks 02–06

gridMET Daily Surface Meteorology

What: Daily 4km gridded temperature, precipitation, and fire weather variables
Source: University of Idaho Climatology Lab via OPeNDAP
URL: https://www.climatologylab.org/gridmet.html
No account required
Citation: Abatzoglou, J.T. (2013). Development of gridded surface meteorological data for ecological applications and modelling. International Journal of Climatology. doi:10.1002/joc.3413
Used in: Notebooks 04–06

Census TIGER AIANNH Boundaries

What: American Indian/Alaska Native/Native Hawaiian Areas boundaries
Source: US Census Bureau TIGER/Line Shapefiles (2023 vintage)
URL: https://www.census.gov/cgi-bin/geo/shapefiles/index.php
No account required
Citation: US Census Bureau, TIGER/Line Shapefiles (2023).
Governance note: Census-defined boundaries are for statistical purposes only. They do not represent legal jurisdiction or Tribal self-definition.
Used in: Notebooks 02–08

Data Sovereignty and Governance

All data in this tutorial describes Indigenous lands and environments. The frameworks that guide responsible use are:

OCAP®: https://fnigc.ca/ocap-training/
CARE Principles: https://www.gida-global.org/care
FAIR Principles: https://www.go-fair.org/fair-principles/
IEEE 2890-2025: https://standards.ieee.org/ieee/2890/10318/

See docs/data_sovereignty.md for full discussion.

For Instructors

Each notebook is self-contained and includes:

A conceptual introduction (no code required)
Step-by-step code cells with detailed comments
Discussion questions for the classroom
A "Going Further" section pointing to more advanced resources

Recommended sequence: notebooks 00–06 in order, then 07 as an optional advanced session, then 08 as a capstone project.

Estimated time: 2–3 hours per notebook for a classroom session, or self-paced.

Acknowledgments

This work is part of a broader effort to build earth data science capacity at Tribal colleges and universities.

Data governance frameworks referenced: OCAP®, CARE Principles, FAIR Principles, IEEE 2890-2025.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Notebooks		Notebooks
data/cache		data/cache
outputs/figures		outputs/figures
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data_sovereignty.md		data_sovereignty.md
environment.yml		environment.yml
glossary.md		glossary.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building Geospatial Data Cubes for Earth Data Science

A Tutorial for Tribal College Students and Faculty

What Is a Data Cube?

Why Pine Ridge?

Data Sovereignty

Tutorial Structure

Quick Start

Data

MODIS MOD13Q1 NDVI

gridMET Daily Surface Meteorology

Census TIGER AIANNH Boundaries

Data Sovereignty and Governance

For Instructors

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Building Geospatial Data Cubes for Earth Data Science

A Tutorial for Tribal College Students and Faculty

What Is a Data Cube?

Why Pine Ridge?

Data Sovereignty

Tutorial Structure

Quick Start

Data

MODIS MOD13Q1 NDVI

gridMET Daily Surface Meteorology

Census TIGER AIANNH Boundaries

Data Sovereignty and Governance

For Instructors

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages