An unsupervised machine-learning pipeline for classifying farmed red deer (Cervus elaphus) behaviour from wearable tri-axial accelerometer data.
STAG discovers prototypical movement patterns directly from sensor streams using k-means clustering, chains them into higher-order behavioural sequences via a first-order Markov transition model, and runs on a 16 MHz microcontroller at over 4 × 10⁸ classifications per second — no GPU, cloud link, or labelled training data required at inference time.
| Stage | Module | Description |
|---|---|---|
| 1 | stag.sync |
Synchronise head & ear accelerometer streams via calibration-drop events |
| 2 | stag.database |
Ingest synchronised .h5 files into a SQLite database (SQLAlchemy ORM) |
| 3 | stag.gps |
Compute ground speed and path tortuosity from GPS fixes (NZMG projection) |
| 4 | stag.clustering |
GPU-accelerated k-means with contiguous leave-out stability analysis |
| 5 | stag.analysis |
Transition matrices, bout statistics, and Markov super-prototypes |
stag/
├── stag/ # Python package
│ ├── sync/ # Sensor synchronisation
│ ├── database/ # SQLAlchemy ORM & database construction
│ ├── gps/ # GPS trajectory analysis & plotting
│ ├── clustering/ # k-means clustering & meta-analysis
│ ├── analysis/ # Label analysis, Markov transitions, preprocessing
│ └── utils/ # Logging, filename generation, helpers
├── scripts/ # Runnable entry-point scripts
├── slurm/ # HPC job submission scripts (NeSI / Aoraki)
├── data/ # Deer code CSVs and auxiliary data
│ └── deer_codes/ # Animal identification lookup tables
├── docs/ # Sphinx documentation source
├── tests/ # Unit tests (to be expanded)
├── CITATION.cff # Machine-readable citation metadata
├── LICENSE # MIT License
├── pyproject.toml # Build system & dependencies
└── environment.yml # Conda environment specification
conda env create -f environment.yml
conda activate stag
pip install -e .pip install -e .For GPU-accelerated clustering you additionally need RAPIDS cuML installed in your environment.
The pipeline runs in five stages. Each stage is a single command from the
repo root and produces the input file the next stage consumes; paths are
resolved from stag/constants.py and can be overridden via the CLI flags
shown for each stage. The full sequence reproduces the manuscript's
Sprint 1–3 analyses end-to-end.
conda env create -f environment.yml && conda activate stag
pip install -e ".[dev]"
pytest # 80+ unit testsfrom stag.sync.data_sync import BetterDataSync
syncer = BetterDataSync(
deer_id="R1_D1",
head_data=head_df,
ear_data=ear_df,
window_dict={"start": 0, "end": 50000},
)
syncer.run_synchronization() # writes deer_data_gps.dbThe synchroniser locates three calibration drops in both signals,
solves for the time offset, fuses ear + head into a single
6-dimensional accelerometer stream, and writes one row per sample into
the SQLite database at LOCAL_DATA_DIR/deer_data_gps.db. One deer per
call; loop over the deer-code lookup CSV in data/deer_codes/ to fill
the full cohort.
python scripts/preprocess_clustering_data.py # → clust_data_maxabs_6col.npyStreams the synchronised database, clips column 5 to ± 7.99 g (the per-
animal sensor saturation), applies the per-column MaxAbs scaler whose
divisors are recorded in clust_data_maxabs_6col.maxabs.csv, and emits
a memmap-friendly .npy that the GPU clustering reads directly.
python scripts/run_internal_metrics.py --chosen-k 8 # → results/internal_metrics/Sweeps k = 2 .. 30 under the contiguous-leave-out protocol (50 cut
positions per k), computes Calinski–Harabasz, Hungarian-matched
centroid stability, stratified Silhouette, and Kneedle elbow on the
inertia curve, and renders the four-panel Figure 2 plus a one-row
selection_summary.csv recording the k = 8 choice and its bounds.
python scripts/run_external_validation.py # → results/sprint2/Loads every groundtruthing register found under data/annotations/,
computes the confusion matrix, ARI and NMI against the k = 8 labels,
and runs the 1 000-bootstrap sub-human test. Writes a Markdown report
and the underlying CSVs alongside Figure 3.
python scripts/cache_label_timeline.py # one-off helper
python scripts/run_sequence_stats.py --n-shuffles 1000 --percentile 99.9
# → results/sprint3/Run-length-encodes the per-sample cluster labels into bout streams,
generates the first-order Markov shuffle null distribution, identifies
super-prototypes that beat the joint 99.9th-percentile AND
Benjamini–Hochberg q < 0.05 threshold, and writes
super_prototype_triplets.csv plus the day/night Wilcoxon table and
per-animal hourly time budgets.
python scripts/run_tortuosity.py R4_D1 bart_paths # one deer, one path-systemLoads the GPS track for the named deer, computes Hausdorff-corrected tortuosity and speed, and saves the trajectory + summary plots.
Each stage is independent — re-run any stage after changing its parameters without invalidating the earlier ones, as long as the input file from the previous stage is still on disk.
For the SLURM-orchestrated versions of stages 2 and 3 (cohort-scale
runs on the Aoraki HPC cluster), see slurm/.
The trained nearest-centroid classifier (stag.embedded.nearest_centroid,
Q4.12 fixed-point, K = 8, D = 6) was benchmarked on ten microcontrollers
spanning every architectural class currently used in animal-borne
biologger and maker-board deployments. The identical C source was
compiled with the vendor-recommended gcc cross-toolchain at -Os -g
and run under a cycle-tracking emulator (simavr for AVR, Renode for
Cortex-M, mspdebug for MSP430, Espressif QEMU for Xtensa). Per-MCU
firmware, linker scripts, and Python runners live under
stag/embedded/benchmark/.
| MCU | Architecture | Clock | Cyc / call | Throughput | × 50 Hz |
|---|---|---|---|---|---|
| ATmega328P (Arduino Uno) | 8-bit AVR | 16 MHz | 3 554 | 4.5 k / s | 90× |
| ATmega32U4 (Pro Micro / Feather) | 8-bit AVR | 16 MHz | 3 554 | 4.5 k / s | 90× |
| ATmega2560 (Arduino Mega) | 8-bit AVR | 16 MHz | 3 730 | 4.3 k / s | 86× |
| MSP430G2553 (TI LaunchPad) | 16-bit MSP430 | 16 MHz | 31 652 | 506 / s | 10× |
| SAMD21G18A (Feather M0 / MKR Zero) | Cortex-M0+ | 48 MHz | 798 ‡ | 60 k / s | 1 200× |
| nRF52840 (Feather nRF52840) | Cortex-M4F | 64 MHz | 415 ‡ | 154 k / s | 3 080× |
| RP2040 (Raspberry Pi Pico) | Cortex-M0+ | 133 MHz | 798 ‡ | 167 k / s | 3 340× |
| STM32F407 (STM32F4-Discovery) | Cortex-M4F | 168 MHz | 415 ‡ | 405 k / s | 8 100× |
| ESP32 (Espressif WROOM-32) | Xtensa LX6 | 240 MHz | 188 § | 1.28 M / s | 25 600× |
| i.MX RT1064 (NXP RT106x / Teensy) | Cortex-M7 | 600 MHz | 415 ‡† | 1.45 M / s | 29 000× |
Every silicon class clears the 50 Hz inertial-sampling budget by at
least an order of magnitude, with the value-line MSP430G2553 (no
hardware multiplier; software-emulated __mulhi3) being the slowest
at a 10× margin.
‡ Renode cpu ExecutedInstructions, single-issue — silicon will be the same or faster. † Cortex-M7's dual-issue pipeline typically realises ≈ 1.3 IPC on integer workloads; RT106x silicon throughput exceeds the reported value by an estimated 25 %. § Espressif QEMU virtual CCOUNT under -icount shift=auto; pipeline and cache effects not modelled. AVR rows are simavr-measured hardware cycles.
See also CITATION.cff for machine-readable citation metadata.
This project is licensed under the MIT License — see LICENSE for details.
- Alexander R. H. Matthews — Department of Zoology, University of Otago, Dunedin, New Zealand
- Lindsay R. Matthews — Matthews Research International LP, New Zealand
- Bart R. H. Geurten — Department of Zoology, University of Otago, Dunedin, New Zealand (corresponding author: bart.geurten@otago.ac.nz)
This pipeline was developed at the Department of Zoology, University of
Otago. Early development — including the original per-replicate
exploratory Jupyter notebooks, the per-deer data-merging notebooks, and
the tortuosity-analysis notebooks — took place under
github.com/alexrhmatthews/headshake_project
and remains available there for provenance. The codebase was then
reorganised and renamed to STAG for publication: every function the
original notebooks performed has been refactored into the production
Python package (stag.sync.data_sync and stag.sync.utils for the
ear/head IMU peak-matching and 3-drop calibration sync; stag.gps.*
for the GPS and tortuosity analyses; stag.database.* for the
SQLAlchemy ORM and SQLite consolidation; stag.clustering.* and
stag.analysis.* for k-means and the first-order Markov sequence
analysis) and is exercised by the scripts/ entry points and the
tests/ suite. The notebooks themselves are therefore not duplicated
in this repository.