WildPose: A Unified Framework for Robust Pose Estimation in the Wild

Jianhao Zheng, Liyuan Zhu, Zihan Zhu, Iro Armeni

Computer Vision and Pattern Recognition (CVPR) 2026

Paper Coming Soon | Video | Project Page

WildPose is a unified monocular camera pose-estimation framework for in-the-wild videos, including dynamic scenes, static scenes, and low-ego-motion sequences. It combines 3D-aware features from a frozen MASt3R backbone with differentiable dense bundle adjustment and learned motion masks for robust trajectory estimation.

📝 TODO

Add run and evaluation scripts for the Sintel dataset.
Add scripts for customized videos.
Release training code.

🛠 Installation

The following setup follows the working notes in log of installation.md. The tested environment uses Python 3.10, PyTorch 2.5.1, CUDA 12.4 wheels, MASt3R, lietorch, and the local CUDA extension in setup.py.

Clone the repository and initialize submodules.

git clone --recursive https://github.com/GradientSpaces/WildPose.git
cd WildPose
git submodule update --init --recursive

Create and activate the conda environment.

conda create -n wildpose python=3.10 ninja mkl mkl-include -c conda-forge -y
conda activate wildpose

Install PyTorch and torch-scatter.

pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu124
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.5.0+cu124.html

Install local dependencies.

# Optional: set this to match your GPU architectures before compiling CUDA extensions.
export TORCH_CUDA_ARCH_LIST="7.5;8.6;8.9;9.0"

pip install --no-build-isolation thirdparty/lietorch
pip install --no-build-isolation -e thirdparty/mast3r

Build WildPose's CUDA backend and install Python requirements.

pip install --no-build-isolation .
pip install -r requirements.txt

Check the installation.

python - <<'PY'
import torch
import lietorch
import mast3r
import droid_backends
print("CUDA available:", torch.cuda.is_available())
PY

📦 Checkpoints

Create a pretrained/ directory and download both checkpoints from the WildPose Hugging Face repository:

mkdir -p pretrained/
wget https://huggingface.co/gradient-spaces/WildPose/resolve/main/wildpose_v0.pth -P pretrained/
wget https://huggingface.co/gradient-spaces/WildPose/resolve/main/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth -P pretrained/

🚀 Run WildPose

Benchmark Dynamic

Download the dynamic benchmark datasets:

bash scripts_downloading/download_dynamic_all.sh

This downloads the Wild-SLAM Mocap, Bonn Dynamic, and TUM RGB-D dynamic sequences used by scripts_run/run_dynamic_all.sh into:

datasets/
  Wild_SLAM_Mocap/
  Bonn/
  TUM_RGBD/

Run all dynamic benchmarks:

bash scripts_run/run_dynamic_all.sh

You can also run one benchmark group at a time:

bash scripts_run/run_dynamic_all.sh wild_slam_mocap
bash scripts_run/run_dynamic_all.sh bonn
bash scripts_run/run_dynamic_all.sh tum_rgbd

Results are saved under the output/ directory. Each benchmark has one folder, and each scene/sequence has its own run folder:

output/
  <benchmark>/
    <scene>/
      cfg.yaml
      video.npz
      traj/
        est_poses_full.txt
        metrics_full_traj.txt
        metrics_kf_traj.txt
        full_traj_2d.png
        kf_traj_2d.png
        before_final_ba/
          metrics_kf_traj.txt
          kf_traj_2d.png

For example, a Wild-SLAM Mocap run writes to output/Wild_SLAM_Mocap/crowd/, while a 7-Scenes run writes to output/7scenes/chess/. The estimated full trajectory is stored in TUM format at traj/est_poses_full.txt, and the main pose metrics are summarized in traj/metrics_full_traj.txt.

Benchmark Static

Download the static benchmark datasets:

bash scripts_downloading/download_static_all.sh

This downloads the 7-Scenes and TUM RGB-D static benchmark sequences used by scripts_run/run_static_all.sh into:

datasets/
  7-scenes/
  TUM_RGBD/

Run all static benchmarks:

bash scripts_run/run_static_all.sh

You can also run one benchmark group at a time:

bash scripts_run/run_static_all.sh seven_scenes
bash scripts_run/run_static_all.sh tum_ablation

🙏 Acknowledgements

This repository is initialized from WildGS-SLAM, and the README structure follows its release style. We thank the WildGS-SLAM authors for releasing their codebase.

WildPose also builds on ideas and components from DROID-SLAM, lietorch, MASt3R/DUSt3R, GO-SLAM/GlORIE-SLAM-style factor graph optimization, MoGe, and standard RGB-D trajectory evaluation tools. We thank the authors of these projects for making their work publicly available.

This work is supported by the Center for Integrated Facility Engineering (CIFE) and the Stanford Robotics Center (SRC). We also thank Stanford's Marlowe and Sherlock clusters for providing GPU computing resources for model training and evaluation.

📚 Citation

If you find this code or paper useful, please cite:

@inproceedings{Zheng2026WildPose,
  author    = {Zheng, Jianhao and Zhu, Liyuan and Zhu, Zihan and Armeni, Iro},
  title     = {WildPose: A Unified Framework for Robust Pose Estimation in the Wild},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2026}
}

✉ Contact

For questions, comments, and bug reports, please contact Jianhao Zheng.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
configs		configs
media		media
scripts_downloading		scripts_downloading
scripts_run		scripts_run
src		src
thirdparty		thirdparty
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WildPose: A Unified Framework for Robust Pose Estimation in the Wild

📚 Table of Contents

📝 TODO

🛠 Installation

📦 Checkpoints

🚀 Run WildPose

Benchmark Dynamic

Benchmark Static

🙏 Acknowledgements

📚 Citation

✉ Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WildPose: A Unified Framework for Robust Pose Estimation in the Wild

📚 Table of Contents

📝 TODO

🛠 Installation

📦 Checkpoints

🚀 Run WildPose

Benchmark Dynamic

Benchmark Static

🙏 Acknowledgements

📚 Citation

✉ Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages