gpt2cpp

gpt2cpp

tokenizer training and encode/decode,
dataset packing into binary corpora,
GPT-2-style decoder-only transformer modules,
training/evaluation/generation workflows,
terminal chat,
a minimal local web demo,
CPU-first layout with optional LibTorch CUDA acceleration,
production-minded repo structure, artifacts, logging, and tests.

Honest status: the tokenizer/data/CLI/serving pieces are fully owned in C++. The model/training/inference path is built around LibTorch for a realistic CPU/CUDA path. Some advanced performance features such as a full KV cache, fused kernels, and AMP scaler integration are explicitly marked as extension points rather than faked.

Build

Minimal tokenizer/data/CLI build

cmake -S . -B build -DGPT2CPP_ENABLE_TORCH=OFF -DCMAKE_BUILD_TYPE=Release
cmake --build build -j

CPU LibTorch build

cmake -S . -B build   -DCMAKE_PREFIX_PATH=/path/to/libtorch   -DGPT2CPP_ENABLE_TORCH=ON   -DGPT2CPP_ENABLE_CUDA=OFF   -DCMAKE_BUILD_TYPE=Release
cmake --build build -j

macOS fresh-clone CPU setup

This path uses a local Python virtual environment to provide Torch and its CMake package, then builds the full training/inference stack on CPU.

Shortcut:

./scripts/setup_macos.sh

Equivalent manual steps:

python3 -m venv .venv
. .venv/bin/activate
python -m pip install torch

cmake -S . -B build \
  -DCMAKE_BUILD_TYPE=Release \
  -DGPT2CPP_ENABLE_TORCH=ON \
  -DGPT2CPP_ENABLE_CUDA=OFF \
  -DCMAKE_PREFIX_PATH="$(python -c 'import torch; print(torch.utils.cmake_prefix_path)')"

cmake --build build -j

./scripts/run_demo.sh
./build/gpt2cpp train --config configs/tiny.toml

LATEST_MODEL="$(ls -td runs/*/model_bundle | head -n1)"
./build/gpt2cpp generate --model "$LATEST_MODEL" --prompt "To be"

Optional local web demo:

LATEST_MODEL="$(ls -td runs/*/model_bundle | head -n1)"
./build/gpt2cpp serve --model "$LATEST_MODEL" --port 8080 --web-root web

CUDA LibTorch build

cmake -S . -B build   -DCMAKE_PREFIX_PATH=/path/to/libtorch   -DGPT2CPP_ENABLE_TORCH=ON   -DGPT2CPP_ENABLE_CUDA=ON   -DCMAKE_BUILD_TYPE=Release
cmake --build build -j

Quickstart

Train a tokenizer:

./build/gpt2cpp tokenizer train   --input examples/tiny_shakespeare/sample.txt   --output artifacts/tokenizer   --vocab-size 512   --special "<|system|>,<|user|>,<|assistant|>,<|endoftext|>"

Prepare packed data:

./build/gpt2cpp data prepare   --tokenizer artifacts/tokenizer   --input examples/tiny_shakespeare   --output artifacts/dataset   --val-ratio 0.1

Train:

./build/gpt2cpp train --config configs/tiny.toml

Generate:

./build/gpt2cpp generate   --model runs/.../model_bundle   --prompt "Once upon a time"   --temperature 0.8   --top-k 40

Chat:

./build/gpt2cpp chat --model runs/.../model_bundle

Serve:

./build/gpt2cpp serve --model runs/.../model_bundle --port 8080 --web-root web

Then open:

http://127.0.0.1:8080/

Architecture

raw text / jsonl
   │
   ▼
byte-level BPE tokenizer
   │
   ├── vocab.tsv
   ├── merges.tsv
   └── special_tokens.tsv
   │
   ▼
packed dataset writer
   │
   ├── train.bin
   ├── val.bin
   └── dataset.meta
   │
   ▼
GPT-2-style transformer (LibTorch)
   │
   ├── checkpoints
   ├── metrics
   ├── samples
   └── model_bundle/
   │
   ├── generate
   ├── terminal chat
   └── local web server + UI

Artifact bundle

model_bundle/
  model.pt
  config.snapshot.toml
  tokenizer/
    tokenizer.meta
    vocab.tsv
    merges.tsv
    special_tokens.tsv
  manifest.txt

Included tests

tokenizer round-trip
packed dataset integrity
model shape smoke test

Phased roadmap

Phase 1

tokenizer + packing + CPU-only tooling
tiny generation path
terminal UX

Phase 2

training + evaluation + checkpoints
model bundle export
local web demo

Phase 3

CUDA polish
mixed precision integration
cleaner inference hot path

Phase 4

KV cache
SSE/WebSocket streaming
LoRA
quantization
ONNX/TensorRT export

Future extensions

LoRA finetuning
quantization
ONNX/TensorRT export
batched serving
SSE token streaming
retrieval augmentation
conversation persistence
function/tool calling experiments

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
cmake		cmake
configs		configs
docs		docs
examples		examples
include/gpt2cpp		include/gpt2cpp
scripts		scripts
src		src
tests		tests
third_party		third_party
tools		tools
web		web
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gpt2cpp

Build

Minimal tokenizer/data/CLI build

CPU LibTorch build

macOS fresh-clone CPU setup

CUDA LibTorch build

Quickstart

Architecture

Artifact bundle

Included tests

Phased roadmap

Phase 1

Phase 2

Phase 3

Phase 4

Future extensions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

gpt2cpp

Build

Minimal tokenizer/data/CLI build

CPU LibTorch build

macOS fresh-clone CPU setup

CUDA LibTorch build

Quickstart

Architecture

Artifact bundle

Included tests

Phased roadmap

Phase 1

Phase 2

Phase 3

Phase 4

Future extensions

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages