Skip to content

cipher813/alpha-engine-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

262 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

alpha-engine-data

Part of Nous Ergon — Autonomous Multi-Agent Trading System. Repo and S3 names use the underlying project name alpha-engine.

Part of Nous Ergon Python ArcticDB Polygon.io License: MIT Phase 2 · Reliability

Centralized data collection, storage, and distribution. Owns the price universe (ArcticDB), macro indicators, universe returns, the engineered feature store, the RAG ingestion step, and per-ticker alternative data.

System overview, Step Function orchestration, and module relationships live in alpha-engine-docs. Code index lives in OVERVIEW.md.

What this does

  • Maintains a 10-year ArcticDB price universe across ~900 S&P 500+400 tickers, refreshed weekly with daily EOD appends
  • Ingests macro indicators from FRED (rates, VIX, commodities) and computes derived signals (yield-curve slope, VIX term slope, market breadth)
  • Pulls per-ticker alternative data (analyst consensus, EPS revisions, options chains, insider filings, 13F holdings, news sentiment) only for tickers promoted by Research — keeps API spend bounded
  • Computes the engineered feature store used by the Predictor for both training and inference
  • Runs the RAG ingestion step: SEC 10-K/10-Q/8-K, earnings transcripts, and thesis history embedded into the pgvector knowledge base that Research's qual-analyst agents query

Phase 2 measurement contribution

Data is the substrate everything else measures against. Phase 2 contribution: feature coverage, freshness tracking, and per-feature drift detection that downstream modules can rely on. Every signal, prediction, and trade traces back to inputs that have been validated, freshness-checked, and tagged with quality flags.

Architecture

flowchart LR
    APIs[External APIs<br/>polygon · FRED · FMP · SEC EDGAR · yfinance · Wikipedia] --> P1
    APIs --> RAG
    APIs --> P2
    APIs --> EOD

    P1[Phase 1 · weekly<br/>prices · macro · constituents · features]
    RAG[RAG ingestion · weekly<br/>filings · transcripts · theses]
    P2[Phase 2 · weekly<br/>alt data — promoted tickers only]
    EOD[EOD · weekday<br/>daily closes · macro refresh]

    P1 --> Arctic[(ArcticDB universe<br/>+ universe_slim)]
    EOD --> Arctic
    RAG --> Vector[(Neon pgvector)]
    P2 --> S3stage[(S3 staging)]
Loading

Quality gates run automatically after each refresh: OHLC ordering, zero-price, extreme returns, zero-volume, volume-spikes, trading-day gaps. Anomalies surface in per-step completion emails.

Configuration

This repo is public. config.yaml is gitignored locally; real values (S3 bucket names, API keys, email recipients) live in the private alpha-engine-config repo. Architecture and approach are public; specific values are private.

Sister repos

Module Repo
Executor alpha-engine
Research alpha-engine-research
Predictor alpha-engine-predictor
Backtester alpha-engine-backtester
Dashboard alpha-engine-dashboard
Library alpha-engine-lib
Docs alpha-engine-docs

License

MIT — see LICENSE.

About

Nous Ergon — centralized data collection: prices, macro, constituents, SEC filings, RAG ingestion via ArcticDB

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors