Nicolas Richard Nicolas-Richard

👋 About Me

I'm a Software Engineer on the Infrastructure team at Chime. My work primarily involves building and maintaining highly available and scalable systems using AWS, Kubernetes, ArgoCD, Datadog, and Terraform.

I mainly code in Python and Bash, with some experience in Ruby and Go.

I write about infrastructure and engineering on my blog.

Most recent blog series:

Streaming LLM inference on EKS — the build: VPC, EKS, vLLM Production Stack, and the streaming gateway.
How much can two L4s serve? It depends on the prompt. — capacity, prefix caching, and the methodology trap.
Per-tenant concurrency caps — protecting well-behaved tenants from a bursty neighbor.
Adaptive concurrency on a multi-tenant vLLM gateway: WFQ + AIMD against a TTFT SLO — the self-tuning gateway.
Autoscaling a GPU Fleet on Inference-Aware Signals - 🎢 ↔️

I've also shared insights on Chime's engineering blog:

How We Preview Kubernetes Changes at Chime: [2023] https://medium.com/life-at-chime/how-we-preview-kubernetes-changes-at-chime-5b4871847c5e | mirror
How We Upgraded Our Core Database with Just 5 Minutes of Downtime: [2025] https://careers.chime.com/en/life-at-chime/engineering-at-chime/how-we-upgraded-our-core-database-with-just-5-minutes-of-downtime/ | mirror

💬 Ask me about ...

🏄 ⛵ 🏃 🚴 🏕️ 🎸

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nicolas Richard Nicolas-Richard

Achievements

Achievements

Block or report Nicolas-Richard

👋 About Me

💬 Ask me about ...

Pinned Loading

Uh oh!