Skip to content
View Nicolas-Richard's full-sized avatar

Block or report Nicolas-Richard

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Nicolas-Richard/README.md

πŸ‘‹ About Me

I'm a Software Engineer on the Infrastructure team at Chime. My work primarily involves building and maintaining highly available and scalable systems using AWS, Kubernetes, ArgoCD, Datadog, and Terraform.

I mainly code in Python and Bash, with some experience in Ruby and Go.

I write about infrastructure and engineering on my blog.

Most recent blog series:

  1. Streaming LLM inference on EKS β€” the build: VPC, EKS, vLLM Production Stack, and the streaming gateway.
  2. How much can two L4s serve? It depends on the prompt. β€” capacity, prefix caching, and the methodology trap.
  3. Per-tenant concurrency caps β€” protecting well-behaved tenants from a bursty neighbor.
  4. Adaptive concurrency on a multi-tenant vLLM gateway: WFQ + AIMD against a TTFT SLO β€” the self-tuning gateway.
  5. Autoscaling a GPU Fleet on Inference-Aware Signals - 🎒 ↔️

I've also shared insights on Chime's engineering blog:

πŸ’¬ Ask me about ...

πŸ„ β›΅ πŸƒ 🚴 πŸ•οΈ 🎸

Pinned Loading

  1. vllm-on-eks vllm-on-eks Public

    Streaming LLM inference on EKS β€” companion repo for the blog posts.

    Python

  2. chime/terraform-aws-alternat chime/terraform-aws-alternat Public

    High availability implementation of AWS NAT instances.

    HCL 1.2k 89

  3. chime/mani-diffy chime/mani-diffy Public

    Go tool that renders Kubernetes manifests from ArgoCD Application templates and commits them to PRs for safer template reviews

    Go 46 8

  4. uptime uptime Public

    multi-tenant uptime monitor SaaS

    Python

  5. k8s-snitch k8s-snitch Public

    AI Kubernetes resource watcher that indexes resource changes (Pods, ConfigMaps, ReplicaSets, Rollouts, ExternalSecrets) in real-time and tells you everything about it. (Chime hackathon Q3 2025)

    Python

  6. blue-green-database-switchover blue-green-database-switchover Public

    Scripts for orchestrating zero-downtime blue/green database migrations using AWS RDS and Route 53.

    Shell