Skip to content
#

promptfoo

Here are 39 public repositories matching this topic...

prompt-evaluator is an open-source toolkit for evaluating, testing, and comparing LLM prompts. It provides a GUI-driven workflow for running prompt tests, tracking token usage, visualizing results, and ensuring reliability across models like OpenAI, Claude, and Gemini.

  • Updated Dec 4, 2025
  • TypeScript

Multi-turn conversational AI agent for medication adherence, with CI-gated evaluation harness. LangGraph 1.0, multi-LLM (Groq/Cerebras/Anthropic), RAG with citation gate, OpenInference observability. Reference implementation, 100% synthetic data. Pattern transfers to other regulated industries.

  • Updated May 14, 2026
  • Python

Improve this page

Add a description, image, and links to the promptfoo topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the promptfoo topic, visit your repo's landing page and select "manage topics."

Learn more