agent-debugging

Star

Here are 26 public repositories matching this topic...

najeed / ai-agent-eval-harness

Star

The open-source MultiAgentOps evaluation and verification harness for any industry business workflow.

Updated May 14, 2026
Python

OthmanAdi / langsmith-fetch-skill

Sponsor

Star

🔍 AI observability skill for Claude Code. Debug LangChain/LangGraph agents by fetching execution traces from LangSmith Studio directly in your terminal.

developer-tools observability ai-agents langchain langsmith llm-ops langsmith-tracing developer-tools-ai-agent claude-skills claude-skills-creator claude-skills-hub claude-skills-libary agent-debugging

Updated Apr 6, 2026

cylestio / agent-inspector

Star

Local open-source dev tool to debug, secure, and evaluate LLM agents. Provides static analysis, dynamic security checks, and runtime monitoring - integrates with Cursor and Claude Code.

behavior-analysis agent-trace ai-security-tool agent-security cursor-integration claude-code-plugin agent-debugging

Updated Jan 15, 2026
Python

Ylsssq926 / clawclip

Star

Cut your OpenClaw / ZeroClaw token bill. Find which model earns its cost. Prove whether optimizations actually work. Local, no upload.

hermes ai-agent ai-observability cost-reduction local-ai agent-tools llm-cost token-optimization agent-debugging openclaw zeroclaw hermes-agent agent-analytics prompt-efficiency

Updated May 8, 2026
TypeScript

aaronlab / browsertrace

Star

Local replay debugger for Browser Use failures with screenshots, model I/O, failed-step timelines, and public-safe HTML exports.

Updated May 14, 2026
Python

converra / agent-triage

Star

Diagnose your AI agents in production. Extract policies from prompts, evaluate traces, generate diagnostic reports.

Updated Mar 10, 2026
TypeScript

joshualamerton / AgentLens

Star

A real-time observability and debugging layer for AI agents.

python machine-learning ai machine-learning-algorithms devtools agents ai-agents machine-learning-projects llms ai-devtools agent-debugging

Updated Mar 11, 2026
Python

amitmishrg / agenticlens

Star

Visual debugging, tracing, and replay for agent workflows.

nodejs ai reactjs devtools tracing developer-tools visualizations observability debugging-tools ai-agents log-visualization jsonl ai-observability llm agentic-ai agent-workflows workflow-visualization agent-debugging execution-tracing

Updated Mar 27, 2026
JavaScript

Exploreunive / agentlens

Star

Explain why your agent failed — root-cause debugging, memory attribution, and run divergence for LLM agents.

python memory tracing developer-tools observability ai-agents llm agent-debugging

Updated Mar 31, 2026
Python

ChainWatch is a flight data recorder for multi-step AI systems. It's a CLI-based tool that records every step in an AI decision chain, links them together in order, prevents tampering, and allows you to verify the chain's integrity and replay the full decision flow.

ai artificial-intelligence audit-log autonomous-agents ai-agents ai-engineering ai-observability llm llmops ai-tracing agent-observability ai-audit agent-debugging tool-using-agents decision-tracing

Updated Jan 22, 2026
Python

kangjinghang / agent-chatlens

Star

🔍 A beautiful web viewer for AI agent session files. Browse Claude Code & OpenClaw conversations with chat-style UI, timeline visualization, and zero setup.

react visualization typescript developer-tools dark-mode chat-ui claude conversation-analysis jsonl vite ai-agent session-viewer claude-code agent-debugging openclaw jsonl-viewer tool-call-visualization

Updated Apr 13, 2026
TypeScript

mda-diaz / runlens

Star

RunLens helps teams compare and debug AI agent runs with step timelines, run diffs, and cost analysis.

python ai-agents fastapi observability-analyze llmops agent-debugging

Updated Apr 1, 2026
HTML

ptaramona / drill-sergeant-skill

Star

Enforce communication discipline & execution hygiene for agent teams. Detect loops, route violations, stale work, and missing ownership.

message-filtering multi-agent-systems workflow-automation autonomous-agents ai-agents team-communication agent-coordination agent-orchestration agent-governance agent-debugging agent-supervision execution-monitoring

Updated Mar 11, 2026

David-Wu1119 / agentreplay

Star

Local recorder and replay verifier for AI-agent command runs.

developer-tools replay observability ai-agent llm-security agent-debugging

Updated May 10, 2026
TypeScript

Zijian-Ni / agent-replay

Star

🔄 Record, replay, and debug AI agent execution traces — the DevTools for AI agents

debugger devtools trace openai replay ai-agent llm anthropic agent-debugging

Updated Mar 27, 2026
TypeScript

MukundaKatta / agent-trajectory-replay-paper

Star

Preprint paper package — Agent Trajectory Replay for Debugging Tool-Using AI Workflow Regressions (Zenodo DOI 10.5281/zenodo.20073574)

research open-science regression-testing ai-agents preprint tool-use agent-debugging trajectory-replay artifact-paper

Updated May 7, 2026
Python

daslabhq / scenegrad

Star

TDD for AI agents — watch world state morph step-by-step. Drop-in for Vercel AI SDK / Anthropic SDK / LangChain. Scrubbable trajectories + bulk grid view.

typescript ai tdd evaluation observability trajectory ai-agents llm anthropic vercel-ai-sdk llm-agents agent-evaluation agent-observability agent-debugging

Updated May 12, 2026
TypeScript

rty90 / Android-Agent-Reliability-Runtime

Star

Android Agent Reliability Runtime A debugging and safety runtime for mobile GUI agents: detect readiness, block unsafe actions, verify progress, diagnose failures, and save reproducible traces.

adb android-automation llm-agent mobile-agent gui-agent agent-observability android-agent agent-runtime agent-debugging ui-automation-testing

Updated May 8, 2026
Python

lewisnsmith / flight

Star

MCP/tool call flight recorder | transparent STDIO proxy that logs every AI agent tool call for inspection, debugging, and research

cli typescript mcp developer-tools observability ai-agents claude llm model-context-protocol agent-debugging

Updated May 4, 2026
TypeScript

miloantaeus / agent-audit

Star

Free self-serve diagnostic for AI coding agents (Claude Code, Cursor, Aider, Codex, custom Agent SDK). 32-rule library detects silent failures, deadlocks, runaway cost, prompt injection, hallucinated tool calls, frozen state, infinite loops, eval drift. Built by an autonomous AI agent.

developer-tools cursor observability ai-agents openai-codex aider ai-coding claude-code agent-sdk agent-debugging

Updated May 13, 2026

Improve this page

Add a description, image, and links to the agent-debugging topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the agent-debugging topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agent-debugging

Here are 26 public repositories matching this topic...

najeed / ai-agent-eval-harness

OthmanAdi / langsmith-fetch-skill

cylestio / agent-inspector

Ylsssq926 / clawclip

aaronlab / browsertrace

converra / agent-triage

joshualamerton / AgentLens

amitmishrg / agenticlens

Exploreunive / agentlens

Tarunjit45 / ChainWatch

kangjinghang / agent-chatlens

mda-diaz / runlens

ptaramona / drill-sergeant-skill

David-Wu1119 / agentreplay

Zijian-Ni / agent-replay

MukundaKatta / agent-trajectory-replay-paper

daslabhq / scenegrad

rty90 / Android-Agent-Reliability-Runtime

lewisnsmith / flight

miloantaeus / agent-audit

Improve this page

Add this topic to your repo