feat: daemon speaks both HTTP and MCP transports#191
Conversation
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
📝 Walkthrough
WalkthroughThis PR implements an MCP daemon bridge feature allowing the MCP stdio server to forward tool calls to a running Gradata daemon via HTTP instead of spawning local brains. The daemon advertises itself, both stdio and daemon expose HTTP MCP endpoints, and comprehensive tests validate end-to-end bridging. ChangesMCP daemon bridge
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 OpenGrep (1.20.0)OpenGrep fatal error (exit code 2): �[32m✔�[39m �[1mOpengrep OSS�[0m �[1m Loading rules from local config...�[0m Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Gradata/src/gradata/daemon.py`:
- Around line 798-807: Replace the direct call to _write_pid_file(...) that
writes self._brain_dir / ".daemon.json" with the repository's atomic JSON write
helper so the file is written atomically; call the helper to write the same
payload (port, pid/process info and started_at) to self._brain_dir /
".daemon.json" instead of using _write_pid_file, using the same inputs
(actual_port and self._started_at) and keep the contextlib.suppress(OSError)
wrapper around the atomic write to preserve best-effort, non-fatal behavior.
- Around line 662-666: The endpoint currently assumes the parsed JSON from
self._read_json() is a dict and calls body.get(...), which raises AttributeError
for non-object JSON; modify the handler to first check that body is an instance
of dict (e.g., if not isinstance(body, dict): self._send_json({"error":"request
body must be an object"}, 400) and return) before accessing body.get("name") and
body.get("arguments"), and keep using _send_json to return the 400 error for
invalid bodies.
In `@Gradata/src/gradata/mcp_server.py`:
- Around line 123-130: The loop currently accepts any daemon that answers
/health; change the probe-and-accept logic to verify the daemon’s reported
brain_dir matches the requested brain before returning a client: modify or
overload cls._probe(url) to return the health dict (or add a new method like
cls._probe_health(url)), then in the candidates loop compare
health.get("brain_dir") to the requested_brain_dir parameter (or an
expected_brain_dir variable passed into the caller) and only do _log.info(...)
and return cls(url) when they match; allow returning a client for mismatched
brain_dir only when an explicit override flag (e.g., allow_cross_brain or an env
var like GRADATA_DAEMON_ALLOW_CROSS_BRAIN) is set and documented.
In `@Gradata/tests/test_mcp_daemon_bridge.py`:
- Around line 107-113: The test test_discover_finds_daemon_via_advert_file calls
_DaemonClient.discover but does not clear environment overrides, so clear
GRADATA_DAEMON_URL and GRADATA_DAEMON_PORT before discovery (use pytest's
monkeypatch.delenv or os.environ.pop with default) to ensure advert-file
discovery is deterministic; apply the same change to the other advert-based test
referenced around lines 148-169 so both tests remove those env vars prior to
calling _DaemonClient.discover.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: d9ace558-08bb-42fb-ba1d-367dc4e69154
📒 Files selected for processing (4)
Gradata/src/gradata/daemon.pyGradata/src/gradata/mcp_server.pyGradata/tests/test_mcp_daemon_bridge.pyGradata/tests/test_mcp_server.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: pytest (py3.12)
- GitHub Check: pytest (py3.11)
- GitHub Check: pytest windows-latest / py3.12
- GitHub Check: pytest ubuntu-latest / py3.12
- GitHub Check: pytest ubuntu-latest / py3.11
- GitHub Check: pytest macos-latest / py3.11
- GitHub Check: pytest macos-latest / py3.12
- GitHub Check: pytest windows-latest / py3.11
🧰 Additional context used
📓 Path-based instructions (2)
Gradata/tests/**/*.py
📄 CodeRabbit inference engine (Gradata/AGENTS.md)
Gradata/tests/**/*.py: SetBRAIN_DIRenvironment variable viatmp_pathin conftest.py for test isolation — ensure_paths.pymodule cache refreshes when callingBrain.init()directly inside tests
Add unit tests intests/test_*.pyfor every CI push without LLM calls (deterministic); mark integration tests with@pytest.mark.integrationand skip them by default (they hit real LLM APIs)
Files:
Gradata/tests/test_mcp_server.pyGradata/tests/test_mcp_daemon_bridge.py
Gradata/src/**/*.py
📄 CodeRabbit inference engine (Gradata/AGENTS.md)
Gradata/src/**/*.py: Prefersentence-transformersfor local embeddings,google-genaifor Gemini embeddings,cryptographyfor AES-GCM encrypted system.db,bm25sfor BM25 rule ranking, andmem0aifor external memory adapters — guard all optional dependency imports withtry / except ImportErrorat the call site, never at module level
Maintain strict layering: Layer 0 (Primitives: _types.py, _db.py, _events.py, _paths.py, _file_lock.py; Patterns: contrib/patterns/) must never import from Layer 1 (Enhancements: enhancements/, rules/) or Layer 2 (Public API: brain.py, cli.py, daemon.py, mcp_server.py)
Never use bareexcept: pass— use typed exceptions or at minimumlogger.warning(...)withexc_info=Trueto avoid silent failure in a memory product
Never import from out-of-scope sibling directories../Sprites/or../Hausgem/withingradata/*code — that is a layering bug
Never leak private-sibling paths into public docs/code — no references to../Sprites/,../Hausgem/, email addresses, OneDrive paths, or Sprites-specific examples from insidegradata/*
Use atomic-write helper when writing JSON files to prevent corruption from mid-write crashes
Files:
Gradata/src/gradata/mcp_server.pyGradata/src/gradata/daemon.py
| body = self._read_json() | ||
| tool_name = body.get("name", "") | ||
| arguments = body.get("arguments") or {} | ||
| if not isinstance(arguments, dict): | ||
| self._send_json({"error": "arguments must be an object"}, 400) |
There was a problem hiding this comment.
Reject non-object request bodies before calling .get().
json.loads() can return a list/string/number. In that case Line 663 raises AttributeError, and this endpoint drops the request instead of returning a 400.
Suggested fix
def _handle_mcp_tool_call(self) -> None:
self.daemon._reset_idle_timer()
body = self._read_json()
+ if not isinstance(body, dict):
+ self._send_json({"error": "request body must be an object"}, 400)
+ return
tool_name = body.get("name", "")
arguments = body.get("arguments") or {}
if not isinstance(arguments, dict):
self._send_json({"error": "arguments must be an object"}, 400)
return🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Gradata/src/gradata/daemon.py` around lines 662 - 666, The endpoint currently
assumes the parsed JSON from self._read_json() is a dict and calls
body.get(...), which raises AttributeError for non-object JSON; modify the
handler to first check that body is an instance of dict (e.g., if not
isinstance(body, dict): self._send_json({"error":"request body must be an
object"}, 400) and return) before accessing body.get("name") and
body.get("arguments"), and keep using _send_json to return the 400 error for
invalid bodies.
| # Always advertise the daemon inside the brain dir so the stdio | ||
| # MCP bridge (and any other local client) can discover us without | ||
| # needing an explicit --pid-file. Best-effort; failures are non-fatal. | ||
| with contextlib.suppress(OSError): | ||
| _write_pid_file( | ||
| self._brain_dir / ".daemon.json", | ||
| actual_port, | ||
| self._brain_dir, | ||
| self._started_at, | ||
| ) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Write .daemon.json atomically.
This advert is now part of the discovery path. A mid-write crash can leave malformed JSON behind, causing the bridge to miss the right daemon or fall through to another candidate. Please route the .daemon.json write through the repo’s atomic JSON helper instead of _write_pid_file().
As per coding guidelines, "Use atomic-write helper when writing JSON files to prevent corruption from mid-write crashes".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Gradata/src/gradata/daemon.py` around lines 798 - 807, Replace the direct
call to _write_pid_file(...) that writes self._brain_dir / ".daemon.json" with
the repository's atomic JSON write helper so the file is written atomically;
call the helper to write the same payload (port, pid/process info and
started_at) to self._brain_dir / ".daemon.json" instead of using
_write_pid_file, using the same inputs (actual_port and self._started_at) and
keep the contextlib.suppress(OSError) wrapper around the atomic write to
preserve best-effort, non-fatal behavior.
| for url in candidates: | ||
| url = url.rstrip("/") | ||
| if url in seen: | ||
| continue | ||
| seen.add(url) | ||
| if cls._probe(url): | ||
| _log.info("MCP bridge: connected to gradata daemon at %s", url) | ||
| return cls(url) |
There was a problem hiding this comment.
Validate health["brain_dir"] before accepting a daemon candidate.
Line 128 treats any daemon that answers /health as compatible. If the caller asked for brain B while brain A is listening via GRADATA_DAEMON_URL, GRADATA_DAEMON_PORT, or the :8765 fallback, mutating tools will be forwarded into the wrong brain. Please compare the daemon’s reported brain_dir with the requested brain_dir before returning a client, and only allow cross-brain forwarding via an explicit override.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Gradata/src/gradata/mcp_server.py` around lines 123 - 130, The loop currently
accepts any daemon that answers /health; change the probe-and-accept logic to
verify the daemon’s reported brain_dir matches the requested brain before
returning a client: modify or overload cls._probe(url) to return the health dict
(or add a new method like cls._probe_health(url)), then in the candidates loop
compare health.get("brain_dir") to the requested_brain_dir parameter (or an
expected_brain_dir variable passed into the caller) and only do _log.info(...)
and return cls(url) when they match; allow returning a client for mismatched
brain_dir only when an explicit override flag (e.g., allow_cross_brain or an env
var like GRADATA_DAEMON_ALLOW_CROSS_BRAIN) is set and documented.
| def test_discover_finds_daemon_via_advert_file(live_daemon, brain_dir: Path) -> None: | ||
| """A daemon advertising itself in <brain>/.daemon.json must be discovered.""" | ||
| _d, base = live_daemon | ||
| client = _DaemonClient.discover(brain_dir) | ||
| assert client is not None | ||
| # The brain-dir-advertised port should win over the 8765 fallback. | ||
| assert client.base_url == base |
There was a problem hiding this comment.
Clear daemon env overrides in advert-based bridge tests.
Discovery checks GRADATA_DAEMON_URL and GRADATA_DAEMON_PORT before <brain>/.daemon.json. Without removing those vars here, these tests become host-dependent and can bind to the wrong daemon on a developer machine or CI runner.
As per coding guidelines, "Add unit tests in tests/test_*.py for every CI push without LLM calls (deterministic); mark integration tests with @pytest.mark.integration and skip them by default (they hit real LLM APIs)".
Also applies to: 148-169
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Gradata/tests/test_mcp_daemon_bridge.py` around lines 107 - 113, The test
test_discover_finds_daemon_via_advert_file calls _DaemonClient.discover but does
not clear environment overrides, so clear GRADATA_DAEMON_URL and
GRADATA_DAEMON_PORT before discovery (use pytest's monkeypatch.delenv or
os.environ.pop with default) to ensure advert-file discovery is deterministic;
apply the same change to the other advert-based test referenced around lines
148-169 so both tests remove those env vars prior to calling
_DaemonClient.discover.
|
Boss review (comment-only — gh CLI blocks formal reviews on org-owned PRs): Approve. Solid architecture — MCP stdio bridge delegating to daemon over HTTP avoids double-flock cleanly. Key points:
One nit: live_daemon fixture manually writes .daemon.json; consider using daemon.start() to test the real lifecycle. Non-blocking. No security concerns — all traffic stays on 127.0.0.1. |
Summary
Eliminates the brain-flock contention between
gradata.daemon(HTTP :8765) andgradata.mcp_server(stdio). Daemon stays the sole brain owner; mcp_server becomes a thin stdio↔HTTP bridge.Refs #190
Design (Option A)
GET /mcp/toolsandPOST /mcp/tool-call, wired into the existingmcp_server._dispatch()underdaemon._brain_lock(no logic duplication)<brain>/.daemon.jsononstart()and deletes it on cleanup — stdio clients can discover the port without a pidfile flagmcp_server.run_servergains an HTTP-bridge path (_DaemonClient). When a daemon is reachable, alltools/callare forwarded over HTTP and Brain is never instantiated and the flock is never acquired$GRADATA_DAEMON_URL→$GRADATA_DAEMON_PORT→<brain>/.daemon.json→ fallback127.0.0.1:8765--no-daemonCLI flag forces legacy in-process mode (tests, debugging)Compatibility
python3 -m gradata.mcp_server --brain-dir ...) keeps working unchanged — it transparently bridges to the running daemon:8765unchangedTests
New:
tests/test_mcp_daemon_bridge.py— 5 tests covering discover-none, discover-via-advert,/mcp/toolsendpoint,/mcp/tool-callendpoint, and the headlinerun_serverbridge test assertingBrainis never instantiated when a daemon is reachable.Updated:
tests/test_mcp_server.pypassesuse_daemon=Falseinrun_server(...)call sites so existing tests don't accidentally bridge to a live host daemon.Known caveat
_DaemonClient.discoverfalls back to127.0.0.1:8765even whenbrain_dirdoesn't strictly match the daemon's reported brain_dir. Intentional for single-host single-daemon use. Could tighten by cross-checkinghealth.brain_dirbefore accepting — worth doing if multi-brain is a real use case.