Add weekly correction-outcome telemetry snapshot script by Gradata · Pull Request #187 · Gradata/gradata

Gradata · 2026-05-12T19:18:45Z

Summary

add scripts/weekly_correction_snapshot.py to compute deterministic weekly correction/graduation aggregates from NDJSON
add unit tests for empty input, malformed rows, deterministic top-category ordering, and zero-division handling
add docs at docs/weekly-correction-snapshot.md with usage and output schema

Validation

pytest -q Gradata/tests/test_weekly_correction_snapshot.py

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-05-12T19:18:59Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: dd9dfb38-fa9d-475e-ad80-03e2de8c9133

📥 Commits

Reviewing files that changed from the base of the PR and between 1d1523e and 2131f76.

📒 Files selected for processing (1)

Gradata/tests/test_byo_key_provider.py

📜 Recent review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: pytest (py3.12)
GitHub Check: pytest (py3.11)
GitHub Check: pytest ubuntu-latest / py3.12
GitHub Check: pytest macos-latest / py3.12
GitHub Check: pytest macos-latest / py3.11
GitHub Check: pytest windows-latest / py3.12
GitHub Check: pytest ubuntu-latest / py3.11
GitHub Check: pytest windows-latest / py3.11

🧰 Additional context used

📓 Path-based instructions (1)

Gradata/tests/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/tests/**/*.py: Set BRAIN_DIR environment variable via tmp_path in conftest.py for test isolation — ensure _paths.py module cache refreshes when calling Brain.init() directly inside tests
Add unit tests in tests/test_*.py for every CI push without LLM calls (deterministic); mark integration tests with @pytest.mark.integration and skip them by default (they hit real LLM APIs)

Files:

Gradata/tests/test_byo_key_provider.py

🔇 Additional comments (1)

Gradata/tests/test_byo_key_provider.py (1)

3-8: Optional dependency gating is correctly handled.

pytest.importorskip("httpx") on Line 7 is a good fit here since all tests depend on patching httpx.post; this keeps test runs stable when httpx is not installed.

📝 Walkthrough

Add scripts/weekly_correction_snapshot.py: CLI tool (main(argv) -> int) to compute deterministic weekly correction/graduation aggregates from NDJSON (reads --input or stdin).
New helper functions: parse_rows(lines: list[str]) -> (rows, skipped_rows) and aggregate(rows) -> summary dict; outputs compact deterministic JSON to stdout.
Parsing behavior: skips empty, non-JSON, and non-object rows and reports skipped_rows.
Aggregation metrics: total_corrections, accepted_graduations, rejection_count, acceptance_rate (zero-division safe, rounded to 6 decimals), and top_rule_categories (top 5, normalized and deterministically ordered).
Category normalization: lowercased, trimmed, empty/missing → "unknown"; top categories sorted by count (desc) then name (asc).
Tests: comprehensive unit and end-to-end tests covering malformed/empty input, deterministic category ordering, zero-division handling, stdin and --input file CLI behavior, and single-outcome handling. Tests now skip BYO-key (httpx) cases when optional httpx is not installed.
Docs: docs/weekly-correction-snapshot.md added with usage, schema, and examples.
Project config: pyproject.toml test config updated to include scripts on PYTHONPATH for tests.
Breaking changes / security fixes: none introduced.

Walkthrough

Adds a CLI script that reads NDJSON weekly correction events (file or stdin), skips malformed rows, classifies rows, aggregates totals and acceptance rate, computes top 5 normalized rule categories, emits a deterministic compact JSON snapshot, and includes docs, tests, and pytest/.gitignore wiring.

Changes

Weekly Correction Snapshot Feature

Layer / File(s)	Summary
Script implementation `Gradata/scripts/weekly_correction_snapshot.py`	Core CLI with category normalization and predicates, `parse_rows`, `aggregate`, `_read_lines`, argparse `--input` wiring, stdout compact JSON, and `__main__` guard.
Documentation and project configuration `Gradata/docs/weekly-correction-snapshot.md`, `.gitignore`, `Gradata/pyproject.toml`	Adds usage and output schema docs; unignores `Gradata/scripts/weekly_correction_snapshot.py`; adds `scripts` to pytest `pythonpath`.
Test suite `Gradata/tests/test_weekly_correction_snapshot.py`, `Gradata/tests/test_byo_key_provider.py`	Unit tests for malformed-row skipping, division-safe empty aggregation, deterministic aggregation and top-category normalization, end-to-end stdin/file->stdout JSON, single-outcome handling; and conditional skip for `httpx`-dependent tests.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Gradata/gradata#124: Also modifies scripts-related .gitignore rules and touches scripts/* ignore patterns.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding a telemetry snapshot script for weekly correction-outcome data aggregation.
Description check	✅ Passed	The description is directly related to the changeset, clearly detailing the three main additions: the script, unit tests, and documentation with validation steps.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/weekly-correction-snapshot

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 OpenGrep (1.20.0)

OpenGrep fatal error (exit code 2):
┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

�[1m Loading rules from local config...�[0m
[00.19][ERROR]: Error: exception Glob.Lexer.Syntax_error("malformed glob pattern: missing ']'")
Raised at Glob__Lexer.syntax_error in file "libs/glob/Lexer.mll", line 8, characters 2-26
Called from Glob__Lexer.__ocaml_lex_token_rec in file "libs/glob/Lexer.mll", line 29, characters 26-53
Cal

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

Gradata/tests/test_weekly_correction_snapshot.py (1)
1-77: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

pythonpath = ["src"] must include scripts for tests to load.

The test file imports from scripts import weekly_correction_snapshot, but pytest's pythonpath configuration only adds Gradata/src to the module search path. Since Gradata/scripts/ is not included, the test module fails to import and never runs in CI.

Update Gradata/pyproject.toml to add "scripts" to the pythonpath list, or relocate weekly_correction_snapshot.py into src/ if it's production code rather than a utility script.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Gradata/tests/test_weekly_correction_snapshot.py` around lines 1 - 77, The
tests import scripts.weekly_correction_snapshot but the test PYTHONPATH only
includes "src", so imports fail; fix by updating the project configuration to
make the tests find the scripts package: either add "scripts" to the pythonpath
list in pyproject.toml (ensure pythonpath = ["src","scripts"]) or move
weekly_correction_snapshot.py into the src package and update any imports;
reference the test file Gradata/tests/test_weekly_correction_snapshot.py and the
module name weekly_correction_snapshot when applying the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Gradata/docs/weekly-correction-snapshot.md`:
- Around line 26-27: The doc for acceptance_rate currently describes raw
division but the code uses round(..., 6); update the documentation text for
`acceptance_rate` to state that the value is computed as `accepted_graduations /
(accepted_graduations + rejection_count)` (or `0.0` if denominator is zero) and
is rounded to 6 decimal places (i.e., uses round(..., 6)). Reference the
`acceptance_rate` field and the rounding behavior so consumers know the exact
precision.

In `@Gradata/scripts/weekly_correction_snapshot.py`:
- Around line 81-84: The current loop can double-count rows because both
_is_graduation_accepted(row) and _is_rejection(row) are evaluated independently;
change the logic so a row counts as either accepted or rejected but not both
(e.g., use an if/elif or otherwise enforce mutual exclusivity between
_is_graduation_accepted(row) and _is_rejection(row) when updating
accepted_graduations and rejection_count) and keep acceptance_rate calculation
unchanged to use the corrected tallies.

---

Outside diff comments:
In `@Gradata/tests/test_weekly_correction_snapshot.py`:
- Around line 1-77: The tests import scripts.weekly_correction_snapshot but the
test PYTHONPATH only includes "src", so imports fail; fix by updating the
project configuration to make the tests find the scripts package: either add
"scripts" to the pythonpath list in pyproject.toml (ensure pythonpath =
["src","scripts"]) or move weekly_correction_snapshot.py into the src package
and update any imports; reference the test file
Gradata/tests/test_weekly_correction_snapshot.py and the module name
weekly_correction_snapshot when applying the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 70e4d16e-6dd0-4de0-a14d-1fcfa4946f55

📥 Commits

Reviewing files that changed from the base of the PR and between 5a5c4d8 and 6e286db.

📒 Files selected for processing (4)

.gitignore
Gradata/docs/weekly-correction-snapshot.md
Gradata/scripts/weekly_correction_snapshot.py
Gradata/tests/test_weekly_correction_snapshot.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: pytest macos-latest / py3.11
GitHub Check: pytest ubuntu-latest / py3.12
GitHub Check: pytest ubuntu-latest / py3.11
GitHub Check: pytest windows-latest / py3.11
GitHub Check: pytest macos-latest / py3.12
GitHub Check: pytest windows-latest / py3.12
GitHub Check: pytest (py3.12)
GitHub Check: pytest (py3.11)

🧰 Additional context used

📓 Path-based instructions (1)

Gradata/tests/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/tests/**/*.py: Set BRAIN_DIR environment variable via tmp_path in conftest.py for test isolation — ensure _paths.py module cache refreshes when calling Brain.init() directly inside tests
Add unit tests in tests/test_*.py for every CI push without LLM calls (deterministic); mark integration tests with @pytest.mark.integration and skip them by default (they hit real LLM APIs)

Files:

Gradata/tests/test_weekly_correction_snapshot.py

🔇 Additional comments (1)

.gitignore (1)

174-178: Allowlist update is correct.

The new exception cleanly keeps Gradata/scripts/* ignored while allowing the weekly snapshot script to be versioned.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Gradata/tests/test_weekly_correction_snapshot.py`:
- Around line 57-77: Add a new test that exercises the explicit file-input
branch of snapshot.main by creating a temporary input file (using tmp_path)
containing the same payload lines as
test_main_emits_deterministic_json_with_skipped_rows, call snapshot.main with
the args ['--input', str(tmp_file)] instead of monkeypatching sys.stdin, assert
rc == 0, capture stdout via capsys, json.loads it and assert the same expected
dictionary (acceptance_rate, accepted_graduations, rejection_count,
skipped_rows, top_rule_categories, total_corrections). Ensure the test name
references the input path case (e.g.,
test_main_emits_deterministic_json_with_input_file) and reuse the same payload
string used in test_main_emits_deterministic_json_with_skipped_rows so behavior
is identical.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b3c29075-e23b-486a-be70-2909d11f4668

📥 Commits

Reviewing files that changed from the base of the PR and between 6e286db and 9f61abb.

📒 Files selected for processing (4)

Gradata/docs/weekly-correction-snapshot.md
Gradata/pyproject.toml
Gradata/scripts/weekly_correction_snapshot.py
Gradata/tests/test_weekly_correction_snapshot.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: pytest macos-latest / py3.11
GitHub Check: pytest macos-latest / py3.12
GitHub Check: pytest ubuntu-latest / py3.12
GitHub Check: pytest ubuntu-latest / py3.11
GitHub Check: pytest windows-latest / py3.11
GitHub Check: pytest windows-latest / py3.12

🧰 Additional context used

📓 Path-based instructions (2)

Gradata/**/pyproject.toml

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Maintain dependencies = [] in pyproject.toml — the base package is pure Python + stdlib with all heavy dependencies gated as optional extras: embeddings, gemini, encrypted, ranking, adapters-mem0

Files:

Gradata/pyproject.toml

Gradata/tests/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/tests/**/*.py: Set BRAIN_DIR environment variable via tmp_path in conftest.py for test isolation — ensure _paths.py module cache refreshes when calling Brain.init() directly inside tests
Add unit tests in tests/test_*.py for every CI push without LLM calls (deterministic); mark integration tests with @pytest.mark.integration and skip them by default (they hit real LLM APIs)

Files:

Gradata/tests/test_weekly_correction_snapshot.py

🔇 Additional comments (3)

Gradata/pyproject.toml (1)

168-168: Pytest path update looks correct for script-module test imports.

pythonpath = ["src", "scripts"] is consistent with the new test module import pattern.

Gradata/scripts/weekly_correction_snapshot.py (1)

81-86: Good fix for deterministic and non-contradictory outcome aggregation.

Mutual exclusivity between accepted/rejected counts is now explicit, and output ordering remains deterministic.

Also applies to: 91-94, 124-126

Gradata/docs/weekly-correction-snapshot.md (1)

23-28: Schema documentation is aligned with implementation behavior.

Field definitions, rounding precision, and category normalization are clearly and accurately described.

Also applies to: 36-36

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Add weekly correction outcome snapshot script and tests

6e286db

greptile-apps Bot reviewed May 12, 2026

View reviewed changes

coderabbitai Bot added docs feature labels May 12, 2026

coderabbitai Bot requested changes May 12, 2026

View reviewed changes

Comment thread Gradata/docs/weekly-correction-snapshot.md Outdated

Comment thread Gradata/scripts/weekly_correction_snapshot.py Outdated

Fix weekly snapshot outcome classification and test import path

9f61abb

greptile-apps Bot reviewed May 12, 2026

View reviewed changes

coderabbitai Bot requested changes May 12, 2026

View reviewed changes

Comment thread Gradata/tests/test_weekly_correction_snapshot.py

test: cover weekly snapshot --input file path

1d1523e

greptile-apps Bot reviewed May 13, 2026

View reviewed changes

coderabbitai Bot previously approved these changes May 13, 2026

View reviewed changes

Skip BYO key tests when optional httpx is unavailable

2131f76

Gradata dismissed coderabbitai[bot]’s stale review via 2131f76 May 13, 2026 02:27

greptile-apps Bot reviewed May 13, 2026

View reviewed changes

coderabbitai Bot approved these changes May 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add weekly correction-outcome telemetry snapshot script#187

Add weekly correction-outcome telemetry snapshot script#187
Gradata wants to merge 4 commits into
mainfrom
feat/weekly-correction-snapshot

Gradata commented May 12, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot left a comment

Uh oh!

coderabbitai Bot commented May 12, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

greptile-apps Bot left a comment

Uh oh!

greptile-apps Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Gradata commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Gradata commented May 12, 2026 •

edited

Loading

coderabbitai Bot commented May 12, 2026 •

edited

Loading