[codex] Strengthen light-mode OOXML shapes/charts and restore print-area contract#129
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 25 minutes and 41 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (15)
📝 WalkthroughWalkthroughThis release upgrades ExStruct to version 0.8.0, introducing a pure-Python OOXML extraction baseline for light mode that extracts shapes, connectors, and charts without COM/LibreOffice dependencies. It adds a new architectural ADR-0010, updates provenance tracking, restores print-area defaults, and enhances OOXML parsing resilience. Documentation and schemas are updated accordingly. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Pipeline
participant RichBackend
participant OOXML
participant WorkbookBuilder
Client->>Pipeline: extract(file, mode="light")
Pipeline->>RichBackend: resolve_rich_backend("light")
RichBackend->>RichBackend: OoxmlRichBackend(file_path)
Pipeline->>Pipeline: _run_light_pipeline()
Pipeline->>RichBackend: extract_shapes(mode="light")
RichBackend->>OOXML: read_sheet_drawings()
OOXML-->>RichBackend: SheetDrawingData (shapes, charts, connectors)
RichBackend-->>Pipeline: ShapeData {provenance="python_ooxml"}
Pipeline->>RichBackend: extract_charts(mode="light")
RichBackend->>OOXML: parse_charts_from_drawings()
OOXML-->>RichBackend: ChartData {provenance="python_ooxml"}
RichBackend-->>Pipeline: ChartData
Pipeline->>WorkbookBuilder: build_workbook(include_rich_artifacts=True)
WorkbookBuilder-->>Client: Workbook with shapes, charts, print_areas
sequenceDiagram
participant Client
participant Pipeline
participant OoxmlBackend
participant LibreOfficeBackend
participant Workbook
Client->>Pipeline: extract(file, mode="libreoffice")
Pipeline->>OoxmlBackend: extract_shapes/charts(mode="light")
OoxmlBackend-->>Pipeline: OOXML baseline shapes/charts
Pipeline->>LibreOfficeBackend: extract_shapes/charts(mode="libreoffice")
alt LibreOffice Available
LibreOfficeBackend-->>Pipeline: Enriched shapes/charts {provenance="libreoffice_uno"}
else LibreOffice Unavailable
LibreOfficeBackend-->>Pipeline: Error
Pipeline->>Pipeline: Preserve OOXML baseline
end
Pipeline->>Workbook: build(shapes + charts from either backend)
Workbook-->>Client: Workbook with best-effort rich artifacts
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 19 |
| Duplication | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes. Give us feedback
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 45 out of 66 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
lightmode print-area contract acrossextract,process_excel, CLI, and engine export pathsrn- strengthen the OOXML rich baseline somode="light"can extract shapes and charts without requiring COMFilterOptions.include_print_areas=Noneas automatic inclusion, requiringFalsefor explicit suppressionWhy
The current branch restores behavior already accepted in
ADR-0010and the published docs. It also promotes OOXML extraction to the baseline rich path forlightmode, so shapes and charts can be returned without COM. In addition, it narrows OOXML drawing fallback scope so one broken drawing part does not erase shapes or charts from healthy sheets in the same workbook.Impact
mode="light"keepsprint_areasin default structured outputrn-mode="light"supports best-effort OOXML shape/chart extraction on healthy.xlsx/.xlsmworksheets without COMprint_areas_dirside output remains available onprocess_exceland CLI paths inlightmodeValidation
uv run pytest tests/engine/test_engine.py tests/core/test_mode_output.py tests/cli/test_cli.py tests/core/test_ooxml_drawing.py -quv run python scripts/gen_model_docs.pyuv run task precommit-runCloses #128
rnCloses #130Summary by CodeRabbit
New Features
Bug Fixes
Documentation