Skip to content

fix(bigquery): clamp progress-bar cursor when query_plan shrinks (#16168)#17011

Open
jbbqqf wants to merge 1 commit intogoogleapis:mainfrom
jbbqqf:feat/16168-fix-tqdm-helpers-bounds
Open

fix(bigquery): clamp progress-bar cursor when query_plan shrinks (#16168)#17011
jbbqqf wants to merge 1 commit intogoogleapis:mainfrom
jbbqqf:feat/16168-fix-tqdm-helpers-bounds

Conversation

@jbbqqf
Copy link
Copy Markdown

@jbbqqf jbbqqf commented May 9, 2026

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #16168 🦕

Summary

wait_for_query in _tqdm_helpers.py crashes with IndexError: list index out of range when the progress-bar cursor i advances past the length of query_job.query_plan after a reload(). This patch clamps i to the current plan length before each indexed access so the progress bar plateaus at the last available stage instead of raising.

Context

The polling loop in wait_for_query reads the query plan once per iteration (line 113 — current_stage = query_job.query_plan[i]) and only advances i inside the COMPLETE branch of the previous iteration. The if i < default_total - 1 guard on line 131 uses the previous iteration's default_total, so on the next pass:

  1. query_job.reload() runs and may return a shorter plan (BigQuery has been observed to coalesce or restructure stages between polls — issue reporter saw it on a MERGE query).
  2. default_total is recomputed from the new (shorter) plan.
  3. query_job.query_plan[i] is indexed without re-checking — IndexError.

The fix is a one-line clamp before the indexing access. The semantics are: when the plan shrinks, the cursor "freezes" at the last valid index and the bar holds steady until the query result arrives. This matches user expectation better than an exception.

Changes

  • packages/google-cloud-bigquery/google/cloud/bigquery/_tqdm_helpers.py — clamp i to default_total - 1 if it has overrun the current plan, then index. Replaced redundant len(query_job.query_plan) re-read with the freshly computed default_total. A short comment explains the invariant for future readers (the reason for the clamp is non-obvious from the diff alone).
  • packages/google-cloud-bigquery/tests/unit/test__tqdm_helpers.py — new file, two regression tests. The first reproduces the exact IndexError from the issue's stack trace using a mock QueryJob whose reload() swaps the query_plan to a shorter one between polls.

Reproduce BEFORE/AFTER yourself (copy-paste)

A reviewer can verify the fix in ≤60s by pasting the block below.

# --- one-time setup ---
git clone https://github.com/googleapis/google-cloud-python.git /tmp/repro && cd /tmp/repro
python3 -m venv /tmp/repro-venv
source /tmp/repro-venv/bin/activate
pip install -q -e packages/google-cloud-bigquery pytest tqdm

# --- pull the regression test from this branch (one file) ---
git fetch https://github.com/jbbqqf/google-cloud-python.git feat/16168-fix-tqdm-helpers-bounds
git checkout FETCH_HEAD -- packages/google-cloud-bigquery/tests/unit/test__tqdm_helpers.py

# --- BEFORE (origin/main with the regression test imported) ---
git checkout origin/main -- packages/google-cloud-bigquery/google/cloud/bigquery/_tqdm_helpers.py
pytest packages/google-cloud-bigquery/tests/unit/test__tqdm_helpers.py::test_wait_for_query_handles_shrinking_query_plan -v
# Expected: FAILED — IndexError: list index out of range at _tqdm_helpers.py:113

# --- AFTER (this PR) ---
git checkout FETCH_HEAD -- packages/google-cloud-bigquery/google/cloud/bigquery/_tqdm_helpers.py
pytest packages/google-cloud-bigquery/tests/unit/test__tqdm_helpers.py -v
# Expected: 2 passed

The only thing that changes between BEFORE and AFTER is which version of _tqdm_helpers.py is checked out. The regression test is identical in both runs.

What I ran locally

  • pytest tests/unit/test__tqdm_helpers.py -v2/2 passed on the fix branch; 1 failed (the new regression test) + 1 passed on origin/main with exactly the IndexError from the issue's stack trace.
  • The wider tests/unit/job/test_query_pandas.py and tests/unit/test_table.py test files import pandas/pyarrow + dev-only test_utils; their full run is what GHA unittest.yml already exercises. The change here only touches the bounds-clamp code path that was previously unreachable from those existing tests (which use plans of fixed length).

Edge cases tested

# Scenario query_plan evolution Cursor trajectory Expected Verified by
1 Plan shrinks across reload (issue's repro) 3 stages → 1 stage i=0 → 1 (clamped) clean exit, bar closed test_wait_for_query_handles_shrinking_query_plan
2 Plan stays small but loop runs longer 1 stage (stable) i stays at 0 no overflow, clean exit test_wait_for_query_progress_does_not_overflow_default_total
3 Plan length unchanged (existing happy path) N stages (stable) i increments normally unchanged behavior existing test_to_arrow_w_tqdm_w_query_plan etc. — still pass

Risk / blast radius

  • Scope is one function (wait_for_query), one extra branch. No public-API change. No type-signature change.
  • Behavior is strictly more permissive: paths that previously crashed now plateau the progress bar. Paths that previously worked are unchanged (the clamp is a no-op when i < default_total).
  • No query_plan mutations — read-only access pattern unchanged.

Release note

fix(bigquery): clamp `wait_for_query` progress-bar cursor when `query_plan` shrinks across `reload()`, preventing `IndexError` on long-running queries (notably MERGE).

PR drafted with assistance from Claude Code. The change was reviewed manually against googleapis/google-cloud-python's source. The reproducer block above was used during development; it is the same one a reviewer can paste verbatim.

…gleapis#16168)

`wait_for_query` indexes into `query_job.query_plan` with a cursor `i`
that survives across iterations of the polling loop, but `reload()` can
return a shorter `query_plan` than the previous iteration (observed for
MERGE queries — see googleapis#16168). Clamp `i` to the current plan length before
indexing so the progress bar plateaus at the last stage instead of
raising `IndexError: list index out of range`.

Adds tests/unit/test__tqdm_helpers.py with a regression test that
fails on origin/main with the exact stack trace from the issue and
passes after the fix.
@jbbqqf jbbqqf requested review from a team as code owners May 9, 2026 15:39
@jbbqqf jbbqqf requested review from TrevorBergeron and removed request for a team May 9, 2026 15:39
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an IndexError in the wait_for_query helper by clamping the stage index when the BigQuery query plan shrinks between iterations. It also introduces unit tests to validate this behavior. A review comment identified a non-functional loop in the newly added tests and suggested replacing it with a proper assertion to ensure the progress bar total is correctly set.

Comment on lines +120 to +122
for call in bar.total.__class__ == int and [] or []:
# placeholder: bar.total is a Mock attribute, no length assertion here
pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This loop is non-functional because the expression bar.total.__class__ == int and [] or [] always evaluates to an empty list [], meaning the loop body never executes. Since progress_bar.total is assigned an integer value in wait_for_query, bar.total will be an integer after the function runs. Replacing this with a standard assertion improves test clarity and ensures the expected value is correctly set.

    assert bar.total == len(plan)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IndexError in _tqdm_helpers.wait_for_query when query_plan changes between iterations

1 participant