feat(scripts): Add dependency version scanner tool by chalmerlowe · Pull Request #16867 · googleapis/google-cloud-python

chalmerlowe · 2026-04-29T12:30:39Z

This adds a utility with the ability to scan for common references to dependencies (Python runtimes and package dependencies) to facilitate updating code when runtimes and dependencies change.

It can be run against an entire repo OR against specific packages within a monorepo
It is customizable with regex patterns and examples here
The test suite checks each regex against the examples to ensure the efficacy of the patterns
The current patterns account for edge cases such as finding < 3.8 when searching for references to 3.7 since they are semantically equivalent even if syntactically different.
The scanner produces a CSV report with:

path/filename, package name, line number, matching pattern, full line for context, etc.

gemini-code-assist

Code Review

This pull request introduces a new dependency version scanner, including a configuration-driven regex scanner, a benchmarking tool, and comprehensive unit and integration tests. The review feedback highlights several areas for improvement: optimizing regex compilation in the scanner to avoid performance bottlenecks, using the tempfile module in the benchmark script to prevent race conditions, removing redundant code, improving test robustness by checking subprocess exit codes, and adhering to PEP 8 by moving imports to the top of files.

…d tests

…e and add tests

…ication

…lines

…changelog.md

…boundaries for explicit_version_string

…kup logic

… to .scannerignore

daniel-sanche · 2026-05-08T22:01:21Z

@@ -0,0 +1,34 @@
+import csv


It looks like the copytight header is missing (applies to all code files)

daniel-sanche · 2026-05-08T22:05:29Z

+Run the script from the repository root:
+
+```bash
+python3 scripts/version_scanner/version_scanner.py -d <dependency> -v <version> [options]


When I ran this, I gt a ModuleNotFound error. is there a requirements.txt or anything that captures the dependencies?

daniel-sanche · 2026-05-08T22:07:18Z

+This plan outlines the approach to update Python packages to drop support for end-of-life Python runtimes (3.7, 3.8, 3.9) OR for deprecated dependencies, and ensure the packages are configured for modern Python.
+
+#### High-Level Strategy
+- **One Branch Per Package**: To keep PRs manageable and isolated, we suggest a dedicated worktree and branch for each package (e.g., `feat/drop-<dependency>-<version>-<package-name>` i.e. `feat/drop-protobuf-4.25.8-google-cloud-bigquery`).


This is only for hand-written packages, right? I assume others would get their updates through the generator?

Should we recommend doing a generator update first, to clean up most of the packages?

daniel-sanche · 2026-05-08T22:12:22Z

@@ -0,0 +1,5 @@
+packages/google-cloud-access-context-manager


what is this?

daniel-sanche · 2026-05-08T22:14:41Z

+        self.variables = self._compute_variables()
+
+    def _compute_variables(self) -> Dict[str, str]:
+        """Compute variables for interpolation from version string."""


nit: more detailed comments/examples could be helpful for future maintainers. I'm not sure what a variable is, or the expected version string format

daniel-sanche · 2026-05-08T22:17:33Z

+    try:
+        with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
+            skip_next = False
+            for line_num, line in enumerate(f, 1):


are there any issues with statements that span lines?

daniel-sanche · 2026-05-08T22:23:57Z

+def upload_to_drive(csv_path: str, matches: List[Dict[str, str]], github_repo: str = None, branch: str = "main") -> str:
+    """
+    Upload matches to a Google Sheet in Drive.
+    """


Is this necessary? It seems to add extra complexity, dependencies and test surface area, when Google Sheets makes it pretty easy to import a csv natively already

daniel-sanche · 2026-05-08T22:26:27Z

+        parts = rel_root.split(os.sep)
+
+        # Monorepo filtering
+        if target_packages and parts[0] == "packages":


There's talk of separating the packages directory into separate ones for generated and handwritten libraries. Will that be easy to address here?

daniel-sanche · 2026-05-08T22:28:13Z

+
+    package_group.add_argument(
+        "--package",
+        help="Specific subdirectory filter (useful for monorepos)"


Is this specific to the structure of the monorepo's package directory? Os is this more of a generic subdirectory filter?

feat(scripts): Add dependency version scanner tool

f446ff7

chalmerlowe changed the title ~~feat(scripts): Add dependency version scanner tool~~ feat(scripts): [WIP] Add dependency version scanner tool Apr 29, 2026

chalmerlowe added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Apr 29, 2026

gemini-code-assist Bot reviewed Apr 29, 2026

View reviewed changes

chalmerlowe added 26 commits April 29, 2026 08:40

perf(search): Apply bot suggestions for regex optimization and imports

256b048

refactor(benchmark): Use tempfile for unique names and safe cleanup

1010399

refactor(benchmark): Remove redundant directory check

68f61ee

test(integration): Check exit code of subprocess in integration test

cc960b4

test(unit): Remove redundant and brittle test_regex_patterns

a4ad9ce

test(unit): Move import yaml to top of file

2743957

refactor(benchmark): Remove redundant directory check in main

47450bb

test(unit): Remove duplicate import yaml from function

c777e44

feat(version_scanner): handle invalid format strings in config and ad…

8aab801

…d tests

feat(version_scanner): handle PermissionError when reading config fil…

f63053c

…e and add tests

feat(version_scanner): extract read_package_file and handle file errors

2af97b3

refactor(version_scanner): simplify target resolution and remove dupl…

cb29438

…ication

feat(version_scanner): add format_match_for_csv helper and tests

ea0e8be

feat(version_scanner): integrate GitHub link generation into CSV report

a8824af

feat(version_scanner): default output to results directory

baafb74

feat(version_scanner): ignore version_scanner directory during scan

a1cc08e

feat(version_scanner): broaden version regex and add case insensitivity

3ceea9b

feat(version_scanner): strip newlines from matched strings

d756c07

feat(version_scanner): add word boundaries and truncate long context …

075d04b

…lines

feat(version_scanner): add console summary table

85e9ff5

feat(version_scanner): add .scannerignore file support

5c8f673

feat(version_scanner): move ignore defaults to .scannerignore file

efb3331

docs(version_scanner): add README.md

bf39072

docs(version_scanner): update README options and CLI help strings

9d9ce22

feat(version_scanner): set default for --github-repo

14e4dcc

feat(version_scanner): default config path to script directory

7fc03ca

chalmerlowe added 9 commits April 30, 2026 09:29

feat(version_scanner): support case-insensitive file ignores and add …

f64eac4

…changelog.md

feat(version_scanner): update small package list for demos

fc47dd6

Merge remote-tracking branch 'origin/main' into feat/add-version-scanner

95f6f19

Merge branch 'origin/main' into feat/add-version-scanner

761def6

feat(version_scanner): add combined_version_string rule and use word …

9289c8c

…boundaries for explicit_version_string

feat(scanner): add ability to detect ignore pragma

d771258

feat(scanner): move .scannerignore to script directory and update loo…

bafae70

…kup logic

chore(scanner): ignore repositories.bzl in scanner

94174bb

feat(scanner): add filename scanning support

d652dbf

chalmerlowe marked this pull request as ready for review May 5, 2026 13:03

chalmerlowe requested a review from a team as a code owner May 5, 2026 13:03

chalmerlowe removed the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label May 5, 2026

chalmerlowe changed the title ~~feat(scripts): [WIP] Add dependency version scanner tool~~ feat(scripts): Add dependency version scanner tool May 5, 2026

docs(scanner): update README with known issues and add binary ignores…

a1188c8

… to .scannerignore

chalmerlowe added this to the Drop support for 3.7-3.9 milestone May 5, 2026

parthea self-assigned this May 6, 2026

docs(version-scanner): merge migration guide into README.md

0a6ae92

daniel-sanche reviewed May 8, 2026

View reviewed changes

Conversation

chalmerlowe commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chalmerlowe commented Apr 29, 2026 •

edited

Loading