Add native rule implementation for the standard protovalidate rules by jonbodner-buf · Pull Request #469 · bufbuild/protovalidate-java

jonbodner-buf · 2026-05-04T15:09:20Z

This PR ports the native rule support from protovalidate-go to protovalidate-java. Validation rules defined by protovalidate are handled in Java code instead of CEL. CEL expressions are still processed in CEL, as are any new rules that don't have native implementations.

The standard rules can be evaluated either through CEL or through native Java code. Native evaluation is functionally identical (the conformance suite passes in both modes) but skips CEL compilation and runtime overhead for the rules it covers — a single validate() call on a complex message can run an order of magnitude faster and allocate ~10× less.

Native rules are opt-in while the implementation matures. Enable them by configuring the validator:

Config config = Config.newBuilder().setEnableNativeRules(true).build();
Validator validator = ValidatorFactory.newBuilder().withConfig(config).build();

Native evaluators currently cover:

bool
all 12 numeric kinds (signed and unsigned int32/int64, float, double, etc.)
enum (const/in/not_in; the existing defined_only path is unchanged)
bytes
string (including all well-known formats)
repeated/map list-level rules (min_items/max_items/unique, min_pairs/max_pairs)
the google.protobuf.{Bool,Int32,Int64,UInt32,UInt64,Float,Double,String,Bytes}Value wrapper types (rules on wrapper-typed fields run through the same native evaluators after unwrapping).

Forward compatibility is preserved by a clone-and-clear contract: when protovalidate adds a new rule that this codebase hasn't yet implemented natively, the rule remains on the residual FieldRules and CEL enforces it. Native evaluation is an optimization, never a replacement.

Performance improvement for native rule implementations:

benchmark                          metric  cel                native          delta
buildBenchComplexSchema            time    15002410.58 ns/op  36947.91 ns/op  -99.8%
buildBenchComplexSchema            alloc   22874194.63 B/op   74972.01 B/op   -99.7%
buildBenchInt32GT                  time    4978570.44 ns/op   15568.07 ns/op  -99.7%
buildBenchInt32GT                  alloc   8468429.34 B/op    43944 B/op      -99.5%
validateBenchBoolConst             time    506.87 ns/op       60.67 ns/op     -88%
validateBenchBoolConst             alloc   1552 B/op          8 B/op          -99.5%
validateBenchBytesConst            time    614.72 ns/op       96.61 ns/op     -84.3%
validateBenchBytesConst            alloc   1828 B/op          140 B/op        -92.3%
validateBenchBytesIn               time    1498.31 ns/op      126.79 ns/op    -91.5%
validateBenchBytesIn               alloc   4288 B/op          168 B/op        -96.1%
validateBenchComplexSchema         time    54851.83 ns/op     4835.59 ns/op   -91.2%
validateBenchComplexSchema         alloc   130384.01 B/op     9184 B/op       -93%
validateBenchDoubleIn              time    686.87 ns/op       84 ns/op        -87.8%
validateBenchDoubleIn              alloc   2324 B/op          140 B/op        -94%
validateBenchEnumConst             time    506.34 ns/op       64.54 ns/op     -87.3%
validateBenchEnumConst             alloc   1556 B/op          72 B/op         -95.4%
validateBenchEnumNotIn             time    505.09 ns/op       65.28 ns/op     -87.1%
validateBenchEnumNotIn             alloc   1620 B/op          80 B/op         -95.1%
validateBenchEnumRules             time    726.41 ns/op       67.74 ns/op     -90.7%
validateBenchEnumRules             alloc   1960 B/op          80 B/op         -95.9%
validateBenchInt32GT               time    36965.57 ns/op     1114.67 ns/op   -97%
validateBenchInt32GT               alloc   85736.01 B/op      1244 B/op       -98.5%
validateBenchInt64Const            time    556.7 ns/op        77.81 ns/op     -86%
validateBenchInt64Const            alloc   1740 B/op          72 B/op         -95.9%
validateBenchInt64In               time    708.46 ns/op       94.13 ns/op     -86.7%
validateBenchInt64In               alloc   2048 B/op          132 B/op        -93.6%
validateBenchMap                   time    1007.53 ns/op      226.63 ns/op    -77.5%
validateBenchMap                   alloc   2912 B/op          432 B/op        -85.2%
validateBenchRepeatedBytesUnique   time    1034.72 ns/op      326.8 ns/op     -68.4%
validateBenchRepeatedBytesUnique   alloc   3304 B/op          1088 B/op       -67.1%
validateBenchRepeatedInt32Unique   time    668.98 ns/op       205.97 ns/op    -69.2%
validateBenchRepeatedInt32Unique   alloc   2256 B/op          576 B/op        -74.5%
validateBenchRepeatedMessage       time    9555.54 ns/op      645.4 ns/op     -93.2%
validateBenchRepeatedMessage       alloc   31112 B/op         1012 B/op       -96.7%
validateBenchRepeatedScalar        time    688.25 ns/op       95.64 ns/op     -86.1%
validateBenchRepeatedScalar        alloc   2240 B/op          112 B/op        -95%
validateBenchRepeatedScalarUnique  time    685.01 ns/op       203.59 ns/op    -70.3%
validateBenchRepeatedScalarUnique  alloc   2592 B/op          856 B/op        -67%
validateBenchRepeatedStringUnique  time    637.11 ns/op       232.19 ns/op    -63.6%
validateBenchRepeatedStringUnique  alloc   2176 B/op          576 B/op        -73.5%
validateBenchScalar                time    1523.08 ns/op      85.91 ns/op     -94.4%
validateBenchScalar                alloc   4492 B/op          120 B/op        -97.3%
validateBenchStringConst           time    548.64 ns/op       78.3 ns/op      -85.7%
validateBenchStringConst           alloc   1728 B/op          72 B/op         -95.8%
validateBenchStringContains        time    498.97 ns/op       74.8 ns/op      -85%
validateBenchStringContains        alloc   1628 B/op          72 B/op         -95.6%
validateBenchStringIn              time    778.2 ns/op        80.58 ns/op     -89.6%
validateBenchStringIn              alloc   2176 B/op          100 B/op        -95.4%
validateBenchStringLen             time    619.07 ns/op       70.94 ns/op     -88.5%
validateBenchStringLen             alloc   2132 B/op          72 B/op         -96.6%
validateBenchStringMinLen          time    640.65 ns/op       69.71 ns/op     -89.1%
validateBenchStringMinLen          alloc   2236 B/op          72 B/op         -96.8%
validateBenchStringPrefix          time    517.29 ns/op       79.1 ns/op      -84.7%
validateBenchStringPrefix          alloc   1732 B/op          72 B/op         -95.8%
validateBenchUint32In              time    747.06 ns/op       86.95 ns/op     -88.4%
validateBenchUint32In              alloc   2120 B/op          124 B/op        -94.2%
validateManyUnruled                time    650.71 ns/op       68.37 ns/op     -89.5%
validateManyUnruled                alloc   2192 B/op          72 B/op         -96.7%
validateMultiRuleError             time    3292.56 ns/op      433.89 ns/op    -86.8%
validateMultiRuleError             alloc   10736 B/op         2168 B/op       -79.8%
validateMultiRuleNoError           time    1958.92 ns/op      82.84 ns/op     -95.8%
validateMultiRuleNoError           alloc   5872 B/op          72 B/op         -98.8%
validateRegexPattern               time    1987 ns/op         779.5 ns/op     -60.8%
validateRegexPattern               alloc   4296 B/op          180 B/op        -95.8%
validateRepeatedRule               time    14083.24 ns/op     1186.26 ns/op   -91.6%
validateRepeatedRule               alloc   42896 B/op         1468 B/op       -96.6%
validateSimple                     time    2130.52 ns/op      882.01 ns/op    -58.6%
validateSimple                     alloc   4056 B/op          152 B/op        -96.3%
validateStringMatching             time    8325.98 ns/op      2035.29 ns/op   -75.6%
validateStringMatching             alloc   16280 B/op         960 B/op        -94.1%
validateTestByteMatching           time    6546.92 ns/op      359.91 ns/op    -94.5%
validateTestByteMatching           alloc   22120 B/op         600 B/op        -97.3%
validateWrapperTesting             time    18270 ns/op        962.24 ns/op    -94.7%
validateWrapperTesting             alloc   40404 B/op         1376 B/op       -96.6%

Run the benchmarks yourself with:

./gradlew :benchmarks:jmh
./gradlew :benchmarks:jmhCompareParams

Establishes the JMH measurement substrate for the native-rules port. Each subsequent phase will compare against these numbers to verify the port delivers real CPU and allocation wins. Ports the bench and test messages from protovalidate-go's bench.proto and native_test.proto into a new native_bench.proto, with gofakeit annotations stripped (no maintained Java equivalent of protogofakeit). Hand-built deterministic fixtures replace the gofakeit-generated values, so runs are reproducible. Adds compile-time benchmarks alongside the steady-state ones to track evaluator-construction cost — under the planned clone-and-clear dispatch, CEL will skip compilation for rules covered natively.

Lays the groundwork for replacing CEL with native Java evaluation of standard rules. Per-rule implementations land in subsequent phases; this commit only adds the surface area and the dispatch path, so behavior is unchanged when the flag is off (the default) or when no rule type has been migrated yet. Native evaluators live in build.buf.protovalidate.rules. To consume the existing evaluator pipeline they need access to a few package-private types in build.buf.protovalidate; those are widened to public and marked with a new @internal annotation so external consumers know they remain unsupported. Forward compatibility is preserved by a clone-and-clear contract: when a future rule is added to the validate proto and the native dispatcher hasn't yet learned it, the rule stays on the residual FieldRules and CEL enforces it. Native evaluation is an optimization, never a replacement. The conformance runner reads DISABLE_NATIVE_RULES so the same suite can exercise both modes without code changes — used now to verify the dispatcher is invisible (2872/2872 in both modes), and again later for parity testing.

First rule type to land natively. Confirms the dispatcher contract end to end — the violation shape, rule path, field path, and rule_value match what the CEL path produces for the same input. Wrapper-typed scalar fields bypass the dispatcher; CEL's transparent unwrapping handles them correctly today and adding native wrapper support uniformly across all scalar rule types is more naturally a follow-up once more rule types have shipped. The test asserts on Violation.getRuleValue() explicitly, since that field isn't part of the Violation proto and is therefore invisible to the conformance suite — a parity-loss vector that needs Java-side coverage to catch.

Covers all 12 proto numeric kinds (int32, sint32, sfixed32, uint32, fixed32, int64, sint64, sfixed64, uint64, fixed64, float, double) and all scalar numeric rules: gt, gte, lt, lte, const, in, not_in, plus finite for float/double. Range-combo rule ids (<type>.gt_lt[_exclusive], <type>.gte_lte[_exclusive], ...) and violation messages match protovalidate-go's reference. Unsigned correctness for uint32/fixed32/uint64/fixed64 uses Integer.compareUnsigned/Long.compareUnsigned for ordering and Integer.toUnsignedString/Long.toUnsignedString for messages, since Java stores unsigned protobuf scalars in signed primitives. Adds Value.rawValue() so native evaluators can read protobuf-java's underlying scalar (Long for uint64, Integer for int32, ...) without the CEL adaptation that ObjectValue.value(Class) applies — uint64 arrives as Guava UnsignedLong via that path, which doesn't compose with the rule values read off the typed *Rules message. The biggest steady-state benchmarks improve dramatically when native is enabled — validateBenchInt32GT drops 97% in time and 98% in allocation, and buildBenchInt32GT drops 99.7% (5ms → 16μs) because every rule is covered natively and CEL has nothing left to compile.

Covers the three enum rules that aren't already handled by the existing EnumEvaluator (defined_only). Both evaluators compose on the same field when both rules are set — protovalidate's clone-and-clear contract leaves defined_only untouched on the residual FieldRules so the existing path keeps working. Also adds bench fixtures BenchBoolConst and BenchEnumRules so Phase 2 and this phase have direct measurement targets. Previously neither phase had a bool-only or enum-only bench message and the JMH numbers were dominated by dispatcher overhead. With these fixtures, native vs CEL shows ~85% and ~90% time reduction on the steady-state path. While here, simplifies getUnknownFields().asMap().isEmpty() to just isEmpty() in the bool and numeric evaluators — UnknownFieldSet has the method directly.

Covers const, len, min_len, max_len, pattern, prefix, suffix, contains, in, not_in, plus the well-known size-only formats ip (4 or 16 bytes), ipv4 (4 bytes), ipv6 (16 bytes), uuid (16 bytes) — none of those require actual parsing per protovalidate's spec. Pattern matching uses re2j to stay byte-compatible with the CEL path, and throws ExecutionException on non-UTF-8 input to match Go's RuntimeError contract — the conformance suite expects pattern checks to fail loudly rather than silently substituting characters. ByteString doesn't expose a contains() method, so there's a small in-class implementation. Hex formatter mirrors Go's fmt.Sprintf("%x") output for byte slices, used in const/prefix/suffix/contains violation messages. validateTestByteMatching drops 92% in time and 97% in allocation when native is enabled. validateBenchComplexSchema is now down 73% in time vs the Phase 0 baseline (numerics + bytes both native).

The biggest single rule type by surface area: 14 scalar rules (const/len/min_len/max_len/len_bytes/min_bytes/max_bytes/pattern/ prefix/suffix/contains/not_contains/in/not_in), 18 well-known formats (email, hostname, the ip/ipv4/ipv6 family with prefix and prefixlen variants, uri, uri_ref, address, uuid, tuuid, ulid, host_and_port), and the well_known_regex oneof case for HTTP header name/value with a strict toggle. The well-known formats reuse the existing format helpers in CustomOverload (isEmail, isHostname, isIp, isIpPrefix, isUri, isUriRef, isHostAndPort) — those are widened to public+@internal so both the CEL custom-overload registration and the native rules path share the same parsers. No reimplementation, no risk of drift. Length rules count Unicode code points (String.codePointCount), not Java chars — surrogate-pair-encoded characters like emoji count as one. Byte-length rules count UTF-8 bytes. validateStringMatching drops 71% in time and 93% in allocation when native is enabled. validateBenchComplexSchema is now down 80% in time and 82% in allocation vs the Phase 0 baseline (numerics + bytes + string all native).

Covers repeated min_items/max_items/unique and map min_pairs/max_pairs. Element-level / key-and-value rules continue to flow through the existing ListEvaluator/MapEvaluator path; this phase only handles the top-level list/map rules that don't recurse into elements. The unique check uses a linear O(n²) scan for lists of 16 or fewer items (no allocation) and falls back to a HashSet for larger lists, mirroring Go's threshold. Element kinds with reliable Object.equals (scalars, strings, bools, bytes, enums) are supported; message/group element kinds bail to CEL since their equality is reference-only. This is the last rule type in scope for the native-rules port. With all six rule phases native, BenchComplexSchema drops 90.5% in time (55,536 → 5,260 ns/op) and 92.4% in allocation (133,120 → 10,100 B/op) vs the pre-port baseline. The remaining ~10% is enum defined_only, oneof required, field-level required, and wrapper- typed scalars — all intentionally out of scope (the first three are already non-CEL evaluators; wrapper support is deferred).

The parity test runs a representative slice of conformance fixtures — one per rule type, plus the kitchen-sink composite — through both evaluation modes and asserts byte-equal Violation.toProto() output. This is stricter than the conformance suite, which by default omits the message text from comparison; the parity test catches any between-mode drift on rule_id, field path, rule path, AND message. Documents the Config.setDisableNativeRules opt-in in README, listing what's covered and noting wrapper-typed scalars as a follow-up.

WrappedValueEvaluator pulls the inner 'value' field off a wrapper Message at evaluation time and delegates to the same scalar native evaluator (BoolRulesEvaluator, NumericRulesEvaluator, etc.) that would have been used for a direct field of the wrapped type. The RuleBase still uses the OUTER wrapper field's descriptor so violation field paths point at the user's wrapper-typed field, matching the field path produced by the CEL path. Removes the Phase 2 wrapper-bypass that returned null from the dispatcher when the outer descriptor was a Message. The CEL path was correct because CEL's runtime auto-unwraps protobuf wrappers; native evaluators expect the underlying scalar (e.g. Long for Int64Value.value), which is what the unwrap produces. ObjectValue is widened to public+@internal so the rules package can construct one when delegating into the inner scalar evaluator. validateWrapperTesting drops 94.8% in time (16,866 → 880 ns/op) and 96.6% in allocation (39,956 → 1,376 B/op). Conformance passes 2872/2872 in both modes; the parity test is extended to include WrapperDouble for explicit coverage.

@param

Reads more naturally: setting enableNativeRules to true turns native rule evaluation on, false turns it off, default is false. Same effective behavior (CEL-only by default; opt in to native), but the API name is direct rather than double-negative. Renames Config.setDisableNativeRules -> Config.setEnableNativeRules, Config.isNativeRulesDisabled -> Config.isNativeRulesEnabled, the EvaluatorBuilder field, the conformance env var DISABLE_NATIVE_RULES -> ENABLE_NATIVE_RULES, the JMH @param across both benchmark classes, and all test/doc references. Bench @param order flips to {"false","true"} so the CEL-only run still comes first; the regression-gate run is unchanged.

Adds make conformance-native that runs the conformance CLI with ENABLE_NATIVE_RULES=true, and updates the Conformance workflow to run both CEL-only (default) and native-enabled passes. Catches any between-mode drift on every PR rather than only when someone runs the parity test locally. JMH-based regression detection in CI was considered but rejected: at sub-microsecond benchmark scales JMH variance on shared CI runners exceeds the 5% gate that's meaningful, so a per-PR perf gate would be either too loose to catch real regressions or too flaky to trust. The conformance dual-mode pass is the strict correctness gate; JMH stays a developer-local A/B via the existing benchmarks/RESULTS.md workflow.

NumericTypeConfig.floatCompare/doubleCompare now treat +0.0 and -0.0 as equal so a rule like float.const = -0.0 against val = +0.0 no longer fires a spurious violation, and the formatters preserve the sign in violation messages by checking the bit pattern before the whole-number short-circuit. RepeatedRulesEvaluator.isUnique drops the linear-scan fast path for short lists; the HashSet path is uniform across sizes. Adds FloatBugConfirmationTest plus the validationtest.proto fixtures it exercises (ExampleFloat/DoubleConstNegZero, ExampleFloat/DoubleRepeatedUnique, FloatDoubleNaNNegZero) to lock in native-vs-CEL parity on -0 const messages and document that NaN/±0 unique semantics agree across both paths (both deviate from CEL's IEEE-754 spec via CustomOverload.uniqueList).

StringRulesEvaluator: - Pre-build well_known_regex sites (header_name / header_name_empty / header_value) as static finals; checkKnownRegex no longer allocates RuleSites per violation. - Drop defensive ArrayList copies of in/not_in lists; the proto runtime already returns immutable views. - Replace strVal.getBytes(UTF_8).length with a counting helper that doesn't allocate a byte array. - Store WellKnownFormat empty messages explicitly per constant rather than deriving them by string substitution. Substring derivation produced "value is empty, which is not a valid host (hostname or IP address) and port pair" for host_and_port, where the spec says "value is empty, which is not a valid host and port pair". - Drop unused ruleSuffix field on WellKnownFormat. BytesRulesEvaluator / EnumRulesEvaluator: drop defensive in/not_in copies. NumericRulesEvaluator: - Switch evaluate() from eager `new ArrayList<>(0)` to the lazy-null pattern used by the other evaluators; rename finalize → done. - Drop the rulesField parameter from tryBuild; the descriptor now lives on NumericTypeConfig. NumericTypeConfig: - Carry the FieldRules-level field descriptor on each kind. Lets the dispatcher read and clear the typed sub-message without the 12-arm rulesFieldNumberFor switch. - Extract a private create(...) helper to make the static initializers uniform. - Document the FieldRules / per-kind *Rules class-init invariant. Rules: - Drop rulesFieldNumberFor entirely; numericTryBuild reads the descriptor off the config. - Inline tryBuildRepeatedRules / tryBuildMapRules at their single call sites. MapRulesEvaluator: tautology() returns false unconditionally. The previous condition was unreachable because tryBuild already returns null when both fields are unset. WrappedValueEvaluator: assert at construction that the inner field is field number 1 named "value" (the wrapper-message contract).

The lazy-null `add(...)`, `done(...)`, and `formatList(...)` helpers were duplicated across six evaluators. Move them onto RuleBase: a static `add(...)`, an instance `done(...)` that finalizes against this base's field-path/rule-prefix elements, and a generic `formatList(...)` plus toString-default overload. Per-evaluator wrappers that were pure delegations are deleted entirely. NumericRulesEvaluator and BytesRulesEvaluator keep their `formatList` methods because each captures a non-default formatter (config.formatter and ByteString::toStringUtf8 respectively). Call sites use `RuleBase.add(...)` / `base.done(...)` / `RuleBase.formatList(...)` directly.

Adds five test classes covering contracts the conformance suite doesn't exercise directly: - FailFastTest: native evaluators must short-circuit after the first violation when failFast=true. One fixture per evaluator kind with two rules that both fail. - NotInRulesTest: pins rule_id and message text for not_in on int32, uint32, string, bytes, and enum. - ResidualClearingTest: asserts exactly one violation (not two) per rule type, proving the dispatcher's clone-and-clear contract for each kind. - WellKnownRegexTest: HTTP header_name/header_value, strict + loose modes, and the empty-header-name special case. - WrappedValueEvaluatorTest: absent wrapper field, present wrapper with default inner value, present wrapper with violating inner value. Adds the validationtest.proto fixtures these tests need (multi-rule fixtures, NotIn fixtures, HttpHeader fixtures, wrapper fixtures), pulling in google/protobuf/wrappers.proto. Parameterizes NativeRulesParityTest with @MethodSource so each fixture produces a discrete pass/fail in the JUnit report.

@benchmark

Adds 16 single-rule fixtures and matching @benchmark methods so each rule that the native path now handles has an isolated CEL-vs-native A/B (the existing BenchComplexSchema mixed too many rules into one timing). Covers the gaps identified in the second-pass review: - string: const, len, min_len (isolated), prefix, contains, in - bytes: const, in - numeric (non-int32): int64 const+in, uint32 in, double in - enum: const, not_in - repeated.unique: string, int32 Each fixture's value satisfies its rule so the benchmark measures the happy path. The `enableNativeRules` @param continues to A/B native vs CEL.

…ctor to prevent instantiation. Update README for benchmarks. Add new gradle task for comparing native rule performance.

…etEnableNativeRules() and setDisableNativeRules(). If neither is used, protovalidate-java defaults to native rules enabled.

…native implementations correctly.

jonbodner-buf · 2026-05-04T18:03:28Z

New commit reverses the behavior. Now native rules are enabled by default.

pkwarren

Took a first pass - still need to review the tests. The results look really promising - nice work!

There were a few places that may differ a bit between Java and Go that need a closer look. I also think we may be able to optimize further but that's better in follow ups.

pkwarren · 2026-05-04T20:07:26Z

-      - name: Test conformance
+      - name: Test conformance (CEL-only)
+        run: make conformance-cel
+      - name: Test conformance (native rules enabled — default mode)


nit: I'd remove the (native rules enabled ... here - there will potentially be other variants down the road.

pkwarren · 2026-05-04T20:08:53Z

-conformance: ## Execute conformance tests.
-	$(GRADLE) conformance:conformance
+conformance: ## Execute conformance tests with native rule evaluators enabled.
+	ENABLE_NATIVE_RULES=true $(GRADLE) conformance:conformance


We could define ENABLE_NATIVE_RULES ?= true above and then just update the workflow to run make conformance ENABLE_NATIVE_RULES=false for the CEL flow.

pkwarren · 2026-05-04T20:10:25Z

 * A comprehensive RPC quickstart for [Java and gRPC][grpc-java]
 * A [migration guide for protoc-gen-validate][migration-guide] users

+## Native rule evaluators (opt-in)


Now opt-out

pkwarren · 2026-05-04T20:12:29Z

+Validator validator = ValidatorFactory.newBuilder().withConfig(config).build();
+```
+
+Native evaluators currently cover bool, all 12 numeric kinds (signed and unsigned int32/int64, float, double, etc.), enum (`const`/`in`/`not_in`; the existing `defined_only` path is unchanged), bytes, string (including all well-known formats), repeated/map list-level rules (`min_items`/`max_items`/`unique`, `min_pairs`/`max_pairs`), and the `google.protobuf.{Bool,Int32,Int64,UInt32,UInt64,Float,Double,String,Bytes}Value` wrapper types — rules on wrapper-typed fields run through the same native evaluators after unwrapping.


Is this helpful information for users or a detail left to the implementation's javadoc or somewhere else?

My main concern is keeping this list in sync with any future optimizations.

pkwarren · 2026-05-04T20:18:27Z

+//
+// Override the input file:
+//   ./gradlew :benchmarks:jmhCompareParams -Presults=path/to/results.json
+tasks.register<Exec>("jmhCompareParams") {


Should we call this jmhCompareNative or jmhCompareNativeRules?

If so we should update the naming of the jq script to match.

renamed to jmhCompareNativeRules, .jq file renamed to jmh-compare-native-rules.jq

pkwarren · 2026-05-04T22:05:52Z

+  // Java's compareTo semantics (NaN.compareTo(NaN) == 0) so behavior matches the existing
+  // protovalidate-java CEL path, which uses the same Object.equals semantics.
+
+  private static int floatCompare(Float f1, Float f2) {


Since the comparator is used for more than range operations, we may need special handling for NaN in places where the comparator is used.

Double.NaN.compareTo(Double.NaN) == 0 in Java but if the Go code is using == then we'll get different behavior.

pkwarren · 2026-05-04T22:18:38Z

+     *
+     * @return this builder
+     */
+    public Builder setEnableNativeRules() {


Having two methods here is a different pattern than setAllowUnknownFields and setFailFast which take the boolean value. I'd have this match and remove setDisableNativeRules.

updated. Now there is a single method, setEnableNativeRules that takes a boolean. The default value is true.

pkwarren · 2026-05-04T22:22:51Z

@@ -71,6 +73,7 @@ final class EvaluatorBuilder {
  EvaluatorBuilder(Cel cel, Config config) {


Not from your change but we should change this to call this(cel, config, Collections.emptyList(), false) so we just do this initialization in one place.

added a new private constructor that is called by the two existing constructors. One constructor throws a checked exception, so merging them causes a great deal of disruption as the exception is bubbled up.

pkwarren · 2026-05-04T22:33:25Z

+ */
+@Retention(RetentionPolicy.SOURCE)
+@Target({ElementType.TYPE, ElementType.METHOD, ElementType.CONSTRUCTOR, ElementType.FIELD})
+public @interface Internal {}


While it makes the package bigger I wonder if we could look to move the rules.* package into build.buf.protovalidate. This way we can continue to leave some classes/methods package private.

There are also some other things we could look at as well - https://errorprone.info/api/latest/com/google/errorprone/annotations/RestrictedApi.html.

pkwarren · 2026-05-04T22:39:32Z

+  private static final Pattern ULID_REGEX =
+      Pattern.compile("^[0-7][0-9A-HJKMNP-TV-Za-hjkmnp-tv-z]{25}$");
+  private static final Pattern HEADER_NAME_REGEX =
+      Pattern.compile("^:?[0-9a-zA-Z!#$%&\\\\'*+\\-.\\^_|~`]+$");


https://github.com/bufbuild/protovalidate-go/blob/c0c9f624fd0fc37177c680ec2e40717ecbc68ac4/native_string.go#L246 uses raw strings, so it appears the escaping of the hyphen char differs betweeen Go and Java.

…package.

…eRules

…uffice. RuleViolation is the only implementation of Violation; the interface should be removed entirely.

jonbodner-buf added 17 commits April 30, 2026 13:50

jonbodner-buf requested a review from pkwarren May 4, 2026 15:09

fix formatting and comments. Remove unused field. Add private constru…

ea107a4

…ctor to prevent instantiation. Update README for benchmarks. Add new gradle task for comparing native rule performance.

jonbodner-buf force-pushed the jbodner/native-rules-port branch from 4419221 to ea107a4 Compare May 4, 2026 15:42

jonbodner-buf added 2 commits May 4, 2026 12:44

default to native rules enabled. There are now two Builder methods, s…

b7fdc0f

…etEnableNativeRules() and setDisableNativeRules(). If neither is used, protovalidate-java defaults to native rules enabled.

update the conformance workflow and the makefile to validate cel and …

9e45d6d

…native implementations correctly.

fix validation message for string.not_contains

ec8d659

pkwarren reviewed May 4, 2026

View reviewed changes

jonbodner-buf added 4 commits May 5, 2026 10:46

remove the rules package, just put all the code in the protovalidate …

6374e26

…package.

remove setDisableNativeRules and use a boolean flag on setEnableNativ…

11c77c1

…eRules

remove unneeded uses of Violation interface when RuleViolation will s…

a554ada

…uffice. RuleViolation is the only implementation of Violation; the interface should be removed entirely.

minor code review issues addressed

7d7e3d9

		@@ -71,6 +73,7 @@ final class EvaluatorBuilder {
		EvaluatorBuilder(Cel cel, Config config) {

Conversation

jonbodner-buf commented May 4, 2026

Uh oh!

jonbodner-buf commented May 4, 2026

Uh oh!

pkwarren left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants