fix(core): normalize boolean range predicates in Gremlin filters#2991
fix(core): normalize boolean range predicates in Gremlin filters#2991contrueCT wants to merge 12 commits intoapache:masterfrom
Conversation
imbajin
left a comment
There was a problem hiding this comment.
整体设计合理,规范化路径(TraversalUtil 层提前将 boolean 范围谓词转换为 eq/in/empty)是正确做法,测试覆盖也较为全面。有两处中等优先级问题需要决策:compareBoolean 的 null 处理语义,以及 neq 未被规范化。
|
@imbajin 这次提交去掉了 compareBoolean 里 null -> false 的静默转换,现在 boolean 比较遇到 null 会显式失败;hugegraph-struct 中对应实现也已同步修改,保持两边语义一致。另外,TraversalUtil 中已将 boolean neq(value) 规范化为 eq(!value),让它可以和 boolean equality 一样走 secondary index 路径。测试也补充了 null 比较异常、Edge 上的 P.neq(true/false),以及 Vertex 上对应的 boolean neq 路径覆盖。 |
There was a problem hiding this comment.
Pull request overview
This PR fixes HugeGraph’s handling of boolean comparison predicates in Gremlin query translation so boolean range filters behave consistently across traversal shapes instead of being treated like numeric range queries.
Changes:
- Added boolean-aware comparison support in both
Conditionimplementations and in range-flattening logic. - Normalized boolean
gt/gte/lt/lte/nequser-property predicates inTraversalUtilinto equivalent boolean-safe conditions. - Added unit and integration regression tests for boolean comparisons, flattening behavior, and edge query execution.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
hugegraph-struct/src/test/java/org/apache/hugegraph/query/ConditionTest.java |
Adds struct-layer tests for boolean comparison semantics. |
hugegraph-struct/src/main/java/org/apache/hugegraph/query/Condition.java |
Extends struct-layer comparison logic to support booleans. |
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/core/ConditionTest.java |
Adds server-side unit tests for boolean comparisons. |
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/core/ConditionQueryFlattenTest.java |
Adds flattening tests for boolean ranges and conflicts. |
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/core/VertexCoreTest.java |
Adds vertex traversal regression coverage for boolean neq. |
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/core/EdgeCoreTest.java |
Adds end-to-end edge traversal coverage for boolean range predicates. |
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/traversal/optimize/TraversalUtil.java |
Rewrites boolean compare predicates into normalized query conditions during traversal planning. |
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/backend/query/ConditionQueryFlatten.java |
Updates range validation/comparison logic to understand boolean domains. |
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/backend/query/Condition.java |
Extends backend comparison logic to support booleans. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| Collection<Condition> actual = queries.iterator().next().conditions(); | ||
| Assert.assertEquals(ImmutableList.of(eq), actual); | ||
| } |
There was a problem hiding this comment.
🧹 Minor — consider adding a deeper-nested regression test for IN [] null-propagation
The new testFlattenWithImpossibleInInsideAnd/Or/OrRight cases all exercise 2-level trees (AND(IN[], eq) / OR(IN[], eq)). The new flattenIn null-propagation is recursive and should handle arbitrary depth correctly, but there's no direct assertion for 3+ level trees such as:
// AND( OR(IN[], eq(a)), eq(b) ) → eq(a) AND eq(b)
query.query(Condition.in(key, ImmutableList.of()).or(Condition.eq(key, "a"))
.and(Condition.eq(key, "b")));Why this matters: the current implementation is correct today, but if someone later refactors flattenIn (e.g. adds short-circuits, converts to iterative, or caches intermediate nodes), deep-nested null propagation is easy to break without a direct test signal. A single 3-level AND-over-OR case would pin this behavior cheaply.
Impact if not added: no correctness regression today; moderate future regression risk when flattenIn is touched. Happy for this to land as-is and add in a follow-up if preferred.
Purpose of the PR
This PR fixes HugeGraph's inconsistent handling of boolean range predicates and unifies the internal ordering semantics as false < true. Previously, filter-step queries such as where(__.has("xxx", P.lt(true))) were pushed down through HugeGraph's condition/query pipeline and incorrectly treated as numeric-style range conditions, while equivalent match() queries could still return results normally. As a result, semantically equivalent traversals behaved differently.
Main Changes
The fix has two parts. First, boolean support is added to backend condition comparison and range-flattening logic so gt/gte/lt/lte can be evaluated consistently. Second, runtime Gremlin boolean range predicates are normalized into equivalent eq/in/empty conditions before query planning, so they are no longer misclassified as ordinary numeric range queries that require a range index. This preserves the intended boolean ordering semantics while making has(), where(), and match() behave consistently for boolean range predicates.
Verifying these changes
Regression coverage was also added for:
Does this PR potentially affect the following parts?
Documentation Status
Doc - TODODoc - DoneDoc - No Need