fix(runtime): #833 — replace(/.../g, fn) honors SSO returns (no more empty replacements)#844
Merged
Merged
Conversation
…empty replacements) `js_string_replace_regex_fn` decoded the replacer-function's NaN-boxed return value with a hand-rolled tag switch that only accepted `STRING_TAG` (0x7FFF) and `POINTER_TAG` (0x7FFD). When `js_string_concat_box`'s SSO fast path returned a ≤ 5-byte concat tagged `SHORT_STRING_TAG` (0x7FF9) — exactly the boundary `m.charAt(0).toUpperCase() + m.slice(1)` lands on for "hello" / "world" / "foo" — both `if tag == ...` arms missed, the handler pushed nothing, and every match was erased. Route the return value through `js_get_string_pointer_unified`, the existing centralized decoder that handles all four string representations (heap STRING_TAG, SSO with on-the-fly heap-materialization, POINTER_TAG, raw pointer) plus the JS-spec number-to-string coercion. Every other "JSValue → string ptr" call site already uses it; the regex-replace handler was the lone holdout doing manual tag dispatch. Net change: 16 lines of branchy tag arithmetic → 4 lines that delegate to the unified decoder. The codepath physically cannot omit a string variant if it doesn't list them itself. Gap suite improves 34/36 → 35/36 (`test_gap_regexp_advanced` flips PASS). Closes #833.
19ef2fc to
6a04461
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
str.replace(/.../g, fn)no longer drops ≤ 5-byte replacer-fn return values. The canonical idioms.replace(/(\w+)/g, m => m.charAt(0).toUpperCase() + m.slice(1))now returns the capitalized string instead of just inter-word whitespace.test_gap_regexp_advancedflips from FAIL to PASS).Root cause
js_string_replace_regex_fnincrates/perry-runtime/src/regex.rsdecoded the closure's NaN-boxed return value with a hand-rolled tag switch — it acceptedSTRING_TAG(0x7FFF) andPOINTER_TAG(0x7FFD), but notSHORT_STRING_TAG(0x7FF9, Perry's small-string optimization where ≤ 5-byte strings live inline in the NaN-box payload). Whenjs_string_concat_box's SSO fast path returned an SSO-tagged concat, both arms missed, the handler pushed nothing, and the match was effectively erased.Why the bug looked weird:
(m) => "Y",(m) => m,(m) => m + "!"(6 bytes) all worked because they didn't take the SSO path. Only outputs whose total byte length was ≤ 5 dropped — exactly the boundary the bug report'scharAt(0) + slice(1)lands on.Fix
Route the return value through
js_get_string_pointer_unified— the existing centralized decoder that handles all four string representations (heapSTRING_TAG, SSOSHORT_STRING_TAGwith on-the-fly heap-materialization,POINTER_TAG, raw heap pointer) plus the JS-spec number-to-string coercion. Every other "JSValue → string ptr" call site in the runtime already uses it; the regex-replace handler was the lone holdout. Net change: 16 lines of branchy tag arithmetic → 4 lines that delegate to the unified decoder.Test plan
node --experimental-strip-types:[Hello World Foo] / [aYbYc] / [aZcaZc] 1,4.g/h/i(heap path) unchanged,j/k(SSO path) now match node.test_gap_regexp_advancedflips PASS;test_gap_console_methodsremains as the unchanged pre-existing console.dir/group gap.cargo test --release -p perry-runtime --lib regex::green (6/0/0).Bonus: replacer functions returning numbers now coerce to their string form via the unified decoder's number-to-string arm (was: empty replacement).
Closes #833. Refs #793, #801.