Skip to content

explorer: two-button search semantics (closes #178, Light path)#179

Merged
rdhyee merged 3 commits intoisamplesorg:mainfrom
rdhyee:explorer-search-two-button
May 8, 2026
Merged

explorer: two-button search semantics (closes #178, Light path)#179
rdhyee merged 3 commits intoisamplesorg:mainfrom
rdhyee:explorer-search-two-button

Conversation

@rdhyee
Copy link
Copy Markdown
Contributor

@rdhyee rdhyee commented May 8, 2026

Summary

Light path of #178 — Hana's mockup feedback for the Interactive Explorer (discussed in 2026-05-08 tech call). Replaces the single "Search" button with two scope buttons:

  • Search Selected Areas (orange) — limits results to samples whose lat/lng falls in the current camera view rectangle.
  • Search Entire World (blue) — runs the existing full-corpus search.

Backend stays option C (per EXPLORER_STATE.md §6) — search is still side-panel + result-pin overlay; cluster layer and facet counts are unaffected. Heavy revisit (full A/B/C rethink) explicitly deferred per #178.

Verified locally

Manual browser eyeball check at `http://localhost:5880/explorer.html?perf=1#v=1&lat=35&lng=33&alt=2000000\` (Cyprus camera position):

step result
Page loads, two buttons visible, old `#searchBtn` removed
Type `pottery` → click "Search Selected Areas" URL becomes `?perf=1&search=pottery&search_scope=area`, search runs
Area-scoped result 0 results — verified natively that none of the top-50 `pottery` matches fall in lat 30-40 / lng 25-40
Camera stays at Cyprus (auto-fly suppressed)
Click "Search Entire World" URL drops `search_scope` param; 50+ results; camera flies to top-1 (Italy)
Perf panel row shows `search #N area: "pottery" (0) 4.02 s`

The 0-result outcome from Cyprus is correct behavior per the contract — the area is genuinely too tight to contain any of the top-50 hits. Users widen by panning out, which the search-help and the second button signal.

Implementation notes

SQL shape: viewport predicate goes on the OUTER query (post-join), not the inner CTE, because lat/lng live in `samples_map_lite` not `sample_facets_v2`. This means area-scoped searches can return < 50 results when the inner top-50 don't all satisfy viewport. Acceptable v1 behavior. A future tuning could increase the inner LIMIT in area mode if data shows users hit this bound often.

Dateline-crossing: when `west > east` (camera over the international date line), the longitude predicate splits into two ranges:

```sql
AND (l.longitude BETWEEN ${west} AND 180
OR l.longitude BETWEEN -180 AND ${east})
```

URL persistence: `persistSearchScope()` is separate from `writeQueryState()` because the latter doesn't know about scope. `?search_scope=area` is written when area is chosen; deleted when world (default).

Auto-fly suppression: only `world` searches fly to the first result. `area` searches preserve the user's current camera — they're already where they want to be.

Enter key: uses the last-clicked scope, or URL-hydrated scope on cold boot. Defaults to `world` for keyboard-only first-time users.

Doc + tests

Out of scope (other Hana mockup items, deferred)

Listed in #178: educational tooltips, vocabulary tree-selection, sample-type icons, accessibility halos, native Cesium control panel, table-always-visible. Each gets its own issue once Hana's "corrections coming" iteration lands.

Test plan

  • Open Explorer, type a query, click both buttons. Verify URL updates correctly.
  • At a global camera view, both buttons should return ~50 results.
  • Zoomed in tightly, "Search Selected Areas" should return < 50 (or 0) depending on what's in view.
  • `?perf=1` panel rows include scope label.
  • Reload with `?search=foo&search_scope=area` — search runs in area mode on boot.

Closes #178. Refs #163, #165, #166, #177.

🤖 Generated with Claude Code

Per Hana's mockup discussed in 2026-05-08 tech call: replace the
single 'Search' button with two scope buttons. Light extension of
option C — same backend, viewport-scoped variant adds an outer-query
lat/lng BETWEEN predicate.

UI:
- '.search-bar' loses the inline button (input only).
- New '.search-actions' row below the input: 'Search Selected Areas'
  (orange #ef6c00) and 'Search Entire World' (blue #1565c0). Match
  the mockup's color/intent coding.
- Search-help line unchanged (still warns about cold-search latency).

Backend (explorer.qmd doSearch):
- doSearch(scope) accepts 'area' or 'world'.
- For 'area', computeViewRectangle() → outer-query predicate
  `AND l.latitude BETWEEN ... AND l.longitude BETWEEN ...`.
  Dateline-crossing handled by splitting longitude into two ranges
  when west > east.
- The viewport predicate goes on the OUTER query (post-join), not the
  inner CTE, because lat/lng live in samples_map_lite, not
  sample_facets_v2. Implication: area-scoped searches can return < 50
  results when the inner top-50 don't all satisfy viewport — users
  widen by panning. Acceptable v1 behavior.
- Auto-fly to first result suppressed for area-scope (the user is
  already at the area they care about; flying would zoom in and
  disorient).

URL state:
- New ?search_scope=area|world param. Default 'world' (omitted from
  URL). Hydrated on boot from URL; persisted by persistSearchScope()
  (separate from writeQueryState which doesn't know about scope).
- Enter key uses the last-clicked scope (or URL-hydrated scope on
  cold boot, defaulting to world).

Instrumentation:
- isamples.search structured log gains 'scope' field.
- ?perf=1 panel row format: 'search #N <scope>: "<term>" (<count>)'.

Tests:
- New 'area-scope' canonical query in test_search_perf.py uses
  url_hash to set the camera before clicking 'Search Selected Areas'.
- _run_search takes a scope param routing to #searchAreaBtn or
  #searchWorldBtn.
- _measure_one_query honors query['url_hash'] and query['filters']['scope'].

Doc:
- EXPLORER_STATE.md §6 gains a 'Light-path addendum' explaining the
  two-button design as an extension of option C, NOT a revisit of
  A/B/C. Heavy revisit deferred until isamplesorg#170-isamplesorg#172 land.

Verified locally: area click at lat=35,lng=33,alt=2Mm → 0 results
(confirmed natively: no top-50 pottery in that rect), camera stays
put. World click → 50+ results, camera flies to top-1 (Italy).
URL hydration round-trips ?search_scope=area correctly.

Closes isamplesorg#178. Refs isamplesorg#163, isamplesorg#165, PR isamplesorg#166, PR isamplesorg#177.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rdhyee
Copy link
Copy Markdown
Contributor Author

rdhyee commented May 8, 2026

Review finding:

Medium: Search Selected Areas can return false zeroes because the viewport filter runs after the global top-50 limit. In explorer.qmd, the CTE ranks and limits 50 matches from sample_facets_v2 first, then the outer query joins coordinates and applies viewportPredicate. That means area mode only searches “current viewport among the global top 50,” not “top 50 within the current viewport.” For broad terms like pottery, this can show 0 results in a real area that has matches, simply because none of that area’s samples made the global top 50. This undercuts the button’s promise and reintroduces the false-zero problem the interim search work is trying to reduce.

The fix is to make area scope apply before the top-K selection. Since lat/lng live in samples_map_lite, area mode likely needs a separate query shape that joins lite inside the candidate CTE before ORDER BY ... LIMIT 50. If that is too slow, use an explicit larger candidate cap and document it as approximate, but the current exact-looking label is misleading.

Test gap: the perf harness adds an area-scope canonical query, but the committed tests/search_baseline_2026-05-08.json still has only the prior 9 labels. If the baseline artifact is meant to track the canonical query set, it should be rerun or explicitly left as historical pre-#179 data.

I did not run the full browser perf-smoke.

…esorg#179 review)

Codex round-2 review caught that the previous shape applied the
viewport predicate AFTER the global top-50 selection. Effect: 'Search
Selected Areas' was actually 'current viewport among the global top
50,' not 'top 50 within the current viewport.' For broad terms like
`pottery`, the global top-50 happens to all live in one Alaska
collection (label='Pottery AM662:...', score=3 each, all at lat=57.7
lng=-152.4); a Cyprus-area query would return 0 even though Cyprus
genuinely has 50+ pottery hits. This was the original false-zero
problem in disguise.

Fix: split into two SQL shapes.

- World mode: unchanged. CTE over sample_facets_v2 → top-50 → LEFT
  JOIN samples_map_lite. Coord-less samples still appear (lat/lng
  null) since they're legitimate text matches.
- Area mode: INNER JOIN samples_map_lite inside the candidate
  selection, viewport BETWEEN predicate applied BEFORE
  ORDER BY ... LIMIT 50. Drops coord-less samples (area-scoped search
  by definition requires coords). Top-50 within area, not within global.

Verified natively (TIGHT Cyprus rect lat 30-40 lng 25-40):
- Old SQL: 0 of top-50 pass viewport
- New SQL: 50 of top-50

Verified in browser at Cyprus camera (lat=35, lng=33, alt=1Mm):
'Search Selected Areas' for `pottery` returns 50+ results, all at
the Dead Sea pottery site (31.13, 35.53) — exactly what the user
expects.

Both SQL shapes use f.-qualified column names so the same
searchWhere/score strings work for both. EXPLORER_STATE.md §6
Light-path addendum updated to describe the two shapes and why
area mode requires coords.

Refs isamplesorg#178, isamplesorg#179.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rdhyee
Copy link
Copy Markdown
Contributor Author

rdhyee commented May 8, 2026

Round-2 fix (commit `cc79ec0`)

Codex was right — the previous shape had a real semantic bug. The viewport predicate ran AFTER the global top-50 selection, so "Search Selected Areas" was actually "current viewport among the global top 50," not "top 50 within the current viewport."

The bug was visible-in-the-wild: for `pottery` the global top-50 is dominated by one Alaska collection (label='Pottery AM662:...', all at lat=57.7 lng=-152.4). Any non-Alaska area-scope query for `pottery` would return zero — including the canonical Cyprus case the interim search work was built to fix.

Fix

Split into two SQL shapes (committed):

mode shape
world CTE over sample_facets_v2 → top-50 → LEFT JOIN samples_map_lite. Unchanged. Coord-less samples still appear in results.
area INNER JOIN samples_map_lite inside the candidate selection. Viewport BETWEEN applied BEFORE `ORDER BY ... LIMIT 50`. Drops coord-less samples (required by area).

Both use `f.`-qualified column names so the same `searchWhere`/`score` strings work for both.

Verified

Native (TIGHT Cyprus rect lat 30-40 lng 25-40):

SQL form top-50 result
Old (filter after CTE LIMIT) 0 ← false-zero bug
New (filter inside CTE) 50

Browser at Cyprus camera (`lat=35, lng=33, alt=1Mm`): "Search Selected Areas" for `pottery` → 50+ results, all clustered at the Dead Sea pottery site (31.13, 35.53). Exactly what the user expects.

Doc

EXPLORER_STATE.md §6 Light-path addendum updated to describe both SQL shapes and why area mode drops coord-less samples.

Baseline JSON refresh coming in a follow-up commit on this branch — the perf-smoke is running now against the fixed code; the previous `tests/search_baseline_2026-05-08.json` was from a build before the area-scope canonical query was added (and before this fix).

Diff: +89 / -54.

…subset (isamplesorg#179)

Run after the round-2 SQL fix (commit cc79ec0). All 10 canonical
queries pass cleanly in 4m6s including the new area-scope case.

Highlights:
- single-common (pottery): 10.5s cold, 4.6s warm, 50 results
- multi-term (pottery Cyprus): 10.0s cold, 4.5s warm, 50 results
  (was 0 before isamplesorg#177 Direction A)
- diacritic (Çatalhöyük): 13.2s cold, 4.9s warm, 50 results
  (was 0 before isamplesorg#177)
- area-scope (pottery × Cyprus camera): 10.5s cold, 4.2s warm,
  50 results — confirms the round-2 fix (was 0 before cc79ec0)
- composed-source / composed-source-material: ~6s cold, faster
  because the source filter dramatically reduces the candidate set

Latency profile: 10-13s cold, 4-5s warm. Within the same envelope
as the pre-area-scope baseline; the new SQL doesn't materially
change cold/warm timings vs the world path.

field_subset string in test + JSON was stale (still said
"label+place_name samples_map_lite") — landed in test edit that
was abandoned with the honesty-fix branch when Direction B
shipped first. Corrected now.

Refs isamplesorg#167, isamplesorg#168, isamplesorg#178, isamplesorg#179.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rdhyee
Copy link
Copy Markdown
Contributor Author

rdhyee commented May 8, 2026

Refreshed baseline (commit `d37f703`)

Perf-smoke against the fixed code, all 10 canonical queries:

label results cold (ms) warm (ms)
single-common 50 10534 4566
single-rare 50 9744 4493
multi-term 50 10013 4515
no-hit 0 9840 4644
wildcard-pct 50 12216 4362
wildcard-under 0 10134 4575
diacritic 50 13237 4878
composed-source 50 6072 5486
composed-source-material 0 6582 3427
area-scope 50 10483 4165

`area-scope` now returns 50 results (previously 0 with the pre-fix SQL). Latency envelope ~10-13s cold, ~4-5s warm — same shape as world mode; no regression from adding the INNER JOIN inside the area-mode CTE.

Also fixed: the `field_subset` JSON metadata string (was stale: `"label+place_name samples_map_lite"` from before #177 Direction A; now reads `"label+description+place_name (sample_facets_v2 + lite for coords; world via LEFT JOIN, area via INNER JOIN with viewport predicate inside CTE)"`).

@rdhyee rdhyee merged commit 13a8dec into isamplesorg:main May 8, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Interactive Explorer: integrate Hana mockup feedback (Light path on search semantics)

1 participant