Defensive: warmup write-tx so read-only reader pool can attach to a fresh WAL DB#21
Open
aiba wants to merge 2 commits intoandersmurphy:masterfrom
Open
Defensive: warmup write-tx so read-only reader pool can attach to a fresh WAL DB#21aiba wants to merge 2 commits intoandersmurphy:masterfrom
aiba wants to merge 2 commits intoandersmurphy:masterfrom
Conversation
On a brand-new SQLite file, init-db! would throw SQLITE_CANTOPEN ("unable
to open database file") from the read-only reader pool's first pragma:
1. Writer pool opens with READWRITE|CREATE — file is created (4096
bytes) and the writer's pragma journal_mode=WAL flips the header
to WAL mode. But no -wal/-shm files are created yet because no
write transaction has happened.
2. Reader pool opens with SQLITE_OPEN_READONLY. The first reader
pragma (cache_size in default-pragma iteration order) needs to
access WAL state, which requires the -shm file.
3. A read-only connection cannot create -shm, so SQLite returns
SQLITE_CANTOPEN.
Anyone using init-db! with the default pragmas and a separate read-only
reader pool hits this on day one against an empty database.
Fix: between writer-pool init and reader-pool init, run a no-op
BEGIN IMMEDIATE/COMMIT on the writer connection. This materializes the
-wal/-shm files so the reader pool can attach. Cost is one tiny
transaction once per init-db! call (skipped for ":memory:" databases).
Includes a regression test that calls init-db! against a freshly-deleted
file and verifies both pools work.
The fresh-WAL failure mode this change defends against does NOT
deterministically reproduce against the bundled SQLite 3.51.3 in a
fresh JVM, even though it has been observed in production on macOS.
The previous commit's regression test passes against the unpatched
library, which makes it misleading.
Updates:
- In-source comment: explain what we know (CLI demo with system
sqlite3 3.51.0 fails; production observed; bundled library more
permissive but still bit production), what we don't (exact macOS
trigger condition unknown), and why the warmup is still worth
doing (cheap; satisfies WAL docs §5 condition 1 unconditionally).
- Test docstring: reframe as a happy-path smoke test rather than a
strict regression test.
The fix itself is unchanged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Update — please re-read; I revised this PR after my colleague pointed out my original claims were overstated.
This is a defensive fix for a macOS-specific failure mode in
init-db!, not a fully-reproducible bug fix. Here's what I actually know:The underlying SQLite behavior is real
Per SQLite WAL docs §5, a
SQLITE_OPEN_READONLYconnection can attach to a WAL database only if one of:-shmand-walfiles already exist and are readable, orimmutableURI parameter.For sqlite4clj's reader pool: the connection is opened with literal
SQLITE_OPEN_READONLY(0x1) and noimmutable=1, so condition (3) doesn't apply. Condition (2) only applies to scenarios where the file itself is read-only and SQlite is opening it R/W; with an explicitSQLITE_OPEN_READONLYflag, SQLite respects the read-only contract and refuses to create files. That leaves condition (1).After the writer pool runs
pragma journal_mode=WALbut no transaction has occurred, the-shm/-walfiles do not exist on disk. So none of the three conditions are satisfied →SQLITE_CANTOPEN.Reproduces deterministically at the SQLite CLI level on macOS
(Tested against Apple's bundled
sqlite3 3.51.0on macOS Sequoia.)I cannot reproduce it deterministically against this library's bundled SQLite (3.51.3) in a fresh JVM
This is the part where I was wrong in my original PR. I tried many variants — different paths, pool sizes, leftover
-shm/-wal, closed-then-reopened writer, concurrent load, the-Dsqlite4clj.memstatus=trueflag — and got:okevery time. Bundled 3.51.3 appears more permissive than the system 3.51.0.But it has bitten a real production system on macOS
A user running this library on macOS hit this exact stack trace in
app.log:A colleague separately reports being able to reproduce something on macOS but not on Linux, which is consistent with macOS's SQLite VFS layer (different from Linux's) being the source of the variability.
Why I still think the patch is worth merging
init-db!call, once per database lifetime. Cost is negligible and only paid at init time.If you'd rather wait for someone to produce a deterministic minimal repro against the bundled library, that's a fair call to make and I won't push back. I just want to be honest about what I do and don't know about this.
Changes
src/sqlite4clj/core.clj: between the writer-pool init and the reader-pool init, runBEGIN IMMEDIATE; COMMITon the writer connection (skipped for:memory:databases). Comment explains the rationale.test/sqlite4clj/core_test.clj: a smoke test verifyinginit-db!still works on a fresh WAL file. (NOT a strict regression test — it passes on the unpatched library too. Docstring says so.)All 19 existing tests still pass (74 assertions, 0 failures).