joaoh82 · joaoh82 · May 8, 2026 · May 8, 2026
diff --git a/README.md b/README.md
@@ -305,7 +305,7 @@ Lockstep versioning — one dispatch bumps every product to the same `vX.Y.Z`. T
 - [x] **7a — `VECTOR(N)` column type** *(v0.1.10)*: dense f32 vectors with bracket-array literal syntax (`[0.1, 0.2, ...]`); file format bumped to v4
 - [x] **7b — Distance functions** *(v0.1.11)*: `vec_distance_l2/cosine/dot` + `ORDER BY <expr> LIMIT k` so KNN queries work end-to-end
 - [x] **7c — Bounded-heap top-k optimization** *(v0.1.12)*
-- [x] **7d — HNSW ANN index** *(v0.1.13–15)*: `CREATE INDEX … USING hnsw (col)`; recall@10 ≥ 0.95 at default `M=16, ef_construction=200, ef_search=50`; persisted as a `KIND_HNSW` cell tree
+- [x] **7d — HNSW ANN index** *(v0.1.13–15, +SQLR-28)*: `CREATE INDEX … USING hnsw (col) [WITH (metric = '<l2|cosine|dot>')]`; recall@10 ≥ 0.95 at default `M=16, ef_construction=200, ef_search=50`; persisted as a `KIND_HNSW` cell tree, with the metric round-tripping through the synthesized `sqlrite_master` SQL
 - [x] **7e — JSON column type + path queries** *(v0.1.16)*: `JSON` / `JSONB` columns stored as canonical text; `json_extract` / `json_type` / `json_array_length` / `json_object_keys`; `$.key`, `[N]`, chained JSONPath subset
 - [x] **7g.1 — `sqlrite-ask` crate** *(v0.1.18)*: foundational natural-language → SQL via the [Anthropic API](https://docs.anthropic.com/) (Sonnet 4.6 by default), prompt-cached schema dump, sync `ureq` HTTP.
 - [x] **7g.2 — REPL `.ask` + dep-direction flip** *(v0.1.19)*: `.ask <question>` meta-command with `Run? [Y/n]` confirmation. The wiring required dropping the engine dep from `sqlrite-ask` (cargo cycle) — `sqlrite-ask` is now pure over `&str` schemas; the `Connection`/`Database` integration moved to the engine's new `ask` feature. Public surface for callers: `use sqlrite::{Connection, ConnectionAskExt}`.

diff --git a/benchmarks/README.md b/benchmarks/README.md
@@ -170,14 +170,20 @@ Even at 10k scale, the gap is large:
 
 ### W10 — vector top-10 (cosine), brute-force vs HNSW
 
-10k 384-dim vectors. Two variants per the plan: brute-force (no index) and HNSW (`CREATE INDEX … USING hnsw`). SQLRite-only — `sqlite-vec` extension wiring is a follow-up (`rusqlite[bundled]` doesn't ship it; loading a pre-compiled `.dylib` at runtime is non-trivial and was out of scope for v1).
+10k 384-dim vectors. Two variants per the plan: brute-force (no index) and HNSW (`CREATE INDEX … USING hnsw (embedding) WITH (metric = 'cosine')`, per SQLR-28). SQLRite-only — `sqlite-vec` extension wiring is a follow-up (`rusqlite[bundled]` doesn't ship it; loading a pre-compiled `.dylib` at runtime is non-trivial and was out of scope for v1).
 
-| Variant | SQLRite median | Throughput |
+The **W10.v1** numbers below were taken before SQLR-23 (parser-bound) and SQLR-28 (HNSW probe was L2-only, so the HNSW variant silently fell through to brute-force on cosine queries) — they are retained for historical context only. **W10.v3** ships with the cosine-built index + cosine-aware optimizer probe; republish under SQLR-25.
+
+| Variant | SQLRite median (v1, retired) | Throughput |
 |---|---|---|
 | brute-force | ~122 ms | ~8 ops/s |
 | hnsw | ~132 ms | ~7 ops/s |
 
-**Read this as:** at 10k vectors × 384 dim, **HNSW barely beats brute-force**. That's not the index's fault — both numbers are dominated by the **per-iter SQL parse cost** (the 384-element bracket-array literal in the `ORDER BY` clause is ~4 KB of SQL the parser walks every iteration; the actual cosine work is ~3.8M FP ops ≈ a few ms). At a much larger corpus (millions of vectors) HNSW would dominate; at 10k the parser cost masks the algorithmic win. A future "prepare-vector-query-once" path or VECTOR-bind binding would surface the real HNSW vs brute-force gap.
+**Read this as (v1, retired):** at 10k vectors × 384 dim, **HNSW barely beats brute-force**. Two reasons compounded:
+1. **Per-iter SQL parse cost** — the 384-element bracket-array literal in the `ORDER BY` clause was ~4 KB of SQL the parser walked every iteration. Fixed in SQLR-23 (`Value::Vector` bind).
+2. **Cosine queries silently brute-forced on the HNSW path** — the optimizer's `try_hnsw_probe` was L2-only; cosine queries never hit the graph. Fixed in SQLR-28 (per-index metric + matching probe).
+
+W10.v3 measures the *actual* HNSW-vs-brute-force gap with both fixes in place.
 
 ### W11 — BM25 top-10
 

diff --git a/benchmarks/src/workloads/vector.rs b/benchmarks/src/workloads/vector.rs
@@ -3,21 +3,23 @@
 //! ```sql
 //! CREATE TABLE vecs (id INTEGER PRIMARY KEY, embedding VECTOR(384));
 //! -- 10k 384-dim vectors, deterministic per-id.
-//! -- HNSW variant adds:
-//! CREATE INDEX vecs_hnsw ON vecs USING hnsw (embedding);
+//! -- HNSW variant adds (SQLR-28: cosine-built index, matched to the
+//! -- query's vec_distance_cosine):
+//! CREATE INDEX vecs_hnsw ON vecs USING hnsw (embedding)
+//!     WITH (metric = 'cosine');
 //!
 //! -- Hot loop:
 //! SELECT id FROM vecs
 //! ORDER BY vec_distance_cosine(embedding, [...]) ASC
 //! LIMIT 10;
 //! ```
 //!
-//! Two criterion groups land per driver: `W10.v1/brute-force` (no HNSW
+//! Two criterion groups land per driver: `W10.v3/brute-force` (no HNSW
 //! index — every probe full-scans + bounded-heap top-k) and
-//! `W10.v1/hnsw` (with the HNSW index, optimizer probes the graph
-//! per [`docs/supported-sql.md`](../../docs/supported-sql.md) "HNSW
-//! indexes"). The gap between the two is the headline number for
-//! "did Phase 7d's ANN actually deliver?"
+//! `W10.v3/hnsw` (with the cosine-built HNSW index, optimizer probes
+//! the graph per [`docs/supported-sql.md`](../../docs/supported-sql.md)
+//! "HNSW indexes"). The gap between the two is the headline number
+//! for "did Phase 7d's ANN actually deliver?"
 //!
 //! ## Comparator
 //!
@@ -40,10 +42,18 @@ use crate::{Driver, Value, WorkloadId};
 /// inlined as a 4 KB bracket-array literal in the SQL string. The
 /// brute-force-vs-HNSW gap should widen materially because the
 /// per-iter parser cost no longer dominates.
+///
+/// SQLR-28 — bumped again to v3: the HNSW variant now creates the
+/// index `WITH (metric = 'cosine')`, matching the hot-loop SQL's
+/// `vec_distance_cosine`. v1/v2 used the optimizer's L2-only probe,
+/// which silently fell through to brute-force on a cosine query —
+/// the HNSW variant was never actually exercising the graph. Numbers
+/// from before v3 are not comparable to v3 numbers and have been
+/// retired.
 pub const W10: WorkloadId = WorkloadId {
     id: "W10",
     name: "vector-top10",
-    version: "v2",
+    version: "v3",
 };
 
 /// `(label, with_hnsw_index)` — two variants per driver.
@@ -71,9 +81,13 @@ pub fn setup<D: Driver>(
     let dataset = vector_dataset();
     insert_rows(driver, &mut conn, &dataset)?;
     if with_hnsw {
+        // SQLR-28: build the graph for cosine — matches the hot-loop
+        // SQL's vec_distance_cosine. Without the metric clause the
+        // index defaults to L2 and the optimizer's metric gate falls
+        // through to brute-force, which is exactly the bug v3 fixes.
         driver.execute(
             &mut conn,
-            "CREATE INDEX vecs_hnsw ON vecs USING hnsw (embedding)",
+            "CREATE INDEX vecs_hnsw ON vecs USING hnsw (embedding) WITH (metric = 'cosine')",
         )?;
     }
     Ok((conn, dataset))

diff --git a/docs/phase-7-plan.md b/docs/phase-7-plan.md
@@ -163,6 +163,7 @@ SELECT id, title FROM docs ORDER BY embedding <-> [0.1, ...] LIMIT 10;
 > - **✅ 7d.1 — Pure HNSW algorithm** *(~700 LOC, shipped in v0.1.13).* `src/sql/hnsw.rs` standalone module: insert + search + layer assignment + beam search per layer + L2/cosine/dot distance dispatch. No SQL integration yet — vectors are passed in via a `get_vec` closure so the algorithm doesn't depend on table types. Tests verify recall@k ≥ 0.95 vs brute-force on randomly-generated vector sets; deterministic via a fixed RNG seed.
 > - **✅ 7d.2 — SQL integration** *(~500 LOC).* `CREATE INDEX … USING hnsw (col)` parser + engine, INSERT wiring (also calls `hnsw.insert()` incrementally), query optimizer hook (recognizes `ORDER BY vec_distance_l2(col, literal) LIMIT k` and probes the HNSW instead of full-scanning). HNSW lives in memory only at this point; the **CREATE INDEX SQL persists in `sqlrite_master` and reopen rebuilds the graph from current rows** — partial persistence ahead of 7d.3. DELETE/UPDATE on HNSW-indexed tables refused with helpful error pointing at 7d.3.
 > - **✅ 7d.3 — Persistence** *(~600 LOC).* New `KIND_HNSW` cell tag and `HnswNodeCell` encoding (varint node_id + per-layer neighbor lists). Each HNSW index gets its own page tree parallel to secondary indexes. Open path loads cells directly into `HnswIndex::from_persisted_nodes` — no algorithm runs, exact bit-for-bit reproduction. Also unblocks DELETE / UPDATE on HNSW-indexed tables: those mark the index `needs_rebuild`, save rebuilds from current rows before staging. ~2× the original 300-LOC estimate because the cell encoding + tests + rebuild path together added more than expected.
+> - **✅ 7d.4 (SQLR-28) — Per-index distance metric.** Q2's "deferred per-index metric knob" lands as `CREATE INDEX … USING hnsw (col) WITH (metric = '<l2|cosine|dot>')`. The metric is stored on `HnswIndexEntry` and round-tripped via the synthesized CREATE INDEX SQL in `sqlrite_master` (no file-format bump — pre-SQLR-28 rows omit the WITH clause and decode as L2). The optimizer's `try_hnsw_probe` widens to all three `vec_distance_*` functions but only fires when the query function matches the index's metric; mismatches fall through to brute-force. Surfaced by the SQLR-23 v2 bench: W10 uses cosine, the optimizer was L2-only, and the HNSW variant had been silently brute-forcing the entire time. SQLR-25 (republish v2 numbers) was the gating consumer.
 >
 > Each 7d.x ships as its own PR + release wave. The user-facing value lands at 7d.2; 7d.3 closes the persistence loop. 7d.1 is foundational but ships a tested algorithmic primitive on its own — useful as documentation of the engine's "from scratch" theme.
 
@@ -368,12 +369,12 @@ Q1–Q10 were resolved by the project owner on 2026-04-26. Each question keeps i
 
 ### Q2. HNSW parameters: fixed defaults or per-index configurable?
 
-> **Decided: fixed defaults** (`M=16, ef_construction=200, ef_search=50`).
+> **Decided: fixed defaults** (`M=16, ef_construction=200, ef_search=50`) for the algorithmic knobs. **Distance metric** *did* land as a per-index `WITH (metric = '<l2|cosine|dot>')` clause in **SQLR-28 / sub-phase 7d.4** — see the 7d split note above. Was deferred from the original 7d.2 cut; surfaced as a gap by the SQLR-23 v2 bench, where W10's cosine query had been silently brute-forcing because the optimizer hook was L2-only.
 
 - **Fixed:** `M=16, ef_construction=200, ef_search=50`. Simpler API, less to test. Matches sqlite-vec's defaults.
 - **Configurable:** `CREATE INDEX … USING hnsw (col) WITH (m=32, ef_construction=400)`. Power-user knobs, more code, more test matrix.
 
-**Recommendation:** fixed defaults for MVP. Configurable can land as a follow-up if anyone actually asks.
+**Recommendation:** fixed defaults for MVP. Configurable can land as a follow-up if anyone actually asks. (`metric` already came back as a follow-up; `m` / `ef_*` haven't been requested yet.)
 
 ### Q3. JSON storage format
 

diff --git a/docs/supported-sql.md b/docs/supported-sql.md
@@ -113,15 +113,18 @@ These are full-citizen indexes — they're visible via `.tables`-adjacent catalo
 ### HNSW indexes (Phase 7d)
 
 ```sql
-CREATE INDEX <name> ON <table> USING hnsw (<vector_column>);
+CREATE INDEX <name> ON <table> USING hnsw (<vector_column>)
+  [WITH (metric = '<l2|cosine|dot>')];
 ```
 
-Builds an [HNSW](https://arxiv.org/abs/1603.09320) approximate-nearest-neighbor index over a `VECTOR(N)` column. The query optimizer recognizes `ORDER BY vec_distance_l2(col, literal) LIMIT k` (or the cosine / dot variants) on an HNSW-indexed column and probes the graph instead of full-scanning. SQLR-23 — the second arg can be either an inline `[...]` literal *or* a bound `Value::Vector(...)` parameter via `Statement::query_with_params`; the optimizer recognizes both, so prepared-statement KNN queries still take the graph shortcut.
+Builds an [HNSW](https://arxiv.org/abs/1603.09320) approximate-nearest-neighbor index over a `VECTOR(N)` column. The query optimizer recognizes `ORDER BY vec_distance_l2(col, literal) LIMIT k` (or the cosine / dot variants) on an HNSW-indexed column **whose metric matches the query's distance function**, and probes the graph instead of full-scanning. SQLR-23 — the second arg can be either an inline `[...]` literal *or* a bound `Value::Vector(...)` parameter via `Statement::query_with_params`; the optimizer recognizes both, so prepared-statement KNN queries still take the graph shortcut.
 
-- Recall@10 ≥ 0.95 at default parameters (`M=16`, `ef_construction=200`, `ef_search=50`). Parameters aren't tunable from SQL yet — see Q2 of [`docs/phase-7-plan.md`](phase-7-plan.md).
-- The index is built incrementally on `INSERT`. `DELETE` / `UPDATE` mark the index `needs_rebuild`; the next save rebuilds from current rows.
-- Persisted as a `KIND_HNSW` cell tree alongside the regular page hierarchy — open path loads the graph bit-for-bit, no algorithm runs.
-- Without an HNSW index, the same `ORDER BY vec_distance_… LIMIT k` query still works — it just brute-force-scans every row (Phase 7c's bounded-heap top-k optimization keeps the memory footprint to O(k)).
+The `WITH (metric = '…')` clause picks the distance the graph is built for. Three values are recognized: `'l2'` (Euclidean — the default, also accepts `'euclidean'`), `'cosine'`, and `'dot'` (negated dot-product — also accepts `'inner_product'` / `'ip'`). Omitting the clause is equivalent to `metric = 'l2'`, so pre-SQLR-28 catalogs round-trip unchanged. **The metric is not a query-time choice** — the graph topology depends on the metric used during INSERT (neighbour pruning is metric-specific), so a query whose `vec_distance_*` function doesn't match the index's metric falls through to brute-force rather than getting a wrong answer back from the graph. If you need both L2 and cosine probes on the same column, create two indexes.
+
+- Recall@10 ≥ 0.95 at default parameters (`M=16`, `ef_construction=200`, `ef_search=50`). The `M` / `ef_*` knobs aren't tunable from SQL yet — see Q2 of [`docs/phase-7-plan.md`](phase-7-plan.md).
+- The index is built incrementally on `INSERT`. `DELETE` / `UPDATE` mark the index `needs_rebuild`; the next save rebuilds from current rows under the same metric.
+- Persisted as a `KIND_HNSW` cell tree alongside the regular page hierarchy — open path loads the graph bit-for-bit, no algorithm runs. The metric travels through the synthesized CREATE INDEX SQL in `sqlrite_master`; no file-format bump.
+- Without an HNSW index — or with a metric mismatch — the same `ORDER BY vec_distance_… LIMIT k` query still works; it just brute-force-scans every row (Phase 7c's bounded-heap top-k optimization keeps the memory footprint to O(k)).
 
 ### FTS indexes (Phase 8)