Skip to content

Vdb 05062026 yb#16

Open
shaharuk-yb wants to merge 181 commits into
mainfrom
vdb-05062026-yb
Open

Vdb 05062026 yb#16
shaharuk-yb wants to merge 181 commits into
mainfrom
vdb-05062026-yb

Conversation

@shaharuk-yb

Copy link
Copy Markdown

No description provided.

Caroline-an777 and others added 30 commits February 10, 2025 18:58
Signed-off-by: siqi.an <ansiqi_7777@163.com>
Signed-off-by: min.tian <min.tian.cn@gmail.com>
Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com>
Co-authored-by: Signed-off-by: Rachit Chaudhary - r0c0axe <Rachit.Chaudhary@walmart.com>
MariaDB introduced vector support in version 11.7, enabling MariaDB
Server to function as a relational vector database.
https://mariadb.com/kb/en/vectors/

Now add support for MariaDB server, verified against MariaDB server
of version 11.7.1:

- Support MariaDB vector search with HNSW algorithm, support filter
  search.
- Support index and search parameters:
   - storage_engine: InnoDB or MyISAM
   - M: M parameter in MHNSW vector indexing
   - ef_search: minimal number of result candidates to look for in the
                vector index for ORDER BY ... LIMIT N queries.
   - max_cache_size: Upper limit for one MHNSW vector index cache
- Support CLI of `vectordbbench mariadbhnsw`.
* Add TiDB backend

Signed-off-by: Wish <breezewish@outlook.com>

* Fix

Signed-off-by: Wish <breezewish@outlook.com>

* Fix

Signed-off-by: Wish <breezewish@outlook.com>

* Improve error handling

Signed-off-by: Wish <breezewish@outlook.com>

---------

Signed-off-by: Wish <breezewish@outlook.com>
* Support GPU_BRUTE_FORCE index for Milvus

Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com>

* MilvusGPUBruteForceTypedDict addition

Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com>

---------

Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com>
Co-authored-by: Signed-off-by: Rachit Chaudhary - r0c0axe <Rachit.Chaudhary@walmart.com>
Signed-off-by: min.tian <min.tian.cn@gmail.com>
Signed-off-by: min.tian <min.tian.cn@gmail.com>
Signed-off-by: min.tian <min.tian.cn@gmail.com>
Signed-off-by: min.tian <min.tian.cn@gmail.com>
Signed-off-by: min.tian <min.tian.cn@gmail.com>
* feat: add hnsw support

* refactor: minor fixes

* feat: reformat code

* fix: remove sql injections, reformat code
Signed-off-by: min.tian <min.tian.cn@gmail.com>
* Add --task-label option for cli

* Fix lint issues
…#523)

* Update cli.py

* Update clickhouse.py

* Update clickhouse.py

* Update cli.py

* Update config.py

* remove space
…ch#521)

* Add --concurrency-timeout option to avoid long time waiting, by default, it's 3600s.

* Fix lint error

* Update README.md, add --concurrency-timeout option
Signed-off-by: min.tian <min.tian.cn@gmail.com>
Ubuntu and others added 30 commits April 3, 2026 18:34
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Milvus results (16c64g, force_merge, v2.6.14):
- 1M Cohere: SQ4U+FP16 (sweep refine_k) + SQ8 (sweep ef), 8 points each
- 10M Cohere: SQ4U+FP16 + SQ8 (sweep ef), 8 points each
- Total 32 benchmark configurations

ElasticCloud and ZillizCloud results from standard benchmark runs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rename ElasticCloud and ZillizCloud result files from 20260209 to 20260403
and update task_label to standard_20260403 for consistency with Milvus results.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update db_name/label in leaderboard_v2_streaming.json to match
leaderboard_v2.json after force_merge became the default:
- Milvus: 16c64g-sq8 -> 16c64g-sq8-force_merge
- ElasticCloud: 8c60g -> 8c60g-force_merge

This fixes the website failing to associate streaming and vector
search results due to mismatched db_name keys.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace ZillizCloud-8cu-perf case_id=4/5 data with new Cardinal backend
benchmark results (level 1-9, 1M and 10M datasets, v2026.4). Remove
force_merge entries as Cardinal uses unified 4-segment architecture for 10M.

New results show significant QPS improvement:
- 1M: 13,316 QPS (was 9,704) at recall 0.938
- 10M: 7,385 QPS (was 3,957) at recall 0.938

Sort all leaderboard entries by (db_name, dataset, filter_ratio, qps DESC)
to fix line chart rendering. Remove one SQ4U 1M outlier (recall=0.84).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Upgrade pydantic to 2.x
2. Remove results/ from .gitignore, those files need to track
3. fix the coding styles in the results

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
…illiztech#751)

Populate insert_duration, optimize_duration, load_duration for all
entries in result_20260403 files. Previously only the first entry per
index had values while the rest were 0.0.

Milvus (re-measured on 2.6-opt-v2):
- 1M SQ4U: insert=129.8s, optimize=152.2s, load=282.0s
- 1M SQ8:  insert=119.5s, optimize=235.9s, load=355.4s
- 10M SQ4U/SQ8: copied from existing first-entry values

ZillizCloud (from prior build runs):
- 1M:  insert=246.7s, optimize=101.2s, load=347.9s
- 10M: insert=2450.8s, optimize=136.9s, load=2587.8s

Co-authored-by: Ubuntu <ubuntu@ip-10-15-14-123.us-west-2.compute.internal>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
…lliztech#754)

Update result_20260403_standard_zillizcloud.json to use the latest validated
build timings from recent reruns for case_id=5 (1M) and case_id=4 (10M),
including insert_duration, optimize_duration, and load_duration.

Co-authored-by: Ubuntu <ubuntu@ip-10-15-14-123.us-west-2.compute.internal>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add VectorChord support and VCHORDRQ index type 
* feat: add VectorChordRQ command to CLI 
* feat: add VectorChord support to README 
* feat: add VectorChordGraph support and configuration 
* feat: add max_scan_tuples parameter to VectorChordGraph 
* feat: enhance VectorChord with improved type safety and search functionality 
* feat: add vectorchord extension creation on connection 

Co-authored-by: edgar-p <edgar.p@kakaocorp.com>
zilliztech#760)

* fix(pgvector): normalize index_type to lowercase in _create_index to match PostgreSQL access method names

PostgreSQL pgvector extension registers index access methods in lowercase
(e.g. "hnsw", "ivfflat"), but the frontend passes IndexType.HNSW.value
which is uppercase "HNSW", causing "access method HNSW does not exist" error.

* Fix index type usage in pgvector.py

Replaced index_param['index_type'] with index_type_lower for consistency.

* add comment sign '#'

I have added the # before [FIX]
Adds a complete Apache Pinot client for VectorDBBench.

Index types: HNSW (Lucene), IVF_FLAT, IVF_PQ, IVF_ON_DISK
Metrics: L2, IP, COSINE
Filters: NumGE, StrEqual
Optional dep: pip install "vectordb-bench[pinot]"

Parallel loading: thread_safe=True — each worker thread maintains its own
row buffer and flushes to Pinot via a fresh HTTP session. Since Pinot's
ingestFromFile is synchronous (blocks until HNSW index is built, ~6 min
per 100K×768D segment), concurrent flushes across threads reduce load time
significantly vs sequential flushing.

Benchmark results:
Small dataset (OpenAI 50K, 768D, L2):
  HNSW:        798 QPS, recall=1.000
  IVF_FLAT:    800 QPS, recall=1.000
  IVF_PQ:      795 QPS, recall=1.000
  IVF_ON_DISK: 691 QPS, recall=1.000

Large dataset (Cohere 1M, 768D, COSINE):
  HNSW m=16:    74 QPS, recall=0.982

Filter benchmark (Cohere 1M, COSINE, HNSW m=32):
  1% NumGE:  71 QPS, recall=0.977
  99% NumGE: 97 QPS, recall=0.649

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…oud commands (zilliztech#761)

* fix: support self-hosted Elasticsearch via --host/--port in elasticcloud commands

ElasticCloudConfig previously required cloud_id, so the elasticcloudhnsw*
subcommands could only target Elastic Cloud. Users benchmarking self-hosted
stock Elasticsearch had no working path: tencentelasticsearch accepts
host/port but forces Tencent's vsearch index_options type, which stock ES
rejects with "Unknown vector index options type [vsearch]".

Extend ElasticCloudConfig with scheme/host/port/user fields (mutually
exclusive with cloud_id) and expose them on all four ElasticCloudHNSW*
CLI subcommands. Existing cloud_id callers are unchanged.

Refs zilliztech#758

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style: apply black formatting to elastic_cloud/config.py

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lliztech#763)

* Fix: Map "ivf_flat" to "ivfflat" for pgvector index access method

- IndexType.IVFFlat.value="IVF_FLAT" → .lower()="ivf_flat" caused SQL to fail with "access method 'ivf_flat' does not exist"
- pgvector PostgreSQL extension expects "ivfflat" (no underscore), not "ivf_flat"
- Added explicit mapping after lowercase normalization: if index_type_lower == "ivf_flat": index_type_lower = "ivfflat"

* style(pgvector): fix comment wrapping and remove commented code

---------

Co-authored-by: rnagaraju <rnagaraju@zeomega.com>
…lliztech#764)

For non-thread-safe DBs (e.g. PgVector), ConcurrentInsertRunner clamps
max_workers to 1, so there is always exactly one worker thread. There is
no need to deepcopy self.db per thread — the single worker can use
self.db directly via the connection already opened by task()'s
`with self.db.init():`.

The original code called deepcopy(self.db) inside _get_thread_db() after
task() had already opened a live psycopg C-extension Connection on
self.db. C-extension objects cannot be deep-copied, causing:
  TypeError: no default __reduce__ due to non-trivial __cinit__

Fix: remove the deepcopy branch entirely. All workers (thread-safe or
not) now use self.db directly; thread-safety is guaranteed for
non-thread-safe DBs by the max_workers=1 clamp.

Also clean up stale comments in pgvector.py left over from zilliztech#760/zilliztech#763.

Adds tests/test_pgvector.py with:
- unit test that reproduces the bug (fails on original, passes on fix)
- e2e regression test via ConcurrentInsertRunner + OpenAI 50K dataset

See also: zilliztech#756

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
* Add label filtering support to pgdiskann client

* Refactor pgdiskann filtering logic

* Refactor: remove unrelated function

* style: apply black formatting to pgdiskann.py

* fix: remove trailing whitespace and fix import sorting

* docs: add comments for label naming and vector storage optimization

* Revert "docs: add comments for label naming and vector storage optimization"

This reverts commit d10b296.

---------

Co-authored-by: Eesha Faisal <eesha.faisal@emumba.com>
…ech#766)

- Migrate DB config validators to pydantic v2; list all empty fields
  instead of raising on first; consolidate via `_extra_empty_skip`.
- Surface missing client modules at config render time as
  `{DB} needs `{module}` but it is not installed.`
- Replace streamlit-autorefresh with native `@st.fragment(run_every)`
  so live progress does not block UI.
- Bump streamlit to 1.47+ (picks up streamlit#11890 fragment fix);
  switch to native `st.switch_page`, drop `streamlit_extras`.
- Migrate deprecated `use_container_width=True` to `width="stretch"`.
- Patch tornado `write_message` to consume expected `WebSocketClosedError`
  on tab-close races (streamlit#9787).
- Add contract test: each DB enum resolves config_cls/init_cls or
  raises ModuleNotFoundError.

See also: zilliztech#446

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
ConcurrentInsertRunner previously defaulted to mp.cpu_count(),
spawning one worker per CPU when load_concurrency was unset.
On high-core hosts this opens many parallel client connections,
saturating modest DBs / network paths and yielding worse
load throughput than a smaller, steadier worker count.

Cap the unset default to min(cpu_count, 4). Explicit
load_concurrency from CLI / config / submitTask still wins.

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
)

Add a new vector database backend for SeekDB, connecting via
mysql-connector-python over the MySQL wire protocol.

Key components:
- seekdb.py: VectorDB implementation with heap-organized table,
  HNSW vector index, and version-aware optimize() that calls
  dbms_index_manager.refresh() on SeekDB >= 1.3.0
- config.py: DBConfig with host/port/user/password/database and
  SeekDBHNSWConfig with m/ef_construction/ef_search parameters
- cli.py: Click command `SeekDBHNSW` for command-line benchmarks

Registration:
- Add SeekDB to the DB enum in backend/clients/__init__.py with
  lazy imports for init_cls, config_cls, and case_config_cls
- Register SeekDBHNSW CLI command in cli/vectordbbench.py
- Add seekdb optional dependency in pyproject.toml
  (pip install vectordb-bench[seekdb])

Filter support:
- NonFilter and NumGE (id >= N) filters are supported
- StrEqual (label filter) is intentionally excluded since the
  table schema only has id and embedding columns

Thread safety:
- mysql.connector is not thread-safe (thread_safe = False).
  ConcurrentInsertRunner uses max_workers=1 accordingly
- rate_runner.py handles SeekDB specially: copies the db object,
  resets the connection, and calls init() per worker thread

Co-authored-by: liuhao6741 <liuhaobupt@foxmail.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
… cosine support (zilliztech#776)

* feat(oceanbase): configurable index params, KEY partitioning, HNSW_BQ cosine support

- Add --create-index-parallel CLI option (default 16)
- Add --extra-info-max-size CLI option (default 32, set 0 to omit)
- Add --partitions CLI option for KEY partitioning (default 0, no partition)
- HNSW_BQ: remove forced L2 for cosine, now supports cosine natively
- need_normalize_cosine returns False for all index types
- pyproject.toml: add pyyaml dependency, fix packages.find to include all subpackages

* fix(oceanbase): declare thread_safe=False to prevent cursor sharing across threads

* fix: restore seekdb dependency accidentally removed
Ports the DEEP1B (Yandex Deep1B, 96-dim, 1B vectors) benchmarking support
from the older `deep1b-parquet` branch onto the latest `vdb-05062026-yb`
base, adapting to upstream's refactors (Pydantic v2 datasets, Filter-based
prepare(), list-valued test_data/gt_data, generate_normal_cases UI items).

What this adds:
- Deep1B dataset + Performance96D1B case (Dataset enum, cases, UI clusters,
  display order) with 96-dim / L2 / 1B-vector LARGE size label.
- DatasetSource.Deep1BLocal / Deep1BS3 with Deep1BReader (HDF5 -> parquet,
  percentage sampling) and Deep1BS3Reader (parallel S3 parquet downloads).
- deep1b_dataset_percentage + skip_load plumbed through DatasetManager.prepare,
  TaskConfig, CaseRunner._pre_run, and the CLI --deep1b-dataset-percentage flag.
- Larger (1M) load batch size for Deep1B; download_file helper; h5py/requests deps.
- Standalone fbin/ibin -> parquet conversion script and reference docs.

Conflict resolutions vs the new base:
- Kept upstream's ClassVar[dict] _size_label, Filter API, and list-based
  test_data/gt_data; Deep1B now extracts [field].to_list() instead of storing
  DataFrames, and _read_file keeps returning polars (scalar_labels rely on it).
- Deep1B uses with_remote_resource=False and a dedicated reader branch in
  prepare(); did not adopt old-base pgvector create_index default flips.
- Trimmed leftover debug logging in the CLI run() path.

Verified: all modified modules compile, the package imports, and
test_deep1b_integration.py passes (dataset/case/enum/type2case registration).

---
_automated · Claude Code (Opus 4.8)_
`host` may now be a comma-separated list of YugabyteDB nodes (perfservice passes
the full tserver-nodes list when load balancing is requested). Each concurrent
search worker is its own process opening one connection, so split the list and
random.choice a node per worker -- connections fan out across the cluster instead
of all hitting one node. A single host (no comma) is a no-op, so default runs are
unchanged.

Random (not least-connections): N independent worker processes have no shared
counter, so querying per-node load would herd them onto the same node; random
avoids that and is a strict improvement over one hot node.

Pipeline plumbing (load_balance flag, json_to_yaml host selection) lives in the
perf-devops repo on branch vdbbench-lb-tserver-nodes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---
_automated · Claude Code (Opus 4.8 1M)_
* pgvector: use YugabyteDB psycopg3 smart driver for connection load balancing

Alternative to the comma-host client-side approach: install psycopg-yugabytedb
(a psycopg3 fork; import stays `psycopg`) and let the YB smart driver distribute
connections across the cluster instead of doing it ourselves.

- pyproject: pgvector extra -> psycopg-yugabytedb[binary]==3.3.4.1rc1 (binary
  bundles libpq; the RC param load_balance_hosts=true needs the YB-patched libpq).
  Note: cannot coexist with upstream psycopg in the same env; Python >=3.10.
- _create_connection: drop the client-side comma-split/random pick; when
  load_balance is on, set load_balance_hosts=true (+ optional topology_keys) so
  the driver discovers nodes via yb_servers() and load-balances. `host` is just a
  bootstrap seed (single master-leader endpoint is fine).
- config.py / cli.py: add load_balance (default True) and topology_keys options.

Default-on so this branch exercises the smart driver out of the box. Caveat: each
search worker is a separate process opening ONE connection via a fresh driver
instance (zero-count view), so verify empirically whether the driver's per-instance
least-connections actually fans out in this model vs. pinning to the seed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---
_automated · Claude Code (Opus 4.8 1M)_

* pgvector: install pure psycopg-yugabytedb (binary wheel not published for RC)

The [binary] extra resolves to psycopg-yugabytedb-binary==3.3.4.1rc1, which is
NOT published on PyPI (only the pure py3-none-any wheel exists) -> pip errors with
"No matching distribution". Drop the extra and install the pure wheel.

The smart-driver load-balancing logic is pure-Python (the dispatcher sits under
the pool), so a stock system libpq works -- no YB-patched libpq required. The pure
wheel just needs libpq.so.5 reachable at runtime.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---
_automated · Claude Code (Opus 4.8 1M)_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.