Vdb 05062026 yb#16
Open
shaharuk-yb wants to merge 181 commits into
Open
Conversation
Signed-off-by: siqi.an <ansiqi_7777@163.com>
Signed-off-by: min.tian <min.tian.cn@gmail.com>
Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com> Co-authored-by: Signed-off-by: Rachit Chaudhary - r0c0axe <Rachit.Chaudhary@walmart.com>
MariaDB introduced vector support in version 11.7, enabling MariaDB Server to function as a relational vector database. https://mariadb.com/kb/en/vectors/ Now add support for MariaDB server, verified against MariaDB server of version 11.7.1: - Support MariaDB vector search with HNSW algorithm, support filter search. - Support index and search parameters: - storage_engine: InnoDB or MyISAM - M: M parameter in MHNSW vector indexing - ef_search: minimal number of result candidates to look for in the vector index for ORDER BY ... LIMIT N queries. - max_cache_size: Upper limit for one MHNSW vector index cache - Support CLI of `vectordbbench mariadbhnsw`.
* Add TiDB backend Signed-off-by: Wish <breezewish@outlook.com> * Fix Signed-off-by: Wish <breezewish@outlook.com> * Fix Signed-off-by: Wish <breezewish@outlook.com> * Improve error handling Signed-off-by: Wish <breezewish@outlook.com> --------- Signed-off-by: Wish <breezewish@outlook.com>
* Support GPU_BRUTE_FORCE index for Milvus Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com> * MilvusGPUBruteForceTypedDict addition Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com> --------- Signed-off-by: Rachit Chaudhary <rachit.chaudhary@outlook.com> Co-authored-by: Signed-off-by: Rachit Chaudhary - r0c0axe <Rachit.Chaudhary@walmart.com>
Signed-off-by: min.tian <min.tian.cn@gmail.com>
Signed-off-by: min.tian <min.tian.cn@gmail.com>
Signed-off-by: min.tian <min.tian.cn@gmail.com>
Signed-off-by: min.tian <min.tian.cn@gmail.com>
Signed-off-by: min.tian <min.tian.cn@gmail.com>
* feat: add hnsw support * refactor: minor fixes * feat: reformat code * fix: remove sql injections, reformat code
Signed-off-by: min.tian <min.tian.cn@gmail.com>
* Add --task-label option for cli * Fix lint issues
…#523) * Update cli.py * Update clickhouse.py * Update clickhouse.py * Update cli.py * Update config.py * remove space
…ch#521) * Add --concurrency-timeout option to avoid long time waiting, by default, it's 3600s. * Fix lint error * Update README.md, add --concurrency-timeout option
Signed-off-by: min.tian <min.tian.cn@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Milvus results (16c64g, force_merge, v2.6.14): - 1M Cohere: SQ4U+FP16 (sweep refine_k) + SQ8 (sweep ef), 8 points each - 10M Cohere: SQ4U+FP16 + SQ8 (sweep ef), 8 points each - Total 32 benchmark configurations ElasticCloud and ZillizCloud results from standard benchmark runs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rename ElasticCloud and ZillizCloud result files from 20260209 to 20260403 and update task_label to standard_20260403 for consistency with Milvus results. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update db_name/label in leaderboard_v2_streaming.json to match leaderboard_v2.json after force_merge became the default: - Milvus: 16c64g-sq8 -> 16c64g-sq8-force_merge - ElasticCloud: 8c60g -> 8c60g-force_merge This fixes the website failing to associate streaming and vector search results due to mismatched db_name keys. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace ZillizCloud-8cu-perf case_id=4/5 data with new Cardinal backend benchmark results (level 1-9, 1M and 10M datasets, v2026.4). Remove force_merge entries as Cardinal uses unified 4-segment architecture for 10M. New results show significant QPS improvement: - 1M: 13,316 QPS (was 9,704) at recall 0.938 - 10M: 7,385 QPS (was 3,957) at recall 0.938 Sort all leaderboard entries by (db_name, dataset, filter_ratio, qps DESC) to fix line chart rendering. Remove one SQ4U 1M outlier (recall=0.84). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Upgrade pydantic to 2.x 2. Remove results/ from .gitignore, those files need to track 3. fix the coding styles in the results Signed-off-by: yangxuan <xuan.yang@zilliz.com>
…illiztech#751) Populate insert_duration, optimize_duration, load_duration for all entries in result_20260403 files. Previously only the first entry per index had values while the rest were 0.0. Milvus (re-measured on 2.6-opt-v2): - 1M SQ4U: insert=129.8s, optimize=152.2s, load=282.0s - 1M SQ8: insert=119.5s, optimize=235.9s, load=355.4s - 10M SQ4U/SQ8: copied from existing first-entry values ZillizCloud (from prior build runs): - 1M: insert=246.7s, optimize=101.2s, load=347.9s - 10M: insert=2450.8s, optimize=136.9s, load=2587.8s Co-authored-by: Ubuntu <ubuntu@ip-10-15-14-123.us-west-2.compute.internal> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
…lliztech#754) Update result_20260403_standard_zillizcloud.json to use the latest validated build timings from recent reruns for case_id=5 (1M) and case_id=4 (10M), including insert_duration, optimize_duration, and load_duration. Co-authored-by: Ubuntu <ubuntu@ip-10-15-14-123.us-west-2.compute.internal> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add VectorChord support and VCHORDRQ index type * feat: add VectorChordRQ command to CLI * feat: add VectorChord support to README * feat: add VectorChordGraph support and configuration * feat: add max_scan_tuples parameter to VectorChordGraph * feat: enhance VectorChord with improved type safety and search functionality * feat: add vectorchord extension creation on connection Co-authored-by: edgar-p <edgar.p@kakaocorp.com>
zilliztech#760) * fix(pgvector): normalize index_type to lowercase in _create_index to match PostgreSQL access method names PostgreSQL pgvector extension registers index access methods in lowercase (e.g. "hnsw", "ivfflat"), but the frontend passes IndexType.HNSW.value which is uppercase "HNSW", causing "access method HNSW does not exist" error. * Fix index type usage in pgvector.py Replaced index_param['index_type'] with index_type_lower for consistency. * add comment sign '#' I have added the # before [FIX]
Adds a complete Apache Pinot client for VectorDBBench. Index types: HNSW (Lucene), IVF_FLAT, IVF_PQ, IVF_ON_DISK Metrics: L2, IP, COSINE Filters: NumGE, StrEqual Optional dep: pip install "vectordb-bench[pinot]" Parallel loading: thread_safe=True — each worker thread maintains its own row buffer and flushes to Pinot via a fresh HTTP session. Since Pinot's ingestFromFile is synchronous (blocks until HNSW index is built, ~6 min per 100K×768D segment), concurrent flushes across threads reduce load time significantly vs sequential flushing. Benchmark results: Small dataset (OpenAI 50K, 768D, L2): HNSW: 798 QPS, recall=1.000 IVF_FLAT: 800 QPS, recall=1.000 IVF_PQ: 795 QPS, recall=1.000 IVF_ON_DISK: 691 QPS, recall=1.000 Large dataset (Cohere 1M, 768D, COSINE): HNSW m=16: 74 QPS, recall=0.982 Filter benchmark (Cohere 1M, COSINE, HNSW m=32): 1% NumGE: 71 QPS, recall=0.977 99% NumGE: 97 QPS, recall=0.649 Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…oud commands (zilliztech#761) * fix: support self-hosted Elasticsearch via --host/--port in elasticcloud commands ElasticCloudConfig previously required cloud_id, so the elasticcloudhnsw* subcommands could only target Elastic Cloud. Users benchmarking self-hosted stock Elasticsearch had no working path: tencentelasticsearch accepts host/port but forces Tencent's vsearch index_options type, which stock ES rejects with "Unknown vector index options type [vsearch]". Extend ElasticCloudConfig with scheme/host/port/user fields (mutually exclusive with cloud_id) and expose them on all four ElasticCloudHNSW* CLI subcommands. Existing cloud_id callers are unchanged. Refs zilliztech#758 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style: apply black formatting to elastic_cloud/config.py Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lliztech#763) * Fix: Map "ivf_flat" to "ivfflat" for pgvector index access method - IndexType.IVFFlat.value="IVF_FLAT" → .lower()="ivf_flat" caused SQL to fail with "access method 'ivf_flat' does not exist" - pgvector PostgreSQL extension expects "ivfflat" (no underscore), not "ivf_flat" - Added explicit mapping after lowercase normalization: if index_type_lower == "ivf_flat": index_type_lower = "ivfflat" * style(pgvector): fix comment wrapping and remove commented code --------- Co-authored-by: rnagaraju <rnagaraju@zeomega.com>
…lliztech#764) For non-thread-safe DBs (e.g. PgVector), ConcurrentInsertRunner clamps max_workers to 1, so there is always exactly one worker thread. There is no need to deepcopy self.db per thread — the single worker can use self.db directly via the connection already opened by task()'s `with self.db.init():`. The original code called deepcopy(self.db) inside _get_thread_db() after task() had already opened a live psycopg C-extension Connection on self.db. C-extension objects cannot be deep-copied, causing: TypeError: no default __reduce__ due to non-trivial __cinit__ Fix: remove the deepcopy branch entirely. All workers (thread-safe or not) now use self.db directly; thread-safety is guaranteed for non-thread-safe DBs by the max_workers=1 clamp. Also clean up stale comments in pgvector.py left over from zilliztech#760/zilliztech#763. Adds tests/test_pgvector.py with: - unit test that reproduces the bug (fails on original, passes on fix) - e2e regression test via ConcurrentInsertRunner + OpenAI 50K dataset See also: zilliztech#756 Signed-off-by: yangxuan <xuan.yang@zilliz.com>
* Add label filtering support to pgdiskann client * Refactor pgdiskann filtering logic * Refactor: remove unrelated function * style: apply black formatting to pgdiskann.py * fix: remove trailing whitespace and fix import sorting * docs: add comments for label naming and vector storage optimization * Revert "docs: add comments for label naming and vector storage optimization" This reverts commit d10b296. --------- Co-authored-by: Eesha Faisal <eesha.faisal@emumba.com>
…ech#766) - Migrate DB config validators to pydantic v2; list all empty fields instead of raising on first; consolidate via `_extra_empty_skip`. - Surface missing client modules at config render time as `{DB} needs `{module}` but it is not installed.` - Replace streamlit-autorefresh with native `@st.fragment(run_every)` so live progress does not block UI. - Bump streamlit to 1.47+ (picks up streamlit#11890 fragment fix); switch to native `st.switch_page`, drop `streamlit_extras`. - Migrate deprecated `use_container_width=True` to `width="stretch"`. - Patch tornado `write_message` to consume expected `WebSocketClosedError` on tab-close races (streamlit#9787). - Add contract test: each DB enum resolves config_cls/init_cls or raises ModuleNotFoundError. See also: zilliztech#446 Signed-off-by: yangxuan <xuan.yang@zilliz.com>
ConcurrentInsertRunner previously defaulted to mp.cpu_count(), spawning one worker per CPU when load_concurrency was unset. On high-core hosts this opens many parallel client connections, saturating modest DBs / network paths and yielding worse load throughput than a smaller, steadier worker count. Cap the unset default to min(cpu_count, 4). Explicit load_concurrency from CLI / config / submitTask still wins. Signed-off-by: yangxuan <xuan.yang@zilliz.com>
) Add a new vector database backend for SeekDB, connecting via mysql-connector-python over the MySQL wire protocol. Key components: - seekdb.py: VectorDB implementation with heap-organized table, HNSW vector index, and version-aware optimize() that calls dbms_index_manager.refresh() on SeekDB >= 1.3.0 - config.py: DBConfig with host/port/user/password/database and SeekDBHNSWConfig with m/ef_construction/ef_search parameters - cli.py: Click command `SeekDBHNSW` for command-line benchmarks Registration: - Add SeekDB to the DB enum in backend/clients/__init__.py with lazy imports for init_cls, config_cls, and case_config_cls - Register SeekDBHNSW CLI command in cli/vectordbbench.py - Add seekdb optional dependency in pyproject.toml (pip install vectordb-bench[seekdb]) Filter support: - NonFilter and NumGE (id >= N) filters are supported - StrEqual (label filter) is intentionally excluded since the table schema only has id and embedding columns Thread safety: - mysql.connector is not thread-safe (thread_safe = False). ConcurrentInsertRunner uses max_workers=1 accordingly - rate_runner.py handles SeekDB specially: copies the db object, resets the connection, and calls init() per worker thread Co-authored-by: liuhao6741 <liuhaobupt@foxmail.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
… cosine support (zilliztech#776) * feat(oceanbase): configurable index params, KEY partitioning, HNSW_BQ cosine support - Add --create-index-parallel CLI option (default 16) - Add --extra-info-max-size CLI option (default 32, set 0 to omit) - Add --partitions CLI option for KEY partitioning (default 0, no partition) - HNSW_BQ: remove forced L2 for cosine, now supports cosine natively - need_normalize_cosine returns False for all index types - pyproject.toml: add pyyaml dependency, fix packages.find to include all subpackages * fix(oceanbase): declare thread_safe=False to prevent cursor sharing across threads * fix: restore seekdb dependency accidentally removed
Ports the DEEP1B (Yandex Deep1B, 96-dim, 1B vectors) benchmarking support from the older `deep1b-parquet` branch onto the latest `vdb-05062026-yb` base, adapting to upstream's refactors (Pydantic v2 datasets, Filter-based prepare(), list-valued test_data/gt_data, generate_normal_cases UI items). What this adds: - Deep1B dataset + Performance96D1B case (Dataset enum, cases, UI clusters, display order) with 96-dim / L2 / 1B-vector LARGE size label. - DatasetSource.Deep1BLocal / Deep1BS3 with Deep1BReader (HDF5 -> parquet, percentage sampling) and Deep1BS3Reader (parallel S3 parquet downloads). - deep1b_dataset_percentage + skip_load plumbed through DatasetManager.prepare, TaskConfig, CaseRunner._pre_run, and the CLI --deep1b-dataset-percentage flag. - Larger (1M) load batch size for Deep1B; download_file helper; h5py/requests deps. - Standalone fbin/ibin -> parquet conversion script and reference docs. Conflict resolutions vs the new base: - Kept upstream's ClassVar[dict] _size_label, Filter API, and list-based test_data/gt_data; Deep1B now extracts [field].to_list() instead of storing DataFrames, and _read_file keeps returning polars (scalar_labels rely on it). - Deep1B uses with_remote_resource=False and a dedicated reader branch in prepare(); did not adopt old-base pgvector create_index default flips. - Trimmed leftover debug logging in the CLI run() path. Verified: all modified modules compile, the package imports, and test_deep1b_integration.py passes (dataset/case/enum/type2case registration). --- _automated · Claude Code (Opus 4.8)_
`host` may now be a comma-separated list of YugabyteDB nodes (perfservice passes the full tserver-nodes list when load balancing is requested). Each concurrent search worker is its own process opening one connection, so split the list and random.choice a node per worker -- connections fan out across the cluster instead of all hitting one node. A single host (no comma) is a no-op, so default runs are unchanged. Random (not least-connections): N independent worker processes have no shared counter, so querying per-node load would herd them onto the same node; random avoids that and is a strict improvement over one hot node. Pipeline plumbing (load_balance flag, json_to_yaml host selection) lives in the perf-devops repo on branch vdbbench-lb-tserver-nodes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --- _automated · Claude Code (Opus 4.8 1M)_
* pgvector: use YugabyteDB psycopg3 smart driver for connection load balancing Alternative to the comma-host client-side approach: install psycopg-yugabytedb (a psycopg3 fork; import stays `psycopg`) and let the YB smart driver distribute connections across the cluster instead of doing it ourselves. - pyproject: pgvector extra -> psycopg-yugabytedb[binary]==3.3.4.1rc1 (binary bundles libpq; the RC param load_balance_hosts=true needs the YB-patched libpq). Note: cannot coexist with upstream psycopg in the same env; Python >=3.10. - _create_connection: drop the client-side comma-split/random pick; when load_balance is on, set load_balance_hosts=true (+ optional topology_keys) so the driver discovers nodes via yb_servers() and load-balances. `host` is just a bootstrap seed (single master-leader endpoint is fine). - config.py / cli.py: add load_balance (default True) and topology_keys options. Default-on so this branch exercises the smart driver out of the box. Caveat: each search worker is a separate process opening ONE connection via a fresh driver instance (zero-count view), so verify empirically whether the driver's per-instance least-connections actually fans out in this model vs. pinning to the seed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --- _automated · Claude Code (Opus 4.8 1M)_ * pgvector: install pure psycopg-yugabytedb (binary wheel not published for RC) The [binary] extra resolves to psycopg-yugabytedb-binary==3.3.4.1rc1, which is NOT published on PyPI (only the pure py3-none-any wheel exists) -> pip errors with "No matching distribution". Drop the extra and install the pure wheel. The smart-driver load-balancing logic is pure-Python (the dispatcher sits under the pool), so a stock system libpq works -- no YB-patched libpq required. The pure wheel just needs libpq.so.5 reachable at runtime. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --- _automated · Claude Code (Opus 4.8 1M)_
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.