Skip to content

[RFC] What filtered search algorithms should DiskANN support?#1128

Merged
magdalendobson merged 24 commits into
mainfrom
users/magdalen/filtered_search_rfc
Jun 10, 2026
Merged

[RFC] What filtered search algorithms should DiskANN support?#1128
magdalendobson merged 24 commits into
mainfrom
users/magdalen/filtered_search_rfc

Conversation

@magdalendobson

Copy link
Copy Markdown
Contributor

This PR adds an RFC on which filtered search algorithms the DiskANN repo should support.

@magdalendobson magdalendobson marked this pull request as ready for review June 2, 2026 21:01
@magdalendobson magdalendobson requested review from a team and Copilot June 2, 2026 21:01

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an RFC documenting an empirical evaluation of DiskANN filtered-search algorithm variants, with a recommendation for which algorithms the repo should support going forward.

Changes:

  • Introduces a new RFC describing existing and proposed filtered-search algorithms (inline, beta, multi-hop, two-queue, adaptive-L).
  • Records benchmark results on two datasets (Caselaw and YFCC) with accompanying plots.
  • Proposes a path forward: add inline (optionally adaptive-L), retain multi-hop, deprecate beta, and close two-queue.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread rfcs/01128-filtered-algorithms.md
Comment thread rfcs/01128-filtered-algorithms.md
Comment thread rfcs/00000-filtered-algorithms.md Outdated
Comment thread rfcs/00000-filtered-algorithms.md Outdated
Comment thread rfcs/00000-filtered-algorithms.md Outdated
Comment thread rfcs/00000-filtered-algorithms.md Outdated
Comment thread rfcs/01128-filtered-algorithms.md
Comment thread rfcs/00000-filtered-algorithms.md Outdated
Comment thread rfcs/00000-filtered-algorithms.md Outdated
@codecov-commenter

codecov-commenter commented Jun 2, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.45%. Comparing base (68cc3c4) to head (0d4eadb).
⚠️ Report is 8 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1128      +/-   ##
==========================================
+ Coverage   88.87%   89.45%   +0.57%     
==========================================
  Files         485      484       -1     
  Lines       92112    91407     -705     
==========================================
- Hits        81868    81767     -101     
+ Misses      10244     9640     -604     
Flag Coverage Δ
miri 89.45% <ø> (+0.57%) ⬆️
unittests 89.10% <ø> (+0.57%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 47 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Magdalen Manohar and others added 2 commits June 2, 2026 21:38
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
magdalendobson and others added 6 commits June 2, 2026 17:39
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…soft/DiskANN into users/magdalen/filtered_search_rfc
@harsha-simhadri

Copy link
Copy Markdown
Contributor

Could this be in the wiki? The plots are a quite large addition to the repo.

@magdalendobson

Copy link
Copy Markdown
Contributor Author

Could this be in the wiki? The plots are a quite large addition to the repo.

Do we want an RFC that points to benchmarks in the wiki then? Because folks were pretty clear that they wanted an RFC on filtered search.

@magdalendobson magdalendobson changed the title [RFC] [RFC] What filtered search algorithms should DiskANN support? Jun 3, 2026
@magdalendobson

Copy link
Copy Markdown
Contributor Author

Could this be in the wiki? The plots are a quite large addition to the repo.

Do we want an RFC that points to benchmarks in the wiki then? Because folks were pretty clear that they wanted an RFC on filtered search.

I have moved the benchmark discussion section to the DiskANN wiki: https://github.com/microsoft/DiskANN/wiki/Evaluation-of-Filtered-Search-Algorithms

Comment thread rfcs/01128-filtered-algorithms.md Outdated
Comment thread rfcs/01128-filtered-algorithms.md Outdated
Comment thread rfcs/01128-filtered-algorithms.md Outdated
Comment thread rfcs/01128-filtered-algorithms.md
Comment thread rfcs/01128-filtered-algorithms.md
Comment thread rfcs/01128-filtered-algorithms.md Outdated
Comment thread rfcs/01128-filtered-algorithms.md

@arrayka arrayka left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Approving, assuming the QPS and latency comparison between Adaptive L and the beta filter for disk search scenarios looks promising.

@magdalendobson magdalendobson linked an issue Jun 5, 2026 that may be closed by this pull request
Comment thread rfcs/01128-filtered-algorithms.md Outdated
Comment thread rfcs/01128-filtered-algorithms.md Outdated
Comment thread rfcs/01128-filtered-algorithms.md
Comment thread rfcs/01128-filtered-algorithms.md
Comment thread rfcs/01128-filtered-algorithms.md

@suri-kumkaran suri-kumkaran left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work. The RFC recommendations around deprecating beta filter, adding inline filtered search, and adding adaptive L search make sense to me.

One follow up implementation question is how we should expose these similar and overlapping algorithms at the right level of granularity, so clients can reuse them through providers, with or without their own query planners, depending on their use cases.

Comment thread rfcs/01128-filtered-algorithms.md
Comment thread rfcs/01128-filtered-algorithms.md
@magdalendobson magdalendobson merged commit a5c745b into main Jun 10, 2026
22 checks passed
@magdalendobson magdalendobson deleted the users/magdalen/filtered_search_rfc branch June 10, 2026 14:34
magdalendobson added a commit that referenced this pull request Jun 11, 2026
This PR implements the recommendation in the [filtered search
RFC](#1128) to implement inline
filtering with the adaptive L method as an optional addition.

---------

Co-authored-by: qingcha chen <qinchen@microsoft.com>
Co-authored-by: Magdalen Manohar <magdalen@magdalen.localdomain>
Co-authored-by: Magdalen Manohar <mmanohar@microsoft.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Mark Hildebrand <hildebrandmw@gmail.com>
Co-authored-by: Mark Hildebrand <mhildebrand@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Post evaluation results and RFC with recommendations

10 participants