GH-46856: [C++][Python] Add binary view comparison kernels#49964
GH-46856: [C++][Python] Add binary view comparison kernels#49964Periecle wants to merge 3 commits into
Conversation
|
|
|
|
3 similar comments
|
|
|
|
|
|
There was a problem hiding this comment.
Pull request overview
This PR adds missing comparison kernel registrations for Arrow C++ compute so that binary_view and utf8_view arrays can be compared (and thus pyarrow.compute.* comparison functions work for pa.binary_view() / pa.string_view() inputs).
Changes:
- Register
(binary_view, binary_view)and(utf8_view, utf8_view)comparison kernels in C++ compute. - Extend internal kernel codegen iteration utilities to support binary-view-like arrays.
- Add C++ and Python tests covering comparisons for view types (including nulls and slicing).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| python/pyarrow/tests/test_compute.py | Adds Python regression tests ensuring view-type arrays work with the 6 comparison functions. |
| cpp/src/arrow/compute/kernels/scalar_compare.cc | Registers comparison kernels for binary_view and utf8_view. |
| cpp/src/arrow/compute/kernels/scalar_compare_test.cc | Adds C++ test coverage for array/array, sliced arrays, and array/scalar comparisons on view types. |
| cpp/src/arrow/compute/kernels/codegen_internal.h | Adds ArrayIterator support for binary-view-like arrays so generated kernels can read values correctly. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for (const std::shared_ptr<DataType>& ty : {binary_view(), utf8_view()}) { | ||
| auto exec = | ||
| GenerateVarBinaryViewBase<applicator::ScalarBinaryEqualTypes, BooleanType, Op>( | ||
| *ty); | ||
| DCHECK_OK(func->AddKernel({ty, ty}, boolean(), std::move(exec))); | ||
| } |
raulcd
left a comment
There was a problem hiding this comment.
The linter CI job is currently failing, could you take a look please?
|
@raulcd I have fixed linter violations, should be all good. |
Rationale for this change
pyarrow.compute.equalfails forpa.binary_view()arrays because C++ compute has no registered comparison kernel for(binary_view, binary_view).This fixes that missing kernel path and also enables the same comparisons for
utf8_view.What changes are included in this PR?
This adds comparison kernel support for
binary_viewandutf8_view.The following functions now work for same-type inputs:
equalnot_equalgreatergreater_equallessless_equalAre these changes tested?
Added C++ tests covering:
Added Python regression tests for
pa.binary_view()andpa.string_view().Verified the same cases fail before this patch at
a0d2885b101acb439f7f79ec2237028974e74e64withArrowNotImplementedError: no kernel matching input types.Are there any user-facing changes?
pyarrow.computecomparison functions now work forpa.binary_view()andpa.string_view()arrays where they previously failed with a missing kernel error.AI Usage
Tests were generated by LLM agents along with part or PR summary
Addresses: GH-46856
Partially addresses: GH-44336