Handle PlanarConfiguration=2 across CPU + GPU multi-band reads#1532
Merged
brendancol merged 3 commits intoxarray-contrib:mainfrom May 9, 2026
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes GeoTIFF multi-band read correctness for PlanarConfiguration=2 (separate planar) across GPU tiled reads, and aligns the GPU stripped fallback’s dimension handling with the CPU reader so multi-band stripped outputs consistently return (y, x, band).
Changes:
- Add a new GPU tiled read branch for planar=2 that decodes per-band tile slabs and stacks them into
(H, W, samples). - Fix GPU stripped fallback dims/coords so 3-D CPU reads return
('y', 'x', 'band')rather than forcing 2-D dims. - Add comprehensive tests covering planar configuration × layout × band count × dtype across CPU and GPU backends.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
xrspatial/geotiff/__init__.py |
Adds planar=2 handling in read_geotiff_gpu, introduces a per-band tile decode helper, and fixes stripped fallback dims. |
xrspatial/geotiff/tests/test_planar_multiband.py |
New test module validating CPU/GPU multi-band correctness and dims across planar/layout combinations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+1400
to
+1409
| def _gpu_decode_single_band_tiles( | ||
| source, offsets, byte_counts, | ||
| tw, th, width, height, | ||
| compression, predictor, file_dtype, | ||
| *, | ||
| byte_order: str, | ||
| gpu: str, | ||
| overview_level, | ||
| cupy, | ||
| ): |
brendancol
added a commit
to brendancol/xarray-spatial
that referenced
this pull request
May 9, 2026
Seven findings, all acted on: - Sparse-tile bypass: planar=2 branch ran before the has_sparse_tile check, so a planar=2 file with sparse tiles would hit the GPU path that doesn't handle sparse blocks. Add `not has_sparse_tile` to the planar=2 gate so sparse always routes to the CPU reader. - Strict offset-count check: planar=2 required len(offsets) == tiles_per_band * samples, but validate_tile_layout only requires >=. Files with extra trailing entries that pass the CPU reader were rejected here. Relax to a minimum-length check and slice the first tiles_per_band * samples entries. - Auto-mode fallback semantics: stage-2 GPU failure in _gpu_decode_single_band_tiles raised unconditionally instead of letting the caller fall back to a whole-image CPU read like the chunky path does. Helper now returns None on stage-2 auto failure; caller catches the signal, runs read_to_array + cupy.asarray, and matches the chunky path's gpu='auto' contract. Strict mode still re-raises. - Per-band file re-reads: the helper called src.read_all() for each band's stage-2 fallback, doing N redundant full-file reads on a planar=2 image with N bands. Caller now does read_all() once before the band loop and passes the bytes via a new shared_data parameter. - Unused params: drop overview_level and cupy from the helper signature; neither was referenced. - Two assert statements for runtime shape validation replaced with explicit `if ...: raise RuntimeError(...)` so the checks survive python -O. 53 planar tests + 303 multi/band/planar/strip/tile/sparse regression tests still pass.
Two related multi-band accuracy bugs from the geotiff audit: A2 (HIGH): GPU tiled reads of planar=2 files (separate band planes) returned wrong pixel values. gpu_decode_tiles iterates a single TileOffsets list with bytes_per_pixel = itemsize * samples and assumes chunky interleave. For planar=2, each band has its own contiguous run of tiles in TileOffsets, so the kernel read across band boundaries and produced garbage. read_geotiff_gpu now detects planar==2 and decodes each band's tile slab as a single-band image, then stacks the results into (H, W, samples). The chunky planar=1 path is unchanged. A3 (MEDIUM): The GPU stripped multi-band fallback hardcoded dims=['y','x'] even for 3-D CPU reads, mirroring the pre-xarray-contrib#1525 bug. The same fix lands here so the planar=2 stripped tests pass; this matches the change in PR xarray-contrib#1525 verbatim. Both paths assert the assembled shape equals (H, W, samples) before returning, so a future kernel regression surfaces immediately rather than producing garbage. Adds xrspatial/geotiff/tests/test_planar_multiband.py covering 53 combinations: planar in {separate, contig} x layout in {tiled, stripped} x bands in {2, 3, 4} x dtype in {uint8, uint16} x backend in {CPU, GPU}, plus single-band sanity checks and an explicit A3 repro. GPU tests skip cleanly when CUDA is unavailable. Seeds are fixed. Refs audit issues A2, A3.
Seven findings, all acted on: - Sparse-tile bypass: planar=2 branch ran before the has_sparse_tile check, so a planar=2 file with sparse tiles would hit the GPU path that doesn't handle sparse blocks. Add `not has_sparse_tile` to the planar=2 gate so sparse always routes to the CPU reader. - Strict offset-count check: planar=2 required len(offsets) == tiles_per_band * samples, but validate_tile_layout only requires >=. Files with extra trailing entries that pass the CPU reader were rejected here. Relax to a minimum-length check and slice the first tiles_per_band * samples entries. - Auto-mode fallback semantics: stage-2 GPU failure in _gpu_decode_single_band_tiles raised unconditionally instead of letting the caller fall back to a whole-image CPU read like the chunky path does. Helper now returns None on stage-2 auto failure; caller catches the signal, runs read_to_array + cupy.asarray, and matches the chunky path's gpu='auto' contract. Strict mode still re-raises. - Per-band file re-reads: the helper called src.read_all() for each band's stage-2 fallback, doing N redundant full-file reads on a planar=2 image with N bands. Caller now does read_all() once before the band loop and passes the bytes via a new shared_data parameter. - Unused params: drop overview_level and cupy from the helper signature; neither was referenced. - Two assert statements for runtime shape validation replaced with explicit `if ...: raise RuntimeError(...)` so the checks survive python -O. 53 planar tests + 303 multi/band/planar/strip/tile/sparse regression tests still pass.
The previous fix did read_all() unconditionally before the band loop to amortise a stage-2 fallback across bands. When every band's GDS path succeeds, those bytes are never used. Replace the eager read_all() with a closure that materialises the buffer on first call and caches it, so: - Every-band-GDS-succeeds case: 0 read_all() calls (was 1). - Any-band-falls-back case: exactly 1 read_all() call regardless of how many bands fall back (was N). Same Copilot-finding-xarray-contrib#4 fix, just lazier.
67bb8c2 to
5560775
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related multi-band accuracy bugs from the geotiff audit:
A2 (HIGH): GPU tiled reads of planar=2 files returned wrong pixel values.
gpu_decode_tiles/_assemble_tiles_kerneliterate a singleTileOffsetslist with
bytes_per_pixel = itemsize * samplesand assume chunky interleave.For
PlanarConfiguration=2, each band has its own contiguous run of tiles inTileOffsets, so the kernel read across band boundaries.A3 (MEDIUM): The GPU stripped multi-band fallback hardcoded
dims=['y','x']on a 3-D CPU read, the same regression that PR #1525 addresses for the stripped
path. Folded in here so the planar=2 stripped tests pass without depending on
PR #1525 merging first.
Shared root cause: neither path branched on
ifd.planar_config.Fix
read_geotiff_gpu: whenifd.planar_config == 2 and samples > 1, sliceTileOffsets/TileByteCountsper band, decode each slab as a single-bandtile sequence, then
cupy.stack(..., axis=2)into(H, W, samples). Chunky(
planar=1) goes through the existing single-pass kernel unchanged.read_geotiff_gpustripped fallback: pick('y', 'x', 'band')with a bandcoord when the CPU array is 3-D (mirrors PR Fix read_geotiff_gpu dim mismatch on stripped multi-band TIFFs #1525).
(H, W, samples)before returning.Diff is scoped to
xrspatial/geotiff/__init__.py-- the GPU kernels are nottouched, so other GPU codecs and the chunky/predictor paths see no behaviour
change.
Paths that now diverge correctly
gpu_decode_tilesTest plan
pytest xrspatial/geotiff/tests/test_planar_multiband.py-- 53 passed(planar in {separate, contig} x layout in {tiled, stripped} x bands in {2,3,4}
x dtype in {uint8, uint16} x backend in {CPU, GPU}, plus single-band sanity
and an explicit A3 repro). Seeds fixed; GPU tests skip without CUDA.
pytest xrspatial/geotiff/tests/ -k "multi or band or planar or strip or tile"-- 303 passedpytest xrspatial/geotiff/tests/test_predictor_multisample.py-- 33 passedpytest xrspatial/geotiff/tests/test_gpu_strict_fallback_1516.py-- 5 passedtest_gpu_stripped_multiband.pycases verified manuallyagainst this branch (3-band uint8, 2-band uint16, single-band sanity)
pytest xrspatial/geotiff/tests/-- 837 passed (3 pre-existingmatplotlib-deepcopy failures in test_features.py unrelated to this change)