Skip to content

[Bug](stream-load) Stream Load crashes BE with strict_mode and all-expression columns #64836

Description

@re20052

Search before asking

  • I had searched in the issues and found no similar issues.

Version

Apache Doris 4.0

What's Wrong?

BE crashes (SIGSEGV) when a Stream Load uses strict_mode=true together with a columns header in which every target column is assigned via an expression (no direct-mapped column), and at least one row produces a NULL on a derived column.

Crash location

FileScanner::_convert_to_output_block() in be/src/vec/exec/scan/file_scanner.cpp.

The strict-mode branch indexes _src_slot_descs_order_by_dest[dest_index] and _dest_slot_to_src_slot_index[dest_index] without checking their size.

Root cause

These two members are populated only when FE sends dest_sid_to_src_sid_without_trans, which only happens if at least one target column is direct-mapped (no = in columns). When every target column uses an expression, both containers stay empty and the strict-mode branch reads them out of bounds.

What You Expected?

Stream Load should not crash BE. The strict-mode branch should be guarded by a size / existence check on _src_slot_descs_order_by_dest and _dest_slot_to_src_slot_index, and fall through to the regular nullable check when no source-column mapping exists for a derived column.

How to Reproduce?

Table

CREATE TABLE sl_min (
    k BIGINT NOT NULL,
    v VARCHAR(64) NULL
) ENGINE=OLAP
DUPLICATE KEY(k)
DISTRIBUTED BY HASH(k) BUCKETS 1
PROPERTIES ("replication_num" = "1");

Source file sl_min.csv (one row, two fields):

1,\N

Stream Load

curl --location-trusted -u root: \
    -H "label:sl_min_$(date +%s)" \
    -H "format:csv" \
    -H "column_separator:," \
    -H "columns:c1,c2,k=c1,v=c2" \
    -H "strict_mode:true" \
    -H "max_filter_ratio:0" \
    -T ./sl_min.csv \
    "http://<fe_host>:<fe_http_port>/api/<db>/sl_min/_stream_load"

The BE process crashes with SIGSEGV. curl reports Warning: Binary output can mess up your terminal because the connection is dropped mid-response.

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions