Skip to content

Fix memory safety issues#3188

Open
texodus wants to merge 1 commit into
masterfrom
memory-safety-fixes
Open

Fix memory safety issues#3188
texodus wants to merge 1 commit into
masterfrom
memory-safety-fixes

Conversation

@texodus

@texodus texodus commented Jun 19, 2026

Copy link
Copy Markdown
Member

I asked Claude to find memory safety bugs in Perspective and write a PR, while I played Balatro on my phone. Here's its own summary of what it found:

Fixes

  • Uneven column lengths in columnar table create/update (table.cpp
    Table::from_cols, Table::update_cols): the table was sized from a single
    column's length while every column was filled to its own length. Now the
    table is sized to the longest column and shorter columns are null-padded, so
    all writes stay in bounds.

  • NDJSON row under-allocation (table.cppTable::from_ndjson): capacity
    was estimated from the newline count but one row was written per parsed
    object. The table is now grown per row (amortized O(1) via geometric capacity
    growth), so concatenated objects without newline separators can no longer
    overrun the buffer.

  • Arrow row-count truncation (arrow_loader.cppArrowLoader::row_count):
    Arrow's 64-bit row count was silently truncated to 32 bits, under-sizing the
    destination table relative to the data written during fill. Oversized/negative
    counts are now rejected instead of truncated.

  • Arrow time32 element width (arrow_loader.cppcopy_array): time32
    values are 32-bit and map to a 4-byte column, but the loader copied 8 bytes
    per element, over-reading the source and over-writing the destination. Now
    copies 4 bytes per element.

  • first/last aggregate with a missing sort dependency (sparse_tree.cpp
    first_last_helper): the helper assumed the aggregate spec always carried
    both a value and a sort dependency and indexed the dependency list
    unconditionally. A view whose sort column falls outside the visible set could
    produce a spec without the sort dependency, causing an out-of-bounds read. Now
    guarded (covers first, last, and last − first).

  • Residency eviction data race (residency.cpp, residency.h): the
    shared pending-eviction vectors were cleared outside the manager's mutex, so
    concurrent request-thread eviction passes could double-free. All mutations now
    occur under the lock, and a dedicated mutex serializes each eviction cycle.

  • Unvalidated Arrow input (arrow_loader.cppArrowLoader::initialize):
    Arrow's IPC reader does not validate buffer contents, yet the fill paths index
    value/offset/dictionary buffers directly. The loaded (remotely supplied) table
    is now fully validated (ValidateFull()) before those buffers are trusted, so
    a malformed payload — bad offsets, out-of-range dictionary indices, inconsistent
    chunk lengths — is rejected instead of read out of bounds. This is the systemic
    defense behind the time32 and row-count fixes above. Note: validation is
    O(data) per ingested payload; Validate() plus targeted bounds checks would be
    a cheaper alternative if ingest throughput matters.

  • Out-of-bounds access in expression vector functions (computed_function.cpp
    diff3, norm3, cross_product3, dot_product3): these operate on
    3-element vectors and indexed v[0..2] unconditionally, but their exprtk
    parameter sequences ("VVV"/"VV"/"V") enforce vector type, not length.
    A user expression can declare shorter vectors (e.g. var v[2] := {1,2}; norm3(v)), causing an out-of-bounds read — and for diff3/cross_product3,
    an out-of-bounds write to the (short) output vector. Each function now
    clears its result for vectors shorter than 3 before indexing; this surfaces
    as an invalid expression (rejected at view creation) rather than an
    out-of-bounds access. Vectors of length 3 are unaffected.

@texodus texodus added the bug Concrete, reproducible bugs label Jun 19, 2026
@texodus texodus requested a review from timkpaine June 19, 2026 14:22
@texodus texodus force-pushed the memory-safety-fixes branch from c1774b3 to 95e8e88 Compare June 19, 2026 16:50
Signed-off-by: Andrew Stein <steinlink@gmail.com>
@texodus texodus force-pushed the memory-safety-fixes branch from 95e8e88 to aa54fef Compare June 20, 2026 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Concrete, reproducible bugs

Development

Successfully merging this pull request may close these issues.

1 participant