Skip to content

Replaced unsafe ptr logic with chained split_at_mut in DenseMatrix an…#371

Open
jackpots28 wants to merge 5 commits into
smartcorelib:developmentfrom
jackpots28:issue-368-enhancement
Open

Replaced unsafe ptr logic with chained split_at_mut in DenseMatrix an…#371
jackpots28 wants to merge 5 commits into
smartcorelib:developmentfrom
jackpots28:issue-368-enhancement

Conversation

@jackpots28
Copy link
Copy Markdown

…d DenseMatrixMutView

Fixes #368

Checklist

  • [ x ] My branch is up-to-date with development branch.
  • [ x ] Everything works and tested on latest stable Rust.
  • [ x ] Coverage and Linting have been applied

Current behaviour

Changing DenseMatrix and DenseMatrixMutView that uses unsafe ptr arithmetic for generating mutable iterators

New expected behavior

DenseMatrix and DenseMatrixMutView continue to produce mutable iterators, but uses a "head" / "tail" method + split_at_mut function for iterating

  • no test changes required

Change logs

  • Replaced unsafe pointer arithmetic in DenseMatrix / DenseMatrixMutView mutable iterators with a safe, chained split_at_mut implementation to ensure memory safety without performance loss.

@jackpots28 jackpots28 requested a review from Mec-iS as a code owner May 20, 2026 20:42
@Mec-iS
Copy link
Copy Markdown
Collaborator

Mec-iS commented May 20, 2026

thank you!

please run cargo fmt --all -- --check and the other checks needed.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 43.24324% with 63 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.34%. Comparing base (70d8a0f) to head (5714314).
⚠️ Report is 17 commits behind head on development.

Files with missing lines Patch % Lines
src/linalg/basic/matrix.rs 42.72% 63 Missing ⚠️
Additional details and impacted files
@@               Coverage Diff               @@
##           development     #371      +/-   ##
===============================================
- Coverage        45.59%   44.34%   -1.26%     
===============================================
  Files               93       94       +1     
  Lines             8034     8105      +71     
===============================================
- Hits              3663     3594      -69     
- Misses            4371     4511     +140     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jackpots28
Copy link
Copy Markdown
Author

I updated the formatting and caught the issue with sort_by that clippy complained about. Test coverage is still slightly lower "src/linalg/basic/matrix.rs: 225/364" | "59.72% coverage, 5238/8771 lines covered", should I add some tests for more coverage?

@Mec-iS
Copy link
Copy Markdown
Collaborator

Mec-iS commented May 21, 2026

I updated the formatting and caught the issue with sort_by that clippy complained about. Test coverage is still slightly lower "src/linalg/basic/matrix.rs: 225/364" | "59.72% coverage, 5238/8771 lines covered", should I add some tests for more coverage?

adding more tests on the methods affected by changes is always welcome 🐳 thanks. at your convenience.

…tView - src/linalg/basic/matrix.rs: 251/364 +7.14% tarpaulin check
@jackpots28
Copy link
Copy Markdown
Author

I updated the formatting and caught the issue with sort_by that clippy complained about. Test coverage is still slightly lower "src/linalg/basic/matrix.rs: 225/364" | "59.72% coverage, 5238/8771 lines covered", should I add some tests for more coverage?

adding more tests on the methods affected by changes is always welcome 🐳 thanks. at your convenience.

Rolled through and updated matrix.rs to include 33 new cases

Copy link
Copy Markdown

@genefold-ai genefold-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The security goal is right and the overall direction is solid — removing unsafe pointer arithmetic is a clear win for long-term maintainability. Two runtime bugs need to be fixed before merging (see inline comments marked 🔴), and there are three medium-priority cleanup items (🟡) that would meaningfully improve performance and reduce duplication. Happy to discuss any of these — great contribution.

// Collect all mutable references up-front using split_at_mut so
// that the resulting iterator owns no borrow of "self.values"

match (column_major, axis) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 [HIGH] Potential panic: col_slice[..nrows] when stride < nrows

In DenseMatrixMutView, when the view has a stride smaller than nrows (e.g. a sub-view into a larger matrix where padding or offset applies), split_at_mut(stride) will produce a col_slice shorter than nrows, causing col_slice[..nrows] to panic at runtime.

The original unsafe implementation handled this correctly via direct offset arithmetic; this safe version does not inherit that safety.

Please add an explicit guard before slicing:

assert!(
    col_slice.len() >= nrows,
    "iter_mut: stride ({stride}) < nrows ({nrows}): view layout is inconsistent"
);
for elem in col_slice[..nrows].iter_mut() {

Or alternatively, use .get_mut(..nrows) and propagate/handle the None case rather than panicking silently later.

let nrows = self.nrows;
let ncols = self.ncols;
let ptr = self.values.as_mut_ptr();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 [HIGH] Subtraction underflow: ncols - 1 panics (debug) or wraps (release) when ncols == 0

The expression if _c == ncols - 1 is evaluated without first checking whether ncols is zero. When ncols == 0:

  • In debug mode: panics with a subtraction overflow.
  • In release mode: wraps around to usize::MAX, causing the branch condition to never match and likely producing an incorrect or out-of-bounds slice.

This can be triggered by a zero-column DenseMatrixMutView (e.g. a view into an empty region). The same issue exists symmetrically for nrows - 1 in the row-major cases.

Please add an early guard at the top of the method:

if ncols == 0 || nrows == 0 {
    return Box::new(std::iter::empty());
}

This also simplifies all subsequent arithmetic by making zero-size cases unreachable.

let mut indexed: Vec<(usize, &'b mut T)> = by_col
.into_iter()
.enumerate()
.map(|(flat_col_idx, r)| {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [MEDIUM] Triple Vec allocation in Case A is redundant — refs can be eliminated

In Case A (column-major, row-by-row) the code allocates by_col, then indexed, then a final refs Vec that is immediately consumed by into_iter():

let mut refs: Vec<&'b mut T> = Vec::with_capacity(total);
refs.extend(indexed.into_iter().map(|(_, r)| r));
Box::new(refs.into_iter())

The final refs is a pure copy of the indexed values with keys stripped. It can be removed entirely:

Box::new(indexed.into_iter().map(|(_, r)| r))

This saves one Vec allocation and one full iteration pass per iter_mut call. The same pattern appears in Case D of this method and in both Cases A and D of DenseMatrix::iterator_mut — four sites in total should be cleaned up.

@@ -143,81 +143,135 @@ impl<'a, T: Debug + Display + Copy + Sized> DenseMatrixMutView<'a, T> {
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [MEDIUM] Duplicated split_at_mut chunking logic across cases — extract a private helper

Cases B and C in DenseMatrixMutView::iter_mut, and their analogues in DenseMatrix::iterator_mut, share nearly identical code for iterating over a strided slice with split_at_mut:

let mut remaining: &'b mut [T] = self.values;
for _c in 0..ncols {
    let col_end = if _c == ncols - 1 { remaining.len() } else { stride };
    let (col_slice, tail) = remaining.split_at_mut(col_end);
    for elem in col_slice[..nrows].iter_mut() { refs.push(elem); }
    remaining = tail;
}

This 9-line block is repeated (with minor parameter changes) four times across the two impls. Consider extracting a private helper:

/// Collects mutable references to the first `take` elements of each
/// chunk of size `chunk` from `slice`, advancing through the tail.
fn collect_strided_mut<'a, T>(
    slice: &'a mut [T],
    chunks: usize,
    chunk_size: usize,
    take: usize,
) -> Vec<&'a mut T> {
    let mut refs = Vec::with_capacity(chunks * take);
    let mut remaining = slice;
    for i in 0..chunks {
        let end = if i == chunks - 1 { remaining.len() } else { chunk_size };
        let (head, tail) = remaining.split_at_mut(end);
        refs.extend(head[..take].iter_mut());
        remaining = tail;
    }
    refs
}

This reduces duplication, makes the zero-size guard easier to add in one place, and makes the bounds check (head.len() >= take) easier to enforce uniformly.


// Case D: row-major, col-by-col
(false, _) => {
let total = nrows * ncols;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [MEDIUM] O(n log n) sort for cross-axis traversal — consider direct index computation

Cases A and D both use this pattern to reorder elements for cross-axis iteration:

let mut indexed: Vec<(usize, &'b mut T)> = by_col
    .into_iter()
    .enumerate()
    .map(|(flat_col_idx, elem)| {
        let c = flat_col_idx / nrows;
        let r = flat_col_idx % nrows;
        (r * ncols + c, elem)
    })
    .collect();
indexed.sort_unstable_by_key(|(idx, _)| *idx);

This performs an O(n log n) sort on n = nrows * ncols elements just to reorder them into row-major (or col-major) access order. For large matrices this is a meaningful overhead compared to the original O(n) pointer arithmetic.

For DenseMatrix (contiguous, non-strided allocation), the same reordering can be done directly using chunks_mut with a transposition index formula, which is O(n):

// Case A: column-major, iterate row-by-row (no sort needed)
// Conceptually: output[r * ncols + c] = col_major_data[c * nrows + r]
// This is just a transpose read-order, achievable with two nested iterators.

If a sort is kept for correctness in the strided DenseMatrixMutView case, please add a comment explaining why direct index computation is not viable there.

@Mec-iS
Copy link
Copy Markdown
Collaborator

Mec-iS commented May 22, 2026

@jackpots28 please see the automated code-review by @genefold-ai and feel free to decide which patches to apply

@Mec-iS
Copy link
Copy Markdown
Collaborator

Mec-iS commented May 22, 2026

This one more highlighted by another automated code-review (may overlap the previous ones):

🔴 [HIGH] Bug 1: Subtraction underflow when ncols == 0 or nrows == 0

File: src/linalg/basic/matrix.rs — both DenseMatrixMutView::iter_mut and DenseMatrix::iterator_mut
The expression _c == ncols - 1 (and symmetrically _r == nrows - 1 in the row-major cases) is evaluated without
checking whether ncols or nrows is zero:

for _c in 0..ncols {
    let col_end = if _c == ncols - 1 {  // panics when ncols == 0
        remaining.len()
    } else {
        stride
    };

When ncols == 0:
• Debug builds: Panics with attempt to subtract with overflow.
• Release builds: Wraps to usize::MAX, the condition never matches, and split_at_mut(usize::MAX) panics or produces
garbage.

This was correctly pointed out by the genefold-ai review. Add an early guard:

if ncols == 0 || nrows == 0 {
    return Box::new(std::iter::empty());
}

About the tests:
| 🟢 Low | Missing edge case tests | Add tests for empty/1×N/N×1 cases |

Also I am trying https://docs.rs/cargo-crap/latest/cargo_crap/ to analyse complexity/risk patterns.

@Mec-iS
Copy link
Copy Markdown
Collaborator

Mec-iS commented May 22, 2026

Not too bad

$ cargo crap --workspace --lcov lcov.info --summary
warning: 13 source files had no matching entry in the LCOV report — verify your --lcov path or coverage tool configuration:
/smartcore/src/dataset/boston.rs
/smartcore/src/dataset/breast_cancer.rs
/smartcore/src/dataset/diabetes.rs
/smartcore/src/dataset/digits.rs
/smartcore/src/dataset/generator.rs
/smartcore/src/dataset/iris.rs
/smartcore/src/dataset/mod.rs
/smartcore/src/linalg/ndarray/matrix.rs
/smartcore/src/linalg/ndarray/vector.rs
/smartcore/src/model_selection/hyper_tuning/grid_search.rs
/smartcore/src/readers/csv.rs
/smartcore/src/readers/error.rs
/smartcore/src/readers/io_testing.rs
Per-crate summary:
┌───────────┬───────────┬────────┐
│ Crate     ┆ Functions ┆ Crappy │
╞═══════════╪═══════════╪════════╡
│ smartcore ┆       917 ┆     16 │
└───────────┴───────────┴────────┘
✗ Analyzed: 917 · Crappy: 16 (threshold 30) · Worst: CategoricalNBDistribution::eq (CRAP 182.0)

@Mec-iS
Copy link
Copy Markdown
Collaborator

Mec-iS commented May 22, 2026

this is a problem, can't merge with this as it breaks lazy evaluation:

1. Significant Performance Regression — Eager Collection

File: src/linalg/basic/matrix.rs
Lines: Both iter_mut implementations
The original code returned lazy iterators using flat_map and pointer arithmetic. The new implementation eagerly
collects all mutable references into a Vec before returning an iterator:

// New implementation pattern
for i in 0..cols {
    // ... split_at_mut logic ...
    refs.extend(col_slice.iter_mut());  // Collects all refs upfront
}
Box::new(refs.into_iter())  // Returns iterator over collected Vec

Problems:
• Memory overhead: Allocates a full Vec of ncols mutable references even for simple iteration
• Latency: Must traverse entire matrix before yielding first element
• Breaks iterator composability: Callers expecting lazy evaluation (e.g., .take(5).collect()) now pay full cost

Recommendation: Use flat_map with split_at_mut to maintain lazy evaluation:

fn iter_mut<'b>(&'b mut self, axis: u8) -> Box<dyn Iterator<Item = &'b mut T> + 'b> {
    if axis == 0 {
        Box::new(
            self.values
                .chunks_mut(self.stride)
                .take(ncols)
                .flat_map(move |col| col[..nrows].iter_mut())
        )
    } else {
        // Similar for axis == 1
    }
}

also code duplication:

3. Code Duplication Across 4 Cases

File: src/linalg/basic/matrix.rs
Lines: DenseMatrixMutView::iter_mut and DenseMatrix::iterator_mut
Both methods have nearly identical logic for handling:
• Column-major vs row-major storage
• Axis 0 (row-wise) vs axis 1 (column-wise) iteration
• split_at_mut chunking with special case for last column

This results in ~200 lines of duplicated logic.
Recommendation: Extract a shared helper function:

/// Build mutable iterator over matrix elements in specified axis
fn build_mut_iter<'b, T>(
    values: &'b mut [T],
    nrows: usize,
    ncols: usize,
    stride: usize,
    column_major: bool,
    axis: u8,
) -> Box<dyn Iterator<Item = &'b mut T> + 'b> {
    // Shared implementation
}

@jackpots28
Copy link
Copy Markdown
Author

this is a problem, can't merge with this as it breaks lazy evaluation:

1. Significant Performance Regression — Eager Collection

File: src/linalg/basic/matrix.rs

Lines: Both iter_mut implementations

The original code returned lazy iterators using flat_map and pointer arithmetic. The new implementation eagerly

collects all mutable references into a Vec before returning an iterator:


// New implementation pattern

for i in 0..cols {

    // ... split_at_mut logic ...

    refs.extend(col_slice.iter_mut());  // Collects all refs upfront

}

Box::new(refs.into_iter())  // Returns iterator over collected Vec

Problems:

• Memory overhead: Allocates a full Vec of ncols mutable references even for simple iteration

• Latency: Must traverse entire matrix before yielding first element

• Breaks iterator composability: Callers expecting lazy evaluation (e.g., .take(5).collect()) now pay full cost

Recommendation: Use flat_map with split_at_mut to maintain lazy evaluation:


fn iter_mut<'b>(&'b mut self, axis: u8) -> Box<dyn Iterator<Item = &'b mut T> + 'b> {

    if axis == 0 {

        Box::new(

            self.values

                .chunks_mut(self.stride)

                .take(ncols)

                .flat_map(move |col| col[..nrows].iter_mut())

        )

    } else {

        // Similar for axis == 1

    }

}

also code duplication:

3. Code Duplication Across 4 Cases

File: src/linalg/basic/matrix.rs

Lines: DenseMatrixMutView::iter_mut and DenseMatrix::iterator_mut

Both methods have nearly identical logic for handling:

• Column-major vs row-major storage

• Axis 0 (row-wise) vs axis 1 (column-wise) iteration

• split_at_mut chunking with special case for last column

This results in ~200 lines of duplicated logic.

Recommendation: Extract a shared helper function:


/// Build mutable iterator over matrix elements in specified axis

fn build_mut_iter<'b, T>(

    values: &'b mut [T],

    nrows: usize,

    ncols: usize,

    stride: usize,

    column_major: bool,

    axis: u8,

) -> Box<dyn Iterator<Item = &'b mut T> + 'b> {

    // Shared implementation

}

Yeah, I'm working on the deduplication via some single collector logic to use across the four instances instead. Also got curious and started into some bigger matrices testing, found what you stated with degradation since it does the full collect. I'll work through it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tech-debt: replace unsafe raw-pointer iterator_mut in DenseMatrix (basic/matrix.rs) with safe split_at_mut approach

3 participants