Skip to content

computeCommunProb: Matrix::crossprod creates N×N dense intermediate matrix causing OOM on large spatial datasets #4

@gettygugu-dot

Description

@gettygugu-dot

In computeCommunProb (modeling.R ~L193-276), each LR pair computes:

dataLR <- Matrix::crossprod(x_, y_)  # N×N dense matrix
P1_Pspatial <- HillFunction(dataLR) * P.spatial

For large spatial datasets (e.g., Visium HD with 174K cells), crossprod produces an N×N matrix (~0.9–8 GB per LR pair depending on gene sparsity). With parallel workers, peak memory easily exceeds 64 GB, causing OOM or segfault.

However, P.spatial is extremely sparse (e.g., 15.6M / 30.5B = 0.05% nonzero). Over 99.95% of the crossprod result is multiplied by zero and discarded.

Proposed fix: Pre-extract P.spatial as triplets (i, j, v), then for each LR pair compute L[i] * R[j] only at nonzero spatial positions. Apply Hill function and agonist/antagonist in vectorized form on the triplet values. This reduces per-LR-pair memory from ~0.9–8 GB to ~125 MB.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions