Skip to content

[converter] Add aten.masked_scatter lowering#16

Merged
gokulkrishna98 merged 2 commits into
apple:mainfrom
gokulkrishna98:dev/gokul/smolvlm-masked-scatter
Jun 16, 2026
Merged

[converter] Add aten.masked_scatter lowering#16
gokulkrishna98 merged 2 commits into
apple:mainfrom
gokulkrishna98:dev/gokul/smolvlm-masked-scatter

Conversation

@gokulkrishna98

Copy link
Copy Markdown
Contributor

Description:

  • Implements aten.masked_scatter.default with semantics out[mask] = source.flatten()[: mask.sum()].
  • Flattens self/mask/source, then for each flat position selects source[cumsum(mask) - 1] where mask is True and self otherwise; False-position indices are masked away by where, so out-of-range gathers there are harmless.
  • Supports dynamic shapes by reshaping with the runtime shape vector (get_shape) and using -1 to infer numel in flat reshapes; mask broadcast also routes through the runtime shape when self is dynamic.

Testing:

  • python unit tests
  • ci
  • enables conversion of smol VLM model (static config)

Register a lowering for aten.masked_scatter.default. Semantics:

    out = self.clone()
    out[mask] = source.flatten()[: mask.sum()]

Implementation flattens self/mask/source, then for each flat position
picks from source[cumsum(mask) - 1] if mask is True else self. False
positions select away the gathered values via where(), so out-of-range
indices there are harmless. Indices for True positions are guaranteed
in [0, mask.sum() - 1] by construction.

Supports dynamic shapes by reshaping with the runtime shape vector
(coreai.get_shape) instead of the type-level shape, which carries a
sentinel for unknown dims, and by using -1 in the flat reshapes to
let coreai.reshape infer the numel. Mask broadcast also routes
through the runtime shape when self is dynamic.

Adds TestMaskedScatter covering:
  - Static and dynamic IR FileCheck (full operand chain pinned for
    static; op order asserted for dynamic).
  - Numerical validation across {f32, f16, i32} × {static, dynamic}.
  - Corner cases: all-False mask (output == self), all-True mask
    (output == src reshape), src larger than mask.sum() (extras
    unused), and a lower-rank mask that broadcasts right-aligned
    onto self.
@gokulkrishna98 gokulkrishna98 self-assigned this Jun 15, 2026
@gokulkrishna98 gokulkrishna98 merged commit a68f1ad into apple:main Jun 16, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants