Skip to content

Optimize transform of vector<bool> for more predicates#5796

Merged
StephanTLavavej merged 9 commits intomicrosoft:mainfrom
AlexGuteniev:implications
Feb 11, 2026
Merged

Optimize transform of vector<bool> for more predicates#5796
StephanTLavavej merged 9 commits intomicrosoft:mainfrom
AlexGuteniev:implications

Conversation

@AlexGuteniev
Copy link
Copy Markdown
Contributor

Towards #625, specifically #625 (comment) item 3

Follow up to #5769

➡️ Optimization

@rodiers asked for set subtraction.

  1. A substract like method
    /// Subtract the sets of true flags in both CFlags.
    /// S1 = {i where *this[i] == true}, S2 = {i where ac_Other[i] == true}, Result[i] == true <=> i element of S with S=S1\S2

Out of the standard predicates std::greater does the trick: it is only true if the element is in first set and not in second set. Or std::less if converse. For completeness, let's add all four remaining comparisons.

❓ What it even is

Apparently, there's no universally accepted name for the corresponding integer operation, unlike xnor for equal_to.

In the benchmark, we can keep comparison names, like less_equal. But we also need to name functors. Here are some ideas:

  • In Wikipedia, these operations defined as implication and nonimplication, and the gates are imply and nimply. The other two, when the other input is negated, are converse implication and converse nonimplication,
  • In our favorites ISA there are instructions to do one of these operation: andn (scalar, BMI1), pandn (MMX, SSE2), and more flavors of pandn for bigger vectors and element masks, all accessible as intrinsics, and generated automatically by the compiler. We can call these ops andn and orn therefore. With slight confusion with nand / nor that we don't use anyway, and with no obvious way how to mark one vs the other inputs as negated.
  • A historian in a Stack Overflow comment mentions such names as "selective clear" and "selective set". Still without obvious way to define which arg is inverted.

In the PR I went with the Wikipedia naming, but I'm open to any other option

⏱️ Benchmark results

Benchmark Before After Speedup
transform_two_inputs_aligned<less<>>/64 130 ns 3.03 ns 42.9
transform_two_inputs_aligned<less<>>/4096 11462 ns 10.8 ns 1060
transform_two_inputs_aligned<less<>>/65536 264332 ns 141 ns 1870

🚗 Drive-by

There's a concise way to test for void, done that.

@AlexGuteniev AlexGuteniev requested a review from a team as a code owner October 22, 2025 15:28
@github-project-automation github-project-automation Bot moved this to Initial Review in STL Code Reviews Oct 22, 2025
Should not matter in practice I think, but for consistency
@StephanTLavavej StephanTLavavej removed their assignment Nov 14, 2025
@StephanTLavavej StephanTLavavej moved this from Initial Review to Ready To Merge in STL Code Reviews Feb 5, 2026
@StephanTLavavej
Copy link
Copy Markdown
Member

The 1870x speedup exceeds all but the literally exponential one 😹 😻

Thanks for the thorough test coverage and I agree with the names!

@StephanTLavavej StephanTLavavej moved this from Ready To Merge to Merging in STL Code Reviews Feb 9, 2026
@StephanTLavavej
Copy link
Copy Markdown
Member

I'm mirroring this to the MSVC-internal repo. Please notify me if any further changes are pushed, otherwise no action is required.

@StephanTLavavej StephanTLavavej merged commit 116f1cb into microsoft:main Feb 11, 2026
45 checks passed
@github-project-automation github-project-automation Bot moved this from Merging to Done in STL Code Reviews Feb 11, 2026
@StephanTLavavej
Copy link
Copy Markdown
Member

👻 🧛 🦇

@AlexGuteniev AlexGuteniev deleted the implications branch February 11, 2026 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Must go faster

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants