fix: support Apple Silicon Metal API and resolve numerical underflow by zhang-zidong · Pull Request #9 · chengl7-lab/scape

zhang-zidong · 2026-03-04T06:54:08Z

Implement smart OS detection for Taichi backend (metal+f32 on Mac, gpu+f64 on Linux).
Refactor kernel probability calculations to log-space (Log-Sum-Exp) to prevent f32 underflow.
Add strict numpy dtype casting to prevent f64 leakage into Metal kernels.

- Implement smart OS detection for Taichi backend (metal+f32 on Mac, gpu+f64 on Linux). - Refactor kernel probability calculations to log-space (Log-Sum-Exp) to prevent f32 underflow. - Add strict numpy dtype casting to prevent f64 leakage into Metal kernels.

chengl7

Dear Zidong,

Thank you very much for taking the time to submit this PR and for working on improving SCAPE. We really appreciate the effort to add Apple Silicon / Metal support and improve the numerical stability of the kernels.

Unfortunately, our team is currently understaffed and we do not have the capacity to properly test and validate these changes across the different environments that SCAPE supports. Because this code touches core Taichi kernels and backend initialization, we want to be careful before merging changes that may affect existing workflows.

In particular, some parts of the PR change how the Taichi backend and floating-point precision are selected (e.g., switching between Metal and GPU backends and modifying default dtypes). Without testing, there is a risk that these changes could affect behavior on other platforms such as Linux/CUDA systems, CPU-only environments, or existing pipelines that rely on the current float64 behavior.

For that reason, we’re not able to merge the PR right now. If you (or others in the community) are able to run and validate the changes on different systems (e.g., CUDA/Linux, CPU-only, Apple Silicon/Metal) and confirm that the behavior remains correct, please feel free to report the results here — that would greatly help us move this forward.

Thanks again for the contribution and for supporting the project!

Lu

zhang-zidong · 2026-03-06T00:39:01Z

Hi Lu,

Thank you for the detailed and transparent feedback! I completely understand your concerns—touching the core Taichi configurations and precision settings does carry risks, and it is totally reasonable to hold off on merging without full validation.

To be completely transparent about my current testing status: I have verified that the code successfully runs on Apple Silicon (Metal, f32) without crashing. However, I have not yet performed a rigorous numerical comparison between the new Mac (f32) output and the original CPU/Linux (f64) output to confirm if the results are statistically consistent and close enough.

Just to provide a bit of reassurance on the code structure, the backend and precision changes are strictly encapsulated within a platform.system() == "Darwin" condition. On Linux or CPU-only setups, the script automatically defaults back to ti.gpu and ti.f64. This ensures the exact same precision and execution logic as the original code for non-Mac users. The Log-Sum-Exp adjustments are mathematically equivalent, serving only to prevent floating-point underflows.

To help move this forward, I will run a comparative benchmark on my end (comparing the Metal/f32 results against the original CPU/f64 results on the toy example) to check for numerical consistency, and I will post the findings here.

In the meantime, if anyone in the community has a Linux/CUDA setup and could help run a quick test using this branch, it would be greatly appreciated!

Thank you again for your time and for maintaining this great project.

Best regards,

Zidong

chengl7 reviewed Mar 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: support Apple Silicon Metal API and resolve numerical underflow#9

fix: support Apple Silicon Metal API and resolve numerical underflow#9
zhang-zidong wants to merge 1 commit into
chengl7-lab:mainfrom
zhang-zidong:main

zhang-zidong commented Mar 4, 2026

Uh oh!

chengl7 left a comment

Uh oh!

zhang-zidong commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhang-zidong commented Mar 4, 2026

Uh oh!

chengl7 left a comment

Choose a reason for hiding this comment

Uh oh!

zhang-zidong commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants