New recipe: High Performance Tensor Transposition library (v1.0.5)#4765
Conversation
High Performance Tensor Transposition Library (v1.0.5)
|
Wow, the recipe is pretty well-done for a first-time contributor!
We actually do have the means to target multiple microarchitectures, but we need to still fix a few minor bugs in the auditor (for example JuliaPackaging/BinaryBuilder.jl#1194 and JuliaPackaging/BinaryBuilderBase.jl#233) in addition to fixing what build Pkg should fall back on when the current system doesn't match the available microarchitectures. Also, it was suggested to revise the sets and names of microarchitectures we should target: JuliaPackaging/BinaryBuilderBase.jl#233 (review) |
|
Thanks, this went smooth. It took inspiration from a related package (TBLIS), but it still took me about a day to get everything correct and tested (mostly because of the rather primitive build process of this particular library). One issue I noticed is that, in |
|
Ugh. Would you mind filing an issue in https://github.com/JuliaPackaging/BinaryBuilder.jl? |
This adds the HPTT library (High Performance Tensor Transposition), original version here:
https://github.com/springer13/hptt
I plan to use this library in forthcoming versions of TensorOperations.jl, and it can also be useful to other tensor libraries.
The patch fixes compatibility with clang, and corresponds to the changes of PR springer13/hptt#16 .
While the library promises dedicated optimisations for the ARM platform, these lead to compiler issues ( believe the original source file does contain bugs or is broken), so I decided to fall back to the general build flags on this platform and to disable specific optimisations.
On Intel platforms, I enable avx optimisations, which yields a warning that I cannot assume these to be present. The library does not support switching avx on or off at runtime, so the only way around would be to have separate builds based on the availability of AVX, which I do not know how to realise using BinaryBuilder. I assume most users would actually have AVX available.