Since the micro architecture of AI chips are different from different vendors, so is Ascend, which is quite different from NV GPU. To make Triton ops perform well, we usually need to do some code changes on the code with NV GPU. The main purpose of this repo is:
- providing some typical ops, which have already been fine-tuned on Ascend
- providing some tutorials, which show the tips and Triton DSL extension on Ascend
Three kinds of example are provided as following:
- basic examples
- best practice
- DSL extension
- CANN
- Pytorch and Pytorch Ascend
- Triton Ascend
Follow MIT license, check LICENSE file