Your cart is currently empty!
Neural Style Transfer: Challenges with Apple’s AI Ecosystem

I continued with my deep dive into the neural networks research on my Mac. I’ve tried to improve performance of the current solution and, Man, I opened a can of worms πͺ±… What I have learned and why Apple’s NPU performance is NOT what you think it is π€¬. Here are the main points ππ»
π€¬ ANE (Apple’s name for NPU) is always underutilized. You simply can’t pin workload only to NPU (no configuration for that). The CPU should always be in the loop. Either it’s CPU+NPU (a new option) or CPU+GPU or altogether. This hides the problem with ANE because other compute bros are always there to help.
ππ» OpenCL is not supported in MacOS anymore. Vulkan as well (only through a Metal wrapper). Only CoreML and Metal (mps).
ππ»ββοΈOnnxruntime (MS’s ML framework) falls back to CPU mainly. Note: Onnxruntime should be customly compiled with core ml support. This is necessary so OpenCV can use it as a ML backend provider.
π CoreML backend for OpenCV is not supported yet. Thread pool is your friend.
π° AI Ecosystem is fragmented. No unified API available.
π© Models conversion problems:
- Onnx models can’t be converted to CoreML’s at the moment. So the models from other famous frameworks (TF, Torch) have side effect upon conversion (glitches, not optimal execution, bugs).
- Some layers are not supported by GPU and NPU from converted models so you won’t get acceleration gains.
- Pytorch for conversion looks more viable choice than others.
The full demo of the solution is down here ππ»
π What has to be done with Apple’s AI to improve?
- Be more open to community and support open standards (like OpenCL, OpenCV).
- ANE’s capacity is not fully utilized. It should be clearly tested and explained why.
- Support ML models inference on ANE directly (not a hybrid approach like now).
- Training is done on combination of GPU and CPU in the best case. ANE should be in the loop as well someday idk.
- No open benchmarks for ANE. That’s why there is no trust in performance and support by 3rd parties. Once it’s fixed the things get better.
- No low level access to the ANE chip limiting the options for optimizations.
π§π»π§πΏββοΈπ§π»ββοΈπ§πΌββWhat has to be done with AI ecosystem in general?
- The should be some unified standard and a committee in place for neural networks issues. Not only bunch of greedy companies whose pooling the blanket on their side.
- Help the companies and communities to support the standard.
- Protobufs are widely used by the community for intermediate graph representation. They caused more harm to the IT community than NULL pointers, in my opinion. Causing a dependency hell.
- Latest Python should work out of the box. Python 3.8 (from 2019) is still dominating AI field π€¦πΌββ. Use conda, pyenv, or a any other supported virtual environment because Mac by default uses the latest version. The pip installer is not a choice, sorry!
Conclusion
Apple has made a big performance leap by introducing Apple Silicon. Yet, AI is still a weak point. Improvements need to be made with developers’ experience, training, and open standards support. Things are moving slowly with market makers like NVIDIA, Microsoft, Google, and yes, surprise-surprise, Intel. AMD has just started. Apple, Qualcomm, and the Chinese government are also involved. NVIDIA still rules the market. But Apple, in fact, making great improvements on device side. With that said I wish to Good Luck to Apple! With WWDC25 on the horizon I hope some problems will be addressed.