AI Edge Torch: High Performance Inference of PyTorch Models on Mobile Devices
The intersection of artificial intelligence (AI) and mobile technology is rapidly evolving, offering new opportunities for developers to bring powerful AI applications directly to the palm of your hand. A breakthrough in this evolving landscape is AI Edge Torch, a framework designed to deploy PyTorch models onto mobile devices efficiently, ensuring high performance regardless of the hardware’s limitations.
Recent collaborations between AI Edge Torch and notable companies such as Shopify, Adobe, and Niantic have yielded impressive advancements in on-device AI capabilities. For instance, Shopify has leveraged AI Edge Torch to introduce on-device background removal for product images, a feature poised to enhance the Shopify app experience in its next update.
Understanding the diverse ecosystem of mobile hardware, the AI Edge Torch initiative has emphasized compatibility across various platforms. This has involved extensive cooperation with manufacturers of CPUs, GPUs, and other accelerators, including industry leaders such as Arm, Google, MediaTek, Qualcomm, and Samsung System LSI. These efforts have not only broadened the support for PyTorch on mobile devices but have also finessed the performance of PyTorch models when converted to TensorFlow Lite (TFLite) for use with accelerator delegates.
A milestone in these efforts is the introduction of a new TensorFlow Lite delegate by Qualcomm. This development represents a significant leap forward in mobile AI, enabling accelerated execution of AI models on mobile devices. The Qualcomm Tensor Neural Processing Unit (NPU) powers this new delegate, offering unprecedented speed improvements. Compared to traditional CPU or GPU processing, the new delegate can dramatically reduce computational time—by an average of twenty times on CPU and five times on GPU setups. This enhances the feasibility and efficiency of sophisticated AI applications on mobile devices.
Qualcomm has also unveiled the Qualcomm AI Hub, an innovative cloud service designed to simplify the testing of TFLite models across a vast array of Android devices. This platform illustrates the performance improvements possible with the new delegate, highlighting the diverse capabilities of Qualcomm’s technology across different devices.
Looking ahead, the AI Edge Torch team is dedicated to refining this technology through open iterations. Future updates are expected to extend model support, enhance GPU compatibility, and incorporate new quantization modes, all aimed at achieving a polished 1.0 release. Furthermore, the development of the AI Edge Torch Generative API is underway, promising to empower developers with the ability to deploy custom generative AI models at the edge with unparalleled performance.
The progress of AI Edge Torch has been made possible by the invaluable feedback from early access users and the collaborative efforts of hardware partners and the XNNPACK ecosystem contributors. The broader PyTorch community has also played a crucial role in guiding the development of this innovative technology.
As AI Edge Torch moves forward, its potential to revolutionize mobile AI applications is clearer than ever. By bridging the gap between advanced AI models and the diverse landscape of mobile hardware, AI Edge Torch is setting the stage for a new era of intelligent mobile applications, accessible anywhere, anytime.