Tether announces the release of a cross-platform BitNet LoRA framework that supports large model training and inference on consumer-grade GPUs and smartphones.

CycleProphet · 2026-03-17T14:06:18+00:00

Tether CEO Paolo Ardoino disclosed that the Tether AI team released a new version of QVAC Fabric, integrating the BitNet LoRA framework to enable training and inference of large models on consumer-grade GPUs and smartphones, with inference speed improvements of 2 to 11 times and memory usage reduction of up to 90%.

CycleProphet

2026-03-17 14:06:18

Abstract generation in progress

Deep Tide TechFlow News, on March 17, according to Tether CEO Paolo Ardoino, the Tether AI team released the new version of QVAC Fabric, integrated with the cross-platform BitNet LoRA framework, enabling training and inference of billion-parameter large models on consumer-grade GPUs and smartphones.

The new QVAC Fabric LLM achieves cross-platform running of BitNet LoRA fine-tuning and inference on AMD, Intel, Apple Metal, and mobile GPUs for the first time. On flagship devices, GPU inference speeds are 2 to 11 times faster than CPUs, with memory usage reduced by up to 90% compared to full-precision models. The Tether team has completed fine-tuning of models with up to 3.8 billion parameters on flagship phones like Pixel 9, S25, and iPhone 16, and achieved fine-tuning of models with up to 13 billion parameters on the iPhone 16. The related code has been open-sourced on GitHub.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.