Particle.news

Apple and Tel-Aviv University Detail PCG Method That Accelerates AI Speech by About 40%

The study presents a lightweight decoding change suited to on-device assistants without announcing any product integration.

Overview

  • Researchers describe Principled Coarse-Grained Acceptance, which groups perceptually similar acoustic tokens into overlapping Acoustic Similarity Groups.
  • A small proposer model suggests tokens that a larger judge model verifies at the group level, adapting speculative decoding to acoustic-token systems.
  • In reported evaluations, generation speed increased by roughly 40% while maintaining lower word-error rates than prior speedup methods and achieving a 4.09 human naturalness score.
  • A stress test that substituted 91.4% of tokens with alternatives from the same group produced only a +0.007 rise in word error rate and a −0.027 change in speaker similarity.
  • Because the technique is applied at inference time and adds about 37MB to store groups, coverage notes potential to reduce Siri response latency, though no rollout is announced.