Particle: Apple and Tel-Aviv University Detail PCG Method That Accelerates AI Speech by About 40%

Overview

Researchers describe Principled Coarse-Grained Acceptance, which groups perceptually similar acoustic tokens into overlapping Acoustic Similarity Groups.
A small proposer model suggests tokens that a larger judge model verifies at the group level, adapting speculative decoding to acoustic-token systems.
In reported evaluations, generation speed increased by roughly 40% while maintaining lower word-error rates than prior speedup methods and achieving a 4.09 human naturalness score.
A stress test that substituted 91.4% of tokens with alternatives from the same group produced only a +0.007 rise in word error rate and a −0.027 change in speaker similarity.
Because the technique is applied at inference time and adds about 37MB to store groups, coverage notes potential to reduce Siri response latency, though no rollout is announced.