Lightmatter has announced their first commercially-ready photonic processing unit, claiming 1000x faster AI inference than NVIDIA's H100. The company calls it Passage — and it is not just a benchmark curiosity.
How the Architecture Works
Passage combines optical matrix multiplication with conventional electronic control logic. The expensive part of running a large language model — the massive matrix multiplications that happen at every transformer layer — is routed through silicon photonic waveguides, where computations happen at the speed of light. Scheduling, memory control, and precision-critical operations remain on silicon.
This hybrid approach sidesteps the one major limitation of pure photonic computing: the difficulty in implementing arbitrary digital logic using only light. Instead, light does what it is uniquely good at — high-bandwidth, low-energy analog computation — while electrons handle control flow.
Why 1000x Is Believable
The speedup claim is specifically for inference throughput on transformer models, not general FP32 FLOPS. The bottleneck in serving large models is not raw compute but memory bandwidth and energy per token. Photonic matrix multiplication sidesteps DRAM reads entirely for the weight matrices, which is where the gain comes from.
Independent researchers who have reviewed Lightmatter's whitepaper note that the 1000x figure is achievable in specific throughput-bound workloads. More general compute tasks will see smaller, but still substantial, improvements.
What It Means for AI Infrastructure
A 1000x improvement in inference cost-per-token would restructure the entire competitive landscape for AI serving. API pricing for frontier models could drop by two orders of magnitude. Data centers could serve the same traffic with a fraction of the rack space and power.
The first commercial deployments are expected through cloud partnerships. Availability is planned for late 2026. Engineering samples are already in hands of select hyperscalers for testing.