Overview
- Andreessen Horowitz and Lightspeed led the financing, with participation from Databricks’ venture arm and the UC Berkeley Chancellor’s Fund.
- Inferact was founded by vLLM core maintainers Simon Mo, Woosuk Kwon, Kaichao You, and Roger Wang.
- The company says it will continue funding and stewarding the open-source vLLM project as model architectures and hardware evolve.
- vLLM’s efficiency features—such as PagedAttention, quantisation, and multi-token generation—aim to cut memory waste and speed responses, and the library counts thousands of contributors with usage at companies like Meta and Google.
- Inferact outlines a commercial platform built for production use, with reporting indicating plans for a serverless vLLM offering plus observability, troubleshooting, and disaster recovery features.