In a world of trillion-parameter AI models and massive GPU clusters, Fastino is charting a different course — training AI models on sub-$100K gaming GPU setups.
The Palo Alto-based startup has just raised $17.5 million in a seed round led by Khosla Ventures, bringing its total funding to nearly $25 million. Previous backers include Microsoft’s M12 and Insight Partners, who contributed $7 million in a pre-seed round last year.
Fastino’s approach is built around ultra-compact, task-specific models that outperform large foundation models in speed, accuracy, and cost — especially for targeted enterprise needs like document summarization or data redaction. These small models can deliver complete, millisecond-speed responses in a single token.
“We’re not trying to build the biggest model in the world,” said Ash Lewis, CEO and co-founder of Fastino.
“We’re focused on models that solve specific problems exceptionally well, without the compute overhead”
While still early, Fastino’s strategy is catching attention. Their lean architecture appeals to enterprises frustrated by the inefficiency and cost of giant models. Competitors like Cohere and Anthropic also offer compact solutions, but Fastino’s speed and affordability set it apart.
With new funding, Fastino is building a contrarian-minded AI team—hiring researchers from top labs who are less interested in leaderboard dominance and more focused on real-world performance.
Fastino’s vision reflects a growing shift in enterprise AI: smaller, smarter, and more efficient.