Sierra Ventures: Our Early-Stage Investment in Smallest AI

Smallest AI: The Future of Voice AI
We are thrilled to announce our Series Seed investment in Smallest AI, the company solving the biggest Enterprise Voice use cases!
In 2024, I was exchanging notes in an AI practitioners group where I saw a founder asking unusually deep technical questions. Out of curiosity, I reached out hoping to learn more. It was clear that they were onto something big. Luckily, a few months later, the founder decided to join my “Scenic Routes and Startup Roots” car ride event, which further convinced me that this is the founding duo I needed to back.
We’ve now entered a new era. Voice AI agents are no longer research projects or demos - they’re deployed in production. Whether it’s buying or selling a house, scheduling a doctor’s appointment, or returning a purchase, there’s a growing chance you’ve already interacted with an AI voice agent without realizing it.
What feels like a natural exchange requires sophisticated infrastructure: accurate speech recognition, context understanding, personalization, and the ability to handle interruptions or errors without sounding robotic. The bar for “human-like” interaction is high, and customers can get turned off the moment they realize they are interacting with AI.
Why Smallest AI
- Incredible team: Sudarshan and Akshat are exceptional AI researchers who were classmates at IIT Guwahati. They have handpicked the best AI talent to solve complex AI challenges. Their suite of products means no customers don’t need to stitch together ASR, LLM/SLM and TTS.
- Industry-leading performance: The company has some of the fastest, highest-quality voice AI models in the world (Lightning Text to Speech) that outperform Elevenlabs and Cartesia across all parameters. Their SLM Electron outperforms GPT 4.1 Mini in both quality and latency for real-time conversational use cases. Smallest AI delivers great performance through proprietary models, including:
- Lightning TTS – outperforming ElevenLabs and Cartesia in real-time speech benchmarks
- Electron SLM – delivering 10x faster time-to-first-byte than GPT-4.1 Mini, while beating much larger models on key benchmarks
- Integrated Voice AI stack: Delivering a natural voice interaction requires a complex workflow under the hood. It starts with Automatic Speech Recognition (ASR) to convert speech into text, enabling the system to capture intent. Accuracy matters - especially in noisy environments, so vendor capabilities vary widely. The text then flows into a language model (LLM or SLM) that generates a response, which is converted back into speech using Text-to-Speech (TTS) technology.
Each step, whether it is ASR, language modeling, or TTS has its own challenges. Stitching them together is even harder. The bar for performance is very high. This entire loop must be completed in under 500 - 600 milliseconds to feel real-time. The TTS component should take less than 100 milliseconds to make it possible. The synthesized voice has to sound natural, and the response must be coherent. AI agents should not interrupt the customers while they are speaking. Achieving both speed and quality simultaneously is a non-trivial problem, but solving it unlocks a new generation of human-like AI voice agents.
Smallest AI is pushing the limits of both speed and quality to create voice agents that feel truly natural and responsive. By optimizing every step in the loop, from low-latency TTS to contextually coherent responses, they are helping to deliver real-time, human-like conversational AI that sets a new benchmark for performance.
- Happy Customers: customers consistently told us how much they enjoyed working with the team. The founders are available at any time of day if support is needed. Smallest has successfully enabled customers to switch from legacy voice AI vendors to the modern AI stack.
We can’t wait to help the team build an incredible company!