What does Inworld AI actually do?
Game studios creating NPCs with AI-driven dialogue have historically faced a trade-off: pre-recorded TTS sounds natural but cannot respond dynamically to player input, and scripted dialogue is limited by what the writers anticipated. The alternative — building custom TTS infrastructure from scratch — is expensive and time-consuming. Most independent studios lack the resources to build real-time voice generation for interactive characters, and off-the-shelf TTS tools are not designed for the sub-100ms latency that interactive dialogue requires. The result is that most AI-powered game characters either sound robotic or require enormous engineering investment.
Inworld's Realtime TTS API is built specifically for the latency constraints of interactive game characters. It connects to your LLM and generates voice output in sub-100ms, so the character speaks while the language model is still processing — the latency is imperceptible in gameplay. Voice 2.0 gives you 58 pre-built character voices across a range of ages, accents, and emotional registers, or you can design a custom voice identity from scratch. You control tone and emotional energy via text prompts mid-scene. Teleport is the real-time streaming layer that handles the live voice session between your game and Inworld's servers.