Thinking Machines unveils TML-Interaction-Small for real-time AI interaction

Thinking Machines has announced the TML-Interaction-Small, a revolutionary AI model designed to replace conventional turn-taking systems with an always-present interactive approach. This 276B-parameter mixture of experts model features 12B active parameters and processes conversation as a continuous live stream rather than a series of interruptions. Unlike traditional AI voice systems, which operate like walkie-talkies, TML-Interaction-Small can simultaneously listen, speak, and respond to various cues, enhancing the fluidity of human-like interaction. Its architecture uniquely integrates interactivity into the model, eliminating the need for separate detectors and timing rules. Early results showcase impressive performance metrics, positioning it for broader applications in real-time collaboration ahead of a wider release later this year.

Thinking Machines: Thinking Machines Lab is an artificial intelligence research and product company co-founded by former OpenAI CTO Mira Murati and other alumni, focused on building multimodal AI that interacts naturally with humans through conversation, sight, and collaboration. The company recently deepened ties with Google Cloud for AI infrastructure and continues to develop products like Tinker for model fine-tuning. In this news, they announced TML-Interaction-Small, their first AI model that introduces always-present interaction by processing live streams of audio, video, and text.
TML-Interaction-Small: TML-Interaction-Small is a multimodal mixture-of-experts AI model developed by Thinking Machines Lab as a research preview for real-time human-AI collaboration. It handles interactions through continuous micro-turns, allowing simultaneous listening, speaking, watching, and tool usage without traditional turn-taking pauses. The model enables novel behaviors like interruptions based on context, reactions to visual cues, and background processing for complex tasks while maintaining conversational presence.

Research Preview: Launched as an early model with demonstrations of live translation, code debugging, and timed reminders, ahead of wider release later this year.
Architecture Advantage: Trains interactivity natively into the model, bypassing separate voice detectors, turn rules, and speech components used in most real-time systems.
Interaction Innovation: Shifts AI from walkie-talkie style turn-taking to always-present systems that listen, watch, and respond simultaneously like human collaboration.