Building ‘Her’-Inspired AI Voices Without the Dystopia
The Vision Behind WaveForms AI
Alexis Conneau, the mind behind ChatGPT’s Advanced Voice Mode, has long been fascinated by the AI companion “Samantha” from Spike Jonze’s 2013 film Her. Now, with his new startup WaveForms AI, he’s determined to bring that level of conversational AI to life—without the dystopian consequences depicted in the movie.
Conneau’s X/Twitter banner pays homage to the film (Image Credit: X)
From Sci-Fi to Reality: The Evolution of Voice AI
After pioneering voice technology at Meta and OpenAI, Conneau has launched WaveForms AI with $40 million in seed funding led by Andreessen Horowitz. The startup aims to develop foundation models for AI audio products set to launch in 2025, competing directly with offerings from tech giants like OpenAI and Google.
Key developments in voice AI:
- ChatGPT’s Advanced Voice Mode processes speech natively for human-like responses
- WaveForms AI is training specialized audio LLMs for more natural interactions
- The technology has evolved dramatically since Siri’s limited 2013 capabilities
Avoiding the ‘Her’ Dystopia
While inspired by the film, Conneau is adamant about creating ethical AI:
“The movie is a dystopia, right? It’s not a future we want. We want to bring that technology—which now exists and will exist—for good.”
WaveForms AI focuses on:
- Complementary rather than replacement human interactions
- Avoiding social media’s “time spent” metrics that promote unhealthy usage
- Developing emotionally intelligent but not manipulative AI
The Technology Behind Advanced Voice AI
Conneau explains the breakthrough in ChatGPT’s Advanced Voice Mode:
- Traditional systems convert voice→text→GPT→text→voice
- Advanced Mode processes audio directly as tokens (≈3 tokens/second)
- Specialized transformer models enable ultra-low latency responses
Emotional Intelligence vs. Emotional Simulation
Important distinctions in AI audio:
- Models recognize vocal patterns humans associate with emotions
- They don’t “understand” emotions but can mimic appropriate responses
- WaveForms aims for authentic, helpful interactions without manipulation
The Future of Conversational AI
Conneau envisions:
- AI as educational companions (“the teacher you wouldn’t have in your physical life”)
- Natural voice interfaces for cars, computers, and other technology
- Smaller, more efficient foundation models as scaling laws plateau
The AGI Connection
Inspired by former OpenAI colleague Ilya Sutskever’s concept of “feeling the AGI,” Conneau believes:
“You’ll be able to feel the AGI more when you can talk to it, hear it, and actually converse with the transformer itself.”
Navigating Ethical Challenges
As AI becomes more personable, WaveForms faces critical questions:
- How to prevent unhealthy attachments to AI companions
- Balancing engaging interactions with responsible design
- Learning from social media’s mistakes about user wellbeing
Martin Casado of Andreessen Horowitz offers perspective:
“I can talk to a random person online who might bully me, or I could talk to an AI. We should study which is actually preferable.”
The Path Forward
WaveForms AI represents both the tremendous potential and profound responsibility of creating truly conversational AI. As the technology approaches sci-fi levels of sophistication, Conneau’s team faces the challenge of realizing the benefits of Her’s vision while avoiding its cautionary tale.
📚 Featured Products & Recommendations
Discover our carefully selected products that complement this article’s topics:
🛍️ Featured Product 1: Ayala Bar Black + Grey Necklace
Image: Premium product showcase
Professional-grade ayala bar black + grey necklace combining innovation, quality, and user-friendly design.
Key Features:
- Cutting-edge technology integration
- Streamlined workflow optimization
- Heavy-duty construction for reliability
- Expert technical support available
🔗 View Product Details & Purchase
💡 Need Help Choosing? Contact our expert team for personalized product recommendations!