MEETING YOUR YOUNGER SELF
What would you say if you could talk to the version of yourself who needed it most?
For Pearson and Vayner Media, we helped create a moment where that question wasn’t hypothetical — but painfully, beautifully real.
Together with our sister company Hummingbird, Tool was responsible for designing and building a real-time, emotionally responsive avatar that allowed three women to meet and speak with their younger selves.
Not actors.
Not scripted stand-ins.
But themselves.
THE IDEA
Each story carried its own truth:
- Escaping homelessness and building a stable future for your family
- Proving everyone wrong and earning a place at NYU after unimaginable loss
- Finally getting a degree you’ve secretly claimed your whole life — even to your own children
The premise was simple, and devastatingly powerful:
The women were told they would mentor a student with a similar background.
Instead, they met their younger selves — and were given the chance to say the things they never got to hear.
TOOL'S ROLE
Tool led the creation of the real-time talking avatars, enabling a live, unscripted, emotionally intelligent conversation between past and present.
This wasn’t about spectacle.
It was about recognition, trust, and emotional truth — all happening in real time.
THE CHALLENGE
Creating a real-time avatar of someone who no longer exists
Real-time avatars require specific video training footage — at least two minutes, recorded under strict conditions.
But these younger versions of Jasmine, Natosha, and Savannah don’t exist anymore.
There was no footage.
Only memories.
RECREATING THE PAST
Step 1: Building the younger self
We began by recreating a single, photoreal image of each woman as she was during that defining chapter of her life.
This process combined:
- Archival personal photos
- Detailed personal memories and self-descriptions
- Facial reconstruction and generative image workflows
- Careful styling and character design
Each avatar had to stand upright, neutral, and camera-ready, matching the exact physical constraints of the projection glass.
Our direction duo John & Aqsa worked closely with the talents, translating interview insights into wardrobe, posture, and accessories — details the women would instantly recognize as themselves.
MAKING STILLNESS SPEAK
Step 2: From image to living presence
The next challenge was turning a static image into a two-minute continuous driver video — the foundation for real-time interaction.
AI video generation typically caps at ~8 seconds of usable quality.
We needed 120 seconds, without visual breaks, artifacts, or inconsistencies.
Using a custom ComfyUI workflow with WAN 2.2, we merged:
- The static younger-self image
- A performance by a live actor
- Carefully controlled micro-movement, gestures, and emotional nuance
Every blink, breath, and shift mattered.
Any visual hiccup would compound downstream — especially once trained into the avatar model.
Working with real actors was essential. This wasn’t animation. It was performance.
TRAINING THE AVATAR
Step 3: Real-time conversation under emotional constraints
The driver videos were then fed into Tavus, a platform primarily built for customer service — not trauma-informed conversations with your past self.
We stress-tested every variable:
- Camera distance: close-ups to full body
- Emotional range and expressiveness
- Eye contact and gaze deviation
- Angled compositions for over-the-shoulder filming
Each pipeline run took over a day.
Decisions were irreversible.
To manage risk, we ran multiple pipelines in parallel, learning early where the system bent — and where it broke.
GIVING THE AVATAR A VOICE
A person isn’t just how they look — it’s how they sound.
During interviews, the talents recorded high-quality voice data using professional Zoom recorders, guided carefully by the directors to capture:
- Emotional cadence
- Rhythm and pacing
- Natural hesitation and warmth
While we initially explored aging their voices down, we quickly realized something important:
No one truly remembers how they sounded back then.
But everyone recognizes their own voice now.
That recognition became part of the emotional impact.
Voice cloning was handled using ElevenLabs, preserving familiarity and authenticity over artificial youthfulness.
TEACHING THE AVATAR HOW TO CARE
The final layer was intelligence.
Through deep interviews, our directors uncovered core memories, fears, beliefs, and wounds — the emotional architecture behind each story.
That information was structured into a conversational framework that allowed the avatar to:
- Respond with empathy
- Reflect shared memories
- Stay emotionally aligned with the moment
We used OpenAI GPT-4.1 mini for its balance of speed and response quality, ensuring the conversation felt immediate — not mechanical.
The goal wasn’t improvisation.
It was emotional coherence.
ON SITE
Tool supported the full live experience:
- Real-time projection mapping onto glass
- Keying and compositing
- Audio routing and capture
- System monitoring and fail-safes
All so the women could forget the technology — and simply be present.
THE RESULT
Three conversations that were never supposed to happen.
Three moments of closure, validation, and healing — unfolding live.
This wasn’t AI pretending to be human.
It was technology stepping aside, just enough, to let humanity speak.
TECHNOLOGY & TOOLS
- Tavus — real-time avatar creation
- ElevenLabs — voice cloning
- Weavy — image generation & editing
- Photoshop
- ComfyUI — custom WAN 2.2 animation workflow on RunPod
- OpenAI GPT-4.1 mini