The voice Alexis “Lexi” Bogan had before last summer was exuberant.
She loved to belt out Taylor Swift and Zach Bryan ballads in the car. She laughed all the time. In high school, she was a soprano in the chorus.
Then that voice was gone.
Doctors in August removed a life-threatening tumor lodged near the back of her brain. When the breathing tube came out a month later, Bogan had trouble swallowing and strained to say “hi” to her parents. Months of rehabilitation aided her recovery, but her speech is still impaired.
In April, the 21-year-old got her old voice back. Not the real one, but a voice clone generated by that she can summon from a phone app. Trained on a 15-second time capsule of her teenage voice — sourced from a cooking demonstration video she recorded for a high school project — her synthetic but remarkably real-sounding AI voice can now say almost anything she wants.
THE RISKS
Experts have warned that rapidly improving AI voice-cloning technology can amplify phone scams, disrupt and violate the dignity of people — living or dead — who never consented to having their voice recreated to say things they never spoke.
It’s been used to produce to New Hampshire voters mimicking President Joe Biden. In Maryland, a high school athletic director with using AI to generate a fake audio clip of the school’s principal making racist remarks.
But Bogan and a team of doctors at Rhode Island’s Lifespan hospital group believe they’ve found a use that justifies the risks. She’s one of the first people and the first with her condition to work with ChatGPT-maker OpenAI to replicate a lost voice.
“We’re hoping Lexi’s a trailblazer as the technology develops,” said Dr. Rohaid Ali, a neurosurgery resident at Brown University’s medical school and Rhode Island Hospital. Millions of people with debilitating strokes, throat cancer or neurogenerative diseases could benefit, he said.
TRAINING AN AI VOICE
Bogan had to go back a few years to find a suitable recording of her voice to “train” the AI system on how she spoke. It was a video in which she explained how to make a pasta salad.
Her doctors intentionally fed the AI system just a 15-second clip. Cooking sounds make other parts of the video imperfect. It was also all that OpenAI needed — an improvement over previous technology requiring much lengthier samples.
Getting something useful out of 15 seconds could be vital for any future patients who have no trace of their voice on the internet. A brief voicemail left for a relative might have to suffice.
When they tested it for the first time, everyone was stunned by the quality of Bogan’s voice clone. “I get so emotional every time I hear her voice,” said her mother, Pamela Bogan, tears in her eyes.
USING AN AI VOICE
Bogan types a few words or sentences into her phone and her custom-built app instantly reads it aloud.
She now uses her AI voice about 40 times a day and sends feedback she hopes will help future patients. One of her first experiments was to speak to the kids at the preschool where she works as a teaching assistant.
She’s used it at stores to ask where to find items. It’s helped her reconnect with her dad, who has hearing loss and was struggling to understand her. And it’s made it easier for her to order fast food.
“Hi, can I please get a grande iced brown sugar oat milk shaken espresso,” said Bogan’s AI voice as she held the phone out her car’s window at a Starbucks drive-thru.
“I think it’s awesome that I can have that sound again,” she said. It’s helping to boost her confidence and restoring a part of her identity she thought she was losing forever.
WHO’S NEXT?
Bogan’s doctors have started cloning the voices of other willing Rhode Island patients and hope to bring the technology to hospitals around the world. OpenAI said it is treading cautiously in expanding the use of the tool it calls Voice Engine, which is not yet publicly available.
Other companies with commercially available voice-generation services say they prohibit impersonation or abuse, but they vary in how they enforce their terms of use.
“We want to make sure that everyone whose voice is used in the service is consenting on an ongoing basis,” said Jeff Harris, OpenAI’s lead on the product. “We want to make sure that it’s not used in political contexts.”
Harris said OpenAI’s next step involves developing a secure “voice authentication” tool so users can replicate only their own voice, with a possible exception for trusted medical providers working with a patient.
While for now she must fiddle with her phone to get the voice engine to talk, Bogan imagines an AI voice engine that improves upon older remedies for speech recovery in melding with the human body or translating words in real time.
She’s less sure about what will happen as she grows older and her AI voice continues to sound like she did as a teenager. Maybe the technology could “age” her AI voice, she said.
For now, “even though I don’t have my voice fully back, I have something that helps me find my voice again,” she said.
___
The Associated Press and OpenAI have that allows OpenAI access to part of AP’s text archives.
Matt O’brien, The Associated Press