From text to natural voice

Enrique Mendez, 9, and his older brother, Cristian, 11, sorted through a plastic bin of toys in their New Jersey home. "I want to play with the wrestling guys," said Enrique in a voice not quite his own, but pretty close.

Enrique has Down syndrome and speech apraxia, which means that he cannot speak, aside from a few grunts and "Ma" in the word "Mama." He was able to speak to his brother, though, with an iPad loaded with the latest version of a widely used text-to-speech application, Proloquo2Go. "The voice now matches the boy," said John Mendez, Enrique's father.

Until recently, devices that help children like Enrique speak used modified adult voices. The effect can be startling to those listening because it doesn't sound like a child's voice. Most existing children's voices sound "like adults on helium," said David Niemeijer, chief executive and lead developer at AssistiveWare, which developed the software Enrique tested.

AssistiveWare and its partner, Acapela Group, developed the next version, Proloquo2Go 2.1, which features two children's voices - known as Josh and Ella - actually recorded by children. The $190 (Rs 10,574) application went on sale on iTunes last Wednesday, but people who already own the app can add the latest voices at no charge.

Few, if any, other companies offer true children's voices, largely because of the challenges of recording children. The average 10-year-old cannot spend hundreds of hours in a sound booth recording the library of phrases needed to create a synthetic child's voice.

Sound engineering can manipulate adult voices, adding filters that adjust for the higher pitch of a child's voice, for example. But without a baseline recording, the voices to date have lacked the natural sound of a child's voice. With little competitive pressure to replicate children's voices, most companies decided children could get by with the altered adult voices.

The release of Proloquo2Go's boy and girl voices - the company also has two other children's voices with a British accent for that market - is an indicator of new progress in the decades-old text-to-speech industry.

The progress is, in part, a side effect of the adoption of automated voices in everything from credit card company service lines to the grocery store checkout kiosk. But faster computer processors with more memory have enabled sound engineers to make artificial voices sound more human. Many of the larger voice companies like Nuance, in Massachusetts, and Ivona, in Poland, now offer voices in multiple languages and accents.

Proloquo2Go, which runs on Apple's mobile devices, is used by tens of thousands of children with disabilities like autism and cerebral palsy. Proloquo2Go "can be a good fit for some people, but not for everyone," said Janice C. Light, a professor at Pennsylvania State University.

Said Niemeijer, the AssistiveWare chief: "A degree of assessment is definitely necessary because the parents often just go out and buy the device and it doesn't work out. Parents often have too high hopes."

During the recording sessions for Proloquo2Go 2.1, audio engineers collected several thousand phrases and hundreds of words. From this bank of words, the application can synthesize any word in the English language. For example the word "impressive" is stitched together from the words impossible, president and detective.

Most text-to-speech devices do give users the ability to say almost anything, and many allow users to choose whether they want to sound happy, angry or sad. The challenge facing the industry is how to develop text-to-speech technologies that can predict the emotion, or tone, a person might want to use.

Many in the industry agree that a synthetic voice, even one that expresses basic emotions, is barely adequate to allow someone with a speech disability to speak normally. "You often can't really chip in sharp/sarcastic comments," wrote Martin Pistorius, a 36-year-old Web developer and author, in an email. He lost his voice after contracting meningitis when he was 12 and has been using text-to-speech technology for 10 years. "By the time you've composed it, the moment has gone so it wouldn't really be funny or appropriate any more."

"I'm pretty quick at getting my message out, but even so I still can't keep up with the pace of normal conversation," he wrote.

From text to natural voice

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112