Friday, October 23, 2009

Speech processing

From the IET:

"You what…?' are the last words the police want to hear over their radios when in hot pursuit with sirens blaring. The same goes in court if a jury can’t understand the recording of a critical 999 call made outside a noisy nightclub. In both cases, it would be tempting to reach for help from a speech-enhancement algorithm to separate the message from the medium.

And yet research by the Centre for Law Enforcement Audio Research (CLEAR) has shown that most speech-enhancement techniques improve sound quality at the expense of intelligibility, particularly when the signal-to-noise ratio (SNR) is very low. Closely related as they are, speech quality and intelligibility are not identical.

“You can find speech signals that an average human listener would judge as good quality but they wouldn’t be able to understand all the words,” explains Patrick Naylor, reader in speech and audio signal processing at Imperial College London and a member of CLEAR. “Equally, it is sometimes possible to pick out all the words in a poor-quality signal that is full of background noise.”

A great deal of the work on the intelligibility of speech has been done to improve the effectiveness of hearing aids and telecoms channels, where the SNR is typically better than 10 to 20dB. But in law-enforcement situations, the audio quality can be far worse, with SNRs of 0dB or lower quite common. And intelligibility is far more important than sound quality. If you cannot make out all the words in a recorded phone call or police interview tape, it loses its value as evidence. Similarly, those using communications equipment links in noisy environments, such as police drivers, put themselves and others at risk if they misunderstand what they are hearing or react slowly because they are trying to pick the signal out of the noise."

Read the rest of the article by clicking here.

No comments: