Friday, May 28, 2004

Indexing Audio Conversations

Get Real has a writeup on an HP Research project called Speechbot. From the article:

John Dowdell points to an interesting research project being conducted at HP Labs, the SpeechBot. As the site describes, "SpeechBot is a search engine for audio & video content that is hosted and played from other websites".

Digging a little deeper into the technical documentation for SpeechBot, I came across this summary:

SpeechBot (http://www.compaq.com/speechbot) is the first Internet search site for indexing streaming spoken audio on the web. Unlike previous attempts to index spoken audio on the Web, which have relied on either adjacent text, metadata, or hand supplied transcripts and close captions, SpeechBot uses automatic speech recognition technology to transcribe and index documents that do not have transcripts or other content information. The use of speech recognition permits the efficient and cost-effective indexing of thousands of hours of audio content, which were previously inaccessible. Because of this indexing, SpeechBot allows users to quickly search for relevant content in long audio documents and yields a high precision on first page-retrieved items.

Read more.

It's not often I get to discuss my two my favorite subjects in one post, but this could have some interesting poetential. Already, through the use of products like OneNote it is possible to take notes during a meeting while recording the audio, and have the audio time indexed to your notes. When you search your notes for a keyword, you can start the audio to hear exactly what was being said at the very time you took a note. Perfect for project meetings, and very useful on a personal level for conference sessions. Also, many financial companies record phone conversations already.

What if it was policy to record such audio? And one step further, policy to autmagically index it? And then archive it as an unstructured data source into a data warehouse or an enterprise document repository? The capabilities are not so far fetched. Querying an ODS and finding an index of audio conversations (or transcriptions) alongside historical support requests would be quite valuable in the right context. Not to mention other uses such as project technical meeting archives indexed for technical documentation purposes in an IT data warehouse.

An Architect of any persuasion would probably find such a searchable archive very valuable. Combine it with some BAM features and perhaps he could be alerted whenever someone uses the word "enable."

1 Comments:

At 6:19 PM, Anonymous Anonymous said...

Duncan,

I'm very intrigued about the OneNote applications. I had previously heard about the voice annotation features but I didn't realize that it was quite so extensive.

I think we agree that there is tremendous potential in better accessing this information. With PocketPCs getting 600MHx processors and up (not to mention Tablet PCs full-on processors) it's only a matter of time before some of this moves to our pockets.

Greg (from GetReal)

 

Post a Comment

Links to this post:

Create a Link

<< Home