In a recent blog post Jordan WhatsApp Number List provided several details on the tests it was conducting to make audio search a reality in the years to it ever be possible for Google to create an index of audio content that users can view like web pages in their organic search results? On reading a blog post published by the Mountain View firm, audio research seems to be a difficult task to set up. The results of the latest established tests are explained in this post by Tim Olson of KQED , a San Francisco radio station that Google has partnered with to make audio more visible in search results.With the help of KUNGFU.
AI , a specialized artificial intelligence company that provides data, Google and KQED are launching several batteries of tests aimed at determining how to transcribe audio content quickly and faithfully, in the manner of textual content.Currently, there is no precise way to accurately transcribe audio content in order to make it easily findable for users. The only way for audio research to be deployed and made possible on a global scale would be through automated transcriptions of audio to text. Indeed, requesting manual transcriptions of each audio content would require enormous resources and efforts for content creators!
The limitations of Speech-to-Text technology
Tim Olson of the radio station KQED indicates that the degree of precision must be high for audio transcriptions, especially when it comes to transcribing the content of an audio news item (which goes on the radio) into content. text . And even Speech-to-Text automatic speech recognition technology , which uses Google’s AI technologies, is not enough to achieve this level of precision.Google, KQED, and KINGFU.AI conducted tests to apply Speech-to-Text technology to a series of audio news items from KQED radio.These tests gave mixed results. Indeed, the AI of this technology was limited when it came to correctly identifying and understanding proper names referring to named entities .
Often named entities need specific context to be understood precisely. It is precisely this context that Speech-to-Text AI finds it difficult to assimilate .To add more concrete, Tim Olson provides an example of audio news from KQED radio that contains several named entities relating to the San Francisco Bay Area. Like many local radios, local news from KQED radio is rich in local references via proper names referring to subjects, people, places and businesses related to the San Francisco area. For example, radio presenters use acronyms like “CHP” for California Highway Patrol and “the Peninsula” for the peninsula .
What are the prospects?
These named entities which take on a very specific meaning depending on the geographical context have difficulty in being identified by the Speech-to-Text AI.Therefore, when named entities are not understood, the AI tries to guess what is being said as best it can. Of course, this represents a non-applicable solution for web research, as a bad transcription can completely change the meaning of what has been said orally.The goal for Google and the newspaper companies is to make audio search globally possible as quickly as possible. Indeed, this would constitute a new facet of research on the web, with visibility granted to billions of content for the moment little exposed.
David Stoller, Google’s head of partnerships for news and press, said this technology will be shared globally when its development reaches the required level.Tim Olson also states that Maching Learning models don’t learn from their mistakes. Hence the need for humans to intervene to improve the relevance of these technologies. For him, the next step would be for all media outlets to be invested in the development of this technology through newsrooms that would constantly add feedback on common errors in audio-to-text transcriptions.stretching from the San Francisco area to San Jose.