Link do wydarzenia na MsTeams:
Zachęcamy do wzięcia udziału w wykładzie oraz zainteresowania tym wydarzeniem swoich współpracowników, kolegów, doktorantów i magistrantów.
Poniżej znajdziecie Państwo biogram prelegenta i krótkie streszczenie wykładu.
Prof. Dr. Bernd T. Meyer
Carl von Ossietzky Universität Oldenburg, Germany
Bernd T. Meyer received the Ph.D. degree from the University of Oldenburg, Germany, in 2009, where he was a member of the Medical Physics Group. He was a Visiting Researcher in the speech group with the International Computer Science Institute, Berkeley, CA, USA, and worked in the Center for Language and Speech Processing at the Johns Hopkins University, Baltimore, MD, USA. Since 2019, he is professor for Communication Acoustics at the University Oldenburg. His research interests include the relation of speech and hearing, with a special interest in models of human speech perception, automatic speech processing and its applications in hearing technology.
“Deep learning for models of speech perception and
automated speech audiometry with Alexa”
Deep learning resulted in a major boost of speech technology and enabled devices with a relatively high robustness in automatic speech recognition (ASR). For some use cases, the underlying algorithms have become so robust that their degradation in presence of noise is similar to the perception of noisy speech of human listeners. In my talk I will provide examples of models for speech intelligibility, perceived speech quality, and the subjective listening effort derived from deep neural networks that are based on estimates of phoneme probabilities calculated from acoustic observations. In some cases, these algorithms outperform baseline models despite the fact that they operate on a mixture of noise and speech – in contrast to other approaches that often require separate noise and speech inputs. This implies a reduced amount of a priori knowledge for these algorithms, which could be interesting for applying them in the context of hearing research, e.g., for continuous optimization of parameters in future hearing devices. The underlying statistical models were trained with hundreds or thousands of hours of speech and are harder to analyze in comparison to many established models; yet they are not black boxes since we have various methods to study their properties, which will be briefly outlined. A second application of robust ASR is the conduction of listening tests where the subject listens to a noisy sentence and responds with the words he or she recognized. In a clinical setting, such a test is guided by a human supervisor who logs the correct word responses, which is time-consuming and expensive. I will present some of our results for speech audiometry using an ASR system and also briefly introduce the Alexa skill for performing a screening procedure in the living room.