P74Session 2 (Friday 10 January 2025, 09:30-11:30)Can we know audiovisual speech perception ability without measuring it?
Speech comprehension is often boosted by watching a talker’s face. This is of critical importance for anyone who is hearing impaired, and the benefits that a hearing aid bring are likely to depend on whether or not the talker’s face is visible. However, the influence of visual speech is not normally considered when treating hearing loss. This is partly because of large individual differences, which make it difficult to quantify the benefit without direct measurements. If we could develop efficient ways of predicting the boost provided by seeing a talker’s face, this might make it practical to factor in when treating hearing loss. Here we ask the question: is it possible to predict audiovisual speech perception in an individual from measurements of the separate modalities?
We fitted a novel model of audiovisual integration to a normative dataset of ~200 individuals from a broad demographic, collected on-line, performing an open-set speech in noise task across three modalities: visual-only, auditory-only and audiovisual. Our model was able to fit the individual variation in audiovisual speech perception under the assumption that it depended only on auditory-only and visual-only speech perception (R~0.95), whereas unisensory performance varied widely across individuals. This suggests that the “audiovisual integration function” for speech is relatively consistent across a diverse population, with individual differences being attributable to differences in unisensory perception.
To test whether this model might have the predictive power to be of clinical use, we attempted to use this model to predict new data which was not used for fitting the model. We applied the model to a new set of data, collected in-person, where participants were either young (<35) or older (>50) and either had normal hearing or had a diagnosed hearing loss (most used hearing aids). Their visual-only and audio-only speech perception scores were used to predict audiovisual speech perception, using the model fit to the on-line data. The model was able to predict audiovisual performance quite data well. Exactly how well is left for the meeting, to provide intrigue. Overall, these data support the idea that we can predict audiovisual speech perception from perception of the individual senses.