Deep Posterior Mean Opinion Score for Speech
Title: Deep Posterior Mean Opinion Score for Speech
DNr: Berzelius-2024-88
Project Type: LiU Berzelius
Principal Investigator: Saikat Chatterjee <sach@kth.se>
Affiliation: Kungliga Tekniska högskolan
Duration: 2024-03-25 – 2024-10-01
Classification: 20205
Keywords:

Abstract

Our proposal is to develop a deep neural network (DNN) based method that provides a posterior distribution of mean-opinion-score (MOS) for an input speech signal. The DNN outputs statistical parameters of the posterior distribution. The proposed method will be referred to as deep posterior MOS (DeePMOS). For robust training of DeePMOS, we will use a combination of maximum-likelihood learning, stochastic gradient noise, and a student-teacher learning setup. Using the mean of the posterior as a point estimate, we will finally evaluate standard performance measures of the proposed DeePMOS. The results will be published in standard venues like Interspeech, ICASSP, and IEEE TASLP journals. For the proposed project, we require computational resource, and hence applying here for the resource.