Publikation
TripletCough: Cougher Identification and Verification From Contact-Free Smartphone-Based Audio Recordings Using Metric Learning.
Wissenschaftlicher Artikel/Review - 03.06.2022
Jokic Stefan, Cleres David, Rassouli Frank, Steurer-Stey Claudia, Puhan Milo Alan, Brutsche Martin, Fleisch Elgar, Barata Filipe
Bereiche
PubMed
DOI
Kontakt
Zitation
Art
Zeitschrift
Veröffentlichungsdatum
eISSN (Online)
Seiten
Kurzbeschreibung/Zielsetzung
Cough, a symptom associated with many prevalent respiratory diseases, can serve as a potential biomarker for diagnosis and disease progression. Consequently, the development of cough monitoring systems and, in particular, automatic cough detection algorithms have been studied since the early 2000s. Recently, there has been an increased focus on the efficiency of such algorithms, as implementation on consumer-centric devices such as smartphones would provide a scalable and affordable solution for monitoring cough with contact-free sensors. Current algorithms, however, are incapable of discerning between coughs of different individuals and, thus, cannot function reliably in situations where potentially multiple individuals have to be monitored in shared environments. Therefore, we propose a weakly supervised metric learning approach for cougher recognition based on smartphone audio recordings of coughs. Our approach involves a triplet network architecture, which employs convolutional neural networks (CNNs). The CNNs of the triplet network learn an embedding function, which maps Mel spectrograms of cough recordings to an embedding space where they are more easily distinguishable. Using audio recordings of nocturnal coughs from asthmatic patients captured with a smartphone, our approach achieved a mean accuracyof 88 % ( ± 10 % SD) on two-way identification tests with 12 enrollment samples and accuracy of 80 % and an equal error rate (EER) of 20 % on verification tests. Furthermore, our approach outperformed human raters with regard to verification tests on average by 8% in accuracy, 4% in false acceptance rate (FAR), and 12% in false rejection rate (FRR). Our code and models are publicly available.