Audio-Based Snoring Index in relation to AHI

Quick Summary

What we're exploring: The paper Dynamics of snoring sounds and its connection with obstructive sleep apnea showcases a significant correlation between the number of snore events per hour (computed from audio recordings) and the AHI. We try to reproduce the results on the PSG-Audio dataset.

Data

Data was downloaded from our minio bucket `rekonas-dataset-psg-audio`.
I made direct use of the extracted FLAC audio files provided at some point by Corius at `rekonas-dataset-psg-audio/cornelius.reyneke/psg_audio_V3/APNEA_FLAC`. I did not try to reproduce it, and unfortunately no scripts for the extraction of the FLACs from the EDFs is available.
For the processing of this data I provide a few python scripts with included dependencies (runnable via `uv run`). These a.) filter the data using a bandpass filter and b.) extract the apnea/hypopnea events and sleep stages from the XMLs (somehow called `.rml`) `rekonas-dataset-psg-audio/psg_audio_V3/APNEA_RML_clean`.

Notes on the XMLS/Anonotations:

The XMLs only include the onset but not the duration of sleep stages, and can thus not be parsed by our internal RKNS, a custom XML parser was required.
The XMLs work with namespaces, which makes parsing difficult.
Sleep stages are available from both experts and automatic staging for this dataset. I used the expert-based stages.
For a few recordings no corresponding FLAC is available, Corius mentioned that a few failed during his extraction.

There is a significant correlation between the extracted snore index and the AHI
The correlation seems strongest, when one considers only snores that are at least 20s later than the last snore (instead of the proposed 10s in the paper).
The correlation seems strongest for N2, but the caveat is that there is a lack of data for other stages, as the patients slept badly (~4hrs total sleep time on average).

There's no proper explanation as to why the 20s minimum interval increases correlation.
The filtering described in the paper doesn't work well for the available audio data. I instead used a basic butterworth filter of a lower order (12). This could definitely be improved
Talking and other sounds are not really filtered out by the filtering + thresholding approach.
The threshold is selected absolutely arbitrarily.

Should we proceed?: Maybe

If yes, what needs to happen?

Include knowledge about the SPO2 measurements for the selection of snores.
Consider using a proper snore detection method
Instead of simply counting snores, a smarter aggregation of the events should be considered. The sensitivity to the low interval suggests that the intervals are highly relevant. Maybe consider frequency of snore events?
Consider using the machine labels
Look at correlation with only apneas and only hypopneas respectively.