Vocal fold lesions and early stages of laryngeal cancer modify voice acoustics that could allow AI detection.
Researchers in the US have shown that abnormalities of the vocal folds can be detected from voice sounds.
Such ‘vocal fold lesions’ might be benign, such as nodules or polyps, but can also signify early laryngeal cancer.
Dr Phillip Jenkins, a postdoctoral fellow in clinical informatics at Oregon Health & Science University, is the corresponding author of the study that was published in frontiers Digital Health.
He said: ‘These proof-of-concept findings pave the way for a new AI application: recognising early warning signs of laryngeal cancer from voice recordings. Here we show that with this dataset, we could use vocal biomarkers to distinguish voices from patients with vocal fold lesions from those without such lesions.’
Jenkins and his colleagues are part of the ‘Bridge2AI-Voice’ project within the US National Institute of Health’s ‘Bridge to Artificial Intelligence’ (Bridge2AI) consortium, an initiative to apply AI to complex biomedical challenges.
They analysed variations in tone, pitch, volume, and clarity within the first version of the public Bridge2AI-Voice dataset, comprising 12,523 voice recordings from 306 participants across North America.
A small number of these were from patients with known laryngeal cancer, benign vocal fold lesions or other voice-related conditions such as spasmodic dysphonia and unilateral vocal fold paralysis.
The team focused on differences in various acoustic features, including the mean fundamental frequency (pitch), jitter (the variation in pitch during speech), shimmer (the variation in amplitude), and the harmonic-to-noise ratio (a measure of the relationship between harmonic components and noise in speech).
Significant differences were found in the harmonic-to-noise ratio and fundamental frequency between men without voice disorders, men with benign vocal fold lesions, and men with laryngeal cancer. No notable acoustic features were identified among women, although a larger dataset may reveal such differences.
The authors concluded that, especially variation in the harmonic-to-noise ratio, could help monitor the progression of vocal fold lesions and detect laryngeal cancer at an early stage – at least in men.
Jenkins said: ‘Our results suggest that ethically sourced, large, multi-institutional datasets like Bridge2AI-Voice could soon help make our voice a practical biomarker for cancer risk in clinical care.’
Having established the proof of concept, the next step involves applying these algorithms to larger datasets and testing them in clinical environments with patient data.
Jenkins added: ‘To move from this study to an AI tool that recognises vocal fold lesions, we would train models using an even larger dataset of voice recordings, labelled by professionals. We must then validate the system to ensure it performs equally well for women and men.
‘Voice-based health tools are already being piloted. Building on our findings, I estimate that with larger datasets and clinical validation, similar tools to detect vocal fold lesions might enter pilot testing within the next few years.’
The authors concluded that future studies should aim to increase sample sizes and include more detailed data, such as lesion sizes.
Additionally, the sex of participants influenced the results, which should be taken into account in future recruitment to avoid biased datasets.
Experts say further research should proceed to investigate various benign and malignant lesions through voice features.


