email   Email Us: phone   Call Us: +1 (914) 407-6109   57 West 57th Street, 3rd floor, New York - NY 10019, USA

Lupine Publishers Group

Lupine Publishers


Accuracy of Laryngoscopy for Quantitative Vocal Fold Analysis in Combination with AI, A Cohort Study of Manual Artefacts

by Mette Pedersen*, Christian F Larsen

Abstract: Introduction: A cohort of high-speed videoendoscopies was evaluated for usability for deep learning. The aim of our study was to find the percentage of our high-speed videos (15.732) that could be used for deep learning (AI). A screening of the material showed that some videos had artefacts, making them non usable for deep learning.Material: A randomization was made with Wolfram Alpha random number generator selecting between 15.732 videos from 7.909 patients. The various non usable videos are described including the rear parts of the vocal folds not seen, the epiglottis or uvula blocking vision, parts of the vocal folds not seen, no vibration of the vocal folds, persistent constricted larynx, picture taken from an oblique angle, the front part of the vocal folds not seen, and parts of the arytenoid region not seen.Method: Assuming the assessments are independent with regards to whether there is a finding, the total number of assessments with a given finding is binomial distributed. With 100 assessments, an observed incidence of 1, 10 and 25 findings will result in estimated 95% confidence intervals of [0%-3%], [4%-16%] and [17%-33%], respectively. 95% confidence intervals are calculated as Wald test using the asymptotic Normal distribution assumption of the estimated proportion in the binomial distribution. Assuming the incidence of findings for each of the different findings was below 25%, the expected length of the 95% confidence interval is 16%-point (33-17), with 200 and 500 assessments, the corresponding length is 14%-point and 8%-point, respectively. Based on these calculations 100 randomised films were sufficient to be used for calculations. Results and Conclusion: The prospective cohort study of high-speed videos covered 12 years from the February 2007 to January 2019 in an otorhinolaryngology medical centre. 7.909 patients with a total of 15.732 high-speed video films of the larynx including the vocal folds had been consecutively sampled (4.000 framesper second, RichardWolf Ltd. endocam 5562). Observations on high-speed video for the usable versus non usable videos with 95% confidence intervals, showed that only 51% were usable. The interesting result is that oblique angle pictures (10%) as well as insufficient pictures of the front of the vocal folds and arytenoids (14%) were the largest groups of the non-usable. They can be augmented by the examiner in the future. Various video and deep learning programs are discussed.

View PDF

Recent e-Prints