Session: 10.4.3 - Vortex Dynamics III
Paper Number: 158722
158722 - Using Schlieren Imaging and Flow Feature Analysis to Classify Alphabetic Speech Sounds
Abstract:
Speech, an essential facet of human communication, requires the intricate coordination of the vocal tract's shape and movements. Articulation disorders can severely hinder an individual's ability to produce clear and accurate speech sounds, disrupting communication effectiveness, social interactions, and overall quality of life. This study investigates the relationship between speech therapy and advanced imaging technology to tackle the challenges faced by individuals with speech disorders. A Schlieren optical system was designed to capture speech airflows while articulating alphabetic characters. A hybrid analytical framework integrating advanced computational methods with an AI-driven approach was applied to analyze flow features, focusing specifically on vortex dynamics for alphabet-specific speech flow classification. This approach allows for the extraction and analysis of complex spatiotemporal patterns in speech flows, aiding in the differentiation of phonetic sounds. A novel ResNet-twin-LSTM architecture was developed to generate high-dimensional temporal and spatial feature vectors from Schlieren images. The model processes these vectors using bidirectional LSTM layers, which are designed to capture both forward and backward dependencies in temporal data. This approach enhances the model's ability to classify complex and dynamic phonetic patterns over time. For the first seven alphabetic characters (A-G), the proposed method achieved a classification accuracy exceeding 95%, even when working with a limited video dataset. This result indicates that the twin-LSTM architecture effectively captured the consistent temporospatial flow features inherent in each alphabet. By comparison, a conventional ResNet-mono-LSTM architecture yielded significantly lower classification accuracy on the same Schlieren video set.
Presenting Author: Jinxiang Xi University of Massachusetts, Lowell
Presenting Author Biography: Dr. Mohamed obtained his PhD in Biomedical Engineering from UMASS Lowell in 2024. He has published 30 journal papers in respiratory dynamics, inhalation drug delivery, experimental fluid mechanics, and machine learning.
Using Schlieren Imaging and Flow Feature Analysis to Classify Alphabetic Speech Sounds
Paper Type
Technical Presentation Only