American Society of Mechanical Engineers

Times are displayed in (UTC-04:00) Eastern Time (US & Canada) Change

Session: 10.4.3 - Vortex Dynamics III

Paper Number: 158722

158722 - Using Schlieren Imaging and Flow Feature Analysis to Classify Alphabetic Speech Sounds

Abstract:

Speech, an essential facet of human communication, requires the intricate coordination of the vocal tract's shape and movements. Articulation disorders can severely hinder an individual's ability to produce clear and accurate speech sounds, disrupting communication effectiveness, social interactions, and overall quality of life. This study investigates the relationship between speech therapy and advanced imaging technology to tackle the challenges faced by individuals with speech disorders. A Schlieren optical system was designed to capture speech airflows while articulating alphabetic characters. A hybrid analytical framework integrating advanced computational methods with an AI-driven approach was applied to analyze flow features, focusing specifically on vortex dynamics for alphabet-specific speech flow classification. This approach allows for the extraction and analysis of complex spatiotemporal patterns in speech flows, aiding in the differentiation of phonetic sounds. A novel ResNet-twin-LSTM architecture was developed to generate high-dimensional temporal and spatial feature vectors from Schlieren images. The model processes these vectors using bidirectional LSTM layers, which are designed to capture both forward and backward dependencies in temporal data. This approach enhances the model's ability to classify complex and dynamic phonetic patterns over time. For the first seven alphabetic characters (A-G), the proposed method achieved a classification accuracy exceeding 95%, even when working with a limited video dataset. This result indicates that the twin-LSTM architecture effectively captured the consistent temporospatial flow features inherent in each alphabet. By comparison, a conventional ResNet-mono-LSTM architecture yielded significantly lower classification accuracy on the same Schlieren video set.

Presenting Author: Jinxiang Xi University of Massachusetts, Lowell

Presenting Author Biography: Dr. Mohamed obtained his PhD in Biomedical Engineering from UMASS Lowell in 2024. He has published 30 journal papers in respiratory dynamics, inhalation drug delivery, experimental fluid mechanics, and machine learning.

Using Schlieren Imaging and Flow Feature Analysis to Classify Alphabetic Speech Sounds

Paper Type

Technical Presentation Only