*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Towards Automatic Analysis of Audio Recordings from Children with Autism Spectrum Disorder
Committee:
Dr. David Anderson, ECE, Chair, Advisor
Dr. Mark Clements, ECE
Dr. Chin-Hui Lee, ECE
Dr. Omer Inan, ECE
Dr. Catherine Lord, UCLA
Abstract: Autism spectrum disorder (ASD) is a neurodevelopmental disorder that can negatively im- pact learning, behavior, and social communication and interaction. In the United States, 1 in 59 children aged eight were diagnosed with ASD, according to the Center for Disease Control and Prevention (CDC)’s Autism and Developmental Disabilities Monitoring Network (ADDM) 2014 report. Unfortunately, manual analysis of recordings of children with ASD is expensive, time-consuming, and does not scale well. This dissertation addresses general approaches for automatic analysis of audio recordings of children with ASD. First, we demonstrate that environmental feature representation in the i-vector space can be used to improve the diarization of the audio recordings. Next, we address the issue of diarizing audio recordings of infants and toddlers. We use a time-delay neural network (TDNN) architecture for diarization on older children and propose a fine-tuning mechanism to improve the accuracy on recordings from infants and toddlers. Finally, we evaluate several vocalization metrics which can aid clinicians in their diagnosis. One particular metric of interest for clinicians is the child’s response rate to questions from parents. We build an interrogative utterance detector that features a stack of convolutional neural network (CNN) layers with a self-attention mechanism. We can identify question segments from parents and subsequently analyze response rates with this particular architecture. Other vocalization metrics evaluated here are conversational turns, child utterance frequency and duration, and adult question rates.