Abstract:
Detection of voice changes in Parkinson's Disease (PD) patients would make it possible for early detection and intervention before the onset of disabling physical symptoms. This study explores static and dynamic speech features relating to PD detection. A comparative analysis of the articulation transition characteristics shows that the number of articulation transitions and the trend of the fundamental frequency curve are significantly different between HC speakers and PD patients. Motivated by this observation, we propose to apply Bidirectional long-short term memory (LSTM) model to capture time-series dynamic features of a speech signal for detecting PD. The dynamic speech features are measured based on computing the energy content in the transition from unvoiced to voiced segments (onset), and in the transition from voiced to unvoiced segments (offset). Under the two evaluation methods of 10-fold cross validation (CV) and splitting the dataset without samples overlap of one individual, the experimental results show that the proposed method remarkably improves the accuracy of PD detection over traditional machine learning models using static features.