Audio Metric Learning by Using Siamese Autoencoders for One Shot Human Fall Detection

Audio Metric Learning by Using Siamese Autoencoders for One Shot Human Fall Detection

Abstract:

In the recent years, several supervised and unsupervised approaches to fall detection have been presented in the literature. These are generally based on a corpus of examples of human falls that are, though, hard to collect. For this reason, fall detection algorithms should be designed to gather as much information as possible from the few available data related to the type of events to be detected. The one-shot learning paradigm for expert systems training seems to naturally match these constraints, and this inspired the novel Siamese Neural Network (SNN) architecture for human fall detection proposed in this contribution. Acoustic data are employed as input, and the twin convolutional autoencoders composing the SNN are trained to perform a suitable metric learning in the audio domain and, thus, extract robust features to be used in the final classification stage. A large acoustic dataset has been recorded in three real rooms with different floor types and human falls performed by four volunteers, and then adopted for experiments. Obtained results show that the proposed approach, which only relies on two real human fall events in the training phase, achieves a F1-Measure of 93.58% during testing, remarkably outperforming the recent supervised and unsupervised state-of-art techniques selected for comparison.