Depression Detection in Speech
-
Megha Ghosh
|
Sukesh Shenoy
Given a clinical audio interview designed to support the diagnosis of psychological distress conditions such as anxiety, depression and post traumatic stress disorder, the Convolutional Neural Network is used to learn useful characteristics of depression from speech which on turn is used for the classification of respondent into two classes depressed and non- depressed respectively. The spectrogram audio segments are converted to spectrogram images and the images are converted to Tensorflow tensor. The normalized images are then fed in Convolutional Neural Network to detect if the person is depressed or not.
Input variables : Features extracted from spectrogram of audio interviews
Output Variables : Depression status (Yes/No)
Metrics to Monitor
Statistical
|
:
|
Somers D |
Accuracy |
Precision and Recall |
Confusion Matrix |
F1 Score |
Roc and Auc |
Prevalence |
Detection Rate |
Balanced Accuracy |
Cohen's Kappa |
Concordance |
Gini Coefficent |
KS Statistic |
Youden's J Index
|
Infrastructure
|
:
|
Log Bytes |
Logging/User/IAMPolicy |
Logging/User/VPN |
CPU Utilization |
Memory Usage |
Error Count |
Prediction Count |
Prediction Latencies |
Private Endpoint Prediction Latencies |
Private Endpoint Response Count
|
Visit Model :
github.com