For most diseases, building large databases of labeled genetic data is an expensive and time-demanding task. genetic Generative Adversarial Networks (gGAN), a semi-supervised approach based on an innovative GAN architecture to create large synthetic genetic data sets starting with a small amount of labeled data and a large amount of unlabeled data is proposed. Even though dengue is being used to demonstrate the method, it is entirely general and can be applied to other diseases. The goal is to determine the propensity of a new individual to develop the severe form of the illness from their genetic profile alone. This model is self-aware and capable of determining whether a new genetic profile has enough compatibility with the data on which the network was trained and is thus suitable for prediction.
Input variables : 1. Unlabeled genetic data from
Phase 3 of the 1000 Genomes Project which consists of 2504 individuals genotyped for more than 84 million variants 2. Labeled data - genotypes measured at 322 loci polymorphisms for the Dengue Infection phenotype in human subjects
Output Variables : 1.Unlabeled output - the genetic profile is
real or synthetic 2. Labeled output - whether the individual with the corresponding genetic profile is likely to develop Severe Dengue
Statistical | : | Somers D | Accuracy | Precision and Recall | Confusion Matrix | F1 Score | Roc and Auc | Prevalence | Detection Rate | Balanced Accuracy | Cohen's Kappa | Concordance | Gini Coefficent | KS Statistic | Youden's J Index |
Business | : | Population at High Risk of Disease | Risk by Geography | Risk by Demographics | Risk by Clinical Parameters | Optimized Hospital Resource Utilization | Decreased Cost of Care | Decreased Patient Visits |
Infrastructure | : | Log Bytes | Logging/User/IAMPolicy | Logging/User/VPN | CPU Utilization | Memory Usage | Error Count | Prediction Count | Prediction Latencies | Private Endpoint Prediction Latencies | Private Endpoint Response Count |
Visit Model : github.com
Additional links : arxiv.org
Model Category | : | Public |
Date Published | : | July, 2020 |
Healthcare Domain | : | Life Sciences |
Code | : | github.com |
Health Risk Management |
Health Risk Prediction |