Classification and clustering analysis of patient length-of-stay (LOS) in a US hospital

Duncan Wang

McGill University

The effective management of patient hospital stays is one of the most challenging yet paramount priorities of modern healthcare systems For this analysis, data extracted from the MIMIC database is used to build two models in order to generate both predictive and exploratory insights regarding patient hospital stays. First is a machine learning classification model to predict the categorical length of a patient’s hospital stay, given a patient’s observable characteristics at time of admission. Then, used unsupervised learning techniques to cluster patients based on the number of various patient-caretaker interactions — such as procedures, inputs taken, and drugs prescribed — which can quantify the amount of human or physical resources used by a patient during their stay. LOS is divided into three classes with a comparable number of observations in each class: Short stays: 0–5 days, Medium stays: 6–10 days, Long stays: greater than 10 days. To predict the length of a patient’s stay, three different classification models are used:Multinomial Logistic Regression (MLR), Random Forest (RF), Gradient Boosting Machine (GBM). Gradient Boosting Machine marginally outperformed the other two models tested. To complement the classification task, K-Means Clustering Model is used to explore human or physical resources utilized on patients during their stay

Input variables : Gender, age, admit type, location, diagnosis, insurance
Output Variables : Length of stay (Short stays: 0–5 days, Medium stays: 6–10 days, Long stays: greater than 10 days)

Metrics to Monitor

Statistical	:	Somers D \| Accuracy \| Precision and Recall \| Confusion Matrix \| F1 Score \| Roc and Auc \| Prevalence \| Detection Rate \| Balanced Accuracy \| Cohen's Kappa \| Concordance \| Gini Coefficent \| KS Statistic \| Youden's J Index
Business	:	Bed Occupancy Rate \| Medical Equipment Utilization \| Optimized Hospital Resource Utilization
Infrastructure	:	Log Bytes \| Logging/User/IAMPolicy \| Logging/User/VPN \| CPU Utilization \| Memory Usage \| Error Count \| Prediction Count \| Prediction Latencies \| Private Endpoint Prediction Latencies \| Private Endpoint Response Count

Visit Model : github.com

Additional links : towardsdatascience.com

Model Category	:	Public
Date Published	:	March, 2021
Healthcare Domain	:	Payer Provider
Code	:	github.com

Classification and clustering analysis of patient length-of-stay (LOS) in a US hospital

Model Details

Applications

Solutions

You can also search for

Classification and clustering analysis of patient length-of-stay (LOS) in a US hospital

Model Details

Applications

Solutions

You can also search for

Share