Keywords: Exploratory Data Analysis (EDA), Logistic Regression, Random Forest, Gradient Boosted Decision Trees, Predictive Modeling, Feature Importance, Python
Summary: This project focuses on predicting mental health outcomes, specifically depressive disorders, using the Behavioral Risk Factor Surveillance System (BRFSS) dataset. With over 440,000 records and 2,000+ variables, the data includes health-related, demographic, and disability-related features.
The methodology involves exploratory data analysis (EDA) to uncover trends and relationships, followed by predictive modeling using Logistic Regression, Random Forest, and Gradient Boosted Decision Trees. Feature importance analysis highlights key predictors like difficulty concentrating, physical health, and education level.
While all models performed similarly, efforts were constrained by dataset imbalance and inherent limitations of the data. Insights emphasize the value of early detection, resource allocation, and targeted public health interventions. Future work may explore bias mitigation and predicting depression severity.
Project Pitch