@Article{info:doi/10.2196/66665, author="Mardini, Mamoun T and Khalil, Georges E and Bai, Chen and DivaKaran, Aparna Menon and Ray, Jessica M", title="Identifying Adolescent Depression and Anxiety Through Real-World Data and Social Determinants of Health: Machine Learning Model Development and Validation", journal="JMIR Ment Health", year="2025", month="Feb", day="12", volume="12", pages="e66665", keywords="social determinants of health; adolescents; anxiety; depression; machine learning; real-world data; teenagers; youth; XGBoost; cross-validation technique; SHapley Additive exPlanation; mental health; mental disorder; mental illness; health outcomes; clinical data", abstract="Background: The prevalence of adolescent mental health conditions such as depression and anxiety has significantly increased. Despite the potential of machine learning (ML), there is a shortage of models that use real-world data (RWD) to enhance early detection and intervention for these conditions. Objective: This study aimed to identify depression and anxiety in adolescents using ML techniques on RWD and social determinants of health (SDoH). Methods: We analyzed RWD of adolescents aged 10‐17 years, considering various factors such as demographics, prior diagnoses, prescribed medications, medical procedures, and laboratory measurements recorded before the onset of anxiety or depression. Clinical data were linked with SDoH at the block-level. Three separate models were developed to predict anxiety, depression, and both conditions. Our ML model of choice was Extreme Gradient Boosting (XGBoost) and we evaluated its performance using the nested cross-validation technique. To interpret the model predictions, we used the Shapley additive explanation method. Results: Our cohort included 52,054 adolescents, identifying 12,572 with anxiety, 7812 with depression, and 14,019 with either condition. The models achieved area under the curve values of 0.80 for anxiety, 0.81 for depression, and 0.78 for both combined. Excluding SDoH data had a minimal impact on model performance. Shapley additive explanation analysis identified gender, race, educational attainment, and various medical factors as key predictors of anxiety and depression. Conclusions: This study highlights the potential of ML in early identification of depression and anxiety in adolescents using RWD. By leveraging RWD, health care providers may more precisely identify at-risk adolescents and intervene earlier, potentially leading to improved mental health outcomes. ", issn="2368-7959", doi="10.2196/66665", url="https://mental.jmir.org/2025/1/e66665", url="https://doi.org/10.2196/66665" }