TY - JOUR AU - Mardini, Mamoun T AU - Khalil, Georges E AU - Bai, Chen AU - DivaKaran, Aparna Menon AU - Ray, Jessica M PY - 2025 DA - 2025/2/12 TI - Identifying Adolescent Depression and Anxiety Through Real-World Data and Social Determinants of Health: Machine Learning Model Development and Validation JO - JMIR Ment Health SP - e66665 VL - 12 KW - social determinants of health KW - adolescents KW - anxiety KW - depression KW - machine learning KW - real-world data KW - teenagers KW - youth KW - XGBoost KW - cross-validation technique KW - SHapley Additive exPlanation KW - mental health KW - mental disorder KW - mental illness KW - health outcomes KW - clinical data AB - Background: The prevalence of adolescent mental health conditions such as depression and anxiety has significantly increased. Despite the potential of machine learning (ML), there is a shortage of models that use real-world data (RWD) to enhance early detection and intervention for these conditions. Objective: This study aimed to identify depression and anxiety in adolescents using ML techniques on RWD and social determinants of health (SDoH). Methods: We analyzed RWD of adolescents aged 10‐17 years, considering various factors such as demographics, prior diagnoses, prescribed medications, medical procedures, and laboratory measurements recorded before the onset of anxiety or depression. Clinical data were linked with SDoH at the block-level. Three separate models were developed to predict anxiety, depression, and both conditions. Our ML model of choice was Extreme Gradient Boosting (XGBoost) and we evaluated its performance using the nested cross-validation technique. To interpret the model predictions, we used the Shapley additive explanation method. Results: Our cohort included 52,054 adolescents, identifying 12,572 with anxiety, 7812 with depression, and 14,019 with either condition. The models achieved area under the curve values of 0.80 for anxiety, 0.81 for depression, and 0.78 for both combined. Excluding SDoH data had a minimal impact on model performance. Shapley additive explanation analysis identified gender, race, educational attainment, and various medical factors as key predictors of anxiety and depression. Conclusions: This study highlights the potential of ML in early identification of depression and anxiety in adolescents using RWD. By leveraging RWD, health care providers may more precisely identify at-risk adolescents and intervene earlier, potentially leading to improved mental health outcomes. SN - 2368-7959 UR - https://mental.jmir.org/2025/1/e66665 UR - https://doi.org/10.2196/66665 DO - 10.2196/66665 ID - info:doi/10.2196/66665 ER -