Feasibility, Acceptability, and Preliminary Efficacy of a Smartphone App–Led Cognitive Behavioral Therapy for Depression Under Therapist Supervision: Open Trial

Background: Major depressive disorder affects approximately 1 in 5 adults during their lifetime and is the leading cause of disability worldwide. Yet, a minority receive adequate treatment due to person-level (eg, geographical distance to providers) and systems-level (eg, shortage of trained providers) barriers. Digital tools could improve this treatment gap by reducing the time and frequency of therapy sessions needed for effective treatment through the provision of flexible, automated support. Objective: This study aimed to examine the feasibility, acceptability, and preliminary clinical effect of Mindset for Depression, a deployment-ready 8-week smartphone-based cognitive behavioral therapy (CBT) supported by brief teletherapy appointments with a therapist. Methods: This 8-week, single-arm open trial tested the Mindset for Depression app when combined with 8 brief (16-25 minutes) video conferencing visits with a licensed doctoral-level CBT therapist (n=28 participants). The app offers flexible, accessible psychoeducation, CBT skills practice, and support to patients as well as clinician guidance to promote sustained engagement, monitor safety, and tailor treatment to individual patient needs. To increase accessibility and thus generalizability, all study procedures were conducted remotely. Feasibility and acceptability were assessed via attrition, patient expectations and feedback, and treatment utilization. The primary clinical outcome measure was the clinician-rated Hamilton Depression Rating Scale, administered at pretreatment, midpoint, and posttreatment


Introduction
Major depressive disorder (MDD), characterized by hallmark symptoms of persistent depressed mood and loss of interest in activities [1], is highly prevalent.In 2020, an estimated 21 million adults in the United States were impacted (8.4% population prevalence) [2].The rates of elevated depressive symptoms have continued to rise since the COVID-19 pandemic, now affecting nearly 1 in 3 adults [3].Depression is the leading cause of disability worldwide [4] and is associated with economic costs exceeding US $326.2 billion in the United States alone [5].Despite the substantial personal and societal impact of MDD, a minority of individuals meeting diagnostic criteria-let alone those at risk or with subthreshold symptoms-receive care; even fewer receive minimally adequate treatment (estimates range from 3% in low-to middle-income countries and 23% in high-income countries [6]), such as cognitive behavioral therapy (CBT), the most widely studied and recommended psychotherapy [7].At a systems level, the limited availability of trained clinicians is a substantial contributor to low treatment utilization [8].There simply are not and will not be enough clinicians to meet current demands for mental health care.Additionally, many people do not seek treatment for depression due to obstacles such as geographic distance from care providers, high costs, and stigma [8,9].Given the prevalence of MDD and the substantial treatment gaps that exist, there is a clear need for low barrier, more widely accessible, effective treatments for MDD.
The technology could bridge such gaps.Smartphone apps or other digital tools could reduce the time and frequency of sessions by supplementing clinician effort with automated, validated support that can be used flexibly between sessions [10,11].However, standalone apps are not sufficient or appealing for many patients [12,13].The majority of apps, even those that are grounded in empirically supported treatments, have high dropout rates, which limits their effectiveness [14][15][16].The absence of concurrent human support is often cited as the major reason for nonadherence or nonengagement [13,[17][18][19].Some engagement is likely a minimum requirement for an app-based therapy to be effective; guidance from a trained provider should further mitigate issues of comprehension, personalization, problem-solving, and interference from comorbid or life concerns [20].Equivalent effects of face-to-face CBT and internet-delivered CBT for depression have been found for treatments that are therapist guided, meaning patients are in contact with a therapist throughout treatment (eg, weekly sessions, check-in phone calls, asynchronous, messaging, and weekly feedback emails) [21][22][23].Moreover, many users simply want access to a therapist and are less willing to engage in self-directed digital treatments [24,25].Thus, a digital service that combines mobile-based CBT with brief remote individual sessions with a clinician (ie, teletherapy monitoring) has the potential to greatly enhance the scalability of high-quality app-based treatment, particularly for moderately and severely ill patients while reducing clinician burden and cost [26][27][28].
The purpose of this study was to conduct an open trial to test the feasibility, acceptability, and efficacy of the Mindset for Depression app (a novel, smartphone-based CBT program) with brief video-conferencing appointments with a therapist.We hypothesized that the treatment would be feasible and acceptable.We also hypothesized that treatment would yield statistically significant reductions in depression symptom severity (primary clinical outcome) as well as improvements in functioning and quality of life (secondary clinical outcomes) from baseline to posttreatment (week 8).The treatment was tested for patients with moderate to severe depression: those who would typically be referred for one-on-one outpatient therapy [29].

Study Design
This open trial tested the Mindset for Depression app when combined with brief (16-25 minutes) video-conferencing visits with a CBT therapist over 8 weeks.The primary outcomes were feasibility, acceptability, and preliminary efficacy, as measured by change in depression symptom severity.To increase accessibility and thus generalizability, all study procedures were conducted remotely.

Ethical Considerations
The study was approved by the institutional review board of Massachusetts General Hospital (2020P001958).All participants provided informed consent prior to the initiation of study procedures and were given the ability to opt out at any point.Data were deidentified to protect participants' privacy.Participants were compensated US $25 at mid-treatment, end of treatment, and 3-month follow-up assessments.

Participants
Eligible participants, recruited between May 2022 and February 2023, were at least 18 years old, living in Massachusetts, presenting with a current primary Diagnostic and Statistical Manual of Mental Disorders: 5th Edition (DSM-5) diagnosis of MDD, and experiencing at least moderately severe symptoms (Patient Health Questionnaire-9 [PHQ-9] score ≥ 10).Participants taking psychotropic medication were on a stable dose for at least 2 months prior to enrollment and were asked to remain on the same stable dose throughout the study period.

RenderX
Exclusion criteria included 4 or more prior sessions of CBT for depression (assessed via self-report and interview with an independent evaluator), current severe substance use disorder, lifetime bipolar disorder or psychosis, acute and active suicidal ideation as indicated by clinical judgment, a score ≥ 2 on the past month suicidal ideation subscale of the Columbia-Suicide Severity Rating Scale [30], concurrent psychological treatment, and inability to engage with treatment (eg, did not own a supported smartphone).

Treatment
The Mindset for Depression app provides key CBT-derived content for adults with MDD and was designed to be used in conjunction with a therapist over 8 weeks.The duration of CBT trials typically ranges from 6 to 20 sessions [31].Mindset for Depression was built in collaboration between researchers at the Massachusetts General Hospital and Koa Health.The app-based format allows participants to review CBT content and accompanying skills practice exercises at their convenience and own pace and with support from their therapist.
The app and clinician dashboard were developed through collaborative, user-centered design, integrating perspectives from clinicians (MDs and psychologists with expertise in MDD and CBT), digital health researchers, patients with MDD and experience with CBT and other therapies, engineers, and designers.Through this approach, the product being tested was deployment ready (eg, built on a commercial platform, able to be quickly scaled and professionally maintained to minimize technical difficulties, and ensure compliance with up-to-date privacy and security standards) and therefore well positioned to succeed outside of research studies [32].

CBT Modules
The app delivers content in 8 steps, corresponding to the 8 weeks of treatment.A summary of these steps is visualized in Table 1.Participants were also allowed access to the app during the 3-month follow-up period.Core CBT skills included across treatment include psychoeducation, cognitive restructuring and core beliefs, behavioral activation, mindfulness, and relapse prevention [33,34].Step 1 comprises psychoeducation about MDD and the CBT model and background and skills practice for identifying and restructuring "thinking traps," or maladaptive automatic thoughts [35,36].Step 2 focuses on the short-and long-term impact of withdrawal and avoidance on mood and provides a structure for recording daily activities and monitoring associated moods.Step 3 introduces behavioral activation and scheduling and provides guidance for identifying valued or new activities and setting specific, measurable, achievable, relevant, and time-bound (SMART) goals [37,38].Activity scheduling and monitoring (ie, recording completed activities and associated mood ratings), with an emphasis on personal values and meaning, continue for the remainder of treatment.Step 4 introduces mindfulness (present-focused and nonjudgmental awareness) and offers a guided mindful breathing audio exercise [39].Steps 5 and 6 provide users with additional mindfulness approaches, including grounding and letting go of unhelpful thoughts.Step 7 builds on prior cognitive skills and delves into the definition of core beliefs, their relationship to automatic thoughts and feelings, and strategies to identify and challenge them (eg, downward arrow technique and building self-esteem).
Step 8 concludes with relapse prevention by helping users consolidate treatment skills, anticipate future challenges, and plan for continued practice and flexible use of skills.Example screenshots from the smartphone app are included in Figure 1.

Therapists
Each participant was matched with a licensed doctoral-level therapist.Therapists were trained in and actively practicing CBT for MDD and provided with study-specific training in using the Mindset app and therapist dashboard prior to beginning the trial.To ensure proficiency, therapists were required to complete the Massachusetts General Hospital Psychiatry Academy CBT training course and pass (>90% correct) both the corresponding CBT knowledge test and an MDD knowledge test.Weekly supervision from the principal investigator (expert in CBT) was also provided.To ensure ongoing high-quality treatment, including that implementation fidelity targets were met and non-CBT techniques were absent, sessions were audio recorded, and 40 of the 224 planned sessions (18.9% of the 212 sessions ultimately conducted) were randomly selected and rated for competency and treatment adherence by an independent rater.Adherence raters were experienced in CBT for MDD and further trained and supervised.Core elements of each treatment session (5-6 items) were rated for adherence on a 7-point scale (1=not at all to 7=completely adherent) and then a global rating of adherence was assigned.The full adherence scale is included in Multimedia Appendix 1. Competence was rated on 12 aspects (32 items) of CBT for MDD (eg, positive outlook, knowledge, clear communication, empathy, flexibility, and empowering the patient).Each item was scored on a 5-point scale (1=not at all to 5=completely competent) and then a global rating of competence was assigned.Overall, adherence and competence were high, with 100% of all rated sessions evaluated as "completely" adherent and 100% of all rated sessions evaluated as "mostly" or "completely" competent.
Therapists offered each patient 8 video-conferencing appointments (16-25 minutes; via HIPAA [Health Insurance Portability and Accountability Act]-compliant video conference) to be conducted weekly.This duration of appointment corresponds to a clinician billing code (CPT-90832), helping to ensure that the reimbursement of clinician time would not become a barrier to scale-up following the research.As needed, because of a therapist's or patient's schedule, up to 2 sessions were able to be scheduled per week.Throughout the treatment, participants were able to communicate with their therapists between sessions through asynchronous in-app secure messaging.Sessions were meant to support a patient's progress through the app-led treatment.In this way, the model mimicked the "flipped classroom," a new pedagogical approach shown to improve student learning [40].In a flipped classroom, students watch or read lectures and complete initial practice problems asynchronously, reserving valuable classroom time for active problem-solving with an instructor.As such, sessions were intended to monitor risk as needed, help participants set goals, enhance motivation, clarify and practice the skills learned via the Mindset app to best meet the patient's needs, brainstorm ideas for homework, and problem-solve treatment barriers that arose.Therapists were instructed to work within a CBT framework and not to introduce other treatment modalities.Such fidelity was monitored in weekly supervision, via therapist self-checks included within session records ("Did you use any of the following non-CBT techniques?[check all that apply]") and via adherence ratings (ie, the degree to which forbidden content was introduced).The therapist dashboard was a separate web-based portal wherein therapists could receive and respond to messages and track participant progress in the app.

Assessments
Assessments were conducted by master's or doctoral-level independent evaluators who were not involved in treatment, were complemented by participant self-report, and occurred at baseline, mid-treatment (week 4), end of treatment (week 8), and follow-up (3 months posttreatment).Evaluators completed training on all clinician-administered measures and were required to maintain high reliability (>0.75 intraclass correlation coefficient), with a gold standard expert rater; 18.9% (20/106) of randomly selected assessments were rated to prevent rater drift.Evaluators were not privy to participants' progress in treatment (eg, app content reviewed and session notes).Adverse events, life events, and changes in medication or outside treatment were surveyed at each assessment or when a patient reported to study staff.

Baseline Diagnostic Assessment
The Mini International Neuropsychiatric Interview was used to establish eligibility and characterize the sample.It is a reliable, validated semistructured diagnostic assessment of DSM-5 psychiatric disorders [41].

Feasibility and Acceptability
Participants completed the self-reported measures as follows.The Credibility/Expectancy Questionnaire (CEQ) [42], completed at baseline and week 4, is a 6-item, self-reported Likert-type questionnaire that assesses patients' judgments about XSL • FO RenderX the credibility of the treatment rationale and treatment expectancy.Items on both subscales are summed together for total outcome scores that can range from 3 to 27, where higher scores mean higher treatment credibility and higher outcome expectancy.We assessed the internal consistency of scales with coefficient omega (McDonald ω), given the heterogeneity of variances across scale items; coefficient ω can be interpreted in the same way as Cronbach α.The internal consistency of the credibility items in this sample ranged from ω=0.69 at baseline to ω=0.77 at week 4; for the expectancy items, internal consistency ranged from ω=0.82 at baseline to ω=0.96 at week 4.The Mobile Application Rating Scale User Version (uMARS) [43], administered at week 8, collects evaluations of mobile health apps.The 26 items assess participants' evaluations of engagement, functionality, aesthetics, information quality, app subjectivity quality, and perceived impact.Items are rated on differently worded 5-point Likert scales ranging from 1 (inadequate) to 5 (excellent).An overall app rating score can be calculated as the mean score of the first 4 subscales (engagement, functionality, aesthetics, and information quality; a range of 1-5), where higher scores indicate higher overall perceived app quality.In this sample, the internal consistency of the 4 subscales used in the overall mean scores were ω=0.76 for engagement, ω=0.83 for functionality, ω=0.72 for aesthetics, and ω=0.62 for information quality, with an overall item consistency of ω=0.83.The Client Satisfaction Questionnaire (CSQ) [44], completed at weeks 4 and 8, is an 8-item questionnaire assessing satisfaction with clinical services received.Each item uses a 4-point Likert scale.Items are summed for a total score ranging from 8 to 32, with higher scores indicating greater satisfaction.The internal consistency of the CSQ was ω=0.93 at week 4 and ω=0.89 at week 8. Treatment use was assessed with a single question: "On average, how much time (in minutes) do you spend using the app or practicing skills from the app in total, per week?"Answers were collected as the number of minutes in integer format, where more time spent on and off the app was interpreted as greater treatment use.In addition, app use data were collected automatically based on the actions participants completed in the app.Due to technical issues, 2 participants' app use data were inadvertently not recorded.The internal consistency values for the CEQ credibility subscale at baseline and the uMARS information quality subscale fell below 0.7 and are a noted limitation.

Clinician-Administered Measures
The primary measure of MDD symptom severity was the (clinician-rated) Hamilton Depression Rating Scale (HAM-D) [45].Considered a gold standard means of assessing symptom severity in patients who are depressed, it contains 21 items that are rated on a mixture of 3-and 5-point Likert scales.The first 17 items are summed for the total score, which can range from 0 to 52.Higher scores indicate greater depression severity.The internal consistency of the HAM-D in this sample ranged from ω=0.79 at baseline to ω=0.93 at the 3-month follow-up (question 17 was necessarily omitted from internal consistency calculations due to the absence of variability in responses; all participants received a score of 0 for this "insight" item ["Acknowledges being depressed and ill"] at all assessment points with the exception of 1 participant at the 3-month follow-up).To evaluate treatment response and remission, we used criteria of HAM-D score reductions of ≥50% for treatment response, HAM-D score reductions of ≥25% but < 50% for partial response, and HAM-D scores ≤7 to indicate remission [46][47][48].An expert rater reviewed 18.9% (20 of the 106 assessments that were ultimately completed) of HAM-D assessments.Inter-rater reliability was excellent (HAM-D: intraclass correlation coefficient (1,1)=0.91).

Self-Reported Measures
Participants completed the following secondary measures of symptoms and functioning at each assessment: (1) The Work and Social Adjustment Scale (WSAS) [49] is a 5-item, self-reported measure of impairment in occupational, social, and family domains.Items are measured on 9-point Likert scales ranging from 0 (no impairment at all) to 8 (very severe impairment).The items are summed for a total score ranging from 0 to 40, where higher scores mean higher functional impairment.The internal consistency of the WSAS ranged from ω=0.84 at baseline to ω=0.93 at the 3-month follow-up.(2) The Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form (Q-LES-Q-SF) [50] is a 16-item self-reported measure of subjective quality of life.Each question is rated on a 5-point Likert scale ranging from 1 (very poor) to 5 (very good).Questions 1-14 are then summed to a total score, and the total score is reported as a percentage maximum possible, such that the final percent score range is 0% to 100%; higher scores correspond to greater ratings of quality of life.The internal consistency of the Q-LES-Q-SF ranged from ω=0.84 at baseline to ω=0.90 at the 3-month follow-up.(3) The PHQ-9 [51] is a self-reported measure of the past week's depression severity.It includes 9 Likert scale items mapping onto DSM-5 symptom criteria and ranging from 0 (not at all) to 3 (every day).The internal consistency of the PHQ-9 ranged from ω=0.70 at baseline to ω=0.89 at week 8.

Power Analysis
With 28 participants enrolled, we had > 80% power to detect pre-to posttreatment effect sizes of d ≥1.37 (very large effect sizes), assuming 30% dropout, a pre-to posttreatment correlation of 0.18, and a doubling of the SD from pre-to posttreatment.The pre-to posttreatment correlation estimate was based on the mean pooled correlation between pretest and posttest HAM-D scores in 14 CBT trials for adult depression [52], and the estimate of the detec effect size was based on a single degree of freedom contrast in a paired means test implemented in SAS for Windows (version 9.4; SAS Institute).

Feasibility and Acceptability
We examined feasibility and acceptability by reporting (1) dropout rates and reasons (defined as participants not completing an end point HAM-D), (2) patient satisfaction (CSQ), (3) patient feedback (uMARS), (4) patient credibility and expectancy ratings (CEQ), and (5) treatment use.We computed means and SDs for the number of app steps completed, the number of days on which participants completed any actions in the app, the number of messages participants sent their therapist in the app,

XSL • FO
RenderX the number of sessions participants completed with their therapists, and session time spent per patient per week by therapists.For measures collected at least twice (ie, CSQ, CEQ credibility and expectancy, and treatment use), we used generalized linear mixed models with repeated measures to examine if these self-reported ratings changed over time.

Preliminary Efficacy and Secondary Outcomes
Analyses were first completed using our intent-to-treat sample (participants who completed a baseline assessment) and then repeated with our "per-protocol" sample (participants who completed posttreatment assessments and did not change psychiatric medications or begin psychotherapy during the study; n=24, 86%).We examined the preliminary efficacy of Mindset for Depression plus brief video-conferencing appointments with a therapist on symptoms and well-being outcomes using mixed model analyses with repeated measures (baseline, mid-treatment, posttreatment, and 3-month follow-up) modeled using an unstructured covariance matrix.We then compared pre-to posttreatment differences using a 2-tailed α of .05 to evaluate preliminary efficacy.We similarly compared pretreatment to end of follow-up estimates to estimate whether changes remained significant by the end of the follow-up.Means are presented as raw means with SDs, while differences between assessments are presented as model-estimated means with CIs (LSM differences [95% CI]) unless otherwise specified.Effect sizes were calculated as Hedges g ave , which takes the correlation of within-participant scores into account [53].Analyses were conducted for changes in depression symptoms (HAM-D and PHQ-9), functional impairment (WSAS), and quality of life (Q-LES-Q-SF).All analyses were completed using the SAS software (version 9.4, SAS Institute Inc).

Overview
Study participants (N=28) were predominantly female (n=21, 75%), White (n=20, 71%), and single (n=19, 70%), with a mean age of 33.5 (SD 10.9) years.The majority of participants had college or advanced degrees, were employed full time, and came from urban or suburban locations (Tables 2 and 3).Nearly half of the participants (n=13, 45%) had one or more comorbid psychiatric diagnoses, and the average duration of MDD was 15.5 (SD 12.7) years.

Feasibility and Acceptability
Credibility and expectancy scores were moderate to high at preand mid-treatment.The ratings did not differ significantly between timepoints.Mean credibility ratings were 18.9 (SD 3.1) at pretreatment and 19.3 (SD 3.8) at mid-treatment (LSM difference 0.4, 95% CI -1.4 to 2.2; P=.63; g ave =0.13).Mean expectancy ratings were 13.8 (SD 3.3) at pretreatment and 15.0 (SD 5.4) at mid-treatment (LSM difference 1.2, 95% CI -0.7-3.0;P=.22; g ave =0.28).Of the 28 participants, 2 (7%) dropped out of the study prior to their posttreatment assessment at week 8 (Figure 2): 1 during the first week of treatment and 1 after the midpoint assessment; both participants were lost to follow-up despite repeated contact attempts and neither provided a reason for drop out.One more participant was lost to follow-up after the posttreatment assessment.Among the 26 participants who completed the posttreatment assessment, patient satisfaction was high and did not change significantly from mid-treatment (CSQ total score mean 26.3, SD 4.0) to posttreatment (mean 27.2, SD 3.3; LSM difference 1.0, 95% CI -0.1 to 2.2; P=.07; g ave =0.24).Conservatively counting the 2 dropouts as not satisfied, 89% (16/28) were very or mostly (9/28) satisfied and 93% (26/28) would recommend the Mindset for Depression program.
With respect to app use and satisfaction, participants reported practicing skills from the app on their smartphone and offline for a median of 50 (IQR 30-60) minutes per week up to mid-treatment and 60 (IQR 30-90) minutes per week between mid-and posttreatment.Based on passively collected app use data (n=26), participants accessed the app on 36.8 (SD 10.0) days, completed a median of 7 (IQR 6-8) steps out of 8 steps by the week 8 assessment, and sent a median of 0 (IQR 0-4) between session messages to their therapist through the app.Five participants completed the last assigned step after the week 8 assessment, bringing step completion to a median of 8 (IQR 6-8) by the end of follow-up (3 months), with 58% (15/26) participants completing the final step by then.Participants' overall ratings of the app quality, rated on the 1 (inadequate) to 5 (excellent) scale of the uMARS, was high (mean 4.3, SD 0.4); ratings of the app's functionality (mean 4.5, SD 0.6), aesthetics (mean 4.6, SD 0.4), and information (mean 4.6, SD 0.4) were higher than those of the engagement subscale (mean 3.6, SD 0.6).Participants reported a mean overall star rating of 4.0 (SD 0.5) but were less inclined to endorse that they would be willing to pay for the app (mean 2.5, SD 1.2).
With respect to therapist support, participants attended an average of 7.6 (SD 1.5) of the possible 8 brief sessions, each of which lasted approximately 24.5 (SD 1.1) minutes.In the sessions, therapists mainly covered behavioral strategies (mean 10.9, SD 3.4 minutes), cognitive strategies (mean 6.9, SD 2.7 minutes), psychoeducation (mean 2.5, SD 2.9 minutes), and mindfulness strategies (mean 2.4, SD 1.2 minutes), with only a little time (<2 minutes on average) spent on explicit motivational strategies, risk management, and technical issues.Participants had a mean homework completion rate (per therapist report) of 82.7% (SD 13.8%).

Preliminary Efficacy and Secondary Outcomes
Over the course of the 8-week treatment, participants' depression severity decreased significantly on both the clinician-rated (HAM-D: P<.001; g ave =1.47) and self-reported measures (PHQ-9: P<.001; g ave =1.89; 4).Concurrently, participants' self-rated functional impairment decreased (WSAS: P<.001; g ave =1.29), and their self-rated quality of life increased (Q-LES-Q-SF: P<.001; g ave =1.74).These changes persisted through the 3-month follow-up, with effect sizes remaining largely the same (Table 4).The results did not differ meaningfully in the per-protocol analyses; HAM-D, PHQ-9, WSAS, and Q-LES-Q scores all improved with statistically significant and large effect sizes (Table 5).

XSL • FO
e WSAS: Work and Social Adjustment Scale (score range 0 to 40, where higher scores mean higher functional impairment).f Q-LES-Q-SF: Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form (percent score range 0% to 100%, where higher scores correspond to greater ratings of quality of life).b Within-group effect sizes were calculated as Hedges g ave for differences from baseline to week 8 or 20, respectively, using raw means for completers only.c HAM-D: Hamilton Depression Rating Scale (score range 0 to 52, where higher scores indicate greater depression severity).d PHQ-9: Patient Health Questionnaire-9 item (score range 0 to 27, where higher scores indicate greater depression severity).
e WSAS: Work and Social Adjustment Scale (score range 0 to 40, where higher scores mean higher functional impairment).f Q-LES-Q-SF: Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form (percent score range 0% to 100%, where higher scores correspond to greater ratings of quality of life).

Adverse Events and Medication Changes
Overall, 17 out of 28 participants reported a total of 33 adverse events during the 8-week treatment phase of the trial in categories such as psychiatric symptoms (n=21, 64%; eg, increased suicidal ideation, depression, or anxiety, sleep difficulties, and emotional distress), infections (n=6, 18%; eg, COVID-19, shingles, and illness), physical injuries (n=3, 9%), and other (n=3, 9%).All adverse events were identified as either mild (new event that did not interfere with activities of daily living; 25/33, 75.8%) or moderate (new event that posed some interference or required intervention to prevent interference; 8/33, 24.2%).No serious adverse events occurred in this trial.By assessing how likely it was that reported adverse events were related to treatment, most events were found to be definitely unrelated (19/33, 57.6%), followed by unlikely to be related (5/33, 15.2%) or possibly related (9/33, 27.3%).A waxing and waning course of MDD symptoms and suicidal ideation is common in MDD.However, there were no adverse events indicating significant clinical deterioration in the trial and no principal investigator-initiated withdrawals.Two participants changed psychotropic medications during the treatment phase: 1 participant increased the dosage of their medication for anxiety and 1 participant discontinued their medication for depression.There were no reported therapy changes during the treatment phase.In the 3-month follow-up phase of the trial, 7 participants reported an additional 10 adverse events in the categories of psychiatric symptoms (4/10, 40%), general disorders (2/10, 20%), and other (4/10, 40%), which were found to be definitely unrelated (5/10, 50%), unlikely to be related (1/10, 10%), and possibly related (4/10, 40%) to treatment.Also during the follow-up period, 5 participants changed psychotropic medications and 3 participants started individual therapy or counseling (non-CBT; for mood or anxiety, and traumatic event or PTSD).

Principal Findings
In this study, we examined the feasibility, acceptability, and preliminary clinical impact of Mindset for Depression, an 8-week app-based CBT with therapist support.The results support Mindset for Depression as a viable treatment option for individuals with moderate to severe MDD.Treatment was feasible to deliver in a setting and acceptable to patients who varied widely in age, severity of the symptoms, and other clinical and demographic dimensions, as indicated by high retention rates (27/29, 93%), favorable satisfaction ratings (CSQ), and positive user feedback (uMARS).The results also showed that the treatment was efficacious.There was a significant reduction in clinician-rated (HAM-D) depression severity with a large effect size as well as significant improvement in functioning and quality of life.After just 8 weeks, about half of the participants were rated as treatment responders and a third were in remission, and these changes were maintained throughout the 3-month follow-up.These results are similar to face-to-face psychotherapy [54] and comparable to guided internet-delivered CBTs, notable given the short treatment duration, younger age, and relatively higher severity of the sample, all of which are associated with lower odds of response and remission in digital treatment [55].
Encouragingly, app ratings were above average for mental health apps [56].Compared to other mHealth for depression or anxiety [57][58][59][60] and treatment-as-usual for depression [61], overall treatment satisfaction scores (CSQ) were also excellent.This was achieved despite clinician time that was below typical courses of CBT for depression (8 sessions averaging 24 minutes vs upwards of 20 sessions lasting on average 45-50 minutes) [62].Indeed, much of the time-consuming didactic content (eg, psychoeducation about the CBT model) was administered by the app through readings, videos, and practice questions, conserving clinician time for more personalized skills' review, practice, and tailoring and addressing risk issues.App usage data were excellent, with most participants reaching the final step by the end of 8 weeks as intended and reporting regularly practicing skills on or off the app each week.Moreover, participants rarely used the messaging function between sessions (eg, seeking additional clarification or encouragement), which would be unbillable clinician time.In this way, Mindset for Depression has the potential to improve the reach of CBT therapists and hopefully reduce treatment gaps, particularly for underserved communities [63,64].
These results are important because the high cost and limited availability of trained clinicians are major barriers to the dissemination of traditional psychotherapy [65].Our findings add to an emerging literature demonstrating the potential of guided smartphone-based CBT to mitigate these challenges.However, although numerous seemingly efficacious therapist-supported digital treatments have been created for depression [10], few are available outside of research settings or integrated into a health care system [25].Created in collaboration with an industry partner to accelerate the dissemination pipeline and allow for ongoing technical maintenance and improvements, Mindset for Depression is commercially available and poised to be truly scalable and successful in real clinical settings.Setting it apart, Mindset for Depression was collaboratively developed with a design team, clinicians, and people with lived experience as well as rigorously applied user interface and experience best practices for mobile platforms.Critically, "users" in the user-centered design process included both patients and clinicians.This approach aligns clinical and engagement incentives so that one is not delivered at the expense of the other and yields an easy to use, streamlined, and effective treatment program.Concretely, this translated to pacing content (delivering or unlocking intervention components in a stepwise manner to encourage practice and mastery of concepts and skills that build on one another), shorter activity lengths, creating a professional and approachable tone, and inclusion of feedback loops.As standardized content is delivered via the app in each step, supporting clinicians are able to prioritize personalizing treatment and use their specialized skillsets, such as addressing unique barriers to motivation, engagement, or response.Critical next steps would be to directly evaluate the program and its readiness to scale in larger scale effectiveness trials and real-world settings.
The results also provide important guidance for improving the program.First, participant feedback (eg, uMARS engagement subscale) indicates that increased customization and interactivity could improve the app's appeal.This is consistent with the larger literature showing user preferences for apps with such features and negative reactions to apps whose content is repetitive and not personally relevant [66].Although most participants shared positive views of the Mindset app, including indicating that they would recommend the app to friends, there was a mixed response regarding their willingness to pay for the Mindset app.It is unclear to what extent this reflects (1) that this question was asked after treatment and thus patients no longer felt the need to use the app; patients were meant to complete all therapeutic content within the 8-week treatment period; (2) a gap between what patients find beneficial and what they are willing to pay for; other studies have similarly found a reluctance to pay for mental health apps [66,67]; or (3) whether the app and concurrent therapist support were experienced as critically linked, and thus the app alone was not as valued.Indeed, half of participants indicated weekly brief sessions were the exact right amount of therapist contact and only 2 would XSL • FO RenderX have preferred less contact.Resolving this question will be necessary for developing a commercially sustainable implementation plan.Moreover, future iterations would benefit from broadening outcomes of interest.For example, beyond reducing symptoms of depression or other mental health concerns, an optimal intervention would also foster positive emotions and thriving and perhaps target common comorbidities, such as sleep difficulties or substance use.These outcomes should be captured in future studies and additional treatment components integrated as appropriate.

Limitations
The study has limitations that should be considered.First, the study has the inherent limitation of an open trial.Without a control group, we cannot conclusively determine that the treatment causes improvements in symptoms.Future controlled large-scale trials are needed.Second, the patient sample was self-selected, recruitment platforms were diverse, and our study therapists were trained in the use of digital therapeutics.Thus, patient and clinician stakeholders might have been biased toward individuals who are motivated by app-based therapy.Future research in real-world clinical settings is warranted.Third, although the large proportion of White women in the sample is consistent with past work and higher MDD prevalence and rate of treatment seeking in women [10], greater representation of patients with other racial and gender identities would strengthen our conclusions and ongoing treatment improvements.Fourth, we had adequate power to detect moderate to large treatment effects; a larger replication is needed to explore moderators and mediators, which are important for tiered care models.Finally, a longer follow-up period and health economics metrics would be required to see the full time and cost-savings potential of Mindset for Depression.

Conclusions
Mindset for Depression offers flexible app-led psychoeducation, skills practice, and support to patients with complementary clinician guidance to promote sustained engagement, monitor safety, and tailor treatment further to individual patient needs.The findings show that Mindset for Depression is a feasible, acceptable, and efficacious tool for adults with MDD.The hope is that such a program could be one cost-effective solution to barriers to psychotherapy dissemination and significantly increase access to evidence-based care.Although these initial results are very promising, more work remains to personalize the amount of therapist support and dose of treatment individuals receive to optimize treatment and increase rates of response and remission.The next steps include testing Mindset for Depression in a fully powered randomized controlled trial as well as the real-world clinical settings in which it is deployed.

Figure 1 .
Figure 1.Screenshots from the Mindset for Depression smartphone app.CBT: cognitive behavioral therapy; SMART: specific, measurable, achievable, relevant, and time-bound.

Table 3 . 5 :
Baseline clinical characteristics of participants enrolled in the Mindset open trial.Values Characteristics 15.5 (12.7)Duration of MDD a (years), mean (SD) Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition.c Percentage sums may exceed 100% because participants could report more than one diagnosis or be on more than 1 s psychotropic medication.d SRI: serotonin reuptake inhibitor.e This included anticonvulsants; no participant reported taking antipsychotics.

Figure 2 .
Figure 2. Flow of participants through the 8-week open trial of the Mindset Depression app with brief therapist visits on the web for people with a primary diagnosis of major depressive disorder.Reasons for ineligibility include diagnosis of bipolar disorder or severe substance use disorder, PHQ-9 score < 10, past CBT for MDD, acute, active suicidal ideation, and MDD not being the primary diagnosis.CBT: cognitive behavioral therapy.

Table 4 .
Baseline, mid-treatment (week 4), end of treatment (week 8), and follow-up (week 20) estimated mean scores on key clinical outcome measures.

Table 5 .
Baseline, mid-treatment (week 4), end-of-treatment (week 8), and follow-up (week 20) estimated mean scores on key clinical outcome measures in the per-protocol sample (n=24).-Q-SF f % scores a LSM = least squares mean.

Table 1 .
Summary of steps in the Mindset for Depression program.

Table 2 .
Baseline demographics of participants enrolled in the Mindset open trial.