Predictors, Outcomes, and Statistical Solutions of Missing Cases in Web-Based Psychotherapy: Methodological Replication and Elaboration Study

Background: Missing cases present a challenge to our ability to evaluate the effects of web-based psychotherapy trials. As missing cases are often lost to follow-up, less is known about their characteristics, their likely clinical outcomes, or the likely effect of the treatment being trialed. Objective: The aim of this study is to explore the characteristics of missing cases, their likely treatment outcomes, and the ability of different statistical models to approximate missing posttreatment data. Methods: A sample of internet-delivered cognitive behavioral therapy participants in routine care (n=6701, with 36.26% missing cases at posttreatment) was used to identify predictors of dropping out of treatment and predictors that moderated clinical outcomes, such as symptoms of psychological distress, anxiety, and depression. These variables were then incorporated into a range of statistical models that approximated replacement outcomes for missing cases, and the results were compared using sensitivity and cross-validation analyses. Results: Treatment adherence, as measured by the rate of progress of an individual through the treatment modules, and higher pretreatment symptom scores were identified as the dominant predictors of missing cases probability (Nagelkerke R=60.8%) and the rate of symptom change. Low treatment adherence, in particular, was associated with increased odds of presenting as missing cases during posttreatment assessment (eg, odds ratio 161.1:1) and, at the same time, attenuated the rate of symptom change across anxiety (up to 28% of the total symptom with 48% reduction effect), depression (up to 41% of the total with 48% symptom reduction effect), and psychological distress symptom outcomes (up to 52% of the total with 37% symptom reduction effect) at the end of the 8-week window. Reflecting this pattern of results, statistical replacement methods that overlooked the features of treatment adherence and baseline severity underestimated missing case symptom outcomes by as much as 39% at posttreatment. Conclusions: The treatment outcomes of the cases that were missing at posttreatment were distinct from those of the remaining observed sample. Thus, overlooking the features of missing cases is likely to result in an inaccurate estimate of the effect of treatment. (JMIR Ment Health 2021;8(2):e22700) doi: 10.2196/22700 JMIR Ment Health 2021 | vol. 8 | iss. 2 | e22700 | p. 1 https://mental.jmir.org/2021/2/e22700 (page number not for citation purposes) Karin et al JMIR MENTAL HEALTH


Introduction Background
The ability to evaluate the effect of psychotherapy often depends on the measurement of outcomes before-and-after an intervention. However, many participants are unable to complete measurement questionnaires and become missing cases, thus threatening the validity of conclusions drawn from trials. Missing cases are frequently reported in psychotherapy trials [1,2] and pose a risk to the validity of the evidence base for some treatments [3,4]. Overlooking the causes and outcomes of missing cases can lead to systematic measurement bias and misrepresentation of treatment outcomes and, therefore, risks compromising the validity of clinical research [5,6]. For this reason, careful analysis of the effect of missing cases is now considered an important part of the process of measuring and reporting clinical evidence [3].
Although the importance of handling missing cases is well understood [3,7], accounting for the outcomes of missing cases is a challenging task, as researchers can never verify whether the replacement values they generate accurately captured patient outcomes. Thus, researchers must rely on statistical approximation and the assumption that any replacement outcomes are suitable [8].
A key requirement for handling missing data is to ensure that the outcomes of missing cases are represented within statistical analyses [8]; typically, this involves using a statistical solution that generates replacement values for missing cases [5,8,9]. Researchers rely on statistical methods that explore the characteristics of missing cases to determine whether a statistical solution is suitable for missing cases and whether these features could also be associated with distinct clinical outcomes. This is typically achieved through analyses that identify variables that predict both the probability that participants will become missing cases and the clinical outcome of such missing cases [4,8,10]. Identifying such variables enables researchers to generate replacement scores that are likely to capture the outcomes of treatment for missing cases [7,10]. For example, if older age is associated with a decreased probability of becoming a missing case and an increased rate of symptom change, a statistical model that can adjust for participants' age will be considered to create replacement outcomes that are more accurate and representative of the effects of treatment than models that overlook age. In statistical terms, variables that predict both the likelihood of becoming a missing case and the outcome of missing cases are known as mechanisms of nonignorable missing cases [6,10,11].
Although statistical models that incorporate replacement values for missing cases have been in use for decades [7,8,12], relatively few published studies have reported the characteristics of missing cases in psychotherapy trials or research that identified nonignorable mechanisms of noncompletion that might influence the reported outcomes [2,13]. This gap in methodological research may result from (1) the limited knowledge about missing cases and the patient features that may generalize across clinical trials [2] and (2) the scarcity of large and comparable treatment samples that are statistically powered to explore nonignorable mechanisms of noncompletion.
Preliminary evidence from trials of internet-delivered cognitive behavioral therapy (iCBT) suggests that common patient variables, such as treatment completion and baseline depressive symptom severity, were the main predictors of both the likelihood of patients dropping out of treatment and moderating the clinical effect [2,4]. These findings suggested that (1) the symptom outcomes of missing cases were not comparable with the patients that provided their data following treatment and (2) missing cases can be characterized through key features that shape the likelihood of a case to present as missing during posttreatment assessment. In particular, minimal treatment adherence, as measured by the partial progress of an individual through the treatment modules, was associated with increased odds of presenting as a missing case during posttreatment assessment (eg, odds ratio 70.6, 95% CI 34.5 to 145.1) and a lower rate of symptom change (eg, 21% for low treatment adherence vs 49% for high adherence) [4]. Without accounting for these variables, web-based psychotherapy researchers risk overlooking a systematic pattern of worse treatment outcomes for missing cases and generating estimates of treatment effects that are unrealistically optimistic. However, the evidence from this study regarding the effect of missing cases in internet-delivered psychotherapy is limited to a single study that focused on symptoms of depression using data from a highly controlled clinical trial with high participant retention (87%) [4]. Replicating this study in an additional therapeutic context and within additional clinical outcomes is needed before conclusions can be drawn regarding the characteristics and effect of missing cases in internet-delivered psychotherapy and the appropriate statistical methods for handling missing cases.

Objectives
The main aim of this study is to examine the characteristics and possible clinical outcomes of missing cases in a large sample in routine care and compare different statistical methods for estimating those outcomes. This study examined the outcomes of a large sample of patients enrolled in treatment courses provided by an established digital mental health service (DMHS) offering internet psychotherapy based on cognitive behavior therapy (n=6701), in which the patients were administered validated self-report questionnaires to measure symptoms of depression, anxiety, and psychological distress at baseline, at intervals during treatment, and at follow-up. It was hypothesized that (1) lower treatment completion and increased baseline depressive symptoms would predict both increased likelihood of noncompletion and higher symptoms of depression posttreatment and that (2) statistical models that account for these features will result in higher posttreatment symptom replacement scores compared with the statistical models that assume missing cases occur as a random event.

The Sample
This study examined the outcome of routine care provided by Australian National DMHS, the MindSpot Clinic [14]. All participants provided consent for their deidentified data to be used in evaluation and quality improvement activities. Approval for this research was provided by the Macquarie University Human Research Ethics Committee. Further information about the sample, the course content and delivery protocols, and the outcomes of the iCBT can be found in a study by Titov et al [15]. The standardized nature of clinical engagement and treatment delivery in iCBT reduces the likelihood that differences in outcomes are because of different approaches of therapists.
The 6701 participants who commenced treatment during a 30-month period completed self-report symptom scales and provided other information pretreatment and completed symptom scales midtreatment (surveyed at Week 4), posttreatment (Week 8), and at follow-up (Week 20).
In this study, emphasis was on the prediction of posttreatment symptom outcomes, where posttreatment was considered the main time point for evaluating the effects of treatment [15]. From the participants who initiated treatment, 63.7% (4271/6701) of the sample provided data posttreatment, with 36% (2430/6701) considered to be missing cases as individuals who did not comply with weekly email and telephone prompts to complete a posttreatment evaluation assessment. For cross-replication analysis, the sample was randomly allocated into 5 subgroups, each with more than 1340 participants pretreatment and more than 840 completed measurements posttreatment. Tables 1 and 2 collate the demographic information of the samples, including chi-square values, to confirm adequate randomization.

Intervention
The participants enrolled in the Wellbeing Course [15], a 5-lesson course delivered over 8 weeks to patients experiencing depression and anxiety. The lessons covered (1) the cognitive behavioral model and symptom identification, (2) thought monitoring and challenging, (3) de-arousal strategies and pleasant activity scheduling, (4) graduated exposure, and (5) relapse prevention. Additional material included downloaded lesson summaries, patient stories, and a range of resources, for example, improved sleep, problem solving, and communication.
Each of the lessons provided homework assignments to assist participants in learning and applying the skills described in the lessons to their everyday lives.

Measures
The primary outcome measures for this study were standardized symptom scales for anxiety, depression, and psychological distress.

Patient Health Questionnaire-9
Patient Health Questionnaire-9 (PHQ-9) is a 9-item measure of depressive symptoms. Total scores range from 0 to 27 with higher scores indicating more severe depressive symptoms. PHQ-9 has demonstrated excellent reliability and validity in previous studies [16,17] and high internal reliability (Cronbach α=.848) and stability over time (assessment to pretreatment intraclass correlation=.72) within this sample.

Generalized Anxiety Disorder Scale-7 Item
Generalized Anxiety Disorder Scale-7 Item (GAD-7) is a 7-item measure of generalized anxiety. Total scores range from 0 to 21, with higher scores indicating more severe symptoms of anxiety. GAD-7 has shown excellent reliability and validity in previous studies [17,18] and high internal reliability (Cronbach α=.85) and stability over time (assessment to pretreatment intraclass correlation=.74) within this sample.

Kessler 10 Item
Kessler 10 Item (K-10) is a widely used 10-item measure of psychological distress. The scale has demonstrated adequate reliability and validity in previous studies [17,19] and within this sample (Cronbach α=.83; intraclass correlation=.71). Total scores range from 10 to 50 with higher scores indicating greater levels of psychological distress. The 10 to 50 score range was converted into a 0 to 40 range within the analysis of longitudinal symptom change.
The following measures were also included as possible independent variables or predictors that might predict clinical trajectory through treatment and noncompletion.

Comorbidity
Individuals were considered to have comorbidity if they demonstrated scores of both anxiety and depression above predetermined clinical thresholds (GAD-7 ≥8 and PHQ-9 10 at baseline [17]).

Demographic Measures
This included age (in years at the start of treatment), gender, relationship status, pretreatment symptom scores, pretreatment anxiety scores, and educational attainment (Tables 3 and 4).

Treatment Completion
Treatment completion was measured by the progression of participants through the 5 modules of the course, consistent with definitions of treatment progression and adherence in eHealth interventions [20]. Completion was measured by (1) logging in to the assigned secured website and (2) accessing the lesson modules, either being online, when the duration of participation could be recorded, or by downloading the lessons.

Identifying Predictors of Missing Cases and the Rate of Clinical Change
The characteristics of missing cases and the estimates of their likely outcomes were examined in 3 steps. All analyses were conducted using SPSS (IBM Corporation) version 25 and a dedicated R software package [21] for longitudinal power [22].

Missing Cases Probability
The first step aimed to identify the relative importance of variables that examined the probability of becoming a missing case. Testing and modeling of the probability of missing cases followed the variable selection strategy outlined by Harrell [23] for logistic regression modeling. In this strategy, potential moderating predictors were tested through separate (univariate) logistic regression models, with the missing case status of the patient at posttreatment as the binary dependent variable. Subsequently, a stepwise variable selection analysis was used to identify factors included in the multivariate model, including treatment completion; baseline depression score; baseline anxiety score; and demographic variables, such as gender, age, employment status, educational attainment, and relationship status. Variables that increased the probability of becoming a missing case were retained in the final model of predictors of missing cases probability. Additional forward and backward model building techniques were also employed to replicate the findings of the stepwise variable selection analysis. Each possible predictor of missing cases was assessed for statistical significance at a more conservative P value of .01. In addition, the ability of each predictor to account for the probability variance of missing cases likelihood was represented with the Nagelkerke R-squared values, which illustrates the predictive contribution of each variable and the variance it can account for in comparison with a model with no predictors [24]. The potential of each variable to differentiate between missing and nonmissing cases was evaluated with sensitivity (prediction of true positives; noncompletion), specificity (prediction of true negatives; observed), and the overall rate of prediction accuracy statistics such as receiver-operator characteristics.

Moderators of Clinical Change
Longitudinal statistical models were also employed to test the influence of baseline and treatment variables on the rate of symptom change. Together, these models sought to identify variables that jointly predicted missing cases and the rate of symptom change, where a significant result on both outcomes would imply a mechanism of missing cases. Longitudinal predictors of symptom change were examined using generalized estimated equation models, such as generalized estimating equations (GEEs) [25] that included a time covariate, each of the predictors as a main effect, and a time by predictor interaction. In these models, the coefficient of change between pre-and posttreatment (β time ) represents the average rate of pre-post symptom change (longitudinal change from baseline) after accounting for within-subject variance (repeated individual scores over time). The moderation of symptom change following treatment was tested by examining the time by covariate interaction (eg, β time* β Gender ). All models included a gamma scale, an unstructured pattern of within-subject correlation matrix, and a log link function to account for positive skewness and the proportional pattern of symptom change from baseline [26]. These models were also tested with the overall sample and retested within each of the 5 subsamples. The purpose of cross-replication sought to test whether characteristics of certain missing cases could be observed reliably within cross-validation subsamples.

Power Analyses
A power analysis was conducted for both the GEE longitudinal models of symptom change, and the binary logistic regression models of missing cases probability at posttreatment [27]. To estimate power, these analyses used the observed statistical parameters from pilot GEE models, such as the rate of change over time, the variance of symptom scores at each time point, and within-subject correlation. This information was then used to determine the minimal differences in the rate of longitudinal change (moderation of longitudinal change) that could be refuted as false negatives [22]. The pilot data used to determine the overall rate of change were replication sample 1 (n=1341), and the differences from the overall rate of symptom change, or missing cases likelihood, were calculated as the relative difference (expβ) from the overall rate of change. These power analyses determine whether nonsignificant tests of symptom change variance, or missing cases probability, are genuine nonsignificant results or whether certain nonsignificant results could be masked by the size of the sample. Separate power estimates were created for the GEE models of symptom change and the binary logistic regression models of missing cases probability. All analyses also specified the probability of power at 80% and a probability of Type I error of .05. The resulting power estimates are further described in the Results section.

Comparison of Different Missing Cases Outcome Approximation Models
Approximated missing cases replacement scores were generated using several types of stratified longitudinal models and evaluated side by side. Models differed from one another by the inclusion of different covariates and a covariate by a time interaction term. For example, by including covariates such as gender and a time-by-gender interaction term, the prediction of replacement outcome scores for missing cases is considered to approximate the corresponding clinical outcomes of that individual as a male or a female. The inclusion of different covariates in the models is thought to test different assumptions about why patients were missing and lead to the adjusted prediction of their likely outcomes [5,8]. In statistical terms, the conditional adjustment of missing cases outcomes by different variables is often referred to as the replacement of missing cases under a conditional missing at random assumption (MAR) [5,8].
In contrast to the adjusted models, models assumed that posttreatment missingness occurred as a completely random event. In these models, the probability of missingness was assumed to be without any systematic characteristics and was unrelated to the patient's outcome [5,8]. These models included no individual patient covariates, other than the time coefficient, and were labeled as missing completely at random (MCAR). Under such MCAR models, the average replacement of missing cases would reflect the average outcome of the remaining sample of completers, given that missing cases are not assumed to be unique from their completer peers.
Missing cases were also replaced through statistical methods such as multiple imputations and a predictive longitudinal mixed model, which included random slopes and random intercepts [9]. The replacement outcomes from such models were used to compare the estimation of missing cases replacement across different types of statistical methods. This addition intended to establish that the impact from the phenomena of nonignorable missing case mechanisms would be observed despite different statistical techniques. Finally, the results using nonstatistical methods for missing cases replacement, such as the last observation carried forward (LOCF) method and baseline observation carried forward (BOCF) method were compared.
To gauge the accuracy and impact associated with the different replacement models, adjusted models (MAR) were compared and interpreted as either overestimating, underestimating, or being equivalent to models that overlook the features of missing cases (MCAR models). Specifically, if the mean CI from an adjusted model was within the mean CI of an unadjusted model, evidence of statistical equivalence was concluded [28]. If the CI of the mean replacement scores was outside the mean of the scores from unadjusted models cases, the models were considered to approximate distinct (statistically significant) symptom outcomes.

Predictors of Missing Cases and the Rate of Clinical Change
Results from the logistic regression models and testing for predictors of missing cases at posttreatment are presented in Table 3.
The effect of increased baseline severity demonstrated that for every additional PHQ-9 point at baseline, the probability of a participant becoming a missing case at posttreatment increased by 2% or 0.7% as a measure of relative risk (eg, 0.7% of 36%). Similarly, the effect of a 1-point increase in psychological distress at baseline, as measured by K-10, increased the odds of an individual becoming missing by 1.6% or 0.56% as a measure of relative risk.
The age of the participant was associated with a reduced probability of presenting as a missing case, with each additional year of age reducing the odds of becoming a missing case by 3.3% or 1.2% as a measure of relative risk. However, treatment completion, which is the number of lessons completed during treatment, was the dominant predictor of missing cases and accounted for 60.3% of the total 60.8% probability variance of missing cases. The disparity among different rates of treatment completion demonstrated that only 9.80% of participants who completed the entire program did not complete the posttreatment assessment, whereas more than 95% of those who completed only one lesson were missing cases posttreatment.
An interaction between the severity of depressive symptoms at baseline and treatment completion was found to be nonsignificant (Wald χ 2 1,6701Treatment completion*Baseline symptoms =2.2, P=.71), as was the age by treatment completion interaction (Wald χ 2 1,6701Age* χ 2 Treatment completion =4.9, P=.30). These nonsignificant interactions imply that baseline symptom severity, age, and treatment completion were distinct predictors of missing cases probability and were independently impacting missingness (eg, additive effects that are not conditional on one another). Table 4 provides estimates of different missing cases predictors and the replication of these results within each of the 5 subsamples.

Power Analyses of Missing Cases Probability Models
Post hoc power analyses of the missing cases models illustrated that the 5 replication subsamples were powered to refute false-negative effects that were as little as 10% of the overall sample probability of missing cases. For example, sample 1 (n=1341) was powered to refute false-negative predictors that moderated the probability rate of missing cases by 3.6% or more (10% of the 36% who did not complete the posttreatment assessment). Refuting nonsignificant tests of predictors that were smaller than 3.6% required a sample larger than the sample available (1341). The power to refute nonsignificant results can be illustrated with the test of the gender predictor in Table 5, where missing cases of males were estimated as 33% and that of females at 37%. The difference between males and females was not statistically significant, and the sample in this study was large enough to refute this difference as a genuine nonsignificant (true negative) result, with a power of at least 80%.

Predictors of the Rate of Clinical Improvement
Variables that moderated the rate of symptom improvement were also tested to determine whether similar variables identified to predict missingness also moderated the rate of symptom change over time. The coefficient statistics in Tables 6 and 7 illustrate the symptom change moderation, associated with each independent variable, for each of the 3 symptom outcomes, with the results presented with separate tables for depressive symptoms (Table 6), anxiety symptoms (Table 5), and psychological distress symptoms (Table 7).     Table 6 shows that posttreatment depressive symptoms were moderated by treatment completion, all 3 baseline symptom levels, and relationship status; all presenting with significant predictor by time interactions. Thus, increases in baseline symptom severity, increased treatment completion, and relationship status significantly increased the rate of depressive symptom improvement in therapy.
Significant predictors of the rate of change in anxiety symptoms were similarly identified. Specifically, increased baseline anxiety symptoms, increased treatment completion, and the relationship status in treatment seemed to increase the rate of symptom change. The results of the anxiety moderators are presented in Table 5.
Analyses exploring moderators of general psychological distress (K-10) yielded the same pattern, with the results presented in Table 7, showing treatment completion, baseline severity, and relationship status to significantly moderate changes in psychological distress.

Power Analyses of Symptom Change Rate Models
Post hoc power analyses of the GEE symptom change models demonstrated that each of the 5 replication subsamples was adequately powered to determine which variables were nonsignificant if they moderated the rate of symptom change by as little as 12% of the total depression symptom change effect (5.7% of 48%). Within the anxiety symptom change models, the sample was powered to refute nonsignificant predictors that moderated 12% of the total reduction of anxiety symptom reduction (5.7% of 48%) and 13% of the total psychological distress symptom reduction (4.4% of 37%). Refuting predictor effects that were smaller than 5.7% (PHQ-9 and GAD-7) and 4.4% (K-10) required a sample that was larger than the 842 participants available in each of the subsamples.

Identified Mechanisms of Nonignorable Missing Cases
The predictors of treatment completion, baseline symptoms, and, to a lesser extent, relationship status demonstrated an association with both the likelihood of missing data at posttreatment and the rate of symptom change over time. These results confirm that treatment completion and, to a lesser extent, baseline symptoms were not significantly associated with noncompletion.
The association of treatment completion and baseline symptoms with both clinical improvement and risk of presenting as missing cases are illustrated in Figure 1 (missing cases probability at posttreatment and symptom change, associated with program completion) and Figure 2 (missing cases and symptom change trends associated with depressive symptom baseline severity and depressive symptom outcomes). These figures illustrate how the probability of missing cases is likely to increase for those individuals who also experience higher depressive symptoms at the end of the treatment period (8 weeks), as a result of low treatment completion ( Figure 1) and increased baseline symptoms (Figure 2).

Comparison of Replacement Outcomes From Different Statistical Models
In this step, the statistical approximation of replacement symptom outcomes was compared between 3 different statistical models: (1) models that adjust for the predictors that form missing cases mechanisms (eg, treatment completion), (2) models that adjust only for time (Completer's analysis), and (3) models that adjust for predictors that are not considered to be a cause of missing cases (eg, gender, age, education). These models differ from one another by the inclusion of different covariates that adjust the projected outcomes of missing cases. Tables 8 to 10 present the approximated mean PHQ-9, GAD-7, and K-10 scores and the CIs for the replacement scores for the various models.

Models adjusted for predictors that form nonignorable missing cases mechanisms (missingness and GAD-7 d outcome moderators)
Significant scores above MCAR 5 (3-7) 6 (5.9-6.1) (MAR) Baseline anxiety symptoms (GAD-7) Significant scores above MCAR 6 (4-7) 6 (5.9-6.1) (MAR) Baseline depressive symptoms (PHQ-9 e ) Significant scores above MCAR 7 (6-9) 6.1 (6-6. Tables 8 to 10 demonstrate that the statistical models that adjust their estimates of missing cases outcome according to the prominent characteristics of missing cases resulted in the prediction of increased symptom outcomes and a more restrained estimation of the treatment effect. For example, missing cases replacement models that account for the rate of treatment completion resulted in PHQ-9 estimates that were 29% higher than the outcomes from the Completer's analysis (Table 8). Similarly, missing cases replacement models that adjusted for both baseline and treatment completion resulted in outcomes that were 39% higher than the average treatment effect. In contrast, the application of models that adjust missing cases replacement scores by covariates that only predict missing cases (eg, age) or the rate of symptom change (eg, relationship status) did not result in missing cases symptom estimates that were different than average (nonadjusted MCAR models).
The influence of nonignorable mechanisms of missing cases is repeated in Table 9 (GAD-7) and Table 10 (K-10). Accounting for the role of low treatment completion in missing cases increased the projected symptom scores for missing cases by 20%. When the role of baseline symptom severity was also included in the replacement procedure, the predicted missing cases outcomes increased to nearly 30% above the average symptom outcome scores. In contrast, models that adjust their predicted outcome by variables that do not jointly predict missing cases and symptom change have resulted in outcomes that were very close to those of the completers.
A comparison between the GEE replacement estimation, multiple imputation, and mixed model-based replacement also demonstrates that the effect of treatment completion could be reliably observed across different statistical techniques. For example, the multiple imputations and mixed model replacement methods that accounted for a measure of treatment completion (stratified) all resulted in higher and comparable symptom replacement outcomes across GEE and multiple imputation and mixed models methods and across all symptom outcomes: depression (PHQ-9), anxiety (GAD-7), and psychological distress (K-10).
Finally, LOCF and BOCF replacement methodologies were compared with the other outcomes. Tables 8 to 10 show that using BOCF and LOCF methodologies, replacement scores for missing cases were higher when compared with the statistical approximation of outcomes for completers.

Principal Findings
The aim of this study is to better understand the characteristics of missing cases and compare methods for estimating the symptom outcomes of missing cases in psychotherapy. The results of the study identified the following variables: (1) treatment adherence rate, defined as the rate of module progression through a treatment protocol and (2) the severity of symptom scores before treatment as variables that moderated both the probability for a case to present as missing during posttreatment assessment and the rate of symptom reduction. Low treatment adherence in particular dominantly predicted both the odds ratio of a case to present as missing during a posttreatment evaluation (162.1:1), and at the same time, low adherence dulled the rate of symptom reduction effect by up to 29%, 41%, and 52% for anxiety, depression, and psychological distress symptoms, respectively. These results are congruent with preliminary research [4] and suggest that the effect of missing cases is fundamental for the measurement process of clinical evidence and is of vital importance to anyone interested in a complete and unbiased account of the efficacy of psychological treatment.
With regard to the hypotheses stated, the first hypothesis that treatment completion and the severity of symptoms at baseline would predict both the likelihood of missing cases and symptom outcomes was supported. Treatment completion accounted for most of the missing case probability variance at posttreatment (R 2 <60%). More than 95% of participants who completed all of the intervention provided symptom data posttreatment compared with the 5% of those who completed a single module. Consistent with previous research in psychotherapy, treatment completion also moderated the rate of symptom improvement for depression, anxiety, and distress, suggesting a positive dose-response relationship in the efficacy of iCBT [29,30]. Specifically, individuals who completed more of the treatment modules demonstrated up to double the rate of symptom change for psychological distress, depression, and anxiety within the same period of 8 weeks.
The identification of the association between treatment completion, noncompletion, and clinical outcomes as related concepts in a very large sample and with multiple outcomes confirms findings from earlier studies of factors associated with outcomes in psychotherapy [1,29,30]. However, in comparison, few studies of psychotherapy outcomes have examined the relationship between these variables, and instead, treatment completion, reasons for dropping out of treatment, and clinical outcomes have been defined as distinct outcomes [20] and explored as parallel outcomes in meta-analyses of noncompletion [2] or in studies of predictors of noncompletion [13].
The findings of this study are also consistent with those of previous studies [4], which suggested that noncompleters were likely to have significantly worse treatment outcomes that would be overlooked without adjusting for the rate of treatment completion and the severity of symptoms of a patient at baseline. The comparison of statistical techniques demonstrated the effect of these variables on the replacement outcomes, regardless of the statistical technique employed. For this reason, it is recommended that to produce accurate and representative replacement estimates for missing cases, researchers should account for the relationship between treatment completion, the probability of completion, and the rate of improvement of symptoms.
The key recommendation arising from these findings concerns the measurement and evaluation of treatment outcomes in both clinical trials and routine care. At present, missing case patterns are mostly overlooked [9] despite being common and comprising a substantial portion of samples examined in psychotherapy research [1]. To date, there has been comparatively little research attempting to examine the suitability of different statistical methods to handle missing cases.
The second aim of this study is to explore the suitability of different statistical solutions to replace the outcomes of missing cases and identify methodological opportunities for psychotherapy researchers. From the range of patient characteristics, 2 types of models were identified: (1) models that included the key nonignorable mechanisms of treatment completion and (2) models that included alternative less dominant predictors, such as age, gender, and education. For example, the analyses of psychotherapy patient characteristics demonstrated that higher psychological distress symptoms at baseline, higher depressive symptoms at baseline, or relatively younger age, also predicted the increased probability of noncompletion. This study found that age, gender, and baseline symptoms are limited in their ability to account for the variance in missing cases (R 2 <5%) or account for the outcomes of missing cases. In contrast, treatment completion far outweighed other competing explanations for missing cases. In this manner, the study results supported the second hypothesis postulating that models that adjust for treatment completion and baseline severity would be more representative of the outcomes of missing cases.
In technical statistical terms, the joint association of the treatment adherence variable with missingness probability and the rate of symptom change is considered to demonstrate a nonignorable mechanism of missing cases. Simply put, the results show that missing cases do not occur as a random event and that missing cases outcomes do not compare with the remaining sample. This study, together with previous research [4], demonstrated that the inclusion of a single key treatment adherence covariate is enough to substantially improve the prediction and replacement of missing cases outcomes. Such findings support the proposed recommendation to use treatment completion as a key mechanism of missing cases and as an adjustment variable in the process of approximating missing cases outcomes [5,31].

Limitations and Future Directions
The findings must be considered in light of several key limitations. First, the demonstration of missing cases, their characteristics and outcomes, and the suitability of replacing missing cases through adjusted models can only be considered preliminary and, at this time, relevant to iCBT [15]. Given that missing cases estimates vary between treatments [2,9], it is possible that the patterns, predictors, and outcomes of missing cases also vary between treatment models. Although this sample employed extensive cross-validation efforts, the trajectories of missing cases identified in this sample should be considered preliminary and experimental. Replication of these findings using different treatments could affirm the generalizability of early treatment completion as a key mechanism of missing cases and the importance of treatment completion for clinical improvement in psychotherapy. Specifically, additional and more detailed replications of the findings across different clinical contexts, such as trials with differing outcome measurement methodologies (eg, self-reported vs clinical diagnosis [32]), differing levels of treatment intensity [30], and differing timelines within study methodology [33], are needed to further verify the validity of treatment adherence as a mechanism that shapes the prediction of missing cases outcomes in psychotherapy research.
Second, this study was unable to examine other variables influencing the trajectories of missing cases or test all of the theoretical causes of missing cases, for example, the effect of interaction between a participant and an individual therapist despite the regimented nature of iCBT or the intervention of external events affecting participation. Other possible variables include the presence of major depression [32,34], perception of treatment credibility [35], or motivation [13] that can also affect treatment completion and the trajectory of participants in psychotherapy. Future studies may consider a more direct or more sophisticated measurement of participant engagement, such as motivation and time spent engaged with treatment, and even directed follow-up surveying to explore why patients dropped out of treatment and lapse out of the assessment protocol.
In addition, although not a limitation of this study, it is important to note that the ability to use statistical replacement models adjusted by treatment completion and baseline symptoms may not be realistic in studies involving small samples [27], where many psychotherapy trials involve samples less than 50 patients and do not have the statistical power to confirm the associations found in this study. In smaller studies, LOCF for cases that do not complete treatment (eg, less than 80% adherence) could be combined with the replacement values from unadjusted models for cases who complete treatment in full (MCAR). Such an approach could result in a less statistically demanding procedure that balances overly conservative LOCF statistics with overly liberal unadjusted model approximation [1].
In conclusion, this study aimed to explore the characteristics of missing cases, the possible clinical outcomes of missing cases in internet-delivered psychotherapy, and the suitability of different strategies for accounting for the outcomes of missing cases in psychotherapy trials. The findings of this study suggest that (1) missing cases are associated with lower treatment completion, (2) the clinical trajectories of missing cases are not likely to be similar to the average participant, and (3) overlooking the nonignorable mechanisms of missing cases is likely to result in erroneous replacement of missing cases outcomes and inflated estimates of treatment effects. The findings suggest that researchers need to consider how they account for the outcomes of missing cases in psychotherapy trials where nonignorable missing cases mechanisms are likely to occur. Accounting for missing cases in this manner provides a more realistic estimate of treatment effects in the real world, as it is expected that some participants will drop out. In this manner, more complete and realistic estimates that account for the outcomes of missing cases can contribute toward more realistic psychotherapy evaluation and outcome modeling.