The Relationships of Deteriorating Depression and Anxiety With Longitudinal Behavioral Changes in Google and YouTube Use During COVID-19: Observational Study

doi:10.2196/24012

Original Paper

¹Department of Computer Science, University of Rochester, Rochester, NY, United States

²Department of Urban-Global Public Health, Rutgers University, Piscataway and Newark, NJ, United States

*these authors contributed equally

Corresponding Author:

Anis Zaman, PhD

Department of Computer Science

University of Rochester

2513 Wegmans Hall

Rochester, NY, 14627

United States

Phone: 1 6262981861

Email: azaman2@cs.rochester.edu

Background: Depression and anxiety disorders among the global population have worsened during the COVID-19 pandemic. Yet, current methods for screening these two issues rely on in-person interviews, which can be expensive, time-consuming, and blocked by social stigma and quarantines. Meanwhile, how individuals engage with online platforms such as Google Search and YouTube has undergone drastic shifts due to COVID-19 and subsequent lockdowns. Such ubiquitous daily behaviors on online platforms have the potential to capture and correlate with clinically alarming deteriorations in depression and anxiety profiles of users in a noninvasive manner.

Objective: The goal of this study is to examine, among college students in the United States, the relationships of deteriorating depression and anxiety conditions with the changes in user behaviors when engaging with Google Search and YouTube during COVID-19.

Methods: This study recruited a cohort of undergraduate students (N=49) from a US college campus during January 2020 (prior to the pandemic) and measured the anxiety and depression levels of each participant. The anxiety level was assessed via the General Anxiety Disorder-7 (GAD-7). The depression level was assessed via the Patient Health Questionnaire-9 (PHQ-9). This study followed up with the same cohort during May 2020 (during the pandemic), and the anxiety and depression levels were assessed again. The longitudinal Google Search and YouTube history data of all participants were anonymized and collected. From individual-level Google Search and YouTube histories, we developed 5 features that can quantify shifts in online behaviors during the pandemic. We then assessed the correlations of deteriorating depression and anxiety profiles with each of these features. We finally demonstrated the feasibility of using the proposed features to build predictive machine learning models.

Results: Of the 49 participants, 49% (n=24) of them reported an increase in the PHQ-9 depression scores; 53% (n=26) of them reported an increase in the GAD-7 anxiety scores. The results showed that a number of online behavior features were significantly correlated with deteriorations in the PHQ-9 scores (r ranging between –0.37 and 0.75, all P values less than or equal to .03) and the GAD-7 scores (r ranging between –0.47 and 0.74, all P values less than or equal to .03). Simple machine learning models were shown to be useful in predicting the change in anxiety and depression scores (mean squared error ranging between 2.37 and 4.22, R² ranging between 0.68 and 0.84) with the proposed features.

Conclusions: The results suggested that deteriorating depression and anxiety conditions have strong correlations with behavioral changes in Google Search and YouTube use during the COVID-19 pandemic. Though further studies are required, our results demonstrate the feasibility of using pervasive online data to establish noninvasive surveillance systems for mental health conditions that bypasses many disadvantages of existing screening methods.

JMIR Ment Health 2020;7(11):e24012

doi:10.2196/24012

Keywords

mental health (1978); anxiety (782); depression (1167); Google Search (12); YouTube (88); pandemic (675); COVID-19 (3094)

Background

Worldwide mental health problems such as depression, anxiety, and suicidal ideation have severely worsened during the COVID-19 pandemic [Torales J, O'Higgins M, Castaldelli-Maia JM, Ventriglio A. The outbreak of COVID-19 coronavirus and its impact on global mental health. Int J Soc Psychiatry. Jun 2020;66(4):317-320. [CrossRef] [Medline]1-Rajkumar RP. COVID-19 and mental health: a review of the existing literature. Asian J Psychiatr. Aug 2020;52:102066. [FREE Full text] [CrossRef] [Medline]3], specifically for college students [Elmer T, Mepham K, Stadtfeld C. Students under lockdown: comparisons of students' social networks and mental health before and during the COVID-19 crisis in Switzerland. PLoS One. 2020;15(7):e0236337. [FREE Full text] [CrossRef] [Medline]4-Cao W, Fang Z, Hou G, Han M, Xu X, Dong J, et al. The psychological impact of the COVID-19 epidemic on college students in China. Psychiatry Res. May 2020;287:112934. [FREE Full text] [CrossRef] [Medline]7]. Yet, current methods for screening mental health issues and identifying vulnerable individuals rely on in-person interviews. Such assessments can be expensive, time-consuming, and blocked by social stigma, not to mention the reluctancy induced by travel restrictions and exposure risks. It has been reported that few patients in need were correctly identified and received proper mental health treatments on time under the current health care system [Kessler RC, Chiu WT, Demler O, Merikangas KR, Walters EE. Prevalence, severity, and comorbidity of 12-month DSM-IV disorders in the National Comorbidity Survey Replication. Arch Gen Psychiatry. Jun 2005;62(6):617-627. [FREE Full text] [CrossRef] [Medline]8,Wang PS, Berglund P, Olfson M, Pincus HA, Wells KB, Kessler RC. Failure and delay in initial treatment contact after first onset of mental disorders in the National Comorbidity Survey Replication. Arch Gen Psychiatry. Jun 2005;62(6):603-613. [CrossRef] [Medline]9]. Even with emerging telehealth technologies and online surveys, the screening requires patients to actively reach out to care providers.

At the same time, because of the lockdown caused by the global pandemic outbreak, people’s engagements with online platforms underwent notable changes, particularly in search engine trends [Mavragani A. Tracking COVID-19 in Europe: infodemiology approach. JMIR Public Health Surveill. Apr 20, 2020;6(2):e18941. [FREE Full text] [CrossRef] [Medline]10-Rovetta A, Bhagavathula AS. COVID-19-related web search behaviors and infodemic attitudes in Italy: infodemiological study. JMIR Public Health Surveill. May 05, 2020;6(2):e19374. [FREE Full text] [CrossRef] [Medline]12], exposures to media reports [Dixit A, Marthoenis M, Arafat SY, Sharma P, Kar SK. Binge watching behavior during COVID 19 pandemic: a cross-sectional, cross-national online survey. Psychiatry Res. Jul 2020;289:113089. [FREE Full text] [CrossRef] [Medline]13,Gao J, Zheng P, Jia Y, Chen H, Mao Y, Chen S, et al. Mental health problems and social media exposure during COVID-19 outbreak. PLoS One. 2020;15(4):e0231924. [FREE Full text] [CrossRef] [Medline]14], and through quotidian smartphone use for COVID-19 information [Huckins J, daSilva AW, Wang W, Hedlund E, Rogers C, Nepal SK, et al. Mental health and behavior of college students during the early phases of the COVID-19 pandemic: longitudinal smartphone and Ecological Momentary Assessment study. J Med Internet Res. Jun 17, 2020;22(6):e20185. [FREE Full text] [CrossRef] [Medline]5]. Reliance on the internet has significantly increased due to the overnight change in lifestyles, for example, remote working and learning, imposed by the pandemic on society. The sort of content consumed, the time and duration spent online, and the purpose of online engagements may be influenced by COVID-19. Furthermore, the digital footprints left by online interactions may reveal information about these changes in user behaviors.

Most importantly, such ubiquitous online footprints may provide useful signals of deteriorating mental health profiles (eg, depression and anxiety) of users during COVID-19. They may capture insights into what was going on in the mind of the user through a noninvasive manner, especially since Google and YouTube searches are short and succinct, and can be quite rich in providing the real-time cognitive state of a person. On one hand, online engagements can cause fluctuations in mental health. On the other hand, having certain mental health conditions can cause certain types of online behaviors. This opens up possibilities for potential health care frameworks that leverage pervasive computing approaches to monitor mental health conditions and deliver interventions on time. However, the findings of this study do not imply any causal relationship between specific types of online activities and one’s level of anxiety or depression at a given point in time.

Prior Work

Extensive research has been conducted on a population level, correlating mental health problems with user behaviors on social platforms [Seabrook EM, Kern ML, Rickard NS. Social networking sites, depression, and anxiety: a systematic review. JMIR Ment Health. Nov 23, 2016;3(4):e50. [FREE Full text] [CrossRef] [Medline]15,Lin LY, Sidani JE, Shensa A, Radovic A, Miller E, Colditz JB, et al. Association between social media use and depression among US young adults. Depress Anxiety. Apr 2016;33(4):323-331. [FREE Full text] [CrossRef] [Medline]16], especially among young adolescents. Researchers monitored Twitter to understand mental health profiles of the general population, such as suicidal ideation [De Choudhury M, Kiciman E, Dredze M, Coppersmith G, Kumar M. Discovering shifts to suicidal ideation from mental health content in social media. Proc SIGCHI Conf Hum Factor Comput Syst. May 2016;2016:2098-2110. [FREE Full text] [CrossRef] [Medline]17] and depression [De Choudhury M, Counts S, Horvitz E. Social media as a measurement tool of depression in populations. In: Proceedings of the 5th Annual ACM Web Science Conference. 2013. Presented at: WebSci '13; May 2013:47; Paris, France. [CrossRef]18]. Similar research has been done with Reddit, where anxiety [Shen JH, Rudzicz F. Detecting anxiety through Reddit. In: Proceedings of the Fourth Workshop on Computational Linguistics and Clinical Psychology — From Linguistic Signal to Clinical Reality. 2017. Presented at: Fourth Workshop on Computational Linguistics and Clinical Psychology — From Linguistic Signal to Clinical Reality; August 2017:58; Vancouver, BC. [CrossRef]19], suicidal ideation [De Choudhury M, Kiciman E, Dredze M, Coppersmith G, Kumar M. Discovering shifts to suicidal ideation from mental health content in social media. Proc SIGCHI Conf Hum Factor Comput Syst. May 2016;2016:2098-2110. [FREE Full text] [CrossRef] [Medline]17], and other general disorders were studied [Silveira Fraga B, Couto da Silva AP, Murai F. Online social networks in health care: a study of mental disorders on Reddit. 2018. Presented at: 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI); December 2018:568-573; Santiago, Chile. [CrossRef]20,Gaur M, Kursuncu U, Alambo A, Sheth A, Daniulaityte R, Thirunarayan K, et al. "Let Me Tell You About Your Mental Health!": contextualized classification of Reddit posts to DSM-5 for web-based intervention. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 2018. Presented at: CIKM '18; October 2018:753-762; Torino, Italy. [CrossRef]21]. Another popular public platform is Facebook, and experiments have been done studying anxiety, depression, body-shaming, and stress online [Frost RL, Rickwood DJ. A systematic review of the mental health outcomes associated with Facebook use. Comput Hum Behav. Nov 2017;76:576-600. [CrossRef]22,Zhang R. The stress-buffering effect of self-disclosure on Facebook: an examination of stressful life events, social support, and mental health among college students. Comput Hum Behav. Oct 2017;75:527-537. [CrossRef]23]. In addition, it has been shown that college student communities rely heavily on YouTube for both academic and entertainment purposes [Fleck BKB, Beckman LM, Sterns JL, Hussey HD. YouTube in the classroom: helpful tips and student perceptions. J Effective Teaching. 2014;14(3):21-37.24,Moghavvemi S, Sulaiman A, Jaafar NI, Kasem N. Social media as a complementary learning tool for teaching and learning: the case of YouTube. Int J Manage Education. Mar 2018;16(1):37-42. [CrossRef]25]. Yet, abundant use may lead to compulsive YouTube engagements [Klobas JE, McGill TJ, Moghavvemi S, Paramanathan T. Compulsive YouTube usage: a comparison of use motivation and personality effects. Comput Hum Behav. Oct 2018;87:129-139. [CrossRef]26], and researchers have found that social anxiety is associated with YouTube consumption in a complex way [de Bérail P, Guillon M, Bungener C. The relations between YouTube addiction, social anxiety and parasocial relationships with YouTubers: a moderated-mediation model based on a cognitive-behavioral framework. Comput Hum Behav. Oct 2019;99:190-204. [CrossRef]27].

During COVID-19, multiple studies have reported deteriorating mental health conditions in various communities [Torales J, O'Higgins M, Castaldelli-Maia JM, Ventriglio A. The outbreak of COVID-19 coronavirus and its impact on global mental health. Int J Soc Psychiatry. Jun 2020;66(4):317-320. [CrossRef] [Medline]1-Rajkumar RP. COVID-19 and mental health: a review of the existing literature. Asian J Psychiatr. Aug 2020;52:102066. [FREE Full text] [CrossRef] [Medline]3,Gunnell D, Appleby L, Arensman E, Hawton K, John A, Kapur N, et al. COVID-19 Suicide Prevention Research Collaboration. Suicide risk and prevention during the COVID-19 pandemic. Lancet Psychiatry. Jun 2020;7(6):468-471. [FREE Full text] [CrossRef] [Medline]28] such as nationwise [van Agteren J, Bartholomaeus J, Fassnacht D, Iasiello M, Ali K, Lo L, et al. Using internet-based psychological measurement to capture the deteriorating community mental health profile during COVID-19: observational study. JMIR Ment Health. Jun 11, 2020;7(6):e20696. [FREE Full text] [CrossRef] [Medline]29,Holmes EA, O'Connor RC, Perry VH, Tracey I, Wessely S, Arseneault L, et al. Multidisciplinary research priorities for the COVID-19 pandemic: a call for action for mental health science. Lancet Psychiatry. Jun 2020;7(6):547-560. [FREE Full text] [CrossRef] [Medline]30], across the health care industry [Greenberg N, Docherty M, Gnanapragasam S, Wessely S. Managing mental health challenges faced by healthcare workers during covid-19 pandemic. BMJ. Mar 26, 2020;368:m1211. [CrossRef] [Medline]31,Chen Q, Liang M, Li Y, Guo J, Fei D, Wang L, et al. Mental health care for medical staff in China during the COVID-19 outbreak. Lancet Psychiatry. Apr 2020;7(4):e15-e16. [FREE Full text] [CrossRef] [Medline]32], and among existing mental health patients [Yao H, Chen J, Xu Y. Patients with mental health disorders in the COVID-19 epidemic. Lancet Psychiatry. Apr 2020;7(4):e21. [FREE Full text] [CrossRef] [Medline]33]. Recently, it has been shown that greater use of social media during COVID-19 may induce increasing levels of anxiety and depression at both population and individual levels [Gao J, Zheng P, Jia Y, Chen H, Mao Y, Chen S, et al. Mental health problems and social media exposure during COVID-19 outbreak. PLoS One. 2020;15(4):e0231924. [FREE Full text] [CrossRef] [Medline]14,Ni MY, Yang L, Leung CMC, Li N, Yao XI, Wang Y, et al. Mental health, risk factors, and social media use during the COVID-19 epidemic and cordon sanitaire among the community and health professionals in Wuhan, China: cross-sectional survey. JMIR Ment Health. May 12, 2020;7(5):e19009. [FREE Full text] [CrossRef] [Medline]34]. In addition, online behaviors during COVID-19 have been explored, especially for web searches related to the pandemic [Mavragani A. Tracking COVID-19 in Europe: infodemiology approach. JMIR Public Health Surveill. Apr 20, 2020;6(2):e18941. [FREE Full text] [CrossRef] [Medline]10-Rovetta A, Bhagavathula AS. COVID-19-related web search behaviors and infodemic attitudes in Italy: infodemiological study. JMIR Public Health Surveill. May 05, 2020;6(2):e19374. [FREE Full text] [CrossRef] [Medline]12] and abnormal TV consumption during the lockdown [Dixit A, Marthoenis M, Arafat SY, Sharma P, Kar SK. Binge watching behavior during COVID 19 pandemic: a cross-sectional, cross-national online survey. Psychiatry Res. Jul 2020;289:113089. [FREE Full text] [CrossRef] [Medline]13]. Many of the behavioral studies also discussed the effects of online interactions on the spread, misinformation, knowledge, and protective measures of COVID-19, including the roles of YouTube [Li HO, Bailey A, Huynh D, Chan J. YouTube as a source of information on COVID-19: a pandemic of misinformation? BMJ Glob Health. May 2020;5(5). [FREE Full text] [CrossRef] [Medline]35-Basch C, Hillyer G, Meleo-Erwin Z, Jaime C, Mohlman J, Basch C. Preventive behaviors conveyed on YouTube to mitigate transmission of COVID-19: cross-sectional study. JMIR Public Health Surveill. Apr 02, 2020;6(2):e18807. [FREE Full text] [CrossRef] [Medline]37] and other platforms [Pennycook G, McPhetres J, Zhang Y, Lu J, Rand D. Fighting COVID-19 misinformation on social media: experimental evidence for a scalable accuracy-nudge intervention. Psychol Sci. Jul 2020;31(7):770-780. [FREE Full text] [CrossRef] [Medline]38]. Lyu et al [Lyu H, Chen L, Wang Y, Luo J. Sense and sensibility: characterizing social media users regarding the use of controversial terms for COVID-19. IEEE Trans Big Data. 2020:1. [CrossRef]39] investigated hate speech targeting the Chinese and Asian communities on Twitter during COVID-19. A study in 2009 showed the opposite effect in mental health risk factors: a communitywide crisis may reduce self-harm ideation behaviors [Gordon K, Bresin K, Dombeck J, Routledge C, Wonderlich J. The impact of the 2009 Red River Flood on interpersonal risk factors for suicide. Crisis. 2011;32(1):52-55. [CrossRef] [Medline]40].

Ubiquitous data has been proved to be useful in detecting mental health conditions. Mobile sensor data such as GPS logs [Saeb S, Lattie EG, Schueller SM, Kording KP, Mohr DC. The relationship between mobile phone location sensor data and depressive symptom severity. PeerJ. 2016;4:e2537. [CrossRef] [Medline]41,Saeb S, Zhang M, Kwasny MM, Karr CJ, Kording K, Mohr DC. The relationship between clinical, momentary, and sensor-based assessment of depression. Int Conf Pervasive Comput Technol Healthc. Aug 2015;2015. [FREE Full text] [CrossRef] [Medline]42]; electrodermal activity; and sleep behavior, motion, and phone use patterns [Ghandeharioun A, Fedor S, Sangermano L, Ionescu D, Alpert J, Dale C, et al. Objective assessment of depressive symptoms with machine learning and wearable sensors data. 2017. Presented at: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII); October 2017; San Antonio, TX. [CrossRef]43,Wang R, Wang W, daSilva A, Huckins JF, Kelley WM, Heatherton TF, et al. Tracking depression dynamics in college students using mobile phone and wearable sensing. Proc ACM Interact Mob Wearable Ubiquitous Technol. Mar 26, 2018;2(1):1-26. [CrossRef]44] have been applied in investigating depressive symptoms. Zaman et al [Zaman A, Acharyya R, Kautz H, Silenzio V. Detecting low self-esteem in youths from web search data. 2019. Presented at: WWW '19: The World Wide Web Conference; May 2019:2270-2280; San Francisco, CA. [CrossRef]45] found that individual private Google Search histories can be used to detect low self-esteem conditions among college students. Huckins et al [Huckins J, daSilva AW, Wang W, Hedlund E, Rogers C, Nepal SK, et al. Mental health and behavior of college students during the early phases of the COVID-19 pandemic: longitudinal smartphone and Ecological Momentary Assessment study. J Med Internet Res. Jun 17, 2020;22(6):e20185. [FREE Full text] [CrossRef] [Medline]5] examined the longitudinal changes in mental health and smartphone use through ecological momentary assessments during COVID-19 among college populations. Although studies exploring anxiety and depression have been conducted in the past, none of them have leveraged individual-level Google Search and YouTube activity logs to examine the effect of COVID-19 on college students.

Goal of This Study

It has been shown that online platforms preserve useful information about the mental health conditions of users, and COVID-19 is jeopardizing the mental well-being of the global community. Thus, we demonstrate the richness of online engagement logs and how they can be leveraged to uncover alarming mental health conditions during COVID-19. In this study, we aim to examine whether the changes in user behaviors during COVID-19 have a relationship with deteriorating depression and anxiety profiles. We focused on Google Search and YouTube use, and we investigated if the behavior shifts when engaging with these two platforms signify worsened mental health conditions.

The scope of the study covers undergraduate students in the United States. We envision this project as a pilot study; it may lay a foundation for mental health surveillance and help delivery frameworks based on pervasive computing and ubiquitous online data. Compared to traditional interviews and surveys, such a noninvasive system may be cheaper and efficient, and avoid being blocked by social stigma while notifying caregivers on time about individuals at risk.

Recruitment and Study Design

We recruited a cohort of undergraduate students, all of whom were at least 18 years of age and have an active Google account for at least 2 years, from the University of Rochester River Campus, Rochester, NY. Participation was voluntary, and individuals had the option to opt out of the study at any time, although we did not encounter any such cases. We collected individual-level longitudinal online data (Google Search and YouTube) in the form of private history logs from the participants. For every participant, we measured the depression and anxiety levels via the clinically validated Patient Health Questionnaire-9 (PHQ-9) and General Anxiety Disorder-7 (GAD-7), respectively. Basic demographic information was also recorded. There were in total two rounds of data collection: the first round during January 2020 (prior to the pandemic) and the second round during May 2020 (during the pandemic). During each round and for each participant, the anxiety and depression scores were assessed, and the change in mental health conditions was calculated in the end. The entire individual online history data up until the date of participation was also collected in both rounds from the participants. Figure 1 gives an illustration of the recruitment timeline and two rounds of data collections. All individuals participated in both rounds and were compensated with US $10 Amazon gift cards during each round of participation.

Given the sensitivity and proprietary nature of private Google Search and YouTube histories, we leveraged the Google Takeout web interface [Google Takeout. URL: https://takeout.google.com/ [accessed 2020-08-07] 46] to share the data with the research team. Prior to any data cleaning and analysis, all sensitive information such as the name, email, phone number, social security number, and credit card information was automatically removed via the Data Loss Prevention application programming interface (API) [Cloud data loss prevention. Google Cloud. URL: https://cloud.google.com/dlp [accessed 2020-08-07] 47] of Google Cloud. For online data and survey response storage, we used a Health Insurance Portability and Accountability Act–compliant cloud-based secure storing pipeline. The whole study design, pipelines, and survey measurements involved were similar to our previous setup [Zaman A, Acharyya R, Kautz H, Silenzio V. Detecting low self-esteem in youths from web search data. 2019. Presented at: WWW '19: The World Wide Web Conference; May 2019:2270-2280; San Francisco, CA. [CrossRef]45] and have been approved by the Institutional Review Board of the University of Rochester.

To address participation bias, the study was advertised among the college population via campus wide digital announcements. The text in the study advertisements and consent materials were generic with text such as “help uncover mental health understanding via your online activities.” There was no explicit mention of anxiety or depression in the advertisement. Participation was voluntary with the option to opt out of the study anytime, and their data would not be part of the research study. The intent of the study was clearly explained at the beginning of the recruitment process via one-on-one interviews with the recruiter. We did not have anyone declining to participate or withdrawing in the middle of the study.

**Figure 1.** The study recruitment procedure and feature development process. All of the participants moved to remote learning on March 7, 2020, the same day a state of emergency was declared in New York State. To avoid any acute behavior during the transition to remote learning, we excluded the data from March 1 to 28. GAD-7: General Anxiety Disorder-7; LWIC: Linguistic Inquiry and Word Count; PHQ-9: Patient Health Questionnaire-9; SEI: short event interval.

Online Data Processing and Feature Extractions

The Google Takeout platform enables users to share the entire private history logs associated with their Google accounts, and as long as the account of the user was logged-in, all histories would be recorded regardless of which device the individual was using. Each activity in Google Search and YouTube engagement logs were time stamped, signifying when the activity happened to the precision of seconds. Furthermore, for each Google Search, the history log contained the query text input by the user. It also recorded the URL if the user directly inputted a website address to the search engine. For each YouTube video watched by the user, the history log contained the URL to the video. If the individual directly searched with one or more keywords on the YouTube platform, the history log also recorded the URL to the search results.

To capture the change in online behaviors for the participants, we first introduced a set of features that quantified certain aspects of how individuals interact with Google Search and YouTube. The set of features was calculated for each participant separately. Individual-level behavior changes were then obtained by examining the variations of the feature between January to March 1, 2020 (a week before the state of emergency in New York State), and March 28 to May 2020 (after the outbreak, following the lockdown and mandated social distancing).

We excluded the online data generated between March 1, 2020, and March 28, 2020, to account for any acute or temporal behavior changes concentrated around the initial lockdown or due to adapting to remote work. We focused on the persistent and stabilized online behaviors throughout the time after the lockdown. Furthermore, the spring break at our institution started on March 7, and the state of emergency in New York State was issued on the same day. All students were asked to leave campus at the start of spring break and complete the rest of the semester remotely.

Concretely, we defined 5 features and cut the longitudinal data of each participant into two segments: (1) from January 1 to February 29, 2020, and (2) from March 29 to May 31. Each segment spanned 2 months. We excluded online data from March 1 to March 28 to account for the fact that not all individuals transitioned into work from home on a specific date or practiced a strict social distancing lifestyle, although all of our participants are residents of New York State. The same feature was extracted from both segments of data, and the change was calculated. Such change was referred to as the behavior shifts during the pandemic and lockdown. Figure 1 gives an illustration of data segmentations and feature development pipelines.

Online Activity Distributions

We considered, for each participant, how the Google Search and YouTube activities were distributed across the 24 hours of a day before and after the lockdown, given the previously defined dates. For each trimmed data segment, we cumulated the total number of activities, regardless of Google Search or YouTube, that happened in each of the 24 hours. Thus, we obtained two 24-bin histograms, representing the activity distributions before (D_before) and after (D_after) the lockdown.

Figure 2 showcases the normalized distributions before and after the outbreak for two participants, each cumulates 2 months of data. For participant one (PHQ-9 increased by 8 and GAD-7 increased by 3) before the outbreak, a few activities started to appear at 8 AM. After the outbreak, these early morning activities disappeared. In addition, a considerable amount of online activities appeared during late-night hours. These patterns most likely indicated a delay in bedtime. For participant two (PHQ-9 decreased by 2 and GAD-7 decreased by 6), there were several activities during late night hours before the lockdown. Followed by a long absence from Google Search and YouTube, the next event usually appeared around noon. After the lockdown, the first activity of the day started to appear in the early morning, and those late-night activities disappeared. Similarly, participant two may also have had afternoon classes at around 3 PM-4 PM. Notice that these two random cases were chosen simply to represent the fact that study participants reacted nonuniformly to the lockdown.

**Figure 2.** The normalized activity distributions over 24 hours before and after the outbreak of COVID-19 for two example participants.

After that, for each of the 24 hours (h) of a user, we calculated the percentage (ie, relative) change of online activities before and after the lockdown:

For the rest of the study, any mentioned percentage or relative changes of features were calculated in this way.

Last Seen Activities

We further considered the last seen activity, regardless of Google Search or YouTube, of each user in a day. It is reasonable to assume that, given the nature of our college student population, the last event before they go to bed does not necessarily happen before midnight. Strictly speaking, our goal was to capture the last event before they went to bed. Therefore, we set a threshold at late night or early morning and considered the last online activity before it. Since a discrete threshold was used, we tried several cutoff hours to perform sensitivity analyses. For our study population, we observed that the hourly volume of Google Search and YouTube activities started to decrease after midnight, and it reached the minimum at 5 AM. This pattern was periodic and persistent across our longitudinal data. Motivated by this observation, we tried a cutoff hour of midnight, 1 AM, 2 AM, 3 AM, 4 AM, and 5 AM, and counted the last events before these thresholds for each participant. Different from the aforementioned online activity distributions, which measures the volume of activities on Google Search and YouTube hourly, the last seen events focused solely on participants staying up late. An example illustration of the last seen activities is provided in Figure 3.

With each threshold, we obtained two distributions of the last seen event time stamps before and after the lockdown from each participant. On a continuous scale, we then picked the two medians of the last seen event time stamps before and after the lockdown and took the difference. For example, a difference of 1.5 hours means that the median time of last seen events shifted 1.5 hours later after the lockdown. A difference of –0.3 hours means the median time of last seen events shifted 0.3 hours earlier after the lockdown. All the time differences are in the unit of hours. There is no need to distinguish between Google Search or YouTube for this feature as we are merely looking for the last event, which could be either.

**Figure 3.** An example to demonstrate how the last seen activities are selected for different threshold hours.

Short Event Intervals

We defined a short event interval (SEI) as the period of time that is less than a certain threshold (eg, 5 minutes) between two adjacent events. It usually occurs when one is consuming several related YouTube videos or is searching for similar content. Taking into consideration that YouTube and Google Search may have different thresholds to define a user session, we adapted the method in Halfaker et al’s [Halfaker A, Keyes O, Kluver D, Thebault-Spieker J, Nguyen T, Shores K, et al. User session identification based on strong regularities in inter-activity time. In: Proceedings of the 24th International Conference on World Wide Web. 2015. Presented at: WWW '15; May 2015:410-418; Florence, Italy. [CrossRef]48] study to identify proper thresholds for consecutive activities on each of the platforms. After obtaining the session thresholds through mixture models, we counted the total numbers of such SEIs for each participant before (SEI_before) and after (SEI_after) the outbreak. We calculated the relative change of SEI the same way as in Equation 1 and used it as a behavioral feature.

Linguistic Inquiry and Word Count Attributes

The Linguistic Inquiry and Word Count (LIWC) is a toolkit used to analyze various emotions, cognitive processes, social concerns, and psychological dimensions in a given text by counting the numbers of specific words [Pennebaker JW, Boyd RL, Jordan K, Blackburn K. The development and psychometric properties of LIWC2015. The University of Texas at Austin. 2015. URL: https://repositories.lib.utexas.edu/handle/2152/31333 [accessed 2020-08-02] 49]. It has been widely applied in research involving social media and mental health. For the complete list of linguistic and psychological dimensions LIWC measures, see [Pennebaker JW, Boyd RL, Jordan K, Blackburn K. The development and psychometric properties of LIWC2015. The University of Texas at Austin. 2015. URL: https://repositories.lib.utexas.edu/handle/2152/31333 [accessed 2020-08-02] 49]. We segmented the data log for each participant by the previously mentioned dates as two blobs of texts and analyzed the words using LIWC.

Since the contexts and linguistic properties of Google Search and YouTube may be distinct, we extracted the LIWC features from them separately. For Google Search, we inputted the raw query text; for YouTube, we inputted the video title and the YouTube query text, if any. There were in total 51 different LIWC attributes. LIWC outputted the count of words falling in each dimension among the whole text. We quantified the shift in behavior by calculating the percentage change of words in each dimension after the outbreak.

Google Search and YouTube Categories

We labeled each Google Search query with a category using the Google natural language processing (NLP) API [Classifying content. Google Cloud. URL: https://cloud.google.com/natural-language/docs/classifying-text [accessed 2020-08-07] 50]. We used the official YouTube API to retrieve the information of videos watched by the participants, including the title, duration, number of likes and dislikes, and default YouTube category tags. For a comprehensive list of Google NLP category labels and default YouTube category tags, please refer to [Content categories. Google Cloud. URL: https://cloud.google.com/natural-language/docs/categories [accessed 2020-08-07] 51,VideoCategories. Google Developers. URL: https://developers.google.com/youtube/v3/docs/videoCategories [accessed 2020-08-07] 52]. There were several categories overlapping with the LIWC dimensions, such as health and finance, and we regarded the LIWC dimensions as a more well-studied standard. Instead, we focused on the number of activities belonging to the adult (specifically originating from Google Search logs) and news categories, which were not presented in the LIWC.

Concretely, activities such as visiting a porn site (identified via the URL) and searching explicitly for information related to porn and mature content were labelled as adult. There was no other ambiguous nonpornographic material being categorized as adult. We used Google Cloud Content Classification API for labeling the search queries and used the Webshrinker [Domain APIs. Webshrinker. URL: https://www.webshrinker.com/apis/ [accessed 2020-10-16] 53] API to categorize the domain of every URL an individual visited. We calculated the relative changes of activities in these two categories as the behavior shifts for each participant (the same as Equation 1).

We now present a qualitative example of behavior changes in Google Search and YouTube categories. Textbox 1 showcases, for a single example participant, the top five Google Search and YouTube categories before and after the lockdown, defined by the percentages out of the total activity volume. For Google Search, we observed the disappearances of food and drinks (including searching for restaurants) and shopping from the list. In contrast, the numbers of searches related to the beauty and fitness, home (including kitchen and cooking subcategories), and health topics increased during the quarantine. The reference category was largely composed of academic content such as dictionaries, humanity and history references, and scientific proceedings. For YouTube, videos belonging to the education category boosted during the remote learning period after the lockdown, as did film and animation. The travel and events, and sports topics vanished from the list. Note that this was merely a single example, and the traits reflected here may be personal, uncorrelated to anxiety or depression, or prevalent among everyone.

Textbox 1. The top five Google Search and YouTube categories for an example participant before and after the lockdown.

Top five Google Search categories

Before lockdown
- Art and entertainment
- Reference
- Food and drinks
- Shopping
- Beauty and fitness
- Finance
After lockdown
- Art and entertainment
- Reference
- Beauty and fitness
- Home
- Health
- Food and drinks

Top five YouTube video categories

Before lockdown
- Music
- Travel and events
- Sports
- News and politics
- Education
- Film and animation
After lockdown
- Music
- Education
- Film and animation
- News and politics
- Pets and animals
- Comedy

Measurement Outcomes

Measurements for Changes in Online Behaviors

There were in total 5 scalar continuous dependent variables measuring various aspects of the changes in online behavior for each participant, as previously defined. These variables were extracted from two segments of the online data logs, namely, the data before and after the pandemic outbreak. All of the measurements were in percentage changes. For the online activity distributions, there were 24 measurements for each hour of a day. For the last seen events, there were several thresholds for sensitivity analyses. For the SEIs, Google Search and YouTube activities were considered separately with their own fitted session intervals.

Measurements for Mental Health Conditions

For both rounds of the data collection, anxiety levels were assessed using the GAD-7 survey, and depression levels were assessed using the PHQ-9 survey. With two rounds of surveys reported before and after the outbreak, the change in mental health conditions of each participant was obtained. According to Spitzer et al [Spitzer RL, Kroenke K, Williams JBW, Löwe B. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med. May 22, 2006;166(10):1092-1097. [CrossRef] [Medline]54] and Rutter and Brown [Rutter LA, Brown TA. Psychometric properties of the Generalized Anxiety Disorder Scale-7 (GAD-7) in outpatients with anxiety and mood disorders. J Psychopathol Behav Assess. Mar 2017;39(1):140-146. [FREE Full text] [CrossRef] [Medline]55], an increase greater than or equal to 5 in the GAD-7 score may be clinically alarming. Similarly, as stated by Kroenke [Kroenke K. Enhancing the clinical utility of depression screening. CMAJ. Feb 21, 2012;184(3):281-282. [FREE Full text] [CrossRef] [Medline]56], an increase greater than or equal to 5 in the PHQ-9 score may indicate the need for medical interventions.

Demographics

In addition to the online data and mental health surveys, we also collected basic demographic information such as school year, gender, and nationality.

Statistical Analysis

Before any analysis of mental health conditions, to eliminate the possibility of annual confounding factors interfering with the shifts in online behaviors, two-tailed paired independent t tests were performed. We inspected, in terms of the five quantitative features, whether the online behavior changes happened every year, such as due to seasonal factors, or only during COVID-19 for the whole study population. As previously mentioned, we collected the entire Google history log back to the registration date of the Google accounts of all participants.

We now use the example of SEIs to illustrate the idea. For each participant, we obtained 4 SEIs counts from 4 periods of time: January 1, 2019, to February 28, 2019; March 29, 2019, to May 31, 2019; January 1, 2020, to February 29, 2020; and March 29, 2020, to May 31, 2020. These counts are represented as 4 points on a Cartesian coordinate plane, where the y-axis represents the counts and the x-axis represents the time. We then calculated the slope (S) of the line connecting the two points from the same year. With the aforementioned process, we achieved two measurements for each participant, namely, S₂₀₁₉ and S₂₀₂₀. Viewing all the participants as a cohort, we computed the S₂₀₁₉ and S₂₀₂₀ for all features and performed multiple paired t tests. This enabled us to estimate the seasonal confounding factors. We could not perform the paired t tests on the changes of behavioral features directly because it may only validate a change in the intercept (baseline) amount of activity while ignoring the slope for each feature.

In the main experiments, with each of the aforementioned 5 features and various thresholds, we investigated the correlation of online behavioral changes with deteriorations in the GAD-7 and PHQ-9 scores, which did not require arbitrary discretization decisions. The dependent variables were the 5 behavior changes extracted from the longitudinal individual online data. Experiments were carried out in a one-on-one fashion: anxiety or depression condition was the single independent variable, and one of the five online behavior changes was the single dependent variable each time. Both of them were continuous variables.

Study Population Statistics

We recruited 49 participants in total, and all of them participated in both rounds of the study (100% response rate). On average, each participant made 2357 (95% CI 2106.28 to 2433.45) Google Searches and 2901 (95% CI 2556.92 to 3248.67) YouTube interactions from January to February 29, 2020, and 2497 (95% CI 2069.45 to 2901.34) Google Searches and 3105 (95% CI 2702.48 to 3487.56) YouTube interactions from March 29 to the end of May. Of the 49 participants, 49% (n=24) of them reported an increase in the PHQ-9 score, and 53% (n=26) of them reported an increase in the GAD-7 score. An increase in the PHQ-9 score≥5 was reported by 41% (n=20) of participants, and 45% (n=22) of them reported an increase in the GAD-7 score≥5.

Figure 4 shows the baseline (collected on January 1, 2020) and follow-up postlockdown (collected on May 31, 2020) distributions of depression and anxiety scores in our sample student population. The PHQ-9 scores are shown on the left, ranging from 0 to 27. The GAD-7 scores are shown on the right, ranging from 0 to 21. Each dot represents a participant. The x-axis represents the baseline score in January, and the y-axis represents the follow-up score during the lockdown in May. Figure 5 shows the distributions of the change in PHQ-9 depression and GAD-7 anxiety scores before and after the lockdown. Again, the PHQ-9 scores are shown on the left, and the GAD-7 scores are shown on the right. The changes were calculated as the follow-up scores in May subtracted by the baseline scores in January. Putting the pandemic into context, the deterioration in anxiety or depression levels may have been triggered by the fear of getting infected, loss of jobs, the death of family members or friends, and many other negative impacts from COVID-19. Particularly for college students, other major reasons may be the pressure of online learning, loss of financial aids, and living alone. In contrast, students that underwent quarantines with their families safely may not have shown signals of deteriorating anxiety or depression, compared to the high stress levels during normal school days.

**Figure 4.** The distributions of PHQ-9 depression and GAD-7 anxiety scores before and after the lockdown. The PHQ-9 scores are shown on the left, and the GAD-7 scores are shown on the right. GAD-7: General Anxiety Disorder-7; PHQ-9: Patient Health Questionnaire-9.

**Figure 5.** The distributions of the changes in PHQ-9 depression and GAD-7 anxiety scores before and after the lockdown. The PHQ-9 scores are shown on the left, and the GAD-7 scores are shown on the right. GAD-7: General Anxiety Disorder-7; PHQ-9: Patient Health Questionnaire-9.

Of the 49 participants, 61% (n=30) of the them were female, 35% (n=17) were male, and 4% (n=2) reported nonbinary genders. First-, second-, third-, and forth-year students occupied 22% (n=11), 41% (n=20), 31% (n=15), and 6% (n=3) of the whole cohort, respectively. A total of 80% (n=39) of the participants were US citizens, and the rest (n=10) were international students. A complete breakdown of demographics with respect to the deteriorating anxiety and depressive disorders are given in Table 1.

Table 1. Demographics of the study population.

Demographic	Increased PHQ-9^a (n=24), n (%)	Increased GAD-7^b (n=26), n (%)
Female	18 (75)	22 (85)
US citizen	18 (75)	20 (77)
First-year students	5 (21)	4 (15)
Second-year students	11 (46)	12 (46)
Third-year students	7 (29)	8 (31)
Fourth-year students	1 (4)	2 (8)

^aPHQ-9: Patient Health Questionnaire-9.

^bGAD-7: General Anxiety Disorder-7.

Evaluation Outcomes

The two-tailed paired independent t tests mentioned at the beginning of the Statistical Analysis section were designed to rule out seasonal factors in online behavior changes and focus on COVID-19 before any of the main experiments. All features had P values less than .003. Hence, the presence of annual or seasonal factors accountable for online behavior changes was neglectable, and it was safe to carry out the following main experiment. This is consistent with one of the main conclusions of Huckins et al [Huckins J, daSilva AW, Wang W, Hedlund E, Rogers C, Nepal SK, et al. Mental health and behavior of college students during the early phases of the COVID-19 pandemic: longitudinal smartphone and Ecological Momentary Assessment study. J Med Internet Res. Jun 17, 2020;22(6):e20185. [FREE Full text] [CrossRef] [Medline]5] that, when comparing the longitudinal data between different years, behaviors during COVID-19 shifted drastically.

We calculated the Pearson product-moment correlations, r, of online behavior shifts with deteriorations in anxiety and depression levels. We reported the correlation coefficients with P values and 95% CIs obtained for each of the aforementioned features.