This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Mental Health, is properly cited. The complete bibliographic information, a link to the original publication on https://mental.jmir.org/, as well as this copyright and license information must be included.
Uncertainty surrounds the ethical and legal implications of algorithmic and data-driven technologies in the mental health context, including technologies characterized as artificial intelligence, machine learning, deep learning, and other forms of automation.
This study aims to survey empirical scholarly literature on the application of algorithmic and data-driven technologies in mental health initiatives to identify the legal and ethical issues that have been raised.
We searched for peer-reviewed empirical studies on the application of algorithmic technologies in mental health care in the Scopus, Embase, and Association for Computing Machinery databases. A total of 1078 relevant peer-reviewed applied studies were identified, which were narrowed to 132 empirical research papers for review based on selection criteria. Conventional content analysis was undertaken to address our aims, and this was supplemented by a keyword-in-context analysis.
We grouped the findings into the following five categories of technology: social media (53/132, 40.1%), smartphones (37/132, 28%), sensing technology (20/132, 15.1%), chatbots (5/132, 3.8%), and miscellaneous (17/132, 12.9%). Most initiatives were directed toward detection and diagnosis. Most papers discussed privacy, mainly in terms of respecting the privacy of research participants. There was relatively little discussion of privacy in this context. A small number of studies discussed ethics directly (10/132, 7.6%) and indirectly (10/132, 7.6%). Legal issues were not substantively discussed in any studies, although some legal issues were discussed in passing (7/132, 5.3%), such as the rights of user subjects and privacy law compliance.
Ethical and legal issues tend to not be explicitly addressed in empirical studies on algorithmic and data-driven technologies in mental health initiatives. Scholars may have considered ethical or legal matters at the ethics committee or institutional review board stage. If so, this consideration seldom appears in published materials in applied research in any detail. The form itself of peer-reviewed papers that detail applied research in this field may well preclude a substantial focus on ethics and law. Regardless, we identified several concerns, including the near-complete lack of involvement of mental health service users, the scant consideration of algorithmic accountability, and the potential for overmedicalization and techno-solutionism. Most papers were published in the computer science field at the pilot or exploratory stages. Thus, these technologies could be appropriated into practice in rarely acknowledged ways, with serious legal and ethical implications.
Data-driven technologies for mental health have expanded in recent years [
Throughout this paper, we use the term algorithmic and data-driven technologies to describe various technologies that rely on complex information processing to analyze large amounts of personal data and other information deemed useful to making decisions [
In the mental health context, algorithmic and data-driven technologies are generally used to make inferences, predictions, recommendations, or decisions about individuals and populations. Predictive analysis is largely aimed at assessing a person’s health conditions. Data collection may occur in a range of settings, from services concerning mental health, suicide prevention, or addiction support. Collection may also occur beyond these typical domains. For example, web-based platforms can draw on users’ posts or purchasing habits to flag their potential risk of suicide [
Some prominent mental health professionals have argued that digital technologies, including algorithmic and data-driven technologies, hold the potential to bridge the “global mental health treatment gap” [
This study set out to identify to what extent and on what matters legal and ethical issues were considered in the empirical research literature on algorithmic and data-driven technologies in mental health care.
Ethics refer to guiding principles, whereas laws, which may be based on ethical or moral principles, are enforceable rules and regulations with penalties for those who violate them. Scholarship on the ethical and legal dimensions of algorithmic and data-driven technologies in mental health care is relatively scant but growing [
According to Lederman et al [
The broader ethical and legal dimensions of algorithmic technologies have been the subject of a much larger scholarship [
This study adapted a scoping review methodology to undertake a broad exploration of the literature. Scoping reviews are particularly useful for surveying a potentially large and interdisciplinary field that has not yet been comprehensively reviewed and for which clarification of concepts is required [
We adapted the Arksey and O’Malley framework for scoping reviews [
A description of each step is outlined below.
We drew on elements of the Joanna Briggs Institute scoping review methodology [
We sought to identify all studies within a selective sampling frame [
In what ways are algorithmic and data-driven technologies being used in the mental health context?
How and to what extent are issues of law and ethics being addressed in these studies?
These questions were chosen to maintain a wide approach to generate the breadth of coverage [
A rapid or streamlined literature search was conducted. We started with a search string that emerged from our initial literature review (noted in the
The following search strings emerged through an iterative process (
(TITLE-ABS-KEY ('mental (health OR ill* OR disability OR impair*)' OR 'psychiatr*' OR 'psycholog*' OR 'beahvioral health') AND TITLE-ABS-KEY ('algorithm*' OR 'artificial intelligence' OR 'machine learning') AND TITLE-ABS-KEY ('internet' OR 'social media' OR 'chatbot' OR 'smartphone' OR 'tracking'))
('mental (health OR ill* OR disability OR impair*)' or 'psychiatr*' or 'beahvioral health').mp. [mp=title, abstract, heading word, drug trade name, original title, device manufacturer, drug manufacturer, device trade name, keyword, floating subheading word, candidate term word]
“mental illness”.mp. or mental disease/
algorithm/ or machine learning/ or artificial intelligence/
('algorithm*' or 'artificial intelligence' or 'machine learning').mp. [mp=title, abstract, heading word, drug trade name, original title, device manufacturer, drug manufacturer, device trade name, keyword, floating subheading word, candidate term word]
Internet/ or “web-based”.mp.
('internet' or 'social media' or 'chatbot' or 'smartphone' or 'tracking').mp. [mp=title, abstract, heading word, drug trade name, original title, device manufacturer, drug manufacturer, device trade name, keyword, floating subheading word, candidate term word]
The above search strings were applied in various combinations.
('mental health' OR 'mental ill*' OR 'psychiatr*' OR 'behavio* health') AND ('algorithm*' OR 'artificial intelligence' OR 'machine learning') AND ('internet' OR 'social media' OR 'chatbot' OR 'smartphone' OR 'tracking')
No date limit was placed, although the search was conducted between August 2019 and February 2020 iteratively. A language filter was applied to focus on English-language results, which was applied for pragmatic reasons to reduce the search scope and complexity (for more on limitations, including terms we appear to have overlooked, see the
After an extensive search, 1078 relevant peer-reviewed research studies were identified in the study selection stage. From these, papers that were not available in English, duplicates, and papers not available in the full text were excluded.
The process of identifying relevant studies among the 1078 papers was iterative, involving several discussions between coauthors. Unlike systematic reviews, where inclusion and exclusion criteria for studies are established at the outset, this study developed these criteria during the search process (
Study undertaken in a mental health context or with application to a mental health context
Text available in English
Study related broadly to the use of big data, internet technology, artificial intelligence, sensors, smart technology, and other contemporary algorithmic technologies
Commentary pieces
Studies focused on other health conditions
Application of data science methods to clinical data collected via clinical technologies (eg, application of data science methods to magnetic resonance imaging data)
Data science methods paper with no specific real-world application or objective
Application of data science methods to psychiatric research in general
Studies applied to animals or animal models
Owing to the large number of studies identified at step 2, we did not undertake a full-text review. Instead, we reviewed only the abstract and title according to our inclusion criteria. According to the PRISMA criteria described by Moher et al [
Study selection for review.
This adaptation enabled us to review a large body of work in a rapidly expanding field. Our broad inclusion approach was also chosen to prevent the exclusion of studies from disciplines that do not conform to traditionally appropriate research designs, which might preclude them from reviews with stricter inclusion and exclusion criteria (eg, using an insufficient study design description as an exclusion criterion). For example, we found that many computer science papers were published in conference journals [
This process resulted in 132 empirical research papers included in the review.
Through initial deductive analysis of the abstracts and discussions between the researchers, we identified several key issues and themes through which to consider the broad research field. We settled on a typology that considered both the form of technology used in the study (eg, social media, sensors, or smartphones) and the stated purpose for the mental health initiative (eg, detection and diagnosis, prognosis, treatment, and support).
The second step involved analyzing the data to determine how legal and ethical issues were discussed. The material was analyzed using the computer software package NVivo 12 (QSR International) [
We sought a uniform approach to the 132 studies included in this review. However, in practice, it was often impossible to extract all the information required where research reports used varying terminology and concepts and potentially failed to include relevant material.
Several typologies can be used to categorize the algorithmic and data-driven technologies identified in these studies. As noted, we integrate two here: (1) the primary
We derived five major categories of technology (
Detection and diagnosis (26/132, 19.7%) [
Prognosis treatment and support (4/132, 3%) [
Public health (22/132, 16.7%) [
Research and clinical administration (1/132, 0.7%) [
Detection and diagnosis (17/132, 12.9%) [
Prognosis treatment and support (20/132, 15.1%) [
Public health (0/132, 0%)
Research and clinical administration (0/132, 0%)
Detection and diagnosis (6/132, 4.5%) [
Prognosis treatment and support (12/132, 9.1%) [
Public health (2/132, 1.5%) [
Research and clinical administration (0/132, 0%)
Detection and diagnosis (0/132, 0%)
Prognosis treatment and support (5/132, 3.8%) [
Public health (0/132, 0%)
Research and clinical administration (0/132, 0%)
Detection and diagnosis (8/132, 6.1%) [
Prognosis treatment and support (8/132, 6.1%) [
Public health (1/132, 0.7%) [
Research and clinical administration (0/132, 0%)
Neat distinctions were not always possible. For example, Nambisan et al [
The four categories by Shatte et al [
We determined that 37.1% (49/132) of the studies broadly concerned technology aimed primarily at prognosis, treatment, and support, which includes initiatives for personalized or tailored treatment and technologies used in services where treatment is provided. Examples include the use of smartphone apps to provide personalized education to someone based on psychometric data generated by the app.
A total of 18.9% (25/132) of studies were on public health. Public health papers used large epidemiological or public data sets (eg, social media data and usage data from Wi-Fi infrastructure) to monitor or respond to persons who appear to be experiencing or self-disclosing an experience of distress, mental health crisis, or treatment. However, we struggled in applying this category, as many were borderline cases in the detection and diagnosis category. This ambiguity may be because many studies were based in the field of computer science and were contemplated at a higher level of generality, with limited discussion of the specific setting in which they might be used (eg, a social media analytical tool could be used in population-wide prevalence studies or to identify and direct support to specific users of a particular web-based platform).
Our search uncovered only 1 study related to research and clinical administration; this particular study focused on the triage of patients in health care settings.
Finally, it is noteworthy that despite the reasonably large volume of studies, all but a few were at an exploratory and piloting stage. This is not surprising given the predominance in our survey of scholarship from computer science journals in databases such as ACM. A key issue in this area of inquiry is the large context gap between the design of these technological innovations and the context of implementation. In many papers from the computer science discipline, the authors made assumptions or guesses as to how their innovations could be implemented, with seemingly little input from end users. This is not a critique of individual researchers; instead, as we shall discuss later, it reflects the need for interdisciplinary and consultative forms of research at the early stages of ideation and piloting. This matter also raises questions as to whether there is a strong enough signal or feedback loop from practice settings back to designers and computer scientists in terms of what
We found 53 studies concerning social media, in which data were collected through social media platforms. Two major platform types were identified: mass social media, including mainstream platforms such as Facebook, Twitter, and Reddit; and specialized social media, comprising platforms focused on documenting health or mental health experiences. Both forms of social media involve the collection of textual data shared by users and self-reported mental health conditions, and sometimes expert opinion on diagnoses attributable to users (identified through information shared on the web). For example, researchers may examine whether the content of posts shared correlates with, and can therefore help predict, people’s self-reported mental health diagnosis. Most studies concerned mass social media platforms (Twitter: 17/53, 32%; Reddit: 14/53, 26%; Facebook: 6/53, 11%), with a small number concerning specialized social media (
The largest sub-category in the social media group (26/53, 49%) have focused on predicting or detecting depression, with some concerning other diagnostic categories. Some studies attempted to capture multiple diagnostic categories or aimed to detect broad signs of mental ill-health.
In total, 38 studies concerned mobile apps used to collect and process data from participants, of which two main subcategories emerged. The first included apps that required active data input by participants (27/38, 71%), which either took the form of validated surveys (eg, Patient Health Questionnaire-9) or an experience sampling method; the second included those that passively collected data from inbuilt smartphone sensors (15/38, 39%). Some papers were counted twice as they had methods that covered both subcategories. Contemporary smartphones include a range of sensors related to sleep patterns, activity (movement), location data (GPS, Wi-Fi, and Bluetooth), communication or in-person human interaction (microphones), web-based activity (phone or text logs and app usage), and psychomotor data (typing and screen taps).
Apps that draw on these data sources can be considered passive sensing because the individual generally does not have to input data actively. Data collection generally requires participants to install an app that collects data from smartphone sensors and sends it to the researchers.
In total, 20 studies focused on broader sensor technology designed to continuously collect data on a person’s activity or environment. We differentiated this category from smartphone passive sensing, although there is a clear crossover with some personal wearables that fall under the sensing technology category. As we use it here,
The list of sensing technologies includes personal wearables (9/20, 45%), smart-home sensors, automated home devices, internet of things (3/20, 15%), Microsoft Kinect (a software developer kit that includes computer vision, speech models, and algorithmic sensors; 1/20, 5%), skin conductance technology (1/20, 5%), portable electroencephalogram (1/20, 5%), radio-frequency identification tags (2/20, 10%), the use of Wi-Fi metadata (2/20, 10%), and data collected via care robots (eg, Paro Robot; 1/20, 5%).
Some studies have examined sensor systems for use in psychiatric settings. For example, Cheng et al [
The fourth group of studies explored the use of chatbots and conversational agents in web-based mental health contexts, of which 5 studies appeared. This group includes studies focused on chatbots being used by both people experiencing mental health conditions or psychosocial disabilities and those who provide them with care or support. For example, D’Alfonso et al [
This final group (17/132, 12.9%) included a range of studies that did not fit the previous categories. This category included the collection of data from video games and data sources where there was no explicit outline of how such data would be collected in practice (eg, facial expression data). We included the video game data in this
As noted, we conducted a thematic analysis supplemented by keyword-in-context analysis to identify themes related to law and ethics, as discussed in the
Using smartphones to collect large amounts of data on personal behavioral aspects leads to possible issues on privacy, security, storage of data, safety, legal and cultural differences between nations that all should be considered, addressed and reported accordingly.
A passing reference was made to the rights of user subjects in some studies (eg, Manikonda and De Choudhury [
In terms of explicit reference to ethics, 10 studies included a specific section on the ethical issues raised by their work [
Privacy was discussed in several ways across all the included studies but was primarily addressed as part of the research method rather than in the real-world implementation of the technology. Approximately 19.7% (26/132) of papers sought to address user privacy through anonymization, deidentification, or paraphrasing of personal information. For example, Li et al [
The second major approach concerns what we have referred to as
A collection of privacy engineering approaches was taken, including hashing and encryption and managing the data processing location. There were a variety of approaches around when and where data were processed and how this aligned with ideas about privacy. Some studies have used encryption before sending data to servers for processing, whereas others have analyzed data locally on the smartphone or did not store specific data postprocessing. Wang et al [
Some authors referred to the tension between privacy and data quality—framed, for example, as “[p]rivacy versus lives saved” [
The final point in the privacy theme was expectations. Very few studies have considered the expectations people may have about how their data are used. This led to an acknowledgment that the use of data from sources such as social media or video games to make predictions about people’s mental health changes the meaning of these data and could have unintended consequences. Eichstaedt et al [
To summarize, we identified five major types of technology—social media, mobile apps, sensing technology, chatbots, and others—in which algorithmic and data-driven technologies were applied in the mental health context. The primary stated purpose of these technologies was broadly to detect and diagnose mental health conditions (approximately 57/132, 43.2% of studies). Only 15.1% (20/132) of papers discussed ethical implications, with a primary focus on the individual privacy of research participants.
As noted, the privacy of participants was addressed in the studies primarily with reference to engineering methods and, in some instances, concerning regulatory compliance. In the smartphone group, notice and consent combined with engineering methods were used to address user-subject privacy concerns. In the social media group, privacy was discussed in terms of how data were managed and the technical elements of the algorithms used, including privacy-preserving algorithms [
Questions may be raised about how privacy is (or should be) conceptualized and how the technologies will fare in real-world settings. Taking a strictly legal approach to privacy, for example, may not necessarily confer a social license to operate. An example of a failure to align law and social license is the United Kingdom’s proposed
Privacy as a concept exists as an expression of claims to dignity and self-determination. These more expansive concerns of dignity and autonomy were not the subject of explicit consideration in the studies examined in this review. This point raises the issue of possible gaps in the literature.
It is difficult to discuss what did
Notwithstanding the common interest in matters of privacy across almost all papers, there was a relatively low engagement with broader ethical dimensions of the algorithmic and data-driven technology in question—a finding that appears to support the view of some scholars in the field [
However, an important distinction should be made between empirical studies designed to validate or explore a particular technology and (as we discussed in the
Furthermore, the gap between applied research, on the one hand, and research that is specifically focused on ethics, on the other hand, does not appear to be unique to the mental health context. For example, Hübner et al [
A minority of the studies in our review discussed these challenges. Birnbaum et al [
New critical questions are required. For example, several studies in the social media category, the largest group of studies, eschewed institutional review board approval based on claims that their data sets were
Very few studies (4/132, 3%) in this survey appear to have included people who have used mental health services, those who have experienced mental health conditions or psychosocial disability, or even those who were envisaged as end-beneficiaries of the particular algorithmic and data-driven technology, in the design, evaluation, or implementation of the proposals in any substantive way (except as research participants). In studies where service users were involved, this tended to comprise of research participants being involved in the co-design of content or codeveloping user-interfaces. D’Alfonso et al [
With very few exceptions, however, the survey indicated a near-complete exclusion of service users in the conceptualization or development of algorithmic and data-driven technologies and their application to mental health initiatives. It is also noteworthy that even mental health practitioners, who may well be end users envisaged by technologists, were involved in relatively few studies.
The active involvement of mental health service users and representative groups for persons with psychosocial disabilities has become a prominent ethos in mental health and disability policies worldwide [
From a pragmatic perspective alone, the involvement of service users and others with psychosocial disabilities is generally agreed to increase the likelihood of “viable and effective—rather than disruptive and short-lived—advances” in digital technologies in the mental health context [
Of the scant commentary and research in the field by persons with psychosocial disabilities and service users, commentators have raised concerns about: the potential need for a
As discussed in the
For some researchers who are developing mental health apps, the first-wave algorithmic accountability concerns will focus on whether a linguistic corpus of stimuli and responses adequately covers diverse communities with distinct accents and modes of self-presentation. Second-wave critics...may bring in a more law and political economy approach, questioning whether the apps are prematurely disrupting markets for (and the profession of) mental health care in order to accelerate the substitution of cheap (if limited) software for more expensive, expert, and empathetic professionals.
Second-wave concerns give rise to questions as to who is benefiting from (and burdened by) data collection, analysis, and use [
The issues the studies aimed to address were presented in medical terms and framed as problems that are amenable to digital technological solutions. This is not surprising. However, some scholars have raised concerns regarding this framing. In their survey of the messaging of mental health apps, Parker et al [
most forms of mental distress are inextricably linked to problems of poverty, precarity, violence, exclusion, and other forms of adversity in people’s personal and social experiences, and are best addressed not by medicalization, but by low intensity but committed and durable social interventions guided by outcomes that are not measured in terms of symptom reduction, but by the capacities that people themselves desire in their everyday lives.
This argument raises broader questions about the politics of mental health, for which it would be unrealistic to expect empirical studies of algorithmic and data-driven technologies in mental health care to resolve. Nevertheless, there is an argument that such political considerations and value choices are currently overlooked, with an overwhelming emphasis on scientific methods and measurements of risk and benefit.
Reviews such as those conducted by Shatte et al [
A disadvantage of using a rapid scoping review method is the difficulty in reproducing the results, given the use of numerous search strings in multiple combinations. This is exacerbated by our aim to cover multiple technology types across several cross-disciplinary databases (resulting in 1078 potential studies reduced manually to 132). There are trade-offs in this broad, exploratory approach. In addition to the challenges of replicability, we cannot claim to have achieved an exhaustive review, as may be possible in systematic reviews of specific technologies or subtypes (such as machine learning). Furthermore, the wide range of new and emerging technologies in our scope poses terminological challenges; hence, we undoubtedly missed studies that used terms overlooked in our search strings (as a peer reviewer pointed out, we did not use the term
Despite these limitations, a survey of empirical studies offers valuable information. The principal strength of a scoping review is its
Our findings suggest that the disciplines undertaking applied research in this field do not generally prioritize
Preferred Reporting Items for Systematic Reviews and Meta-analyses
Funding for this research was obtained from the Mozilla Foundation and the Australian Research Council (Project ID: DE200100483).
None declared.