Published on 30.08.18 in Vol 5, No 3 (2018): Jul-Sept
Preprints (earlier versions) of this paper are available at http://preprints.jmir.org/preprint/10104, first published Feb 12, 2018.
Recognition of Emotions Conveyed by Touch Through Force-Sensitive Screens: Observational Study of Humans and Machine Learning Techniques
Background: Emotions affect our mental health: they influence our perception, alter our physical strength, and interfere with our reason. Emotions modulate our face, voice, and movements. When emotions are expressed through the voice or face, they are difficult to measure because cameras and microphones are not often used in real life in the same laboratory conditions where emotion detection algorithms perform well. With the increasing use of smartphones, the fact that we touch our phones, on average, thousands of times a day, and that emotions modulate our movements, we have an opportunity to explore emotional patterns in passive expressive touches and detect emotions, enabling us to empower smartphone apps with emotional intelligence.
Objective: In this study, we asked 2 questions. (1) As emotions modulate our finger movements, will humans be able to recognize emotions by only looking at passive expressive touches? (2) Can we teach machines how to accurately recognize emotions from passive expressive touches?
Methods: We were interested in 8 emotions: anger, awe, desire, fear, hate, grief, laughter, love (and no emotion). We conducted 2 experiments with 2 groups of participants: good imagers and emotionally aware participants formed group A, with the remainder forming group B. In the first experiment, we video recorded, for a few seconds, the expressive touches of group A, and we asked group B to guess the emotion of every expressive touch. In the second experiment, we trained group A to express every emotion on a force-sensitive smartphone. We then collected hundreds of thousands of their touches, and applied feature selection and machine learning techniques to detect emotions from the coordinates of participant’ finger touches, amount of force, and skin area, all as functions of time.
Results: We recruited 117 volunteers: 15 were good imagers and emotionally aware (group A); the other 102 participants formed group B. In the first experiment, group B was able to successfully recognize all emotions (and no emotion) with a high 83.8% (769/918) accuracy: 49.0% (50/102) of them were 100% (450/450) correct and 25.5% (26/102) were 77.8% (182/234) correct. In the second experiment, we achieved a high 91.11% (2110/2316) classification accuracy in detecting all emotions (and no emotion) from 9 spatiotemporal features of group A touches.
Conclusions: Emotions modulate our touches on force-sensitive screens, and humans have a natural ability to recognize other people’s emotions by watching prerecorded videos of their expressive touches. Machines can learn the same emotion recognition ability and do better than humans if they are allowed to continue learning on new data. It is possible to enable force-sensitive screens to recognize users’ emotions and share this emotional insight with users, increasing users’ emotional awareness and allowing researchers to design better technologies for well-being.
JMIR Ment Health 2018;5(3):e10104
- emotional artificial intelligence;
- human-computer interaction;
- force-sensitive screens;
- mental health;
- positive computing;
- artificial intelligence;
- emotional intelligence
Emotions are distinct natural entities that involve the mind and the body. They modulate our physiology by increasing or decreasing variables such as heart rate, respiration, and body temperature, and our psychology by altering perception, beliefs, and virtual images [, ]. Emotions seek expression, and they use one or many motor outputs (sequentially or simultaneously) to fulfill their need to be expressed: face, voice, arms, and legs are used in a combination of one or many outputs to express emotions depending on the context, the physical condition of the body, the available means of communication, and the choice of the expresser [ ].
Touch is a profound form of communicating emotions, and many researchers have studied its use in and impact on health and well-being [, - ]. A few minutes of daily touches not only enhance growth and weight gain in children, but also lead to emotional, physical, and cognitive improvements in adults [ , ]. Touch releases hormones and neuropeptides, and stimulates our bodies to react in very specific ways: the levels of blood pressure, heart rate, and cortisol change, and the hippocampus area of the brain is activated for memory [ ]. Humans can easily communicate and sense emotions conveyed through touch [ , ]; babies respond well to touch [ ] and loving touches are critical to the health of premature infants [ ].
We use touch to communicate with many devices in our daily lives. As machine interfaces are engaging users more frequently and tend to mimic human-human interactions to facilitate natural communications, it is becoming key to develop new algorithms for the new interactive and sensitive touch screens to capture and recognize users’ emotions and increase the level of emotional intelligence of both users and their devices.
Emotional intelligence, or our ability to recognize and regulate our own emotions and those of others, is key to communicating well. Recognizing emotions when they are expressed and regulating them in ourselves and others helps us achieve effective communications and maintain good mental health [- ].
Emotional artificial intelligence, or a machine’s ability to express emotions and recognize and regulate users’ emotions, has also become a key capability of an intelligent machine enabling effective interactions with its users . Various methods are being used and technologies are being built to detect users’ emotions as they interact passively or actively with machines. The most commonly used techniques for emotion recognition are one or a combination of many of the following approaches: text, facial expressions, voice tones, biosensors and body movements, and gestures.
Many technologies have implemented facial coding and voice analysis theories to recognize emotions, but none have explored the touch theory . Software technologies analyze the face, tone of voice, and textual natural language to recognize emotions [ ]. It is difficult to fulfill the requirements of these technologies and measure emotions accurately in real-life situations. It is essential to find and add other more accessible ways to measure emotions: that is, mobile phone use and touch screen behavior.
Emotion recognition techniques using text process words and sentences in a particular language. The most common techniques process natural language and extract emotions and sentiments from writings and conversations found in books, blogs, chat rooms, and social media platforms . One of the biggest challenges of this method is to recognize emotions in the context of the text. An emotion can be expressed without using the word that denotes it or any of its synonyms. New words and expressions can be created, or words from other languages or sarcasm can be used to express emotions, which makes the task of recognizing emotions very difficult [ ].
Usually, facial emotion recognition techniques segment images of the face into specific regions and analyze their movement. Regions of interest include cheek, chin, wrinkles, eyes, eyebrows, and mouth . Different classification techniques are then applied to recognize emotions [ ]. New studies have claimed that faces alone do not universally communicate emotions, and that conceptual knowledge supported by language is necessary to distinguish and recognize emotions [ , ]. In addition, a static image does not convey the information related to the dynamic form of the expression to accurately recognize emotions.
Speech emotion recognition techniques analyze the tone of voice and other speech features to detect emotions and their dimensions. Various methods and interfaces have been designed to extract various features from speech signals, and they are usually adapted to a particular language. The tone of speech varies with different cultures. A person from a certain area, talking in a normal tone, might sound angry to someone from another culture due to differences in normal speed and volume between the two cultures. Someone talking in a low and slow tone might appear as sad for some and polite for others. Additionally, speaking in real-life conditions is corrupted with various noises. This makes it hard for a machine to isolate a particular voice and recognize emotions .
Emotion recognition using biosensors monitors the physiological variables of the autonomic nervous system (ANS) that are affected by emotions. Biosensors can be invasive or noninvasive and collect variables in the ANS including heart rate, skin conductance, heart rhythm, blood volume, and temperature to recognize emotions through changes in their patterns. Spatial and temporal analysis of the brain’s activity are two other techniques that are used to recognize emotion . Functional magnetic resonance imaging focusses on identifying the regions of the brain involved in expressing emotions, and electroencephalography monitors the electrical activity of the brain to recognize emotions. Different clustering and evaluating techniques are then used on these physiological changes to detect emotions. But consistent and universal patterns in the ANS in relation to emotions have not been found, and many technical challenges are still to be solved [ , , ].
Body movements and finger gestures are other ways of expressing emotions. Features such as amplitude, speed, fluidity, shape of movements, and motion direction are being extracted from expressive children and adults, and various methods and techniques have been applied to recognize emotions [- ]. A body action and posture coding system has been developed recently to enhance the understanding of the role of body movements in expressing emotions [ ]. But emotion recognition based on body movements and gestures is the least popular way of evaluating emotions. Tracking body movements and gestures in 3 dimensions is difficult and requires many sensors, which is one of the major drawbacks of this method.
Multimodal approaches have also been used to recognize emotions. By analyzing two or many measures from the face, voice, text, or ANS, these approaches provide better accuracy than do individual modalities but are complex and not easy to replicate or scale [- ].
In the last few years, researchers have started to explore correlations between emotions, features from smartphones, and gestures [, ]. Analyzing data from a smartphone’s accelerometer predicted the emotional dimension of arousal with an accuracy of 75% [ ]. Analyzing the length, time, velocity, and pressure of finger strokes on a smartphone predicted the emotional dimension of valence with 84.9% accuracy [ ]. Analyzing features extracted from textual contents and user typing predicted anger, disgust, happiness, sadness, neutrality, surprise, and fear with 72% accuracy [ ].
The worldwide number of mobile phone users is expected to pass the 5 billion mark by 2019 . Most of the mobile market growth can be attributed to the increasing popularity of smartphones. Users touch their smartphones thousands of times a day [ ]: they play games, purchase products, and interact with other users in chat rooms and social media platforms. Users spend 54 minutes to 3.8 hours per day on their smartphone. Among 16- to 24-year-olds, 94% possess a smartphone and spend up to 4 hours per day on it. They open an app every 15 minutes because they feel the urge to do so: it is difficult for most of them to reduce the time spent on a smartphone or control the frequency of its use [ ]. Addictive behaviors are dictated by uncontrolled emotions, where reasoning and logical thinking is not applied [ , ]. The World Health Organization classifies addictive behaviors for technology as a mental health problem [ ].
It is thus essential to develop technologies for emotional awareness and help users establish a healthy communication between reason and emotions to lower addictive behaviors and suffering, improve decision making, and increase well-being .
This study explored the power of touch and its potential to convey distinct emotions when it is used as a means to communicate with apps and distant users via force-sensitive smartphones. Our ultimate goal is to increase users’ emotional awareness by powering smartphone apps with touchscreen emotional intelligence. We hope to open new opportunities for designers to create new interactive emotional experiences, provide emotional inputs for developers to enrich their apps, and offer insight for researchers to better understand emotions in the context of human-computer interactions.
We conducted 2 experiments in this study to recognize anger, awe, desire, fear, hate, grief, laughter, love, and no emotion. Experiment 1 examined whether humans are able to recognize the emotions by looking at passive expressive touches. Experiment 2 collected features of expressive touches of emotionally aware participants with good imagery ability and used machine learning techniques to predict emotions.
Emotions are complex entities that often cannot be defined with just one single word. Naming an emotion is labelling the qualities of its psychophysiological manifestation, and these qualities are not precisely known. Naming by language the experience of anger is not a guarantee of the existence of a simple and clear psychophysiological pattern. Anger is not a simple entity, and complex and mixed patterns may have simple names. Sometimes the naming is, to a degree, confused and confusing. Some emotions remain nameless.
In this study, we were interested in what we define as biological emotions or nonverbal emotions for which language is not required when they are communicated. We listed 8 emotions (and no emotion) that we considered to be biological, and we hypothesized that they are easily recognizable by a perceiver if expressed authentically: anger, awe, desire, fear, grief, hate, laughter, and love, as well as no emotion. The main word we coined for each emotion is approximate and not unique. Translating our words to other languages and explaining them to our participants required that we define every individual word by describing how people react and what they say and do when they are under the influence of that emotion.describes our 8 biological emotions.
In addition to these descriptions, we added a set of images and videos showing people’s faces, speech, and body movements for every emotion. We collected this multimedia content from the internet. We were inspired by the International Affective Picture System  and the affective videos of the Laboratoire d’Informatique en Image et Systèmes d’information Annotated Creative Commons Emotional Database [ ], and standardized in terms of its audio and visual characteristics (brightness, loudness, distance, color, and size).
We recruited and screened volunteers and smartphone users before assigning them into 2 groups to participate in 2 experiments. We contacted a local volunteer center and posted advertisements on websites asking people above 16 years old who were interested in emotions and technology to participate in our study for free.
|Anger||When you are angry, you boil, react, object, yell, or swear. You say words or expressions like “fuck,” “shit,” “no,” “stop,” or other synonyms silently in your head or loudly in your own language, usually your native language.|
|Awe||When you are in awe, you freeze or slow in contemplation. You disconnect from distractions and get absorbed by the object of your awe. You are speechless and cannot link what you discover with what you already know.|
|Desire||When you desire, you want, crave, need, and starve for. You say words or interjections like “yummy,” “come,” “tasty,” “want you,” or other synonyms silently in your head or loudly in your own language, usually your native language.|
|Fear||When you fear, you withdraw, hide, freeze, or tremble. You remain silent or say words or expressions like “no” or other synonyms silently in your head or loudly in your own language, usually your native language.|
|Grief||When you are in grief, you are very sad, and feel helpless and weak. You suffer and feel pain. You cry, moan, and whimper.|
|Hate||When you hate, you destroy, crush, and break. You say words or expressions like “perish” or “die” in your head or loudly in your own language, usually your native language.|
|Laughter||When you laugh, your breath and voice are chopped and your eyes twinkle and tear. You repeat “Ha ha” or other sounds while you move in the same rate as you laugh and emit sounds.|
|Love||When you love, you care, protect, comfort, and maintain the state of the loved object. You smile, remain silent, or say words or expressions like “dear,” “cute,” or “sweet” in your head or loudly in your own language, usually your native language.|
|No emotion||When you are not under the influence of an emotion, you reason with ease. Counting from 1 to 10 while seeing or visualizing the numbers in your head is an example of a very simple and unemotional state.|
We invited all volunteers to complete 2 tests: the Levels of Emotional Awareness Scale (LEAS) to assess their emotional awareness ; and the Questionnaire Upon Mental Imagery (QMI) to assess their imagery ability [ , ]. For the purpose of our study, and the requirement to express pure and authentic emotions in both experiments, we needed all participants in group A to be not only emotionally aware, but also able to physiologically react to imaginary emotional situations. Only 20 participants among the 117 volunteers accepted to take the tests and apply to be part of group A.
The LEAS is an open-ended test in which we assessed the ability of volunteers to use emotion words in various situations to describe their feelings and the feelings of others . A total of 20 volunteers answered 20 questions describing 20 emotionally evocative situations, and we hand-scored their responses. The scoring was as follow: 0 was given to nonemotional responses or when a response described thought instead of feeling; 1 was given when participants described physical awareness (eg, “I feel tired”); 2 was given when a response described an undifferentiated emotion (eg, “I feel bad”) or when the response described an action (eg, “I feel like I’m going to punch him in the face”); 3 was given when feelings were described using discrete emotions (we have a glossary of more than 600 discrete emotions collected from prior studies); 4 was given when many different discrete emotions were used to describe mixed or complex feelings; and 5 was given when participants described and differentiated their own feeling from someone else’s feelings using discrete emotions [ ].
A higher score in the LEAS correlates positively with empathy, understanding of others, and openness to experience , as well as the ability to recognize emotions [ , ].
The QMI is a 600-item measure of mental imagery ability for 7 sensory modalities (visual, auditory, cutaneous, kinesthetic, gustatory, olfactory, and organic). In our study, we focused on the visual, cutaneous, and kinesthetic sensors only (the scoring of each of the 7 modalities is independent, and the emotion induction protocol requires participants to focus on 3 senses). We thus asked our volunteers 80 selected questions similar to the original questions of the test to indicate how clearly they could imagine a series of situations (eg, the form and movement of an assassin approaching the bed, touching silk, running fast to catch a car). Scoring was on a 7-point vividness rating scale varying from 1 (“Perfectly clear and as vivid as the actual experience”) to 7 (“I think about it but I cannot imagine it”), with a total score ranging from 80 to 560. Higher total scores indicate weaker imagery ability. Physiological activity in response to emotional imagery varies as a function of imagery ability. This means that good imagers show greater emotion-specific physiological activity than poor imagers .
We conducted experiments 1 and 2 in the laboratory on the same day for group A. Group B participated in experiment 1 remotely. We asked group A participants to follow some instructions (see below) 1 day before the day of the experiments. On the day of the experiments, they participated in an interview session and a training session before experiments 1 and 2 .
Instructions for the Day Before
We asked group A to rest, sleep, and eat well but not too much, because emotions are subject to physical and psychological states and can be difficult to induce under conditions of stress, fatigue, lack of sleep, hunger, or heavy meals . For the needs of the training session, we asked them to come, if possible, with a voluntary partner with whom they were able to be emotionally intimate. Alternatively, we set up a time when 2 single group A participants could come together in the same time and partner each other during the training session. We asked all of them to bring their smartphone.
We excluded group A participants if they had a history of any of the following: current alcohol and drug abuse or dependence, neurological disease or trauma, and other medical or psychiatric complications. To share our definition of emotions clearly with everyone in group A, we gave them the list of emotions described in, and they viewed our collected multimedia content for each emotion. We encouraged them to ask questions and translate the words we gave as examples to their own words and language. All group A participants signed a consent form where a strict ethics code was applied, including their right to ask questions, withdraw from the experiment, keep their personal information private, and understand the scope and the purpose of the study.
Our protocol in experiment 2 required a minimal granularity in the measurement of finger pressure and area to collect enough change in data for our machine learning algorithms. To test whether a smartphone is sensitive enough, we designed an app that can be set up on any smartphone to check our requirements. After installing and launching the app on their smartphone, we asked participants to press on their screen as hard as possible for 5 seconds. We required a minimum granularity level of 10 for both area and force, a minimum pixel density of 100 pixels per inch, and a maximum temporal resolution of 1 millisecond. We used the device model Motorola XT1023 (Motorola Mobility LLC, Libertyville, IL, USA) with all of those who did not have a sensitive enough smartphone.
The purpose of this session was to train participants from group A to express each emotion as precisely as possible by using only their finger and touching the palm of the partner to communicate each emotion. We called group A participants “expressers” and their partners “perceivers.” We encouraged but did not oblige expressers to use their middle finger during the expression because it is the first finger to reach the target of touch and the least cumbersome to use.
We asked the expressers to read the description of each emotion (see) and use our inductive material (images and videos) to stimulate their imagination and remember an emotion-provoking life situation. We asked them to describe the situation as if they were actively involved and emotional. They were encouraged to close their eyes and imagine the specific situation so as to experience it more accurately. A blank paper was provided, and participants were required to write down a description of the situation, their thoughts, and their words for each emotion. They were encouraged to use their own language. lists the instructions given to the expressers.
We gave the perceivers, in a separate room, the list of emotions (see) and asked them to memorize them. They were informed that they were to be blindfolded and seated on a chair with their palm resting on their knee, and that an expresser would try to communicate an emotion from the list by only touching the perceiver’s palm with their finger. We asked the perceivers to try to guess the emotion conveyed by the finger and respond with 1 of the following 3 options: 1: “I don’t know;” 2: “I hesitate between...;” and 3: “I know; it is...”
We asked expressers to express each emotion once or many times until the perceiver guessed their emotion correctly. They were free to move to another emotion before coming back to an emotion they had already tried to express without being successful and try again until they succeeded. Expressers were given a paper on which the emotions were listed and were also asked to put 1 or 0 at every trial (1: their partner successfully recognized the emotion; 0: their partner was confused or not able to recognize the emotion).shows the emotional expression of participants on the palms of their partners.
We asked expressers who completed the training session to participate in experiments 1 and 2, but only to express the emotions they were able to communicate successfully to their perceivers.
The objective of this experiment was to answer the following question: if emotions modulate our touches [, , ], will humans be able to recognize emotions by only looking at an expressive touch?
We asked participants from group A to express the emotions that their partners were able to recognize during the training session against a transparent glass under which we video recorded their touches for 30 seconds. Each session started with neutral (nonemotional) touches.
We asked participants to express each emotion with one or many successive touches. They were encouraged but not obliged to use their middle finger.shows 3 frames from 3 different videos in which a participant expressed fear, grief, and laughter, respectively (left to right).
We then showed the videos online to group B and asked them to guess the emotion expressed by the finger by classifying the expression as 1 of the 8 emotions (and no emotion) described in.
The objective of this experiment was to answer the following question: can we teach machines how to accurately recognize users’ emotions from their touches on force-sensitive screens?
We asked group A participants to express the emotions that their partners were able to recognize during the training session against an app that we installed on a smartphone. They were encouraged to use their middle finger because it is the least cumbersome finger and the first to reach and touch the screen, but they were allowed to use any other finger if it was easiest for them to use.
shows the app’s interface and a participant expressing an emotion. Participants followed the same instructions as those in the training session (see ).
Each session started with neutral (nonemotional) touches. We asked participants to express each emotion with one or many successive touches. They were encouraged to express each emotion at least 10 times, successively or not. After each emotional expression, participants took a 5-minute break to relax and resume a neutral state, thus preventing potential carryover effects of the previous emotional experience. During this period, we asked participants to rate on a single 5-level scale the purity of their emotional expression and the extent to which their finger expression conveyed the emotion. The rating varied from 0 to 5, where 0 was “not expressed” and 5 was “expressed well.” This allowed us to assess the accuracy of the induction procedure, as well as the subjective state of the participant after the emotional experience and expression.
Precision, Calibration, and Normalization
We recorded the coordinates (xt,yt) of finger touches, their amount of force (Ft), and skin area (St), all as functions of time with a resolution of 1 millisecond. Knowing the screen density of each device, we converted (xt,yt) from pixels to millimeters with a precision of a hundredth of a millimeter. We calibrated Ft and St, knowing the maximum amount of force and skin area a participant could apply on a device, and then normalized it on a scale of 0 to 1 with a precision of 2 digits after the decimal, because the maximum values Ft and St differ not only between devices but also between participants. Some devices are more sensitive than others, and the maximum values of Ft and St are not always 1. Participants had different finger sizes, and large fingers record larger areas.
Data Filtering, Touches, and Expressions
Among the data we collected, and based on the subjective rating, we retained only the expressions that were rated 3 or more on the scale of purity. We grouped successive touches with a delay of less than 1000 milliseconds and considered them to be part of one single emotional expression. Expressions comprised one or many touches depending on the emotion, its intensity, and the need to express it.
To set up an initial set of relevant features and observe the expression of each emotion in time, we designed a visualization tool using Python.
Based on our intuitive assumptions and observations and the experimental context, we calculated for every expression an initial set of the following 17 spatiotemporal features: the coordinates (x1, y1, x2, y2) of the beginning and end of an expression; the distance, angle, duration, velocity, and acceleration of the expression; the total number of touches and the mean duration of touches of an expression; and the mean, maximum, and velocity of both force and size of an expression. Further statistical analysis and feature selection techniques including principal component analysis  allowed us to reduce the number of features to 9 independent features. We retained the first 9 eigenvectors because they captured 95% of the total variance in the original data. The distribution of variance among the 9 components was 20.1% (roughly corresponds to the duration), 18.2% (the amount of force), 12.7% (the number of touches), 11.3% (the spatial extent on the x,y axis), 10.4% (the angle), 8.2% (the distance), 5.1% (the size), 4.6% (the velocity), and 4.4% (the acceleration).
We used the parametric paired t test to measure the degree of variance and dependency between the normally distributed features.displays the comparison results.
We see that the mean differences 5.139 (in pair 1), 175.118 (in pair 2), 0.756 (in pair 3), and –284.59 (in pair 4) between features are not equal to zero. With 95% confidence (P=.05), we could conclude that there was a significant statistical score to indicate that our selected features were independent.
We recruited and screened 117 volunteers and smartphone users between the ages of 16 and 63 years (47 male and 70 female participants). Their mean age was 32.5 (SD 13.9) years.
Only 15 participants were rated as good imagers and highly emotionally aware enough to be part of group A (15/117, 12.8%; 7 male and 8 female participants). They were all right-handed and scored above 70 (very much above average) in the LEAS test and below 107 (above average) in the QMI.
The other 5 participants were either not enough emotionally aware or not good enough imagers, or both. We assigned them to group B, which comprised 102 volunteers (40 male and 62 female participants), and we applied no additional tests or requirements to them. Both groups participated in experiment 1, and only group A participated in experiment 2.shows the distribution of the participants by group and sex.
Smartphone Sensitivity Tests
We tested 11 different smartphone devices: 63% (7/11) were eligible to be used in experiment 2 (see).
Experiment 1: Human Recognition for Emotions
We collected 102 responses for each emotion. The results proved to be highly successful and significant, with all 8 emotions (and no emotion) correctly recognized with great accuracy with only 1 attempt. Errors of recognition were mainly choosing hate for anger (7/918, 0.8%) and vice versa, or choosing awe for no emotion (11/918, 1.2%) and vice versa. A total of 12 of the 918 participants (1.3%) confused hate with laughter.
|Device model||Screen density (pixels/inch)||Force granularity||Area granularity||Sensitive enough?|
|Apple iPhone 6S||326||121||95||Yes|
|Sony Xperia XZ||424||35||21||Yes|
However, 49.0% (50/102) of the participants were 100% (450/450) accurate in recognizing all 8 emotions (and no emotion) and 25.5% (26/102) were 77.8% (182/234) accurate. Male and female participants did equally well.shows the results obtained for each emotion and details the classification results for each emotion. This experiment confirmed that emotions modulate our fingers as we force-touch a sensitive surface, as well as that humans have a high ability to recognize emotions from the pattern of the emotion in the expressive force-touch.
We can see distinct patterns between emotions in terms of their beginning and end, as well as the shape, speed, acceleration, and space occupation of their respective expressions.shows more details in the variation of the amount of force and skin area for each emotion. The horizontal axis is the time in milliseconds; the vertical axis represents the variation of force from 0 to 1, and the dot size on the curves correlates with the variation of the skin area over time.
shows a single frame taken from participants’ finger expressions of each emotion at a random instant t on the (x,y) axis.
Emotional expressions have distinct amounts of force and skin area. Hate is characterized by a very high amount of force and fear is very fast and short.
|Emotion expressed in the video||Participants’ classification|
Experiment 2: Machine Learning Classification
shows the results of overall subjective ratings of group A for each emotion.
All participants expressed anger and no emotion very well (486/486, 100% of expressions were rated 5). Grief and laughter were the most challenging (20% of expressions: 25/126 for grief and 46/232 for laughter were rated below 3).
Our dataset comprised 2316 instances (or emotional expressions).shows the distribution of instances among emotions. Anger was the most expressed emotion with 486 expressions, and grief was the least expressed with 101 instances.
The number of touches per emotional expression varied between 1 and 25 (mean 1.94, SD 2.88); 83.46% (1933/2316) of expressions had an average of 1.38 touches and 3.45% (80/2316) of expressions had a maximum of 2.5 touches (see). We saw more touches in laughter and anger than in the other emotions.
We designed an experimental framework in which we tested various machine learning techniques in supervised learning, including naive Bayes, nearest neighbor, neural networks, meta, and decision tree classifiers. We obtained the best classification results in random committee and 2 decision tree algorithms: random tree and random forest.
Random committee is a type of meta algorithm that takes classifiers and converts them into more powerful learners. Random committee builds an ensemble of base classifiers and averages their predictions . Each one is based on the same data but uses a different random number seed. Decision trees are treelike structures; they start from root attributes and end with leaf nodes. Decision tree algorithms describe the relationships among attributes and the relative importance of attributes. Random tree chooses a test based on a given number of random features at each node, performing no pruning. Random forest constructs random forests by bagging ensembles of random trees [ ]. shows the classification results using the 10-fold cross-validation test option.
The percentage of correctly classified instances was very high, and varying between 86.14% (1995/2316) and 91.11% (2110/2316). Kappa statistics is a chance-corrected measure of agreement between the classifications and the true classes. It is calculated by taking the agreement expected by chance away from the observed agreement and dividing by the maximum possible agreement. Kappa being higher than .81 demonstrates an almost perfect agreement for all the classifiers. Random forest produced the best results.shows its detailed accuracy per emotion and shows the confusion matrix.
The rate of true positives (or recall) varied from .75 for awe to 1.00 for fear. Most instances were correctly classified. The rate of false positive was insignificant and was highest in love, with only .03 instances falsely classified. The proportion of instances that were truly of a class divided by the total instances classified as that class (precision) varied from .82 to 1.00. A combined measure for precision and recall calculated as 2 × precision × recall / (precision + recall) is presented as the F measure. The area under the receiver operating characteristic curve approach 1.00 for all classes (>.98), which demonstrates the optimality of our model.
The confusion matrix inshows the raw numbers, with anger, are, desire, fear, hate, grief, laughter, love, and no emotion being the class labels.
To test the performance of the classifier, we trained it on 15 subsets where we excluded 1 different participant in each run to be tested on the sample of the excluded participant (leave-one-run-out cross-validation and leave-one-sample-out cross-validation). The average performance of the classifier was 86.36%, kappa=.84: a drop of 4.6% compared with the result obtained with 10-fold cross-validation.
Human Versus Machines
In recognizing our emotions from finger force-touches, our algorithm did better than group B participants.shows a comparison of accuracy between our algorithm and group B participants in recognizing emotions.
For group B participants, anger was the easiest recognizable emotion, while awe and laughter were the most difficult to guess. Our algorithms were best in detecting fear, with almost 100% accuracy.
|Emotion and rating||Participants who chose the rating, n (%)|
|No emotion (n=216)|
|Algorithm||Correctly classified, |
|Kappa statistic||Mean absolute error||Root mean |
|Relative absolute |
|Root relative square |
|Random tree||1995 (86.14)||.84||.03||.18||15.90||56.40|
|Random committee||2082 (89.90)||.88||.03||.13||16.43||42.62|
|Random forest||2110 (91.11)||.90||.04||.13||19.54||40.77|
|Class||True positive |
|False positive |
|Precision||Recall||F measure||Matthews |
|Area under the receiver |
|Area under the |
The results of the first experiment of this study demonstrated the ability of humans to recognize emotions when expressed through finger force-touch, and the results of the second experiment demonstrated clear finger force-touch patterning of emotions.
Our findings in experiment 1 indicated the ability of group B participants to recognize 8 emotions: anger, awe, desire, fear, grief, hate, laughter, love (and no emotion) as described in. Group B had a high accuracy of 83.8% (769/918) in recognizing the emotions of the emotionally expressive participants (group A) by only looking at the movement of their fingers when pressing a transparent glass to express emotions. Clynes [ ] reported a similar ability of humans to recognize emotions by only looking at the movement in space of an arm expressing emotions. Clynes [ ] and Hertenstein and colleagues [ , ] demonstrated the ability of humans to communicate and perceive distinct emotions via touch. Our finding in experiment 2 indicated higher accuracy for our algorithm, recognizing clear patterns of 8 emotions (and no emotion) with a 91.11% (2110/2316) classification accuracy. Following the training sessions, recording finger force-touch and skin area in the expression of each emotion and filtering the collected data based on subjective ratings revealed highly significant and accurate classification of group A’s emotions.
An interesting insight in our data was the correlation between the 8 emotions (and no emotion) and the spatiotemporal features: coordinates of the beginning and end of an expression, the distance, angle, duration, velocity, and acceleration of the expression, the total number of touches, the mean duration of touches of an expression, and the mean, maximum, and velocity of both force and size of an expression.
Using a strict participant selection process and a personalized emotion induction protocol with human validation and subjective rating allowed us to state that anger, awe, desire, fear, grief, hate, laughter, love, and no emotion produce specific response patterns in finger force-touch expression of emotions. However, there are several considerations (related to the selection process of group A participants) that limit generalization of our findings: group A participants (1) were highly emotionally aware, (2) had good imagery ability, (3) were used to smartphones, (4) had good touch dexterity on their smartphones, and (5) were willing and able to communicate emotions authentically using touch. Future replication of these findings is needed in participants who are poor imagers, less emotionally aware, and nonusers of smartphones.
Comparison With Prior Work
According to cognitive-physiological network models, ideas, memories, and verbal and subjective descriptors are very important components to induce emotions. Thus, when we scrupulously described our emotions to our participants and then asked them to remember and personalize meaningful emotion-evoking scenes and use their finger as the motor output to communicate the emotion to another participant before communicating the expression to a smartphone, the finger force-touch alone discriminated between anger, awe, desire, fear, grief, hate, laughter, love, and no emotion in 91.11% (2110/2316) of the cases. This indicates that the 9 spatiotemporal features extracted from finger force-touch measures were part of the network that was activated when participants expressed their emotions. Our accuracy rates were higher than those reported in earlier similar studies where emotions or emotional dimensions were detected from force-touch, typing, and words or other sensors in the smartphone [, - ]. This may be related to clearer descriptions of our target emotions, the use of personalized multimedia material and imagery of real-life situations to induce emotions, and the use of another participant to validate the communication of the emotion via finger force-touch. Also, our higher accuracy may be due to the strict conditions in the selection process of participants for group A (they were all good imagers and highly emotionally aware).
Emotions are unique and important entities with built-in windows across the mind-body barrier that need to be understood. They convey great power in the development and mental healing of the individual, of society, and even for the now self-conscious evolution of human beings.
In this study, we described a protocol and implemented methods that allowed us to validate the human ability to express and perceive distinct emotions, and a machine’s ability to recognize those emotions on force-sensitive smartphones. Much remains to be done to build more comprehensive, accurate, and scalable emotion detection algorithms in real-life contexts.
Touch is one of the most powerful of human senses, and we use it passively to communicate emotions. As we continue interacting with devices through touch, it is becoming essential to analyze these patterns and sense emotion. Enabling smartphone apps to capture, discern, and communicate the emotions of expressive touches will completely change the way users perceive and touch their devices and facilitate spontaneous emotional expressions. Human-computer-human interactions will get better, clearer, and emotionally intelligent.
Conflicts of Interest
AH is the founder, Chief Executive Officer, and Chief Technology Officer of Emaww Inc, a software startup based in Montreal, Canada. Emaww specializes in emotions, human-computer interactions, generative art, and artificial intelligence. It aims to build innovative solutions to empower users, professionals, and developers with emotional insight and improve human-computer-human communications.
- Clynes M. Sentics: The Touch of Emotions. Garden City, NY: Anchor Press/Doubleday; 1977.
- De Spinoza B. Ethics. London, UK: Penguin Classics; 1996.
- Ardiel EL, Rankin CH. The importance of touch in development. Paediatr Child Health 2010 Mar;15(3):153-156 [FREE Full text] [Medline]
- Fairhurst MT, Löken L, Grossmann T. Physiological and behavioral responses reveal 9-month-old infants' sensitivity to pleasant touch. Psychol Sci 2014 May 01;25(5):1124-1131 [FREE Full text] [CrossRef] [Medline]
- Feldman R, Rosenthal Z, Eidelman AI. Maternal-preterm skin-to-skin contact enhances child physiologic organization and cognitive control across the first 10 years of life. Biol Psychiatry 2014 Jan 01;75(1):56-64. [CrossRef] [Medline]
- Field T. Touch. Boston, MA: MIT Press; 2003.
- Field T. Touch for socioemotional and physical well-being: a review. J Dev Rev 2010 Dec;30(4):367-383. [CrossRef]
- Hertenstein MJ, Keltner D. Gender and the communication of emotion via touch. Sex Roles 2011 Jan;64(1-2):70-80 [FREE Full text] [CrossRef] [Medline]
- Hertenstein MJ, Keltner D, App B, Bulleit BA, Jaskolka AR. Touch communicates distinct emotions. Emotion 2006 Aug;6(3):528-533. [CrossRef] [Medline]
- Calvo R, Peters D. Positive Computing: Technology for Wellbeing and Human Potential. Cambridge, MA: MIT Press; 2014.
- DeSteno D. Emotional Success: The Power of Gratitude, Compassion, and Pride. Boston, MA: Eamon Dolan/Houghton Mifflin Harcourt; 2018.
- Feldman-Barrett L. How Emotions are Made: The Secret Life of the Brain. Boston, MA: Houghton Mifflin Harcourt; 2017.
- Yonck R. The Heart of the Machine: Our Future in a World of Artificial Emotional Intelligence. New York, NY: Arcade Publishing; 2017.
- Doerrfeld B. 20+ Emotion recognition APIs that will leave you impressed, and concerned. Stockholm, Sweden: Nordic APIs AB; 2015. URL: https://nordicapis.com/20-emotion-recognition-apis-that-will-leave-you-impressed-and-concerned/ [accessed 2018-02-05] [WebCite Cache]
- Medhat W, Hassan A, Korashy H. Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J 2014 Dec;5(4):1093-1113. [CrossRef]
- Ekman P. The atlas of emotions. 2018. URL: http://atlasofemotions.org/ [accessed 2018-02-05] [WebCite Cache]
- Hjortsjo C. Man's Face and Mimic Language. Lund, Sweden: Studen Litteratur; 1969.
- Neeru N, Kaur L. Performance comparison of recognition techniques to recognize a face under different emotions. Am J Eng Res 2016;5(12):213-217.
- Doyle CM, Lindquist KA. Language and emotion: hypotheses on the constructed nature of emotion perception. In: Fernandez-Dols JM, Russell JA, editors. The Science of Facial Expression. New York, NY: Oxford University Press; 2017.
- Doyle CM, Lindquist KA. When a word is worth a thousand pictures: language shapes perceptual memory for emotion. J Exp Psychol Gen 2018 Jan;147(1):62-73. [CrossRef] [Medline]
- Chenchah F, Lachiri Z. Speech emotion recognition in noisy environment. 2016 Presented at: 2nd International Conference on Advanced Technologies for Signal and Image Processing; Mar 21-24, 2016; Monastir, Tunisia.
- Heraz A, Frasson C. Towards a brain sensitive intelligent tutoring system: detecting emotions from brainwaves. Adv Artif Intell 2011;2011:1-13. [CrossRef]
- Kreibig SD. Autonomic nervous system activity in emotion: a review. Biol Psychol 2010 Jul;84(3):394-421. [CrossRef] [Medline]
- Levenson RW. The autonomic nervous system and emotion. J Emot Rev 2014;6(2):100-112. [CrossRef]
- Castellano G, Villalba SD, Camurri A. Recognising human emotions from body movement and gesture dynamics. 2007 Presented at: ACII 2007: Affective Computing and Intelligent Interaction; Sep 12-14, 2007; Lisbon, Portugal p. 71-82.
- Kapur A, Babul NV, Tzanetakis G, Driessen PF. Gesture-based affective computing on motion capture data. In: Tao J, Tan T, Picard RW, editors. Affective Computing and Intelligent Interaction, First International Conference. Lecture Notes in Computer Science 3784. Cham, Switzerland: Springer; 2005:1-7.
- Kipp M, Martin JC. Gesture and emotion: can basic gestural form features discriminate emotions? 2009 Presented at: 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops; Sep 10-12, 2009; Amsterdam, Netherlands.
- Dael N, Bianchi-Berthouze N, Kleinsmith A, Mohr C. Measuring body movement: currentfuture directions in proxemicskinesics. In: Matsumoto D, Hwang HC, Frank MG, editors. APA Handbook of Nonverbal Communication. Washington, DC: American Psychological Association; 2015.
- Anil KK, Kiran B, Shreyas B, Sylvester J. A multimodal approach to detect user's emotion. Procedia Comput Sci 2015;70:2015-2303. [CrossRef]
- Kapoor A, Picard RW. Multimodal affect recognition in learning environment. 2005 Presented at: 13th Annual ACM International Conference on Multimedia; Nov 6-11, 2005; Singapore p. 677-682. [CrossRef]
- Shah M, Chakrabarti C, Spanias A. A multi-modal approach to emotion recognition using undirected topic models. 2014 Presented at: IEEE International Symposium on Circuits and Systems (ISCAS); Jun 1-5, 2014; Melbourne, Australia p. 754-757.
- Ghosh S. Emotion-aware computing using smartphone. 2017 Presented at: 9th International Conference on Communication Systems and Networks (COMSNETS); Jan 4-8, 2017; Bangalore, India.
- Olsen AF, Torresen J. Smartphone accelerometer data used for detecting human emotions. 2016 Presented at: 3rd International Conference on Systems and Informatics (ICSAI ); Nov 19-21, 2016; Shanghai, China.
- Dai D, Liu Q, Meng H. Can your smartphone detect your emotion? 2016 Presented at: 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD); Aug 15, 2016; Changsha, China.
- Sneha H, Annappa B, Manoj Kumar MV, Likewin T. Smart phone based emotion recognition and classification. 2017 Presented at: Second International Conference on Electrical, Computer and Communication Technologies; Feb 22-24, 2017; Coimbatore, India.
- Number of mobile phone users worldwide from 2015 to 2020 (in billions). New York, NY: Statista, Inc; 2018.
- Winnick M, Zolna R. Putting a finger on our phone obsession: mobile touches: a study on humans and their tech. Chicago, IL: dscout, Inc; 2016 Jun 16. URL: https://blog.dscout.com/mobile-touches [accessed 2018-02-05] [WebCite Cache]
- Nicholas M, Samuels M. The trust divide for brands online. London, UK: Kantar TNS; 2017. URL: http://www.tnsglobal.com/intelligence-applied/the-trust-divide-for-brands-online [accessed 2018-02-05] [WebCite Cache]
- Grant J, Odlaug B, Chamberlain S. Why can't I Stop? Reclaiming Your Life From a Behavioral Addiction. Baltimore, MD: Johns Hopkins Press; 2016.
- Higuchi S, Nakayama H, Mihara S, Maezono M, Kitayuguchi T, Hashimoto T. Inclusion of gaming disorder criteria in ICD-11: a clinical perspective in favor. J Behav Addict 2017 Sep 01;6(3):293-295 [FREE Full text] [CrossRef] [Medline]
- Florence J. Mindfulness Meets Emotional Awareness, 7 Steps to Learn the Language of Your Emotions. 1st edition. London, UK: A-Z of Emotional Health Ltd; 2017.
- Lang P, Bradley M, Cuthbert B. International Affective Picture System (IAPS): Affective Ratings of Pictures and Instruction Manual. Technical Report A-6. Gainesville, FL: University of Florida; 2005.
- Baveye Y, Bettinelli JN, Dellandréa E, Chen L, Chamaret C. A large video database for computational models of induced emotion. 2013 Presented at: ACII th International Conference on Affective Computing and Intelligent Interaction; Sep 2-5, 2013; Geneva, Switzerland.
- Lane RD, Quinlan DM, Schwartz GE, Walker PA, Zeitlin SB. The Levels of Emotional Awareness Scale: a cognitive-developmental measure of emotion. J Pers Assess 1990;55(1-2):124-134. [CrossRef] [Medline]
- Betts G. The Distribution and Functions of Mental Imagery [doctoral thesis]. New York, NY: Teachers College, Columbia University; 1909.
- Sheehan PW. A shortened form of Betts' questionnaire upon mental imagery. J Clin Psychol 1967 Jul;23(3):386-389. [Medline]
- Lane RD. Neural substrates of implicit and explicit emotional processes: a unifying framework for psychosomatic medicine. Psychosom Med 2008 Feb;70(2):214-231. [CrossRef] [Medline]
- Barchard KA, Bajgar J, Leaf DE, Lane RD. Computer scoring of the Levels of Emotional Awareness Scale. Behav Res Methods 2010 May;42(2):586-595. [CrossRef] [Medline]
- Lane RD, Sechrest L, Reidel R, Weldon V, Kaszniak A, Schwartz GE. Impaired verbal and nonverbal emotion recognition in alexithymia. Psychosom Med 1996;58(3):203-210. [Medline]
- Lane RD, Sechrest L, Riedel R, Shapiro DE, Kaszniak AW. Pervasive emotion recognition deficit common to alexithymia and the repressive coping style. Psychosom Med 2000;62(4):492-501. [Medline]
- Miller GA, Levin DN, Kozak MJ, Cook EW, McLean A, Lang PJ. Individual differences in imagery and the psychophysiology of emotion. Cognit Emot 1987 Oct;1(4):367-390. [CrossRef]
- Coan J, Allen J. Handbook of Emotion Elicitation and Assessment. Series in Affective Science. Oxford, UK: Oxford University Press; 2007.
- Peres-Neto PR, Jackson DA, Somers KM. How many principal components? Stopping rules for determining the number of non-trivial axes revisited. Comput Stat Data Anal 2005 Jun;49(4):974-997. [CrossRef]
- Witten I, Frank E. Data Mining: Practical Machine Learning Tools and Techniques. 2nd edition. San Francisco, CA: Morgan Kaufmann; 2005.
- Breiman L. Random forests. Mach Learn 2001;45:5-32.
|ANS: autonomic nervous system|
|LEAS: Levels of Emotional Awareness Scale|
|QMI: Questionnaire Upon Mental Imagery|
Edited by R Calvo, M Czerwinski, G Wadley, J Torous; submitted 12.02.18; peer-reviewed by MH Yong, S Hussain; comments to author 03.04.18; revised version received 09.05.18; accepted 18.06.18; published 30.08.18
©Alicia Heraz, Manfred Clynes. Originally published in JMIR Mental Health (http://mental.jmir.org), 30.08.2018.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Mental Health, is properly cited. The complete bibliographic information, a link to the original publication on http://mental.jmir.org/, as well as this copyright and license information must be included.