- Open Access
Artificial neuro fuzzy logic system for detecting human emotions
Human-centric Computing and Information Sciencesvolume 3, Article number: 3 (2013)
This paper presents an adaptive neuro/fuzzy system which can be trained to detect the current human emotions from a set of measured responses. Six models are built using different types of input/output membership functions and trained by different kinds of input arrays. The models are compared based on their ability to train with lowest error values. Many factors impact the error values such as input/output membership functions, the training data arrays, and the number of epochs required to train the model. ANFIS editor in MATLAB is used to build the models.
The problem of emotion detection based on the measured physiological changes in the human body had received a significant attention lately (Nie et. al. 2011). However, the instant detection of each human’s emotions had not been thoroughly studied. The problem is that each person manifests emotions in a manner different from others. In social networks, for example, a person sending a message across the net may experience certain emotional status which need to be transmitted to the other party in the same manner voice or image are transmitted.
The human emotional status is rather intangible, and therefore cannot be directly measured. However, these emotions can be correlated to external and/or internal factors, which are rather tangible things, and hence they can be measured and analyzed. The internal factors come from different parts of the body in several forms such as electroencephalography (EEG), heart rate (HR), heart rate variability (HRV), pre-ejection period (PEP), stroke volume (SV), systolic blood pressure (SBP), diastolic blood pressure (DBP), skin conductance response (SCR), tidal volume (Vt), oscillatory resistance (Ros), respiration rate (RR), nonspecific skin conductance response rate (nSRR), skin conductance level (SCL), finger temperature (FT), and others (Kreibig 2010).
These factors’ measurements are provided in wide ranges and often their impacts vary from a person to a person and for different postures for the same person. For example, a given measurement of some factors may relate to a person being happy, while the same measurements may reveal a rather “sad” status for another person. This kind of behavior lends itself naturally to fuzzy sets and fuzzy logic (zero and one, true and false or black and white cannot present this kind of data).
In this paper, we will use fuzzy operations to represent the knowledge about each factor. This will enable us to detect the emotion of a person using fuzzy inputs of the various factors. For example, we can use a fuzzy rule such as “IF (Temperature is High) AND (Heart Rate is High) THEN (Person is Excited).” Although fuzzy sets and operations are useful for representing the knowledge base, they fail to model the individual behavior of each and every person. Obviously, a model that is able to adapt to various categories of human responses would be preferred. Consequently, an adaptive learning mechanism is required to adjust the model if we were to cater for the differences in emotions between various humans. This requirement calls for the use of an adaptive learning system such as artificial neural networks (ANN) (Abraham 2005). However, the ANN model does not allow the use of fuzzy sets or rules, which is the more natural way of representing the relation between human emotions and human physical and physiological parameters. ANN uses exact and crisp values for representing the model’s input.
In order to utilize the benefits of both fuzzy logic and artificial neural networks, we will use the hybrid approach, which combines fuzzy logic and artificial neural networks in a single model.
The analysis and detection of human emotions using an expert system has a direct impact on several fields of the human life such as health, security, social networks, gaming, entertainment, commercials and others[3, 7]. The system will enable social networks (SN) participants to exchange emotions in addition to text, images, and videos.
In health related applications, for example, the interaction between a patient and doctor (in some critical cases) may become difficult or impossible, where the patient cannot explain his/her feelings to the doctor (e.g. coma infants, autism). The proposed system would enable the doctor to analyze and detect the patient’s emotions, even when the patient is unable to correctly define his emotional status.
In social networks people exchange all types of information such as text, video, images, and audios. The proposed model would enable communicating parties to detect the emotional status of their partners in a seamless automatic manner. In essence, a person chatting with a friend on the social network would be able to tell whether the other partner is sad, angry, embarrassed, afraid or happy without the need for the partner to explicitly state the emotional status.
Security is another area where the proposed model can be of significant impact. The model can be used to predict a crime before it occurs by detecting a criminal behavior based on the emotional status of the person attempting to commit a crime or breach the security at given facilities. This is based on the psychological status of the criminal before committing a crime. At an airport facility, for example, the system can identify individuals with certain emotional postures based on perceived measures of the individual’s heart rate, EEG frequencies, body temperatures and other measurable factors.
Gaming and entertainment is yet another area where the prediction of a person’s current emotion status is very useful. The system can detect the modes of customers based on the various factors studied and analyzed in this paper.
The rest of this paper is organized as follows. Related work is presented in Section 2. Section 3 provides an overview of the various factors which impact the human emotions. Section 4 presents the ANFIS model, used to build the neuro/fuzzy model. Section 5 presents and analyzes the model results. Conclusions are presented in Section 6.
Human emotion detection and analysis is an important field of study. Some scientists have focused on external effects on human emotions for commercial objectives, such as the use of Electroencephalogram (EEG) measurements for determining the level of attention of a subject to a visual stimulus such as a television commercial displayed on a screen.
Timmons at al. introduced a medical instrument that allows doctors to monitor their patients using sensors like insulin and blood pressure sensors. Ohtaki et al. developed wearable instruments capable of indoor movements tracking and monitoring of concurrent psycho–physiologically indicated mental activity. The instrument used electro dermal activity (EDA), heart rate, and vascular change sensors for emotion detection. EDA can be used to detect the human emotional response by measuring the skin humidity, which reflects the activity of the Eccrine sweat glands.
In another related study, Petrushin & Grove provided a method for detecting emotional state using statistical analysis of perceived measurements. In their study, they use parameters extracted from a voice speech as an input to an artificial neural network (ANN) to get the related emotion; ANN was used as an adaptive classifier which is taught to recognize one emotional state from a finite number of states. Affectivea introduced a wearable sensor which is capable of quantifying human emotions such as fear, excitement, stress, boredom etc. The devise can be used by doctors to analyze the emotions of autism patients by monitoring their motion and temperature.
Santosh and Scott proposed the use of wearable wireless sensor system for continuous assessment of personal exposures to addictive substances and psychosocial stress as experienced by human participants in their natural environments. It was observed that physiological stress and response vary from person to person, and for the same person with respect to postures and physical activity; it has also been observed that human emotions can be correlated to behavioral patterns such as smoking and speech[12, 13].
Nie et al. evaluated the relationship between the Electroencephalography (EEG) and the human emotions and concluded that EEG can be used to classify two kinds of emotions: negative and positive. Yuen et al. believe that the states of the brain change as feelings change, therefore, EEG is suitable for the task of recording the changes in brain waves, which vary according to feelings or emotions; a neural network was used to train the model. Leupoldt A et al. observed the emotion influence on respiration sensation, skin conductance response and EEG. Kreibig conducted a survey of several research studies to find the relationship between autonomic nervous system (ANS) and the human emotions. ANS includes the cardiovascular, the electrodermal, and the respiratory responses. The survey shows that the ANS response appears in negative emotions clearer than in positive emotions. Kreibig summarized the results of the survey in one table which shows the impact of several measurable factors on both negative and positive emotions. We will rely on this data for the construction of our model and we will choose fourteen factors out of the factors listed by Kriberg. The factors are selected on the basis of their measurability and the availability of sensors for these factors. The model will include all 22 different emotions (11 positive and 11 negative emotions).
Human emotions analysis and detection
Human emotions are intangible things; however there are several factors which can be used to detect them. The factors impact the human emotions and the emotions of different people in different ways. The amount of information presented by the various factors is enormous, thus drastically increasing the complexity of any model used to correlate the factors to the emotions. In order to simplify the model by reducing the amount of data required to evaluate the model, we make use of fuzzy logic, where the input parameters are quantified with linguistic variables such as low, normal, and high which represent a wide range of input values. Following is a brief description of the factors used in our model and their impact on human emotions[15–17].
Electroencephalography(EEG):EEG measurements [18, 19] are given in frequency ranges, and can be represented with four linguistic variables, namely alpha, beta, theta and delta with ranges 13–15, 7.5-13, 2.5-8, and <4 Hz respectively, (Figure 1).
Heart Rate (HR): Three heart rate ranges are identified, and categorized with fuzzy linguistic variable low (LHR) from 20 to 70 bpm, normal (NHR) from 45 to 100 bpm and high (HHR) from 84 to 120 bpm as shown in Figure 2
There are several frequency-domain measures which pertain to HR variability at certain frequency ranges; and these measures are associated with specific physiological processes. HRV has three ranges of frequencies, high, low and very low with ranges 0.15-0.4, 0.04-0.15 and 0.003-0.04 Hz respectively. HRV is known to decrease with anxiety and increase with amusement. Three linguistic variables (very low, low and high) with Gaussian functions are used to represent HRV.
Pre-Ejection Period (PEP). The PEP is defined as the period between when the ventricular contraction occurs and the semi lunar valves open and blood ejection into the aorta commences Three linguistic variables are used to implement PEP, namely low (LP) from 0 to 800 ms, normal (NP) from 0 to 1000 ms, and high (HP) from 500 to 1100 ms. PEP is known to increase with acute sadness, while it experiences an increase or decrease with joy
Stroke Volume (SV): Stroke volume is defined as the amount of blood pumped by the left ventricle of the heart in one contraction, and its normal range is from 0 ml to 250 ml. SV remains almost invariant for the positive emotions, while it responds actively to negative emotions e.g. it decreased with disgust, fear, and sadness . Three linguistic variables are used to implement SV, namely low (LSV) from 10 to 144 ml, normal (NSV) from 10 to 250 ml, and high (HSV) from 240 400 ml.
Systolic Blood Pressure (SBP): Three linguistic variables are used to implement SBP, low (100–121), normal (110–134), and high (120–147). SBP is known to increase with fear and anxiety
Diastolic Blood Pressure (DBP): Three variables are used to implement DBP namely low (LDBP) from 77 to 87, normal (NDBP) from 81 to 91and high (HDBP) from 81 to 91. DBP increases with anger, anxiety, and disgust, while it decreases with acute sadness .
Skin Conductance Response (SCR): SCR is the phenomenon that the skin momentarily becomes a better conductor of electricity when either external or internal stimuli occur that are physiologically arousing. Three linguistic variables are used to implement SCR namely low (0–0.2 ms), normal (0.1-1 ms) and high (0.85-1.5 ms) [20, 21].
Tidal Volume (Vt): Tidal volume represents the normal volume of air displaced between normal inspiration and expiration when extra effort is not applied Three linguistic variables are used to implement Vt, namely rapid breath (100–150 ml/breath), quiet breath (200–750 ml/breath) and deep breath (600–1200 ml/breath) [15, 20].
Oscillatory Resistance (Ros): Three linguistic variables are used to implement Ros, namely low (0–0.49), normal (0.4-0.88) and high (0.5-1) .
Respiration Rate (RR): Three linguistic variables are used to implement RR namely, low (5–10), normal (7 to 23) and high (15–24) breath/min .
Nonspecific Skin Conductance Response (nSRR): nSRR is used to measure the moisture level of the skin and is implemented with three linguistic variables, low (0–2), normal (1–3) and high (2–5) per min .
Skin Conductance Level (SCL). SCL measures the electrical conductance of the skin and is used as an indication of psychological or physiological arousal. Three linguistic variables are used to implement SCL namely low (0–2), normal (2–25) and high (20–25) ms.
Finger Temperature (FT): Three linguistic variables are used to implement FT namely low (65-75°F), normal (75- 85°F) and high (80-90-°F) .
Output membership functions
There are twenty-two different emotions impacted by the factors described above. In the artificial neural fuzzy system the output can be based on the Mamdani or the Sugeno models. We will use the Sugeno model in this study, without loss of accuracy or generality of the overall results of the model, since both models equally represent the real system, differing only in the performance of the models, where the Sugeno model exhibits better performance than the Mamdani model. For Sugeno model, we use both a single point constant and linear outputs functions to represent each of the emotions.
Using the Sugeno model, each output is represented by exactly one fuzzy rule and one constant value. The initial distribution can be uniform across all output emotions, since the final values will be adjusted after training. In our model, we use constant representation ranging from 1 to 22 as shown in Figure3
The initial choice of the output values does not have an impact on the accuracy of the model, because the training part of the model adjusts the final output values based on the training data. For example, the initial value for the anger emotion is 1, for anxiety is 2. After training with the ANFIS model, the output values will be adjusted based on a set of training data.The Sugeno linear output model has the general form:
Where ai is a constant parameter and Xi is input variables. For constant functions, the values of ai is equal to zero and, hence Y = z (constant value). For the linear function model, the ai values are entered through the ANFIS editor, while Xi are the values of the input factors
The correlation between the input and the output variables is done through a set of fuzzy rules. Each rule uses AND/OR connectors to connect various input factors with a particular output emotion. For example, rule 1 shows all the input factors which produce the anger emotion; the initial weight of the rules is set to1 and will be adjusted after training the system.
The correlation between the input and the output variables is done through a set of fuzzy rules. Each rule uses AND/OR connectors to connect various input factors with a particular output emotion. Five of the 22 different rules used in our model are listed below for illustration. Each rule corresponds to one and only one output emotion. For example, rule 1 shows all the input factors which produce the anger emotion; the initial value of the anger emotion is set to 1. In the ANFIS model, rules are also assigned weights. The initial weights for all rules are set to 1.
If (EEG is Beta) and (HR is HHR) and (HRV is LF) and (PEP is LP) and (SV is LSV) and (SBP is HSBP) and (DBP is HBP) and (SCR is HSCR) and (Vt is RapidL) and (Ros is HRos) and (RR is HRR) and (nSRR is HnSRR) and (SCL is HSCL) and (FT is LFT) THEN (Emotion is Anger - 1) (1)
If (EEG is not Alpha) and (HR is HHR) and (HRV is VLF) and (SV is NSV) and (SBP is HSBP) and (DBP is HBP) and (SCR is HSCR) and (Vt is RapidL) and (Ros is HRos) and (RR is HRR) and (nSRR is HnSRR) and (SCL is HSCL) and (FT is LFT) THEN (Emotion is Anxiety - 2) (1)
If (EEG is not Alpha) and (HR is HHR) and (HRV is HF) and (PEP is LP) and (SV is LSV) and (SBP is HSBP) and (DBP is HBP) and (SCR is HSCR) and (Vt is RapidL) and (Ros is HRos) and (RR is HRR) and (nSRR is HnSRR) and (SCL is HSCL) and (FT is LFT) THEN (Emotion is Disgust_contamination −3) (1)
If (EEG is not Alpha) and (HR is LHR) and (PEP is LP) and (SV is NSV) and (SBP is HSBP) and (DBP is HBP) and (SCR is HSCR) and (Vt is RapidL) and (Ros is NRos) and (RR is HRR) and (nSRR is HnSRR) and (SCL is HSCL) and (FT is HFT) THEN (Emotion is Disgust_Mutilation −4) (1)
If (EEG is not Alpha) and (HR is HHR) and (HRV is VLF) and (PEP is LP) and (SBP is HSBP) and (DBP is HBP) and (SCL is HSCL) THEN (Emotion is Embarrassment −5) (1)
Experimental results and discussions
In this study, using NFIS editor in MATLAB which supports building a hybrid neuro-fuzzy systems, six models are built with three input membership functions, namely the Gaussian membership function (gaussmf), a combination of two Gaussian functions (gauss2mf), and a product of two sigmoid shaped member functions (psigmodmf). We use the Sugeno constant and linear output functions. The rule in Sugeno fuzzy model has the form
If (input 1 = x) and (input 2 = y)
THEN output z = ax + by + c.
For the constant Sugeno model, the output level z is constant c, where a = b = 0. The output level zi of each rule is weighted by firing strength wi of the rule.
The performance metrics of the models include the trainability, the training time, and the training error. The characteristics of the six models are given in Table 1: Model Characteristics
For each of the models shown in Table 1: Model Characteristics, we build a neuro/fuzzy structure with 5 layers as shown in Figure4. The general structure of the ANFIS model is the same for all models. The models differ in the specifications of the membership functions and the output specifications. However, the general structure remains the same for all.
Training is used to adjust the model parameters, particularly the input membership function parameters, and the corresponding output values. For example, after training, the width (c) and the height (σ) of Gaussian function curve are adjusted to produce the desired output. The adjustment and tuning depend on the accuracy of the training data.
Training requires two kinds of data arrays, training array and testing array. A training array is a two dimensional array [m × n], where m is the number of rows containing input values, and n = a + 1, where a is the number of input factors; in our model, n = 15. Each row contains values for each of the 14 input factors; the last column holds the value for the corresponding emotion output. The testing array holds the data in the same way as the training array, but the data in this array is more accurate and smaller than the data of the training array. We use three sets of training arrays. The correct values training Array (CTA) has 815 records; each is selected based on the rules given in Table 1. The noisy training array (NTA) with 815 records; the noise is introduced by violating the rules of Table 1. The small training array (STA) has 465 records.
Table 2 shows the parameters for the SCR factor before (σ BT, c BT) and after (σ AT, c AT) training with NTA array.
We noticed similar behavior for all factors using the three different functions and the different arrays; although the magnitude of change is different for each experiment. We also noticed that the output values of the emotions had changed from the initial values. Figures5 and6 show the Gaussian membership function before and after training respectively.
Training under CTA data arrays produced smaller error values than those for NTA. The STA training data produced the lowest error values. The choice of training data set has an impact on the outcome of the emotions. Using different training data sets may very well produce different emotions output. This is consistent, however, with the fact that different categories of people may respond differently to emotions stimuli. Table 3 and Figure7 show the human emotions response to 10 different trials with different input values for each trial under a model that had been trained for 10,000 epochs with CTA, STA, and NTA using psigmf/Linear functions. The table demonstrates how the models behave under different training sets.
The first experiment (column 2 HE-BT) indicates that the human emotion was 22 (suspense) before training and after training it became 15.4 (happiness), 6.48 (fear) and 8.3 (sadness crying) for NTA, CTA and STA respectively. Note that different training sets produce different models. In reality, there could be different training sets representing different human behavior and different human responses. So it is essential to know which category (or training set) an individual belongs to before attempting to define his/her current emotion.
In this paper, we presented a neuro/fuzzy model for the detection of human emotions using fourteen measurable human factors which are known to impact human emotions in varying degrees. The factors are converted into fuzzy variables and used in a set of rules to detect one of twenty two different emotions. The model is trained and the output parameters representing emotions can are adjusted using a set of training data. The developed models can be used in social networks such as Facebook and Twitter, health organizations especially for coma, infant or autism patients, security systems like airports and critical places, and gaming industry. The experiments show that the model is sensitive to the choice of input membership functions as well as the output function.
In this research, we developed a neuro-fuzzy system that deals with 14 human factors that impact human emotions. These human factors are used as input data for the system. These input factors are entered into some specific rules which correlate human emotions to one or more of these factors. A training mechanism is developed, which allows the neuro-fuzzy system to be trained in a manner to detect current human emotions. The developed system benefits the advancement of social networks, security systems, gaming industry and others.
Nie D, Wang X, Shi L, Lu B: “EEG-based emotion recognition during watching movies”. Mexico: International IEEE EMBS Conference on Neural Engineering Cancun; 2011:667–670.
Owaied H, Abu-Arr’a M: ‘Functional model of human system an knowledge base System’, the international conference on information & knowledge engineering. 2007, 158–161.
Kreibig S: ‘Autonomic nervous system activity in emotion: a review’, biological psychology. 2010, 84: 394–421. Listed among?Biological Psychology’s Most Cited Articles?and?Most Downloaded Articles Listed among?Biological Psychology’s Most Cited Articles?and?Most Downloaded Articles 10.1016/j.biopsycho.2010.03.010
Zadah L: Fuzzy sets. U.S: Ntional Science Foundation under Grant; 1965.
Negnevitsky M: Fuzzy expert system. In Artificial intelligent a guide to intelligent systems. 2nd edition. England: Pearson Education; 2005.
Abraham A: ‘Artificial neural networks’, handbook of measuring system design. John Wiley & Sons, Ltd: USA; 2005:901–908.
Silberstein R: Electroncephalographic attention monitor. US patent 1990, 4: 955,388.
Timmons N, Scanlon W: ‘Analysis of the performance of IEEE 802.15.4 for medical sensor body area networking’, sensor and Ad Hoc communications and networks, 2004. IEEE SECON. Ireland: First Annual IEEE Communications Society Conference; 2004:1–4.
Ohtaki Y, Suzuki A, Papatetanou : ‘Integration of psycho – physiological and behavioral indicators with ambulatory tracking of momentary indoor activity assessment’. Japan: ICROS – SIC, International Joint conference 2009; 2009:499–502. 2009 2009
Petrushin V, Gove B: ‘Detecting emotions using voice signal analysis’. US patent 2002, 7: 222. 075,B2 075,B2
Affectiva: Affectiva. 2011.http://www.affectiva.com/q-sensor , viewed by 11 June 2011
Santoch K, Scott M: “Auto sense: a wireless sensor system to quantify personal exposure to Psycological stress and addictive substances in natural environment”. viewed by 13 June 2011http://sites.google.com/site/autosenseproject/research viewed by 13 June 2011
Yuen C, San W, Rizon M, Seong C: “Classification of human emotions from EEG signals using statistical features and neural network”. International Journal of Integrated Engineering 2009, 71–79. In International Journal of Integrated Engineering In International Journal of Integrated Engineering
Leupoldt A, Vovl A, Bradley M, Keil A, Lang P, Davenport P: ‘The impact of emotion on respiratory-related evoked potentials’. USA: Society for Psychophysiological Research; 2010:579–586.
Cacioppo J, Tassinary L, Berntsong G: ‘The hand book of psychophysiology. 3rd edition. : ; 1999.
Sherwood L: Human physiology from cells to systems. 7th edition. : ; 2010.
Osumi T, Ohira H: The positive side of psychopathy: emotional detachment in psychopathy and rational decision-making in the ultimatum game. Personality an Individual Differences 2010, 49: 451–456. 10.1016/j.paid.2010.04.016
Niedermeyer E, da Silva FL: Electroencephalography: basic principles, clinical applications, and related fields. Lippincot Williams & Wilkins; 2004.
Bos D: ‘EEG-based emotion recognition’. The influence of visual and auditory stimuli University. 2006, 734–743. Online ISBN 978–3-642–24955–6 Online ISBN 978-3-642-24955-6
Kaluer K, Voss A, Stahl C (Eds): In Cognitive method in social psychology. 2011.
Seo S, Lee J: ‘Stress and EEG’. Convergence and Hybrid Information Technologies Marius Crisan, In Tech; 2010:413–426.
The authors declare that they have no competing interest.
MM and OM carried a neuro-fuzzy system for detecting human emotions, participated in the sequence alignment and drafted the manuscript. MM and OM read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.