A Survey on Robust Technique for Human Facial Expression Recognition

Published on July 2016 | Categories: Types, Presentations | Downloads: 42 | Comments: 0 | Views: 345

of 6

The recognition of facial expression from images or videos even in various illumination condition and even it suffers from geometrical attacks have attracted great attention in computer vision field. Facial activities can be characterized by three levels to have better tracking performance. First, in the bottom level, facial feature points around each facial component, i.e., mouth, nose etc., Second, in the middle level, facial action units, are represented i.e., nose wrinkler, cheek raiser, etc. Finally, in the top level, six prototypical facial expressions are represented and are commonly used to describe the human emotion states. Several authors proposed many techniques to track and recognize facial expressions. Among them this paper investigates 5 facial expression tracking techniques. From the comparison the detection using proposed technique performs better than the existing method

Content

International Journal of Emerging Technologies and Engineering (IJETE) Volume 1 Issue 1 January 2014

A Survey on Robust Technique for Human Facial Expression Recognition
E.M.Uma selvi and Professor Mr.P.Kannan

Abstract
The recognition of facial expression from images or videos even in various illumination condition and even it suffers from geometrical attacks have attracted great attention in computer vision field. Facial activities can be characterized by three levels to have better tracking performance. First, in the bottom level, facial feature points around each facial component, i.e., mouth, nose etc., Second, in the middle level, facial action units, are represented i.e., nose wrinkler, cheek raiser, etc. Finally, in the top level, six prototypical facial expressions are represented and are commonly used to describe the human emotion states. Several authors proposed many techniques to track and recognize facial expressions. Among them this paper investigates 5 facial expression tracking techniques. From the comparison the detection using proposed technique performs better than the existing method. Keywords—Dynamic Bayesian network, classification, facial feature extraction, local binary pattern, Gabor wavelet. condition and to recognize the facial expression even it suffers from geometrical attacks such as scaling, rotation etc, our proposed approach gives more recognition rate and less false recognition rate.

II.LITERATURE REVIEW
1.Classifying facial action In 1999 [1] Axel Pinz, et al., explores and compares techniques for automatically recognizing facial actions in sequences of images. Database of image sequence is collected and each sequence contained six images starting with neutral expression and ending with contraction in high magnitude. In this method 12 facial actions are classified such as six upper face action and six lower face action. Action Unit 1 2 4 5 6 7 UPPER FACE Inner brow raiser Outer brow raiser Brow lower Upper lid raiser Cheek raiser Lid tightner Action Unit 17 18 9 10 16 20 LOWER FACE Chin raiser Lip pucker Nose wrinkle Upper lip raiser Lower lip depressor Lip stretcher

I. INTRODUCTION
Face Recognition (FR) has received a significant interest in pattern recognition and computer vision due to the wide range of applications including video surveillance, biometric identification and face indexing in multimedia contents. As in any classification task, feature extraction is of great importance in the FR process. Recently, local texture features have gained reputation as powerful face descriptors because they are believed to be more robust to variations of facial pose, expression, occlusion, etc. In particular, Gabor wavelets and local binary pattern (LBP) texture features have proven to be highly discriminative for FR due to different levels of locality. It can be used to enhance classification/recognition performance. To develop a good facial expression recognition system even in various illumination

1.1Feature extraction: Located feature points are estimated by the use of optic flow [15] and classified using discriminant function. Optic flow comprised of two main components 1. Local velocity extraction using luminance conservation constraint. 2. Local smoothing 1.2Classification: Classification is used for facial action recognition. A simple nearest neighbor classifier is used. Similarity measure and classifier is indicated 1

www.ijete.org

International Journal of Emerging Technologies and Engineering (IJETE) Volume 1 Issue 1 January 2014

for each technique. Algorithms were trained and tested using jack-knife procedure which makes maximum use of data which is available for training. This procedure was repeated for each of 20 subjects and mean classification accuracy was calculated. Number of different image analysis methods are compared for facial expression analysis. Holistic analysis: Holistic images representations are based on principal component analysis(PCA), Local Feature Analysis(LFA), Fisher’s Linear Discriminants(FLD), Independent Component Analysis(ICA). Advantages: 1. Robustness to change in illumination 2. Removal of surface variation between subjects.
Image Analysis methods PCA[18] Classification performance 79.3% Classifier

with Gabor filter representation and ICA which both achieved 96% correct classification. Facial Motion Extraction is done using Optic Flow Analysis, Feature Extraction is done using Gabor wavelet filter and Face Recognition is done using PCA. The Computational complexity is reduced but the illuminations are well handled using Gabor wavelet but it doesn’t robust against rotation attack. 2.Recognizing Action Units for Facial Expression Analysis In 2001 [9] Jeffrey F. Cohn et al., introduced an Automatic Face Analysis (AFA) system to analyze facial expressions based on both permanent facial features (brows, eyes, mouth) and transient facial features (deepening of facial furrows) in a nearly frontal-view face image sequence. The AFA system recognizes fine-grained changes in facial expression into Action Units (AUs) of the Facial Action Coding System (FACS), instead of a few prototypic expressions. Multistate face and facial component models are proposed for tracking and modelling the various facial features, including lips, eyes, brows, cheeks, and furrows. During tracking, detailed parametric descriptions of the facial features are extracted. With these parameters as the inputs, a group of action units (neutral expression, six upper face AUs and 10 lower face AUs) are recognized whether they occur alone or in combinations. The system has achieved average recognition rates of 96.4 percent (95.4 percent if neutral expressions are excluded) for upper face AUs and 96.7 percent (95.6 percent with neutral expressions excluded) for lower face Aus Recognition rate= based on input samples = based on AU components = 2

LFA[19]

81.1%

FLD[20]

75.7%

ICA[21]

95.5%

Euclidean distance similarity measure & template matching classifier Cosine similarity measure & nearest neighbor classifier Euclidean distance similarity measure & template matching classifier Cosine similarity measure & nearest neighbor classifier

1.3Gabor filter: Steps involved in Gabor filter are, 1. Gabor filters are applied to the images. 2. Output of 40 Gabor filters are down sampled to reduce dimensionality. 3. Then normalized to unit length which perform divisive contrast normalization The accuracy of classification performance is 95.5% which is higher than all other approaches except ICA. This is tested using cosine similarity measure & nearest neighbor classifier. Investigation compared facial action classification by optic flow, local spatial representation and holistic analysis. Best performance were obtained

www.ijete.org

International Journal of Emerging Technologies and Engineering (IJETE) Volume 1 Issue 1 January 2014

based on input samples =

based on AU components Feature Extraction is done using permanent features and transition features. Classification is done using Neural Networks. This method work efficiently even with complex databases because of using the multiple features but requires high processing time for large neural networks. The neural network needs training to operate. 3.Facial Action Unit Recognition by Exploiting Their Dynamic and Semantic Relationships In 2007 [2] Yan Tong et.al., proposed that AUs are recognized to improve facial feature extraction technique or AU classification technique. In this paper, Dynamic Bayesian Network(DBN) is used to model relationship among various AUs. This experiment is used for spontaneous facial expressions and under more realistic environment including illumination variation, face pose variation and occlusion. According to FACS (Facial Action Coding System), facial behaviour is decomposed into 46 action units) AUs. So FACS is a powerful means of detecting and measuring a large number of facial expression. AU recognition system has 2 key stages. 1. Facial feature extraction stage 2. AU classification stage In facial feature extraction stage, holistic technique and local technique are considered. In AU classification stage, spatial approaches and spatial temporal approaches are considered. 3.1Dynamic Bayesian Network: The dynamic and semantic cooccurrence/co-absence relationship is well modelled with DBN which is capable of representing relationship among different AUs used for AU recognition process. The psychological experiment suggests that facial behaviour is more accurately recognized from an image sequence than from a still images. This experiment is based on spatial temporal analysis that starts from neutral expression. 3.2AU measurement:

The feature points are extracted by the use of Gabor wavelet. Adaboost classifier is then used to obtain the measurement for each AU. Adaboost classifier is discretized into binary values and used for DBN to model the dynamics of AUs. AU relationship are observed from co-occurrence and co-absence. Co-occurrence dependency between 2 AUs is computed as, P(AUi=1/AUj=1) = NAUi + AUj NAUj Where NAUi+AUj __ Total no. of positive examples of AU combination NAUj ____ Total no. of positive examples of AUj Co-absence dependency between 2 Aus is computed as, P(AUi=0/AUj=0) = M-AUi + -AUj M-AUj Where M-AUi+-AUj __ Total no. of events neither AUi nor AUj occurs. M-AUj ____ Total no. of negative examples of AUj In this paper, instead of recognizing each AU or AU combination individually or statistically, DBN is employed to model both the semantic and temporal relationship among various AUs. This technique is applied to human emotion recognition. Feature Extraction is done using Gabor wavelet filter Feature Selection is done using Ada boost and Classification is done using Dynamic Bayesian Networks. The illuminations are well handled using Gabor wavelet filter but Lack of training images increases false recognition rate. 4.A Statistical Method for 2-D Facial Land marking In 2012 [4] Hamdi Dibeklioglu et.al, described a statistical method for automatic facial-landmark localization. Facial land marking is an important component for face registration, recognition methods and analysis. Landmarks of face include eye, eyebrow corners, mouth corners, centres of iris, tip of chin and nose tip. Landmarks are usually required for expression analysis. 3

www.ijete.org

International Journal of Emerging Technologies and Engineering (IJETE) Volume 1 Issue 1 January 2014

Initially facial landmark points are marked on neutral face and then each landmark are tracked while face is deformed under influence of an expression. The accuracy and the robustness of this method is not significantly affected by lowresolution images, small rotations, facial expressions, and natural occlusions such as beard and moustache. Land marking algorithm: The main steps in facial land markings are face detection and illumination compensation. In order to reduce computational complexity, 3-level image pyramid from cropped high resolution face images are prepared. Pyramid has 160 X 224, 80 X 112 and 40 X 5 pixel images. The coarse to fine strategy is used for landmark detection. This 3-level strategy localizes landmark more accurately than the oneshot detection. Accuracy of the land marking algorithm on the Bosphorus database
LANDMARKS Outer Eye Corners Inner Eye Corners Nose Tip Mouth Corners Outer Eyebrows Inner Eyebrows Pupils Nose Saddles Nostrils Lip Outer Middles Lip Inner Middles Tip of chin Mean SUCCESS(%) 97.98 98.46 94.68 90.74 91.42 94.79 98.84 86.51 99.03 88.76 89.69 61.47 92.21

movement features, which include feature position and shape changes, are generally caused by the movements of facial elements and muscles during the course of emotional expression. 5.1Frame work: Framework composed of three stages. 1. Pre processing stage 2. Training stage 3. test stage At the pre processing stage, by taking the nose as the centre and keeping main facial components inclusive, facial regions are manually cropped from database images and scaled to a resolution of 48 X 48 pixels. No more processing is conducted to imitate the results of real face detectors. Then, multi resolution Gabor images are attained by convolving eight scale, four-orientation Gabor filters with the scaled facial regions. During the training stage, a whole set of patches is extracted by moving a series of patches with different sizes across the training Gabor images. Then, a patch matching operation is proposed to convert the extracted patches to distance features. To capture facial movement features, the matching area and matching scale are defined to increase the matching space, whereas the minimum rule is used to find the best matching feature in this space. Based on the converted distance features, a set of “salient” patches is selected by Adaboost classifier At the test stage, the same patch matching operation is performed on a new image using the “salient” patches. The resulting distance features are fed into a multiclass support vector machine (SVM) to recognize six basic emotions, including anger (AN), disgust (DI), fear (FE), happiness (HA), sadness (SA), and surprise (SU). Comparision with State-of-the-Art performance
Feature Patch based Gabor[6] Gabor[8] Gabor + haar[7] Gabor + FSLP[10] Boosted-LBP, LBP[22] SFRCS[23] FEETS + PRNN[24] JAFFE Database 92.93%(6) 91.0%(7) 81.0%(7) 85.92%(7) 83.84%(7) CK Database 94.48%(6) 93.3%(7) 93.1%(7) 95.1%(6) 92.6%(6) 95.87%(5)

Accurate facial landmarks are essential for expression analysis. The expression recognition application is obtained in various databases such as Cohn-kanade set, BU-4DFE dataset. The illuminations are well handled using Gabor wavelet filter but lack of training images increases false recognition rate. 5.Facial movement features: In 2011[6] Ligang Zhang et.al, proposed that Facial Expression Recognition(FER) is accurately measured by extracting emotional features. Facial

4

www.ijete.org

International Journal of Emerging Technologies and Engineering (IJETE) Volume 1 Issue 1 January 2014

The figure in the parentheses stand for the number of testing facial expressions The results indicate that patch-based Gabor features show a better performance over point-based Gabor features in terms of extracting regional features, keeping the position information, achieving a better recognition performance, and requiring a less number.

[3] Maja Pantic, Ioannis Patras, "Dynamics Of Facial Expression: Recognition Of Facial Actions And Their Temporal Segments From Face Profile Image Sequences", Ieee Transactions On Systems, Man, And Cybernetics—Part B: Cybernetics, Vol. 36, No. 2, April 2006. [4] Hamdi Dibeklio˘Glu, Albert Ali Salah, Theo Gevers, "A Statistical Method For 2-D Facial Landmarking", Ieee Transactions On Image Processing, Vol. 21, No. 2, February 2012.

III.CONCLUSION
In this paper, a brief literature survey for tracking and recognition of facial activities from images or video is discussed elaborately. From the comparison the detection using a hierarchical framework based on both local binary pattern Gabor wavelet for facial feature tracking and Dynamic Bayesian Network for facial expression recognition technique performs better than the existing method.

[5] Guoying Zhao And Matti Pietika¨ Inen, "Dynamic Texture Recognition Using Local Binary Patterns With An Application To Facial Expressions", Ieee Transactions On Pattern Analysis And Machine Intelligence, Vol. 29, No. 6, June 2007. [6] Ligang Zhang, Dian Tjondronegoro, "Facial Expression Recognition Using Facial Movement Features", Ieee Transactions On Affective Computing, Vol. 2, No. 4, October-December 2011. [7] H.Y. Chen, C.L. Huang, and C.M. Fu, “HybridBoost Learning for Multi-Pose Face Detection and Facial Expression Recognition,” Pattern Recognition, vol. 41, pp. 1173-1185, 2008. [8] G. Littlewort, M.S. Bartlett, I. Fasel, J. Susskind, and J. Movellan, “Dynamics of Facial Expression Extracted Automatically from Video,” Image and Vision Computing, vol. 24, pp. 615-625, 2006. [9] Ying-Li Tian, Takeo Kanade, Jeffrey F. Cohn, "Recognizing Action Units For Facial Expression Analys", Ieee Transactions On Pattern Analysis And Machine Intelligence, Vol. 23, No. 2, February 2001. [10] G. Guo and C.R. Dyer, “Learning from Examples in the Small Sample Case: Face Expression Recognition,” IEEE Trans. Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 35, no. 3, pp. 477-488, June 2005 [11] M.S. Bartlett, H.M. Lades, and T.J. Sejnowski, ªIndependent Component Representations for Face Recognition,º Proc. SPIE Symp. Electronic Imaging: Science and Technology; Human Vision and 5

IV.ACKNOWLEDGEMENT
Apart from the efforts of the authors, the success of any work depends largely on the encouragement and guidelines of many others. We take this opportunity to express our gratitude to the people who have been instrumental in the successful completion of this work. We would like to extend my sincere thanks to all of them. We owe a sincere prayer to the LORD ALMIGHTY for his kind blessings and giving me full support to do this work, without which this would have not been possible. We wish to take this opportunity to express our gratitude to all, who helped me directly or indirectly to complete this paper.

REFERENCES
[1] Gianluca Donato, Marian Stewart Bartlett, Joseph C.Hager, Paul Ekman, And Terrence J. Sejnowski, "Classifying Facial Actions", Ieee Transactions On Pattern Analysis And Machine Intelligence, Vol. 21, No. 10, October 1999. [2] Yan Tong, Wenhui Liao, Qiang Ji, "Facial Action Unit Recognition By Exploiting Their Dynamic And Semantic Relationships", Ieee Transactions On Pattern Analysis And Machine Intelligence, Vol. 29, No. 10, October 2007.

www.ijete.org

International Journal of Emerging Technologies and Engineering (IJETE) Volume 1 Issue 1 January 2014

Electronic Imaging III, T. Rogowitz and B. Pappas, eds., vol. 3,299,pp. 528-539, San Jose, Calif., 1998. [12] M.S. Bartlett and T.J. Sejnowski, ªViewpoint Invariant Face Recognition Using Independent Component Analysis and Attractor Networks,º Advances in Neural Information Processing Systems, M. Mozer, M. Jordan, and T. Petsche, eds., vol. 9, pp. 817-823, Cambridge, Mass., 1997. [13] M.S. Bartlett, P.A. Viola, T.J. Sejnowski, J. Larsen, J. Hager, and P.Ekman, Classifying Facial Action,º Advances in Neural Information Processing Systems, D. Touretski, M. Mozer, and M. Hasselmo, eds., vol. 8, pp. 823-829, 1996. [14] J. Bassili, ªEmotion Recognition: The Role of Facial Movement and the Relative Importance of Upper and Lower Areas of the Face,º J. Personality and Social Psychology, vol. 37, pp. 2,049-2,059, 1979. [15] K. Mase, ªRecognition of Facial Expression from Optical Flow,º IEICE Trans. E, vol. 74, no. 10, pp. 3,474-3,483 1991. [16] C. Zhengdong, S. Bin, F. Xiang, and Z. Yu-Jin, “Automatic Coefficient Selection in Weighted Maximum Margin Criterion,” Proc. 19th Int’l Conf. Pattern Recognition, pp. 1-4, 2008. [17] W. Yuwen, L. Hong, and Z. Hongbin, “Modelling Facial Expression Space for Recognition,” Proc. IEEE/RSJ Int’l Conf. Intelligent Robots and Systems, pp. 1968-1973, 2005. [18] M. Turk and A. Pentland, ªEigenfaces for Recognition,º J.Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, 1991. [19] P.S. Penev and J.J. Atick, ªLocal Feature Analysis: A General Statistical Theory for Object Representation,º Network: Computation in Neural Systems, vol. 7, no. 3, pp. 477-500, 1996. [20] R.A. Fisher, ªThe Use of Multiple Measures in Taxonomic Problems,º Ann. Eugenics, vol. 7, pp. 179-188, 1936. [21] M.S. Bartlett and T.J. Sejnowski, ªViewpoint Invariant Fac Recognition Using Independent

Component Analysis and Attractor Networks,º Advances in Neural Information Processing Systems, M. Mozer, M. Jordan, and T. Petsche, eds., vol. 9, pp. 817-823, Cambridge, Mass., 1997. [22] C. Shan, S. Gong, and P.W. McOwan, “Facial Expression Recognition Based on Local Binary Patterns: A Comprehensive Study,” Image and Vision Computing, vol. 27, pp. 803-816, 2009. [23] M. Kyperountas, A. Tefas, and I. Pitas, “Salient Feature and Reliable Classifier Selection for Facial Expression Classification,” Pattern Recognition, vol. 43, pp. 972-986, 2010. [24] J.-J. Wong and S.-Y. Cho, “A Face Emotion Tree Structure Representation with Probabilistic Recursive Neural Network Modeling,” Neural Computing and Applications, vol. 19, pp. 33-54, 2010.

6
www.ijete.org

A Survey on Robust Technique for Human Facial Expression Recognition

Comments

Content

Sponsor Documents

Recommended