- Open Access
Time evolution of face recognition in accessible scenarios
Human-centric Computing and Information Sciences volume 5, Article number: 24 (2015)
Up to now, biometric recognition has shown significant advantages as to be considered a reliable solution for security systems in mobile environments. Nevertheless, due to the short lifetime of biometrics in mobile devices, a handful of concerns regarding usability and accessibility need to be covered in order to meet users’ requirements. This work is focused on analysing the usability and accessibility of a face recognition system used by visually impaired people, focusing on the time spent in the process, which is a critical aspect. Specifically, we cover different key questions including which kind of feedback is more useful for visually impaired users and beneficial for performance and how is the performance evolution in contrast with the time spent in the recognition. Our findings suggest that several parameters improve along with the time spent in the process, including performance. The audio feedback provided in real time involves also better performance and user experience than instructions given previously.
Biometrics in mobile environments
The amount of sensitive data that needs to be protected, not only at institutions or companies levels but also for ordinary people, is increasing exponentially  . Nowadays, it is common to use the smartphone to access bank accounts , make payments or handle important information in general [7, 8], which leads to the necessity of increase the security in those devices [17, 24]. Usually, the applied methods to assure security in mobile devices are based on PINs or passwords, which can be easy to forget and forge, so that, other approaches are arising. In particular, biometric recognition is suggested to be embedded in mobile devices for many reasons. The first one is the large amount of devices already deployed, which has reached the situation that it is difficult to find someone that does not possess and use daily devices such as smartphones or tablets. The second one is that for some biometric modalities, the capture device is already included within the mobile device (e.g., camera for face recognition, touch screen for handwritten signature recognition, microphone for speaker recognition, or the inclusion of some swipe sensors for fingerprint verification). Handwritten signature, voice and face recognition has been suggested as the most suitable modalities .
This leads to an important reduction in the cost of the deployment, as users already have those devices and they should only acquire the application. Other important factors are the necessity of having ID portable devices by security forces (e.g., for suspects identification) or for signing documents on the spot. Also, as users are already familiar with this kind of devices, the usability level achieved could be improved, although, as it will be mentioned below, mobility also creates new usability challenges. Due to marketing needs, mobile devices are improving every day, which will allow powerful biometric algorithms in the near future. As an important drawback, mobile devices present security concerns related to how the operating system controls the way that installed applications access memory data and communication buffers. A lack of a strict control compromises the integration of biometrics as sensitive data may be endangered.
Usability and accessibility concerns
One of the major drawbacks when using biometrics in mobile devices is the lack of usability being this technology a challenge for users in many cases. Almost all the work done in biometrics is devoted to improve algorithms performance and bringing the Equal Error Rate (EER) close to zero. But while this kind of research is necessary, working on improving user interaction with systems is also extremely important, as a lack of usability could mean not only the rejection of the system by the users, but also a reduction in the expected performance of the biometric system. In order to increase the easiness and encourage the use of biometrics it is necessary to improve its usability and accessibility, making it reachable for a wider percent of population.
One of the collectives usually excluded at the time of design security systems is the disabled people, who are around 15 % of the world population . Furthermore, it is important to highlight that every individual is potentially dependent (illnesses, age, pregnancy, etc.). Improving biometrics designs would be beneficial not only for disabled people but for many others who find the technology complicated to use. It could be thought that biometric recognition is challenging for disabled people but we show in this work that a correct design can make the process easy for everyone. Specifically, we focus on face recognition for visually impaired users, providing audio feedback and instructions. Face recognition has shown to have a good acceptation and it is one of the less intrusive modalities for users. There are several works in the literature embedding face recognition in mobile environments [4, 29, 30]. Nevertheless, the amount of works in biometrics accessibility is scarce yet. One of the first approaches is a universal access control to mobile devices through fingerprint and handwritten signature developed by authors . Other recent works in biometrics accessibility show the advantages against other alternatives such as PIN or passwords . These researches point the necessity of reliable solutions which could ease several procedures to people with accessibility concerns.
State of the art in biometrics usability
There are several usability works in biometrics in the literature and most of them come from the usability definition given by the ISO 9241:2010 : “The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use”. The National Institute of Standards and Technology (NIST) made some experiments emphasizing in ergonomics to better capture user traits. For instance, in  an experiment analysing the optimal device position regarding height or in  where they measured the usability of the face image capturing system at the US ports of entry. One of the first usability studies in biometrics was an enrolment trial in the UK  conducted by Atos. Kukula et al. designed a model, the HBSI (Human Biometric System Interaction) , where the interaction between the user and the system is studied through ergonomics, usability and signal processing. In Kukula et al.  analysed the different kinds of possible errors when applying hand geometry recognition. Another example is  where an extensive analysis on the fingerprint devices ergonomics is done.
Authors have carried out several experiments analysing various factors which affect usability in biometrics. In  the use of different styluses in dynamic signature verification on an iPad was studied reaching interesting outcomes regarding ergonomics. The stress as a key usability factor in biometric recognition was analysed in , where authors showed that mobile biometrics are reliable even when the user is under stressful situations (banks, shops, post office, etc.). The most relevant works providing feedback to users in biometrics are the experiments made by NIST. In  and  they present a quality-driven interactive real-time user feedback mechanism for unattended fingerprint kiosk, where the application shows pictures (visual feedback) to users helping them to better place their fingertips on the sensor. In our case, we suggest a new feedback mode providing audio feedback in order to guide visually impaired users through the biometric process.
Study of the time evolution
Several visually impaired people participated in this usability evaluation of face recognition, where they were asked to take self-photos with a mobile device. We prepared different experiments with different kinds of feedback divided into two sessions. Once all the images had been taken we analysed the face recognition performance in contrast with the time employed in the process. In mobile environments, one of the most critical aspects is the time employed in the authentication process because long times would lead to users rejection and/or security concerns. On the contrary, quick interactions could involve misuses and errors in recognition. Then, in this work we have focused on the efficiency (as defined by ISO 9241:2010, the time spent in tasks) of biometrics in mobile devices.
There are not too many works on accessibility in biometrics  and there is not a standard methodology yet. In this work, we compare instructions with audio feedback in real time following the state of the art in interfaces  accessibility and applying them to biometrics . This work comprises an extensive analysis of the time influence in face recognition for visually impaired users both in performance and usability, obtaining several important outcomes regarding: the importance of the received feedback, the variability of the performance and the usable images rates along with the time spent in the process. This paper is divided as follows: in “Evaluation set up” the evaluation set up is provided. We explain the methodology and the experimentation in the “Methodology and experimentation”. The results are in “Results and discussion” and the conclusions and future work are in “Conclusions and future work”.
Evaluation set up
This evaluation was carried out in the Saint Nicholas Home centre for visually impaired people (Penang, Malaysia). The scenario was a quiet room where the user was only accompanied by the operator and the ambient conditions were normal (e.g. lightning, temperature and humidity). In all the experiments the user was sat at a table and was requested to hold the camera by himself and try to focus his face in order to obtain a face image as centred as possible. The process was scheduled in two sessions and each one consisted of four different experiments (from this point forward “E”) and completed in this order: E1, E2, E3 and E4 (explained in “Experiments”). The camera used for capturing the face images is an Advent Slim 300K web cam connected to a PC, easily manageable for users. The images taken have a 640 × 480 resolution in grayscale. Each experiment takes as much time as the user needs (until a timeout set at 45 s) to obtain a good image (in quality given by the face detector confidence) of his face. The evaluation crew was composed of 40 users (29 men and 11 woman) with different range of age and impairment (Fig. 1). The users’ general degree of vision is really low as many of them can barely distinguish light. Each user has different knowledge about the technology and none of them have used a biometric system previously.
The experiments made were planned in order to meet the initial requirements regarding usability and accessibility. The users receive different kinds of feedback in each experiment being the final goal is to focus their face at the camera in all of them. The experiments are:
E1: The user does not receive any information or feedback about how to take the self-photo.
E2: An audio feedback consisting of a sinusoidal wave sound is given by the computer. The sound is set at three different frequencies according to the face detection confidence of the acquired image. Higher face confidence involves higher frequency.
E3: The user receives information before starting about how to take the self-photo, regarding the correct distance and camera handling.
E4: The user receives the audio feedback and information about how to take the self-photo (E2+E3).
Each experiment was recorded as a video and each video comprises several images (some of them contain a face and others not, depending on the user skills at the time employed to take the self-photo). An example of the experiments images is in Fig. 2.
An ideal way to carry out the experiments is by using RCT (Randomized Controlled Trial), whereby a user is subject to one of the four modes of feedback. However, we were concerned with two issues with this approach. The first is the fact that biometric performance is subject-dependent. Therefore, if one conducts the same experiment (the same mode of feedback) on two different disjoint populations, one will get two different results. The effect of the subject variability might be higher than the effect due to the mode of feedback. The second concern is that the number of blind subjects is limited, which is about 40. For this reason, deploying RCT would mean that one has to divide the population into a smaller set with 10 subjects for each mode of feedback. This is arguably not the best use of limited samples available to us.
We have therefore, opted for subjecting every volunteers to all the four modes of feedback, but doing so carefully so that the effect of one feedback does not influence that of another. One way to achieve this is by exploiting the natural ordering of the modes of feedback. For example, the E1 setting does not have any feedback and so should be carried out first. E2 and E3 are each independent of each other because the audio feedback does not convey any information about the instruction. However, the instruction (E3) mode should take place after. A potential weakness of the above approach is that the volunteer may have become more familiar with the device after each experiment which are conducted sequentially. After a post experimental analysis, we found that this is not a concern.
Methodology and experimentation
The main target of this work is to contrast several hypotheses related to the efficiency in the biometric recognition context and to do so a whole biometric recognition process was designed. Next, the suggested hypotheses, the algorithms used and the methodology followed to process the data are described.
Performance results are better in the last experiments (i.e. E4 > E3 > E2 > E1) because (a) the user has already acquired more habituation and (b) the user has received more information about the process. This should involve better images’ quality and therefore, better results.
The audio feedback is more useful than the information because it is provided in real time and the users can correct bad postures on the spot.
There are less performance variations in the last experiments (i.e. E4 < E3 < E2 < E1) because the user is more habituated to the system and makes less mistakes.
The number of valid images is higher in the last experiments because of the habituation because (a) the user has already acquired more habituation and (b) the user has received more information about the process.
We describe the face recognition system designed according to the well-known biometric recognition schema (Fig. 3). These algorithms were applied once all the images had been collected except the face detector used for giving audio feedback in real time. The database acquired contains self-images of visual impaired individuals, then many of them are generally rotated, blurred or not centred . In order to overcome this problem we have normalized the images and applied a face detection algorithm. Furthermore we have used an alignment-free based face recognition algorithm.
The audio feedback was provided to the users as a function of the face detection confidence, applying a Viola-Jones based algorithm (an example is shown in Fig. 4). The feedback consisted of three different sinusoidal wave sounds with three different frequencies: the lowest (1.5 kHz) indicates non face/partial face detected, the medium (4.5 kHz) indicates non frontal face and the highest (7.5 kHz) indicates frontal face. These sounds are intended to alert the users about the current quality of image and encourage them to better locate the face.
One of the most prejudicial factors for face recognition is lightning , so that we have applied a lightning normalization to improve the performance . Then, in order to use only the images that contain faces we have used a Viola-Jones based  implementation for face detection and finally the face images were cropped to delete noise and normalize the size. An example of the final images is in Fig. 3.
The application of the classic alignment-based algorithms (e.g. PCA, LDA, etc.) could involve poor results with this particular set of images because the landmarks used as references for the alignment are in many cases occluded or distorted. For instance, in many of them could be difficult to differentiate the eyes from the eyebrows. Thus, we have used SIFT , an alignment-free-based method initially developed for object recognition but also used in face recognition approaches . This algorithm extracts several keypoints from the high contrast regions of the image reducing local variations errors and then being resistant to distortion, orientation, changes in the image scale and noise (then accomplishing all our requirements). Once the SIFT had been applied, each face image was divided in several descriptors making up the template.
The SIFT algorithm returns the number of matches between two images. A match is obtained when the distance between the first and the second nearest neighbour between 2 descriptors is under a given threshold. Notice that comparing image A with image B can return a different number of matches than comparing image B with image A. Then, once the SIFT had returned the number of matches between two images, they were normalized to the number of descriptors of both to obtain a score:
Applying this kind of normalization the score will be low as the number of matches between two images is always too much lower than the number of descriptors. In order to calculate genuine and impostor rates we have obtained the templates from the session 1, being the image with the highest confidence the template for each combination of user-experiment. The genuine rates were computed matching each user-experiment template with all the images from the same user-experiment of the session 2.
Methodology applied to process the data
A first approach to the genuine scores has shown meaningful differences from some users to others representing no consistence as to obtain broad conclusions (Fig. 5, left). Thus it is necessary to normalize the results on a user by user basis with the baseline in order to obtain reliable results. To obtain the scores evolution in time, we have divided each user interaction (each user-experiment) into three parts of the same length, namely t1 (first part), t2 (second part) and t3 (third part). Then, we have calculated the mean score per each user-part and normalized it (μ′ ∈ R) to the same part in the baseline (E1) (Fig. 5, right).
Then, we have measured the differences in the genuine scores between t1 and t3 (the beginning and the end of the interactions).
Results and discussion
In this section we contrast and discuss all the suggested hypotheses with the results obtained. First, we have used the t test between the pairs t1–t3 for each experiment (except for Experiment 1 which is not normalized) to validate the data. The t test showed these p values: pExp2 = 0.12, pExp3 = 0.09, pExp4 = 0.25. Authors conclude that the database size should be increased in order to obtain more reliable and statistically significant results. Using the current database and following the initial order of hypotheses:
According to the results obtained, the experiments order (according to the performance evolution in time) is not as expected and the correct order is E4 > E2 > E3 > E1 as shown in Fig. 6. The fact that the results in E2 outperform those of E3 points out that the audio feedback could be more useful than the previous instructions. As E2 has obtained better scores, we also suggest that the audio feedback is even more effective than the habituation.
The audio feedback was more successful than the instructions given: as shown in Fig. 6, E2 and E4 (those experiments where the audio feedback was provided) are better in performance.
The Fig. 6, where the variance is represented by the boxplots, shows that there is not an experiment dependency regarding the performance variance as it does not tend to change in any experiment. This fact shows that users did not get habituated to the system as much as to gain consistency in the results.
The experiments order with regards to the amount of valid images in this case is as expected (according to the number of valid images): E4 − 6598 > E3 − 6216 > E2 − 5538 > E1 − 5238.
Figure 6 shows also that performance increases in time in all cases. Therefore, though the performance evolution is not consistent in all users, its tendency is to increase in time.
Conclusions and future work
This work shows the high influence of the time on usability concerns in biometrics, specifically for visually impaired people: the longer the interaction the better the performance. It also covers a gap as we did not find in the literature any other study of the time spent in the face recognition for disabled people (the accessibility studies are scarce in biometrics). Regarding the feedback, the previous works in biometrics (mainly carried out by NIST) were based on provide images to users. In this experiment, we have successfully applied audio feedback in real time and we suggest that it is more effective than previous instructions, but even more effective is to use both modes of feedback jointly.
Regarding accessibility, we have found big variations from one user to another when processing the received feedback. Then, for some of the users the audio is more helpful and other users process the previous instructions better. This fact strengthen our suggestion of provide both feedback modes at the same time (E4). We have also found that users acquired skills on taking self-photos during the process because the number of valid images increases from one experiment to another. Nevertheless, it is not consistent with the performance as we have obtained more valid images in E3 than in E2 but the performance in E2 is better. This fact reinforces our suggestion that audio feedback is more effective than instructions and even that habituation. The performance results obtained are under the state of the art as expected due to evident reasons.
Although our findings suggest that a continuous use of this technology by visually impaired people would lead to improve the final results. It is necessary to extend this work including other different kinds of feedback (e.g. mobile vibration, different sounds, etc.). Another future project could be to implement this work in a real application for common mobile devices including both feedback modes.
International Organization for Standardization (1999 ) ISO 13407:1999. Human-centred design processes for interactive systems
Abidin A, Xie H, Wong KW (2013) Touch screen with audio feedback: Content analysis and the effect of spatial ability on blind people’s sense of position of web pages. In: Research and Innovation in Information Systems (ICRIIS), 2013 International Conference, pp 548–553
Atos: Ukps biometric enrolment trial (2005)
Barra S, De Marsico M, Galdi C, Riccio D, Wechsler H (2013) Fame: face authentication for mobile encounter. In: Biometric Measurements and Systems for Security and Medical Applications (BIOMS), IEEE Workshop, pp 1–7
Blanco-Gonzalo R, Diaz-Fernandez L, Miguel-Hurtado O, Sanchez-Reillo R (2014) Usability evaluation of biometrics in mobile environments. In: Hippe ZS, Kulikowski JL, Mroczek T, Wtorek J (eds.) Human-computer systems interaction: backgrounds and applications 3, advances in intelligent systems and computing, vol 300. Springer International Publishing, New York, pp 289–300. doi:10.1007/978-3-319-08491-6_24
Blanco-Gonzalo R, Sanchez-Reillo R, Miguel-Hurtado O, Bella-Pulgarin E (2014) Automatic usability and stress analysis in mobile biometrics, Image Vision Comput 32(12):1173–1180
Chagnaadorj O, Tanaka J (2014) Gesture input as an out-of-band channel. J Inform Process Syst 10(1):92–102
Cho H, Choy M (2014) Personal mobile album/diary application development. J Converg 5(1):32–37
Choong YY, Theofanos M, Guan H (2012) Fingerprint self-captures: Usability of a fingerprint system with real-time feedback. In: Biometrics: theory, applications and systems (BTAS), 2012 IEEE Fifth International Conference, pp 16–22
Cisco: Cisco visual networking index (2014) Global mobile data traffic forecast update, 2013–2018. http://www.cisco.com/c/en/us/solutions/collateral/serviceprovider/visual-networking-index-vni/white_paper_c11-520862.html. Accessed 5 Aug 2015
Guan H, Theofanos M, Choong YY, Stanton B (2011) Real-time feedback for usable fingerprint systems. In: Biometrics (IJCB), 2011 International Joint Conference, pp 1–8
Jain AK, Li SZ (2005) Handbook of Face Recognition. Springer-Verlag New York Inc, Secaucus
Kamencay P, Breznan M, Jelsovka D, Zachariasova M (2012) Improved face recognition method based on segmentation algorithm using sift-pca. In: Telecommunications and signal processing (TSP). 2012 35th International Conference, pp 758–762
Kim H, Lee SH, Sohn MK, Kim DJ (2014) Illumination invariant head pose estimation using random forests classifier and binary pattern run length matrix. Human-Centric Comput Inform Sci 4(1):9
Kukula E (2008) Design and evaluation of the human-biometric sensor interaction method. PhD thesis, Purdue University, USA
Kukula E, Sutton M, Elliott S (2010) The human biometric-sensor interaction evaluation method: Biometric performance and usability measurements. Instrum Measurement IEEE Trans 59(4):784–791
Lee IY, Park SW (2013) Anonymous authentication scheme based on NTRU for the protection of payment information in NFC mobile environment. J Inform Process Syst 9(3):461–476
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110
Marcel S, McCool C, Matějka P, Ahonen T, Černocký J, Chakraborty S et al (2010) On the results of the first mobile biometry (mobio) face and speaker verification evaluation. In: Nay D, Ataltepe Z, Aksoy S (eds) Recognizing Patterns in Signals, Speech, Images and Videos, vol 6388., Lecture notes in computer scienceSpringer, Berlin Heidelberg, pp 210–225
Ravi V, Nishanth K (2013) A computational intelligence based online data imputation method: An application for banking. J Converg 9(4):633–650
Riley C, McCracken H, Buckner K (2007) Fingers, veins and the grey pound: Accessibility of biometric technology. In: Proceedings of the 14th European Conference on Cognitive Ergonomics: Invent! Explore!, ECCE’07, ACM, New York, pp 149–152
Sanchez-Reillo R, Blanco-Gonzalo R, Liu-Jimenez J, Lopez M, Canto E (2013) Universal access through biometrics in mobile scenarios. In: Proceedings ICCST2013: 47th annual IEEE International Carnahan Conference on Security Technology
Seo H, Choy Y (2014) Id credit scoring system based on application scoring system: Conceptual online id credit for id integrated environment. J Converg 5(1):38–42
Singh R, Singh P, Duhan M (2014) An effective implementation of security based algorithmic approach in mobile adhoc networks. Human-Centric Comput Inform Sci 4(1):7
Tan X, Triggs B (2010) Enhanced local texture feature sets for face recognition under difficult lighting conditions. Image Process IEEE Trans 19(6):1635–1650
Theofanos MF, Stanton BC, Wolfson C (2008) Usability and biometrics: Ensuring successful biometric systems. In: International Workshop on Usability and Biometrics, p 83
Theofanos MF, Stanton BC, Orandi S, Micheals R (2007) Effects of scanner height on fingerprint capture. National Institute of Standards and Technology, Gaithersburg NISTIT 7382 (2006)
Theofanos MF, Stanton BC, Micheals R Usability testing of face image capture for us ports of entry. In: 2nd IEEE International Conference on Biometrics: Theory, Applications and Systems. BTAS, pp 1–6
Tresadern P, Cootes T, Poh N, Matejka P, Hadid A, Levy C (2013) Mobile biometrics: Combined face and voice verification for a mobile platform. Pervasive Comput IEEE 12(1):79–87
Vazquez-Fernandez E, Garcia-Pardo H, Gonzalez-Jimenez D, Perez-Freire L (2011) Built-in face recognition for smart photo sharing in mobile devices. In: Multimedia and Expo (ICME), 2011 IEEE International Conference, pp 1–4
Verma P, Singh R, Singh A (2013) A framework to integrate speech based interface for blind web users on the websites of public interest. Human-Centric Comput Inform Sci 3(1):21
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference, vol. 1, pp I-511–I-518
Wamsley A, Elliott S, Dunkelberger C, Mershon M (2011) Analysis of slap segmentation and HBSI errors across different force levels. In: Security Technology (ICCST), 2011 IEEE International Carnahan Conference, pp 1–5
Wong R, Poh N, Kittler J, Frohlich D (2010) Interactive quality-driven feedback for biometric systems. In: Biometrics: Theory Applications and Systems (BTAS), 2010 Fourth IEEE International Conference, pp 1–7
World Health Organization: WHO ((2011)) World report on disability, 2013–2018
Yang X, Peng G, Cai Z, Zeng K (2013) Occluded and low resolution face detection with hierarchical deformable model. J Converg 4(2):11–14
Kowtko MA (2014) Biometric authentication for older adults. Systems, Applications and Technology Conference (LISAT), 2014 IEEE Long Island, pp 1–6
RB made the data analysis and the experiments planning. Furthermore he wrote the paper. NP directed the whole experiment. RW has gathered the database. RS has reviewed the paper and gave a final approval of it. All authors read and approved the final manuscript.
This work has been supported by the Spanish Ministry of Economy and Competitiveness by the project TEC2012-38329: “URBE-Universal Access through Biometrics in Mobile Scenarios”.
Compliance with ethical guidelines
Competing interests The authors declare that they have no competing interests.
About this article
Cite this article
Blanco-Gonzalo, R., Poh, N., Wong, R. et al. Time evolution of face recognition in accessible scenarios. Hum. Cent. Comput. Inf. Sci. 5, 24 (2015). https://doi.org/10.1186/s13673-015-0043-0
- Mobile Device
- Face Recognition
- Face Image
- Disable People
- Equal Error Rate