A Two Fold Expert System for Yawning Detection


 Procedia Computer Science   92  ( 2016 )  63 – 71 

Available online at www.sciencedirect.com

1877-0509 © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license 
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of the Organizing Committee of ICCC 2016
doi: 10.1016/j.procs.2016.07.324 

ScienceDirect

2nd International Conference on Intelligent Computing, Communication & Convergence  

(ICCC-2016) 

Srikanta Patnaik, Editor in Chief 

Conference Organized by Interscience Institute of Management and Technology 

Bhubaneswar, Odisha, India 

A Two Fold Expert System for Yawning Detection 

Anitha Ca*, M K Venkateshab, B Suryanarayana Adigac 
aResearch Scholar – BMS College of Engineering/ Asst. Prof. – NIE, Mysuru- 5700 08, INDIA *is the corresponding author 

bPrincipal, RN Shetty Institute of Technology, Bengaluru – 560 098, INDIA, cFormer Chief Consultant, TCS Ltd., Bengaluru – 560 066, INDIA 

Abstract 

One of the prominent indicators of drowsiness is yawning. The main intention for a real-time application such as 
detecting the driver’s yawning is that the response of the detector must be as quick as possible. A novel yawning 
detection system is proposed which is based on a two agent expert system. The features of the face have to be 
extracted to detect yawning in the driver’s face. In the proposed system, as the first part of detection we use the face 
detection algorithm’s skin detection part. The skin region is extracted. For all the skin region blocks detected, their 
boundaries are defined. Then segmented face is divided into two halves. The lower half of the face is considered for 
the mouth region extraction. The presence of yawning would be indicated by a black blob in the mouth region of the 
binary image. But, there may be multiple blobs present in the image which may be due to the presence non-skin like 
regions around the driver’s face.  So, identifying the exact position of the mouth and checking for its containment 
inside the face is necessary. The features extracted for yawning detection are the histogram values taken from the 
vertical projection of the lower part of the face. 
© 2014 The Authors. Published by Elsevier B.V. 
Selection and peer-review under responsibility of scientific committee of Missouri University of Science and Technology. 

Keywords: Yawning detection; two-fold expert system; mouth region extraction; histogram; containment; vertical 
projection; non-frontal face images; blob detection 

© 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license 
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of the Organizing Committee of ICCC 2016

http://crossmark.crossref.org/dialog/?doi=10.1016/j.procs.2016.07.324&domain=pdf
http://crossmark.crossref.org/dialog/?doi=10.1016/j.procs.2016.07.324&domain=pdf


64   C. Anitha et al.  /  Procedia Computer Science   92  ( 2016 )  63 – 71 

1. Introduction 

Driver’s behavior monitoring system finds its application in many real-time situations that include drowsiness 
detection and surveillance. One of the applications that are considered is the yawning detection while driving. 
Yawning is a vital indicator for drowsiness or sleepiness. It is crucial to identify the sleepiness at an early stage. 
Repeated yawning most of the time, leads to lack of alertness. When the driver is not alert, mishaps tend to occur. In 
order to avoid mishaps occurring through non-alert drivers, the vital indicators have to be detected at the earliest. As 
per the statistics provided by the National Highway Traffic Safety Administration1 (NHTSA), about 72,000 crashes, 
800 fatalities and 44,000 injuries have occurred in 2013 due to drowsy driving. Hence, there is a need for 
automatically alerting the driver in advance. In this paper, we propose a two-fold expert system for yawning 
detection. The proposed system is computationally less complex, robust and has less reaction time. In the next 
section, the state of the art yawning detection systems are described. Section 3 gives a detailed description of the 
proposed system. The results are discussed in section 4. The last section provides the conclusion for the proposed 
work. 

 
2. Related Work 

In4, three different approaches for yawning detection are described. The color segmentation technique, the Snake 
Contour method and the third method was usage of Viola-Jones theory9 for face and mouth detection for detecting 
and locating the mouth region. The yawning detection was based on the openness of the mouth and the number of 
frames it was open. A comparative histogram was used between the closed mouth condition and the yawning 
condition.  In5, an SVM classifier is used to detect yawning condition. The classifier is given the width to height 
ratios of eyes and mouth. Both the eyes and the mouth features are considered for concluding the yawning condition.  
In6, the location of the chin and nostrils are used to detect yawning. The distance between the chin and the nostrils 
increases as a person yawns, according to the authors’ of6. The localization of the chin and the nostrils is done using 
directional integral projection method. The Viola-Jones technique9 for face and mouth detection is used in paper7. 
An SVM classifier is trained with mouth and yawning images. The mouth region is detected from the face using a 
cascade of classifiers during fatigue. SVM is used to classify the mouth regions as yawning or alert. In8, mouth 
corners are detected by grey projections and extracted using Gabor wavelets. Least Discriminant Analysis (LDA) is 
applied to classify these Gabor wavelets into yawning and not yawning condition. In8, the authors also detect 
yawning based on the mouth’s height to width ratio. Most of the techniques discussed for yawning detection make 
use of classifiers. The use of classifiers adds to the complexity of the system. In this paper we propose a system that 
does not use any classifiers for yawning detection; this reduces the computational time and complexity. Another 
feature of the proposed system is that the yawning detection is performed in two folds; this improves the accuracy.  

 
3. Proposed System 

A two-fold yawning detection system is proposed in this paper. The word two-fold is used as the yawning is 
detected in two ways. Firstly, yawning detection is based on the skin tone detection and secondly based on blob 
dimensions and face containment. As the first step towards yawning detection, we separate the face from the 
background.  

 
3.1. Feature Extraction 

The features of the driver’s face have to be extracted to detect yawning. In the proposed system, as the first part 
of detection we use the face detection algorithm’s skin detection part. The skin region is extracted. For all the skin 
region blocks detected, their boundaries are defined. Then extracted face is divided into two halves. The lower half 
of the face is considered for the mouth region extraction.  


65 C. Anitha et al.  /  Procedia Computer Science   92  ( 2016 )  63 – 71 

The presence of yawning would be indicated by a black blob in the mouth region of the binary image. But, there 
may be multiple blobs present in the image which may be due to the presence non-skin like regions around the 
driver’s face.  So, identifying the exact position of the mouth and checking for its containment inside the face is 
necessary. The features extracted for yawning detection are the histogram values taken from the vertical projection 
of the lower part of the face. 

  
3.2. Design 

The most important assumptions made for yawning detection:  
 Driver’s face is expected to be frontal to the camera (located on the dashboard of the car).  
 Head rotation maximum up to 45° permitted, not beyond that (practically not feasible to drive with head turned 

beyond 45°) 
 
Figure 1 depicts the steps involved in the proposed yawning detection system.  
 

Fig 1: Block diagram of the yawning detection approach. 
 

3.3. The Approach 

The method adopted is simple computationally, yet efficient and accurate.  Simple as there is no use of classifiers 
for yawning detection. The proposed system is a two-agent expert system to detect yawning. The first agent detects 
yawning based on skin tone detection. The second agent detects yawning based on the blob dimensions and 
containment within the face region. The design steps are elaborated as below: 

 
 Detect skin tone  

The technique discussed in3 is used for skin detection. The input is resized to standard dimensions. This 
algorithm displays a blob at regions containing non-skin values. All the detected skin regions are bounded 
in a rectangle and indexed. The minimum and maximum index values are extracted. The maximum index 


66   C. Anitha et al.  /  Procedia Computer Science   92  ( 2016 )  63 – 71 

value is used to define the region of interest (ROI) rectangle, which is the candidate face. 
  
 Segment the detected face  

To minimize the noise and to reduce the size of computation, only the lower part of the face is considered. 
This segment contains the mouth region, which is further analysed.  

 
 Verify the presence of blob 

The non-skin regions in the segmented part of the face will be represented by black blobs. Check if the 
blobs appearing are due to features inside the face or not. This is because; the noise from the surroundings 
should not be mistaken for inside face non-skin regions. 

 
 Bound the blobs  

The blobs confirmed to be inside the face are bounded in rectangles as per their dimensions.  Take the 
histogram from the vertical projection of the bounded blobs.  

 
 Verify the yawning through histogram  

The histogram of blobs is obtained through vertical projection of the segmented face. Histograms are 
considered for blobs present only at the centre and near-centre regions of the segmented face. This is in 
accordance with the assumptions stated earlier in this section. The histogram’s length and sum values are 
verified with the threshold values. If the values satisfy the thresholds specified, yawning is confirmed.  

 
3.4. Dealing with Occlusions 

The occlusions on the driver’s face may be many, such as mouth covered by hand while yawning; wearing 
sunglasses while driving, hair covering the face while driving and so on. Among these, the mouth covered by hand 
while yawning is one which cannot be handled. The frames with the hand covering the mouth, even though 
yawning, would indicate not yawning. This can be overcome by monitoring the frames’ output before and after the 
occluded frames. 

 
The wearing of sunglasses or any other type of glasses does not bother the yawning detection as we have 

segmented the face, and only the lower part of the face is considered for yawning detection. 
 
The covering of face with the hair also causes occlusion to some extent as the skin region covered cannot be 

detected. But this is not as serious as it does not affect the yawning decision made finally. 
 

4. Results and Discussion 

The methodology adopted for yawning detection was tested on the YAWDD data set2. Different cases were 
considered for evaluation. These include image sequences with frontal faces, sequences with sunglasses, sequences 
with prescribed spectacles, sequences with mouth occlusion, and sequences with non-frontal faces. This section 
discusses the results into the following categories: 

• Non-occlusive, frontal faces with/without glasses 
• Occlusion with hand, frontal with/without glasses 
• Non-frontal faces with/without glasses 
 
 
4.1. Non-occlusive, frontal faces with/without glasses 


67 C. Anitha et al.  /  Procedia Computer Science   92  ( 2016 )  63 – 71 

The detection is accurate for non-occlusive, frontal faces. The driver’s sunglasses do not affect the response of 
the yawn detector, since only the lower part of the face is considered. 

 
Fig. 2 True Positives;  (a) Input image, (b) Segmented region, (c) Yawn detector’s output, (d) Histogram 

 
The histogram values indicated in the rightmost column of figure 2 indicates the blob measurements in the lower 

segmented region of the face. The presence of multiple bins indicates the presence of other blob regions. But mouth 
blob is identified based on the location and the containment inside the face. 

 
4.2. Occlusion with Hand 

The detection in this case was not possible in the frames where the hand completely covered the mouth region. 
But, due to the detection of yawning in the previous and continuing frames, this occlusion could be easily handled. 
Figure 3 shows the input and output resulting in this case. 

 
When the driver occludes the mouth region during yawning, the blob in the mouth region is covered and the 

vertical projection does not reflect any value. Hence the histogram does not reflect any variations. During these 
frames, the yawning condition goes undetected. As our system increments a count value for every frame when 
yawning is detected based on the histogram values. The yawning is detected the moment the hand is removed away 
from the mouth region. The proposed system could not overcome this occlusion as the feature used for yawning 
detection was the based on the mouth region. 

 
68   C. Anitha et al.  /  Procedia Computer Science   92  ( 2016 )  63 – 71 

 
69 C. Anitha et al.  /  Procedia Computer Science   92  ( 2016 )  63 – 71 

 
Fig. 3 Occlusion with hand, Yawn Undetected 

(a) Input image, (b) Segmented region, (c) Yawn detector output, (d) Histogram 

4.3. Non – frontal faces 

The frames with subject completely frontal to the camera are the actual and typical way a driver drives a car. But 
some cases where the face was not completely in front of the camera during yawning were considered. The success 
rate was good for faces up to 45° inclination, but went bad for faces inclined beyond that. Figure 4 shows some of 
the non-frontal frames from the database. 

 
As per the assumptions made in section 3.2, the proposed system detects yawning when the driver’s face is 

completely parallel to the camera (completely facing the camera) or slightly rotated. The extent to which yawning is 
detected with head rotation is 45°. Beyond this the blob’s containment in the face becomes false, leading to 
undetected yawning condition. Moreover, the driver cannot afford to keep his head away from the windshield all the 
time. 

 
70   C. Anitha et al.  /  Procedia Computer Science   92  ( 2016 )  63 – 71 

 
Fig. 4 Non – frontal frames, less than 45° detectable 

(a) Input image, (b) Segmented region, (c) Yawn detector output, (d) Histogram 
 

5. Conclusion 

A robust, computationally less complex yawning detection system is proposed. The two fold expert system was 
successful in detecting yawning with an accuracy of 94% when evaluated on the YAWDD – Yawning Detection 
Database2. The first clue for the yawning condition is at the skin tone detection phase. The presence of blob is an 
indication of yawning condition. The confirmation of the yawning condition is when the blob is tested for its 
containment within the face and the histogram values of the vertical projection of the blobs in the lower segmented 
face region. The proposed system indicates yawning condition only when the histogram value is above the threshold 
value. The system successfully detects yawning when the driver’s face is completely frontal or slightly frontal. The 
system is unable to detect in the driver’s head is rotated beyond 45°. Yawning with hand occluding the mouth 
cannot be detected as the entire mouth region is covered by the hand. The previous and next frames can be used to 
detect yawning in such situations. The proposed system deals with head movement up to 45°. The yawning 
detection systems discussed in5, 6, 7, 8 consider only frontal images. The proposed system is evaluated on the publicly 
available yawning detection database YAWDD2, which consists of 322 video sequences shot in the car with varying 
light conditions, when compared to5, 6, 7, 8 which use a small private data set under controlled conditions for 
evaluation.    

 
71 C. Anitha et al.  /  Procedia Computer Science   92  ( 2016 )  63 – 71 

Acknowledgement 

The authors express their deepest gratitude to BMS College of Engineering, Bengaluru for providing the required 
help in conducting this research work and The National Institute of Engineering, Mysuru for their continuous 
encouragement and support. 

 
References 

1. NHTSA – National Highway Traffic Safety Administration, Washington DC, Online:  
http://www.nhtsa.gov/Driving+Safety/Drowsy+Driving 

2. S Abtahi, M Omidyeganeh, S Shirmohammadi, B Hariri, ‘YawDD: A Yawning Detection Dataset’, in 
proceedings of ACM Multimedia Systems, Singapore, pp. 24-28, March 2014. 

3. Anitha C, M K Venkatesha, B Suryanarayana Adiga, “Real Time Detection and Tracking of Mouth Region 
of Single human Face”, in Proceedings of IEEE Third International Conference on Artificial Intelligence, 
Modeling and Simulation (AIMS2015), Malaysia, pp. 297 – 304, December 2015. 

4. S Abtahi, B Hariri, S Shiromohammadi, “Driver Drowsiness Monitoring Based on Yawning Detection”, in 
Proceedings of IEEE International Instrumentation and Measurement Technology, Binjiang (Hangzhou), 
China, May 10-12, Pages 1 – 4, 2011. 

5. T Azim, M Jaffar, A Mirza, “Automatic Fatigue Detection of Drivers through Pupil Detection and Yawning 
Analysis”, in Proceeding of Fourth International Conference on Innovative Computing, Information and 
Control, pp. 441 – 445, 2009. 

6. L Yufeng, Z Wang, “Detecting Driver Yawning in Successive Images”, in Proceedings of First International 
Conference on Bioinformatics and Biomedical Engineering, 2007. 

7. M Saradadevi, P Bajaj, “Driver Fatigue Detection Using Mouth and Yawning Analysis”, IJCSNS 
International Journal of Computer Science and Network Security, Vol. 8, No. 6, pg. 183-188, June 2008. 

8. X Fan, B Yin, Y Sun, “Yawning Detection for Monitoring Fatigue”, in International Conference on Machine 
Learning and Cybernetics, 2007. 

9. P Viola, M Jones, “Rapid Object Detection Using a Boosted Cascade of Simple Features”, in Proceedings of 
IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001.