key: cord-0487073-blm7ywel
authors: Pillay, Ashwin; Kale, Aditya; Anchan, Raj; Bhadricha, Aniket; Ram, Sangeetha Prasanna
title: Real-Time Detection of Drowsiness Among Vehicle Drivers: A Machine Learning Algorithm for Embedded Systems
date: 2021-11-04
journal: nan
DOI: nan
sha: 41adbfaede946892e950256dd517acb6bc9ae110
doc_id: 487073
cord_uid: blm7ywel

Numerous studies have established the necessity for developing safety equipment to detect drowsiness among vehicle drivers. However, for reliable implementations, such systems must employ dependable sources of stimuli; through Electrooculography (EOG), the tendencies of drowsiness can be directly sensed by measuring blinks of prolonged durations. While conventional machine learning (ML) algorithms can be utilized for the detection and classification of these prolonged blinks (PB), executing them on microcontroller units (MCU) may prove to be a laborious task. Hence, by keeping resource constraints and practicality in mind, an ML algorithm is proposed in this study to identify PBs executed by an individual with desirable accuracy and precision while being efficient enough to be deployed on portable wearables using economic MCUs. Furthermore, the suggested algorithm is subjected to multiple rounds of testing in this study thereby, establishing its possibility as a feasible drowsiness detection measure for wearable systems.

T HE eyes are among the most vital organs in the human body, aiding as a sensing element in carrying out a plethora of quotidian tasks, including spatial awareness and vehicular driving. More remarkably, the eyes are as much an indicator of the human state as it is a sensor: the ocular activity of a person may indicate numerous different parameters such as their alertness, expressions, indications, mental and physical conditions. To this end, several technological applications have been developed based on the analysis of various eye movements and actions [1] [2] [3] .

In fact, gauging the alertness of a subject based on their eye activity is among the most elementary tasks performed by humanity. This procedure can be extended to situations where identifying a subject's drowsiness is indispensable, like driving vehicles or operating heavy machinery. Existing literature has successfully identified a high correlation between drowsiness and blinks of prolonged durations, referred to as Prolonged Blinks (PBs) in this study [4] . By detecting about two to three such PBs in a ten to fifteen-second interval using eye-activity measurement techniques such as electrooculography (EOG) [5] , the probability of a driver being drowsy may be ascertained. Consequently, they can be alerted before an impending mishap.

However, vehicular driving is an activity where the eye is employed for multiple services, ranging from achieving spatial awareness in the environment to gauging various instruments on the vehicle to control it as desired. EOG will naturally account for all these eye movements; hence, isolating only those electrooculograms that correspond to PBs becomes essential for reliable detections of drowsiness. Furthermore, for dependability and affordability, such isolation and ensuing detection procedures should be carried out by portable, isolated embedded systems. However, they are generally disadvantaged in both data processing power and storage.

With these considerations, an efficient MCU algorithm for the isolation of EOG waveforms corresponding to PBs and onward detection of drowsiness is discussed in this paper, which is organised as follows: section II describes the instrument setup for detection of PBs. Section III introduces the initial preprocessing scheme employed on the EOG waves and expounds on the distinct properties of PB EOG waveforms used by the algorithm to differentiate them from other eye movements. Section III briefs about the execution of the drowsiness detection algorithm, emphasising on its machinelearning (ML) phase. Section V presents the results of executing this algorithm both as a proof of concept and as a practical implementation. Finally, section VI culminates the study by submitting the conclusions and future scope of the algorithm based on the observed results.

For this study, a three-electrode assembly (two measuring electrodes and one reference electrode) was fastened onto the optimal positions of the subject's face to detect PBs [6] . These electrodes were then interfaced with a custom made signal conditioning PCB that selectively amplifies signals lying in the EOG spectrum while filtering out all other unwanted waveforms. The processed output was then provided as input to a 12-bit microcontroller unit (MCU) to carry out the PB isolation and detection operations. [7] . Hence, any analysis based on the grounds of their amplitudes alone is ineffective. Moreover, these waves are also prone to be superimposed by the noise produced from sources like motion artefacts, EMI and improper attachment of electrodes on the face. In any case, reliable detection of PBs depends more on the tangible properties of the EOG wave received than on the subtler details. Hence, there is an opportunity to improve detection results by allowing some pre-processing to be performed on the incoming signal before analyzing it further.

For the ease in analysis and optimal utilization of MCU resources, therefore, the EOG signal is initially processed by a modified, moving average filter M {x(n)} of window size N = 25 such that for every sample of the original signal, x(n), the filter results in a smoothed equivalent x avg (n):

where;

x(n) = n th sample of EOG signal r thresh = regularization threshold; constant that controls degree of smoothing obtained x N (n) = mean of previous N th samples; stored in a buffer size of N

x avg (n) = filtered (smoothed) equivalent of x(n)

In equation 1, r thresh denotes the regularization threshold-a positive, floating-point valued and controllable parameter determining the extent of smoothing achieved by the filter. As r thresh is increased upwards from zero, the degree of smoothing performed on x avg will increase. For applications requiring subtler details in the smoothed waveform, a low value of r thresh must be set. After a round of experimental trials, r thresh was suitably set to 1 for this study.

Once the wave is smoothed, its first-order derivative (FOD) is obtained to remove the complications posed by the wandering baseline. 

F OD clearance threshold = clearance threshold that controls sensitivity parameter for calculation of FOD;

if FOD clearance threshold is low, higher will be the sensitivity N = min(n,25)

x FOD (n) = FOD of filtered x avg (n) sample for convenience;

The adjustable, positive valued F OD clearance threshold in equation 2 sets the sensitivity of F (n). Higher values of this threshold will cause only the more sudden and significant changes of the EOG signal to appear in x F OD . In comparison, smaller values (closer to zero) will also consider the more gradual deviations in the output signal. After a round of experimental trials, the F OD clearance threshold was set to 0.1 for this study.

For the convenience of this research, consider equation 3

The signal r is the resultant obtained from the pre-processing stage; all ensuing analysis will be performed on r.

To isolate exclusively those sections of the EOG FOD signal r that correspond to PBs, the unique traits of the same are identified and defined-this is done by defining a set of states that the PB wave is observed to strictly pass through. Initially, when the subject executes no eye movements, the signal amplitude is on the zero-line or the reference line and is designated to be in State 0. Then, as observed in fig. 2 , the FOD of PBs is characterized by an initial negative excursion that rises back to the zero line after reaching the minima. This entire negative half cycle (NHC) of the PB is defined as State 1. After traversing the NHC, there is a short period where the value of r is zero; this region is labelled as the inter-half cycle (IHC) zone and is represented by State 2. Corresponding to the NHC, the PB wave also voyages through a positive half cycle (PHC) where the amplitude reaches the wave maxima; the entirety of this PHC is demarcated to be State 3. Finally, the FOD again drops back to the zero line after State 3, assuming the final State 4.

Since all PBs generated by the subject are bound to cause the FOD signal to pass from States 0-4 sequentially, it is a trait that can be used to distinguish and isolate PBs from all other eye movements, except for the vigorously exhibited upward gazes. An algorithm that ensures that all PB candidate waveforms pass through these defined states may, therefore, act as an initial filter to prevent non-PB eye movements from being analyzed further. This facilitates the optimum use of MCU resources and improves overall detection efficiency. The algorithm carries out the State evaluation procedure till the wave remains in State 4 continuously for 200 ms, rejecting non-candidate waves as soon as they disobey the sequential transition scheme.

Like most physiological signals [8] , the characteristics of EOG signals also vary from one individual to another. Such variations are the result of multiple factors, including differences in muscle strength, age, fatigue or stress level, gender [9] [10] [11] . As a result, while PBs generated by different subjects have the same skeleton-as described by the four States defined previously, their subtleties may differ. Any outright classification scheme that does not account for these distinctions may hence, not yield the reliability expected in practical applications; for example, misclassifying eye movements with similar EOG waveforms like the upward gaze and PB.

Consequently, to understand and analyze subject-specific signals, the classification algorithm must employ artificial intelligence (AI). However, for an isolated embedded system, executing conventional machine learning (ML) algorithms like multilayer artificial neural networks (ANNs) may pose to be a strenuous task. Therefore, to suit the portable detection systems in focus of this study, a novel algorithm is developed. Firstly, it learns the distinct features of EOG signals generated by a given user and subsequently employs this knowledge for successfully classifying PBs in real-time without exhausting the available resources at any point during its execution.

The learning scheme employed is inspired by the widely established and efficient strategy used to train fingerprint detectors in smartphones [12] . Initially, wavelet buf f er, a matrix that will act as a record of most identical PBs, is defined in the MCU memory. As a first step, the user will be expected to execute a set number prolonged training reps of PBs. In this period lperiod 1, the algorithm implemented by the MCU will assume all eye movements generated and undergoing a sequential transition from State 0-4 to be PBs; the samples of the corresponding EOG wave being stored in an array wavelet. If wavelet buf f er is empty, wavelet is added to it without further analysis; however, if it is populated, the most similar wave among its members is compared with wavelet: the similarity or correlation coefficient between them being calculated and assigned to similarity measure. The most similar wave is obtained conveniently by determining the medoid wave of wavelet buffer [13] , while the comparison is quantified by DSP algorithms like maxima of normalised cross-correlation [14] or normalised DDTW with Sakoe-Chiba based constraint [10] [15] [16] depending on the suitable computational complexity versus accuracy compromise [17] .

The correlation coefficient is only one among six features (total f eatures = 6) analysed in wave r for this study; these are outlined in table I. Depending on the strictness needed for the classification of waves, the number of features analysed can be varied as required. For each feature during lperiod 1, the mean and standard deviation (SD) parameters are updated as per equations 4 and 5. The updated values of these statistics are subsequently maintained in their respective buffers.

For each feature,

where; N = current reading number 

Finally, wavelet is appended as a new entry of the wavelet buf f er, and the wave counter total readings is incremented before analyzing the next eye movement. Following lperiod 1, the user must generate a predefined number up training reps of upward gazes. In this phase (lperiod 2), the same set of features are analyzed in wave r. Here, the statistical parameters anti mean and anti SD for each feature are updated similar to mean and variance of PBs, respectively (referring to equations 4 and 5). While the analysis of PBs in lperiod 1 is used to set the thresholds for their accurate real-time classification, the data acquired in lperiod 2 is used to identify the idiosyncrasies of up movements and thereby make a more comprehensive differentiation between them and PB waveforms. After a round of experimental trials, the values of both P B training reps and up training reps were conveniently set to ten.

Together lperiod 1 and lperiod 2 constitute the Learning P eriod of the algorithm, which should last for around one-two minutes depending upon the subject's consistency. Post this period, the thresholds for classifying PBs from other eye movements can be determined. For each feature examined: the upper (ut) and lower thresholds (lt) and the upper (uat) and lower (lat) anti-thresholds are calculated as per equations 6a and 6b using the obtained mean, variance, anti mean and anti SD. This threshold adjusting scheme employed will also account for situations where the threshold bands of PB and up waves of a feature merge, assigning the midpoint of the merging region as the corresponding threshold for that PB feature, as proposed by equations 7a and 7b. 

Once the Learning P eriod is completed and thresholds are set, the system is ready to detect the occurrence of ensuing PBs in real-time.

The thresholds lt and ut set in the Learning P eriod are used to classify any PBs generated by the same subject in real-time, referred to as the Operational P eriod. For any generated EOG wave to be considered as a PB, it should initially pass from State 0 -4, followed by having all the aforementioned features within tolerable limits set by the corresponding thresholds for that feature. Practically it is observed that true PBs generated by the user may, in some instances, have some of the considered attributes outside their thresholds. To avoid such PBs from being misclassified, fuzzy logic is employed over crisp logic, ensuring that a holistic classification is made on the closeness of all the features of the EOG wave with the existing dataset.

For an EOG wave obtained during the Operational P eriod, the membership degree of each feature on its corresponding PB set is obtained using a gaussian membership function given by equation 8. The membership degree of all the features considered is then added up and assigned to pass sum. The defuzzification necessary for the classification of PB is finally performed by equation 9, resulting in a binary output for Fig. 3 . Gaussian function used for determining the membership of a candidate EOG wave's feature on its corresponding PB set the entire analysis that answers whether the eye movement generated was a PB or not.

For each feature,

V. OPTIMIZING THE DETECTION ALGORITHM FOR MCUS

The algorithm described in sections 3 and 4 was executed by the TIVA TM4C123G, a 32-bit ARM® Cortex® M4 MCU operating at 80MHz, coupled with a 32kB SRAM and 2kB EEPROM [18] . Considering the EOG wave samples to be of two-byte short datatype and each EOG wave to amass about 500 samples on an average, the MCU (executing the unoptimized program) was observed to run out of available RAM after recording just 8-15 readings. Such performances are unsuitable since the accuracy in determining the medoid wave (for Similarity Coefficient calculation) is sufficient only when enough PB waves are maintained in the buffer. Moreover, this calculation occurs for every new reading, and so, the MCU must always keep this buffer in memory for quick detection of drowsiness.

Memory fragmentation is another issue that needs to be addressed to ensure the long term viability of the MCU algorithm. Considering the limited RAM, discontinuous data structures like linked lists are not practical due to the 4 bytes utilized per pointer. For continuous arrays, creating space for new waves by freeing the memory occupied by another wave will be optimized only when such waves have the same length. This is a rare occurrence since real-time PBs can have any random number of samples per wave. Fragmentation may also occur while using moving buffers for mean filtering and FOD calculation.

One of the solutions identified for the aforementioned problems was to optimize the datatype used for each variable. For example, most variables were associated with the short datatype that occupies two bytes instead of the default int datatype occupying four bytes. Apart from minimizing the number of float calculations part of the algorithm, All data requiring continuously moving buffers are stored in circular buffers that occupy a predefined range of RAM. Furthermore, as a compromise between linked lists and plain arrays, a hashed array tree (HAT) with leaves of hundred elements were used to store EOG waves. Using HATs with fixed leaf size prevents memory fragmentation while ensuring no more than 49 memory elements are left utilized at any time.

Another crucial factor governing the suitability of a candidate MCU program for time-critical applications like drowsiness detection and alerting is its execution time. Hence, the various calculations performed by the code were optimized to the greatest extent, involving minimal copying of data among the sub-modules. For example, phase calculations were omitted in the cross-correlation algorithm used for EOG wave comparison. They were also rewritten to comprise integer based calculations only. Such optimizations ensure that the user is alerted as soon as the established criteria of drowsiness are satisfied.

While laboratory evaluations of the algorithm yielded acceptable results, physically testing the same on vehicle drivers became difficult due to the COVID-19 related lockdowns in India. 1 However, to demonstrate its efficacy, a simulator was used to initially train the algorithm with PBs and upward gazes and then randomly generate any common eye movement to evaluate its performance.

The simulated signals for each eye movement have more than ninety per cent correlation with the actual EOG signals for that movement. Additionally, it was programmed to make minor variations in the simulated signals to resemble the fluctuations in vigour and speed that a human subject is likely to exhibit while using the wearable. For each eye movement that the simulator generated, the algorithm is expected to classify it as a PB or not correctly. The results obtained from this comparison were then recorded as one among a correct classification, a false positive, a true negative or an unclassified detection (when the algorithm did not detect the simulated wave). For demonstrating the algorithm's suitability for different kinds of subjects, the simulator was used to recreate the behaviour of fifteen different subjects. The algorithm was individually trained with ten PBs and ten upward gazes for each subject profile and then executed for more than 300 readings. To ensure that the results resembled the real conditions as far as possible, the entire simulation process and the detection algorithm was run on the TIVA TM4C123G MCU. The results of this experiment are summarized in table II.

From the aforementioned observations, it can be concluded that the algorithm discussed in this study can detect PBs with an average accuracy of 86%. Moreover, the experiments conducted were documented for detecting individual PBs. Since drowsiness as a phenomenon is associated with the repeated exhibition of PBs (about two to three such PBs in a ten to fifteen-second interval), the overall effect of true negatives is averaged out, further reducing their repercussions on reliable drowsiness detection.

Moreover, unlike PBs, the upward gaze is a far more likely eye movement that a vehicle driver can perform. This may be due to actions like the periodic glancing at the rear-view mirror or looking at the signal indicated by traffic lights. Hence, it is necessary to ensure that no upward gazes are classified incorrectly as PB (false positive); failing to do so might repeatedly trigger the safety mechanism of the wearable, which can affect the efficiency and concentration of the user while driving. From the results obtained, it can be seen that the probability of false positives being generated by the algorithm is low. Hence, it is unlikely that the system falsely detects consecutive upwards gazes.

Additionally, one of the most crucial metrics that may be used to benchmark safety systems is the time it requires to detect a hazardous situation. In this case, the algorithm manages to make its decisions within half a second when executed on the TIVA TM4C123G. Hence, even while using an economical MCU, the driver is provided with sufficient time to recover from their drowsiness and prevent an impending mishap from happening.

While the detection algorithm constitutes multiple optimizations and tradeoffs for the calculation and storage of data on isolated embedded systems, it manages to incorporate ML for addressing subject-specific characteristics present in the EOG signals. Additionally, considering the high accuracy and low detection time required for detecting PBs, it can be concluded that the algorithm proposed in this study may be used as part of an economic wearable system to detect drowsiness among vehicle drivers.

There are some aspects where the algorithm could be modified to enhance its performance. Over time, an individual may exhibit varying physical and mental health conditions due to causes like fever, fatigue, age, distraction and stress. In these conditions, the eye movements exhibited by them may not be of the same vigour and speed as they did during the Learning P eriod [8] [9] . If the algorithm can fine-tune the feature thresholds according to the short-time EOG wave trends observed during these situations, reliability in detecting PBs can be ensured irrespective of user conditions. Theoretically, such algorithms would run indefinitely and may be developed using statistical means similar to those discussed in section IV-A a but gathered over a short-time window.

Additionally, it must be noted that the algorithm discussed in this study assumes a single wearable unit to be used by a single user only (for whom it is trained). However, adding a user profiling system (availing auxiliary permanent memory storage for the MCU) allows multiple user data to be maintained simultaneously. If implemented, a single wearable can then be used interchangeably by numerous individuals simply by switching to their user profile through button-based menu toggling.

Eye movement driven head-mounted camera: it looks where the eyes look

EOG signal processing module for medical assistive systems

A device controlled using eye movement

Narrative review: Do spontaneous eye blink parameters provide a useful assessment of state drowsiness?

Electrooculography-based continuous eye-writing recognition system for efficient assistive communication systems

Optimal Bipolar Lead Placement in Electrooculography (EOG): A Comparative Study with an Emphasis on Prolonged Blinks

Baseline wandering removal from ECG signal by wandering path finding algorithm

Physiological Signal Based Biometrics for Securing Body Sensor Network

Effects of aging on eye movements in the real world

Eye Movements Discriminate Fatigue Due to Chronotypical Factors and Time Spent on Task -A Double Dissociation

Genderbased eye movement differences in passive indoor picture viewing: An eye-tracking study

Security Analysis and Improvement of Fingerprint Authentication for Smartphones

A Sub-Quadratic Exact Medoid Algorithm

Image Matching by Normalized Cross-Correlation

Derivative Dynamic Time Warping

Dynamic programming algorithm optimization for spoken word recognition

A real-time spike classification method based on dynamic time warping for extracellular enteric neural recording with large waveform variability

Tiva™ TM4C123GH6PM Microcontroller