key: cord-0959255-jgw1uqea
authors: Joshi, Ashwini; Procter, Teresa; Kulesz, Paulina A.
title: COVID-19: Acoustic Measures of Voice in Individuals Wearing Different Facemasks
date: 2021-06-19
journal: J Voice
DOI: 10.1016/j.jvoice.2021.06.015
sha: 22b5921c448bd677bf90b3c4351f07454116d7f3
doc_id: 959255
cord_uid: jgw1uqea

AIM: The global health pandemic caused by the SARS-coronavirus 2 (COVID-19) has led to the adoption of facemasks as a necessary safety precaution. Depending on the level of risk for exposure to the virus, the facemasks that are used can vary. The aim of this study was to examine the effect of different types of facemasks, typically used by healthcare professionals and the public during the COVID-19 pandemic, on measures of voice. METHODS: Nineteen adults (ten females, nine males) with a normal voice quality completed sustained vowel tasks. All tasks were performed for each of the six mask conditions: no mask, cloth mask, surgical mask, KN95 mask and, surgical mask over a KN95 mask with and without a face shield. Intensity measurements were obtained at a 1ft and 6ft distance from the speaker with sound level meters. Tasks were recorded with a 1ft mouth-to-microphone distance. Acoustic variables of interest were fundamental frequency (F0), and formant frequencies (F1, F2) for /a/ and /i/ and smoothed cepstral peak prominence (CPPs) for /a/. RESULTS: Data were analyzed to compare differences between sex and mask types. There was statistical significance between males and females for intensity measures and all acoustic variables except F2 for /a/ and F1 for /i/. Few pairwise comparisons between masks reached significance even though main effects for mask type were observed. These are further discussed in the article. CONCLUSION: The masks tested in this study did not have a significant impact on intensity, fundamental frequency, CPPs, first or second formant frequency compared to voice output without a mask. Use of a face shield seemed to affect intensity and CPPs to some extent. Implications of these findings are discussed further in the article.

A global health and socio-economic pandemic exists due to SARS-CoV-2 (COVID- 19) , an infectious respiratory virus since January 2020 [1] . As of May 7, 2021, globally 155,506,494 people have contracted COVID-19 with 3,247,228 confirmed deaths reported worldwide; infected persons and deaths from COVID-19 amass 33,374,726 and 594,103 in the United States, respectively [1] . As a respiratory virus, modes of transmission of SARS-CoV-2 may be direct contact (person to person, person to object), respiratory droplets, and airborne. SARS-CoV-2 infection is understood to transmit via respiratory droplets (> 5µm larger droplets that fall near the source) expelled during exhalatory events such as coughing, sneezing, speaking, singing, and breathing, and via fine-particle aerosol droplets (less than or equal to 5µm in aerodynamic diameter) [2, 3] . Pathogens from COVID-19 may then remain airborne indefinitely in indoor environments without adequate air ventilation [3] [4] [5] . Transmission of COVID-19 from asymptomatic and presymptomatic individuals is documented. A public level of infection control is necessary to curtail transmission [6] . individuals not from the same household. The CDC encourages particular recommendations for mask wearing: masks secured under the chin and covering the mouth and nose, homemade/cloth masks with at least two layers of fabric, to leave the N95 respirators for healthcare workers etc., masks for anyone above the age of 2 years old, and use of face shields with additional face covering [7] . Recent evidence from the CDC confirmed that tightly fitted masks (using a cloth mask over a surgical mask or using a surgical mask with knotted ear loops) decreased exposure to potentially infectious aerosols by approximately 95% (CDC, 2021).

Universal masking contributes to a multipronged infection control strategy [8] and encourages a community-level response albeit with communication difficulties.

Recommendations by public health officials for social distancing of 6ft or 1-2 meters additionally extend typical distance between two conversation partners [9] . Per the inverse square law, a doubling of distance from the sound source reduces intensity of speech by approximately 6dB. With distance controlled, intensity level of speech sounds with a surgical mask at 3ft simulated measures of speaking without a mask at 6ft [10] . Maintaining the CDC recommended 6ft distance for protection against COVID between conversational partners simulates 12ft distance between speakers and consequently affects speech intelligibility. As can then be expected, speech intelligibility in mask-wearing speakers improves with increased proximity [10, 11] .

The implications of these safety measures on communication go beyond needing repetition in the casual conversational settings. The use of masks during a socially distanced conversation places an added physiological and cognitive load on the communication partners.

The listener, in the absence of visual cues, needs a higher degree of focus to complete the speech perception, which is particularly difficult for interlocutors with hearing loss, voice disorders or deficits in auditory and/or cognitive processing.

Data revealed how fabric porosity, viscosity-resistance, and resonant features of facial coverings may influence the amplitude of a signal or transmission loss of a material [12] [13] [14] [15] [16] .

Higher frequencies of speech may be more impacted by viscosity-resistance (friction between air molecules and fabric pores), and lower frequencies by resonance [12, 13] , with resonant characteristics of fabric type augmenting amplitude at certain frequencies [14, 15] . A low-pass filter, reduction of intensity in high-frequency sounds, was seen in all mask conditions by Effects of facemasks on voice 5 Guiliani (2020) [10] , with the least reduction in the surgical mask (loss of 3-4dB above 1.5kHz), and in frequency ranges 2.5kHz to 12.5kHz and 14kHz to 24kHz in a surgical mask by Llamas et al. (2008) [17] . Viewer-listeners rated it the easiest face covering to understand. Fabric or cloth face covering conditions had attenuated transmission, and negative effects on speech intelligibility measures compared to surgical masks [10, 14, [17] [18] [19] . Generalization into connected speech is limited as most results were obtained from mechanical experimental designs or with a standardized word list with human subjects.

Prior to the COVID-19 pandemic, about one third of the United States population was found to encounter a vocal impairment during their lifetime [20] . Standard care of voice includes instrumental assessment of acoustics by speech-language pathologists as delineated by the American Speech-Language Hearing Association Special Interest Group 3: Voice and Upper Airway Disorders [21] . With the rapid changes in clinical practice protocols, physicians and speech-language pathologists had to scramble to create new protocols for voice assessment and treatment. While many clinical visits were moved to the telehealth platform, there was and continues to be a need for in-person visits. The lack of information on the transmission of COVID-19, especially in the early months of the pandemic, made it difficult to know which voice measures and tasks were safe. This uncertainty was confounded by the need to wear masks without an understanding of the effects of the different masks on voice quality and the associated acoustic measures. Due to mask wearing recommendations and obligations, identifying masks either with a minimal effect, or with a significant effect on voice measures can be of great use to clinicians, not only in the medical setting but also in schools, private practice and other nonmedical settings. The systematic measurement of masks on the acoustic signal from individuals as opposed to a simulated environment using a physical model will have more real-life and generalizable applications.

In addition to measures of voice quality, resonance measures are important in voice assessment and voice therapy. While measurement of formant frequencies are not typically included in the five domains of voice assessment or in the recommended protocol for instrumental assessment [21] , assessment of resonance is critical to success in voice therapy. The balance of the respiratory, laryngeal and resonance systems are important for a healthy voice as well as for the perception of a normal voice by a clinician. [22] As detailed by the source-filter theory [23] , the interaction of the sound at the vocal folds and the filtering properties of the vocal tract are responsible for the range of resonant features seen on the spectrograph across various consonants, vowels and persons. A small change in the shape of the vocal tract can make a perceivable change in the vocal output. This physiological phenomenon is vital to many voice therapy approaches based on a semi-occluded vocal tract. [24] Given the sensitivity of the changes to the vocal tract, masks can be expected to alter resonance during speech as well.

Examining the effects of masks in the context of face-to-face voice assessment and therapy is important for accurate data and treatment course.

The purpose of this study was to measure the effects of masks during the COVID-19 pandemic on measures of voice. We chose to use the most commonly used mask and mask combinations for examination-cloth, surgical, KN95, a surgical mask worn over a KN95 with and without a face shield. Cloth masks are typically 2-3ply woven, cotton material. The surgical mask is synthetic, 3-ply with a looser fit than a KN95. The KN95, due to its shape, adds to the length of the vocal tract, but is a tighter fit than the other 2 masks. We hypothesized that effects of the mask on voice quality would differ based on their individual filtering properties. In addition, we also measured if there was a differential effect of mask type on sound radiation at 1ft and 6ft mouth-to-microphone distance.

The Institutional Review Board at the University of Houston approved the study. Special considerations were made for testing during COVID-19 including disinfecting the testing space, hand hygiene, masks for the study personnel, and maintaining a 6ft distance between participants and study personnel. The data collection was planned in a way that allowed us to keep the minimum number of individuals in the room at any given time.

Nineteen adult participants (ten female, nine male) participated in the study. Mean age and age range for the female participants were 30.5 years and 18-56 years respectively, while that for male participants were 39.4 years and 21-67 years. All participants had a normal voice quality with no current complaints of the respiratory, cognitive, neurological or auditory systems.

Participants were non-smokers and were all native speakers of Standard American English.

A repeated-measures study design was implemented for measurement of acoustics under different mask conditions. Each participant had to perform speech tasks for 6 mask conditions.

The order of these conditions was counterbalanced to minimize an order effect. The 6 mask conditions included no mask, a cloth mask (3-ply, woven, 100% cotton poplin, www.oldnavy.gap.com), a surgical mask (3-ply, unwoven, www.amazon.com), a KN95 mask (filtering facepiece respirator, www.amazon.com) and, a surgical mask over a KN95 mask with Effects of facemasks on voice 8 and without a safety face shield (www.amazon.com) (referred to as KSF and KS from here on).

A KN95 mask was used instead of an N95 mask due to the short supply of the N95 masks. The masks used are shown in Figure 1 . Participants were instructed to wear the mask to cover the nose up to the bridge, mouth and chin.

Tasks Under each condition (mask, mask combination or no mask), participants performed three trials of sustained /a/ and /i/ vowels at comfortable pitch and loudness, each trial lasting approximately 4 seconds as recommended by Patel et al (2018) [19] . The /a/ vowel was used for F0, intensity and CPPs. The /a/ and /i/ vowel were used for formant measurements due to the difference in their manner of articulation in the vocal tract.

Recordings were performed in a quiet room with ambient noise below 60 dB SPL. A sound-treated booth was not utilized as a precautionary measure for COVID-19. A Shure SM48 dynamic cardioid microphone on a desk stand at a 1ft mouth-to-microphone distance was paired with a Marantz Professional PMD661 recorder. Intensity measurements were obtained for the sustained /a/ with two dB-C weighted sound level meters (SLM, Reed Instruments, R8050), at 1ft and at 6ft A study personnel sat 6ft away from the participant (in keeping with COVID-19 safety precautions) and recorded the readings on the SLM placed at 1ft from the participant. A second study personnel noted the readings on the SLM at 6ft from the participant.

Intensity measurements from the SLM were averaged across three trials. When using the SLM, the intensity level was noted approximately mid-vowel. The Praat software [25] was used to obtain fundamental and formant frequencies and, smoothed cepstral peak prominence (CPPs).

The central 3 seconds of each /a/ trial was selected and measurements for F0 were averaged across the three trials for each mask. The parameter settings recommended by Maryn & Weenink for CPPs measurement with Praat were used [26] . Praat settings for formant analyses were modified after formant tracking errors were found with the default settings. The formant ceiling was adjusted to 7500Hz for recordings of female participants to appropriately track F1 and F2.

Formants identified by the program were confirmed manually by visually inspecting the spectrogram Recorded data for the KN95 mask only for one female participant was unavailable due to a technical error with the recorder.

We computed a two-way mixed analysis of variance with sex as a between-subjects factor, mask type as a within-subjects factor, using an unstructured covariance matrix for the error term. The outcomes of interest included intensity levels for /a/ at 1ft and 6ft distance from the speaker, F0, F1 and F2 for /a/ and /i/ and, CPPs for /a/. Statistically significant main effects of mask type were followed up with ten planned comparisons, maintaining family-wise alpha at .05 by using the Bonferroni correction method (in our case, critical p-value was equal to .05/10 = .005). The comparisons of interest included: all mask and mask combinations to the no mask condition, the cloth mask with the surgical and KN95 mask, and the KN95 mask alone compared to KN95 layered with the face shield with/without the surgical mask. The no mask condition served as a baseline comparison to all the other mask conditions. Comparisons of the surgical and KN95 masks to the cloth mask were of particular interest due to the common use of the cloth mask by the general public and, the relative thickness of the cloth creating a potential dampening effect. The KN95 alone as compared to its combination with other layers was of interest due to the recommendation for healthcare professionals to layer the surgical mask with an N95 mask to reuse N95s during shortages of PPE and, the use of the face shield for eye protection from aerosolized particles. Analyses were computed in SAS 9.4 using the PROC MIXED procedure [27] . In the interest of space, we only reported statistically significant main and interaction effects, and planned comparisons in the Results section. Tables 1a and   1b. Table 2 Understanding the effect of masks on overall communication and at the recommended 6ft distance is important at many levels. Anecdotally, masks make it harder to be understood and may require the speaker to repeat. This has implications for all communication settings (patientphysician interactions, in the classroom, during speech and language evaluations, treatment etc.), especially for those with hearing loss, voice or communication disorders. Identifying a mask that minimizes the negative effect on speech and voice would be useful in providing recommendations of mask use in these settings.

Intensity levels were measured at 1ft and 6ft using two sound level meters. Male participants had higher intensity levels than females for all mask types at both the 1ft and 6ft distance, consistent with established normative data. Statistically significant sex differences were found for all variables during the acoustic analyses.

When comparing the performance of the 3 masks (cloth, surgical, KN95) and 2 mask combinations (KS, KSF) to each other and to voice without a mask, there were some statistically significant differences in the output for all of the variables. Overall, there were no significant differences in any variable between the no mask and all other mask conditions. All participants were instructed to produce each /a/ in the same manner irrespective of mask type, however it is likely that participants changed the effort to keep the output constant based on their own perception. It is difficult to parse out the effects of the masks in isolation, but this behavior is in keeping with what we would expect in the real-world.

For intensity measures at 1ft, following post-hoc comparisons, there were statistically significant differences between the KSF and KN95, with the addition of the face shield in the KSF mask condition having greater intensity in /a/ production at 1 ft compared to the KN95. This may have been due to a change in effort when wearing 3 protective layers (KSF) as compared to 1 layer and/or the face shield acting as a potential reflective surface, changing some of the properties of the vocal output.

Compared to the no mask condition, there were no significant differences while wearing the facemasks for CPPs, F0 and, F1. These results are consistent with Magee et al. (2020) who found no significant differences between no mask and three different facemasks with CPPs, intensity, harmonics-to-noise ratio, jitter and shimmer. These findings are meaningful for clinical evaluations, and suggest that clinicians could obtain voice evaluation measures from patients wearing a mask as long as the mask type is consistent across measurement time points. Between mask conditions KSF and KS, CPPs value for /a/ at 1ft was significantly lower in the KSF combination compared to KS after addition of the face shield. This may imply that the face shield altered the vocal output in some way, adding noise/turbulence to the vocal signal.

The effect of mask type differed between sexes only for F1 for /i/ as seen by the interaction effect between mask type and sexes. The most obvious difference was for the cloth mask. For females, mean values for all mask types including the cloth mask were higher, though not statistically significant, than the no mask condition. For the males, mean F1 for /i/ for the cloth and surgical mask was lower than the no mask condition. There is an overlap in range of F1 for all mask types with both sexes, so while it is unclear why we saw an interaction effect for this particular variable, a larger sample size might aid in making this effect clearer.

There were no other significant differences for F0 and F1, but we saw differential filtering effects for the masks for F2 of /a/ by sex. As seen in Figure 4 , F2 frequency was stable for the no mask and all masks except KSF for males, with a drop in F2 for KSF. In the female participants there was a drop in F2 for the cloth mask, and an increase for the surgical mask with comparable mean F2 values for the KN95, KS and KSF as the no mask condition. The differential filtering effects can be explained by the energy transmission loss in the higher frequencies as seen in the study by Guilani (2020) [10] . There was an increase in intensity below 1KHz in their study with the face shield and a decrease in intensity compared to no mask at frequencies higher than approximately 1.4KHz. Corey et al. (2020) [19] also found an attenuation in energy in frequencies greater than 1KHz when using masks. The range for F2 for the participants in this study without a mask for /a/ was 1053 Hz-1473 Hz and the masks may have caused a transmission loss at these frequencies. F2 is critical to vowel identification and is dependent on the size of the oral cavity anterior to the constriction in the oral cavity. A change in the size of this cavity secondary to a mask may be responsible for altering F2, therefore impacting approximation of vowel identity.

The methodology study was limited by the small sample size and the adaptation in to the recommended safety precautions during the COVID-19 pandemic. Differences in output as a function of age were not examined.

The masks tested in this study did not have a significant impact on intensity, fundamental frequency, or first formant frequency compared to the measures without a mask. Per these results and previous literature, clinicians may collect complete voice assessments with masks but should keep the mask type consistent across different measurement days. However, assessments and treatment involving articulation, receptive and expressive skills need to be considered carefully in the context of the facemask used.

The results obtained in this study may vary for masks of different brands and materials, especially for cloth masks. A summary of the recommendations based on our findings are provided for clinicians in Figure 5 . Further research is required to study the effects of masks on in conversational speech and in the presence of background noise, the impact of alterations formants on speech perception and, the measurement of effort for the speaker and listener during mask use. Note. *p < .05; **p < .01; ***p < .001. 

Coronavirus disease (COVID-19) pandemic

Aerosol and Surface Distribution of Severe Acute Respiratory Syndrome Coronavirus 2 in Hospital Wards

Aerosol and Surface Transmission Potential of SARS-CoV-2. medRxiv

Particle sizes of infectious aerosols: implications for infection control. The Lancet Respiratory Medicine

Aerodynamic analysis of SARS-CoV-2 in two Wuhan hospitals

Temporal dynamics in viral shedding and transmissibility of COVID-19

Things To Know About the COVID-19 Pandemic

Association Between Universal Masking in a Health Care System and SARS-CoV-2 Positivity Among Health Care Workers

Influence of culture, language, and sex on conversational distance

For speech sounds, 6 feet with a mask is like 12 feet without

Diminished Speech Intelligibility Associated with Certain Types of Respirators Worn by Healthcare Workers

Absorption of Sound Wave by Fabrics Part 2: Acoustic Impeadance Density

Absorption of Sound wave by Fabrics Part 3: Flow Resistance

Evaluation of transmission loss induced by stretched fabric treatments

The effect of fabric parameters on sound-transmission loss. The Journal of The Textile Institute

Filtration mechanisms and manufacturing methods of face masks: An overview

Effects of different types of face coverings on speech acoustics and intelligibility

Effect of masks on speech intelligibility in auralized classrooms

Acoustic effects of medical, cloth, and transparent face masks on speech signals

Voice Disorders in the General Population: Prevalence, Risk Factors, and Occupational Impact. The Laryngoscope

Recommended Protocols for Instrumental Assessment of Voice: American Speech-Language-Hearing Association Expert Panel to Develop a Protocol for Instrumental Assessment of Vocal Function

Clinical Voice Pathology: Theory and Management

Acoustic theory of speech production with calculations based on X-ray studies of Russian articulations. The Hague: Mouton

Voice training and therapy with a semi-occluded vocal tract: rationale and scientific underpinnings

Praat: doing phonetics by computer

Objective Dysphonia Measures in the Program Praat: Smoothed Cepstral Peak Prominence and Acoustic Voice Quality Index

Language Reference: Concepts

Effects of facemasks on voice 16 We would like to thank all our participants for volunteering for this study during the COVID-19 pandemic. We would also like to thank our student research assistants Neha Alex and Ruiquing Fan for expediting the data collection and analyses for this study.