key: cord-0586891-ar42p62b authors: Vivas, David R.; S'anchez, Estiven; Reina, John H. title: Deep learning the atmospheric boundary layer height date: 2020-04-09 journal: nan DOI: nan sha: aa1c79fbd60c7b7270b8bdaad66d86ee31993c83 doc_id: 586891 cord_uid: ar42p62b A question of global concern regarding the sustainable future of humankind stems from the effect due to aerosols on the global climate. The quantification of atmospheric aerosols and their relationship to climatic impacts are key to understanding the dynamics of climate forcing and to improve our knowledge about climate change. Due to its response to precipitation, temperature, topography and human activity, one of the most dynamical atmospheric regions is the atmospheric boundary layer (ABL): ABL aerosols have a sizable impact on the evolution of the radiative forcing of climate change, human health, food security, and, ultimately, on the local and global economy. The identification of ABL pattern behaviour requires constant monitoring and the application of instrumental and computational methods for its detection and analysis. Here, we show a new method for the retrieval of ABL top arising from light detection and ranging (LiDAR) signals, by training a convolutional neural network in a supervised manner; forcing it to learn how to retrieve such a dynamical parameter on real, non-ideal conditions and in a fully automated, unsupervised way. Our findings pave the way for a full integration of LiDAR elastic, inelastic, and depolarisation signal processing, and provide a novel approach for real-time quantitative sensing of aerosols. Atmospheric particulate matter (PM) stands for a mixture of microscopic solid particles and liquid droplets suspended in the air, which can be of varied chemical composition and size distribution 1 . These include coarse and fine particles such as PM 10 , PM 2.5 , including nitrates, sulfites, organic carbon and black carbon [2] [3] [4] [5] . Also referred to as atmospheric aerosols, they are a key indicator of air pollution, and their heterogeneity arises from the numerous sources and varying formation mechanisms, as they can be either directly emitted to the atmosphere or produced in the atmosphere from precursor gases [6] [7] [8] [9] [10] [11] [12] . Continuous high concentration levels of atmospheric pollution pose many crucial challenges and important adversarial effects, ranging from impact on global public health, mortality and morbidity due to increased cases of cardiovascular and cardio-respiratory diseases linked to long-term exposure to fine particles [13] [14] [15] , through to direct implications on climatic changes [2] [3] [4] [6] [7] [8] [9] [10] [11] [12] 16, 17 , and agriculture 14, [18] [19] [20] [21] . It has become a latent need to accurately monitor the atmospheric variables that allow for the prediction of air pollution behaviour, in order to issue early alarms for the protection of the population. One of the variables of greatest interest is the dynamical height of the atmospheric boundary layer (ABL), the lowest layer of the troposphere that is directly influenced by the Earth surface by means of both natural and anthropogenic emissions 22 . The ABL top (ABLT) is defined as the midpoint of the sharp transition zone between the ABL and the free troposphere 23 . In an ideal LiDAR signal, it is identifiable as the midpoint of the first sharp reduce of intensity, the so-called entrainment zone. In a real LiDAR signal, several intensity peaks can be detected before or after the actual ABLT, thus erroneously estimating its location. The fully automated and unsupervised detection of the ABLT has been a challenging topic for the LiDAR community during more than two decades, with the development of numerous contributions such as the gradient 24 and second derivative 25 methods, the wavelet Covariance transform (WCT) method [26] [27] [28] [29] , several fitting methods [30] [31] [32] , and some approaches based on the statistical analysis of the LiDAR signal, over the time evolution of successive signals 33, 34 . Of the aforementioned, the gradient, second derivative, and WCT have achieved big spread and acceptance 23,24,26-29 ; probably due to their simplicity and ease of implementation. However, these methods are heavily limited by the weather conditions that determine the shape of the signal, usually failing to retrieve plausible predictions under the presence of clouds, heavy noise and layer overlapping phenomena. In order to avoid these error sources, a search threshold can be defined for each specific case study, thus sacrificing automaticity in the process. Fitting methods attempt to circumvent these limitations via the automatic fitting of the experimental signal to an ideal one, from which an ABLT value 30 or a search threshold 32 can be extracted. Nonetheless, these methods are still limited by the shape of the signal, given that under non-ideal conditions, such shape can be substantially different from the ideal one. This translates into the retrieval of an invalid fitting and a consequent inaccurate ABLT value. The above-mentioned methods assume that the aerosol concentration inside the ABL is significantly higher than the one in the free troposphere 23 . These methods process each signal independently, and are computationally inexpensive (assuming an adequate implementation), allowing the possibility of realtime detection. Methods based on statistical analysis usually rely on information of the signal's time evolution over a set of measurements, so they are outside the scope of this work. In this article, we present a new method for real-time, fully automated and unsupervised ABLT detection from atmospheric LiDAR signals. Our method uses a deep learning model, specifically a convolutional neural network (NN), trained to detect the atmospheric boundary layer top on real, non-ideal conditions. The results here reported have been obtained from data acquired from in-situ measurements performed at the LALINET network LiDAR-CIBioFi station (Cali, Colombia), which has been operational since 2018, and constitutes a regional strategy that Figure 1 . Neural Network Architecture. The input signal is fed to three convolutional layers, each one with five kernels that decompose the signal into low-levels features. This new representation of the signal is then fed to a three-layer multilayer perceptron (MLP), which defines a non-linear transformation that finally returns the detected ABLT. Initially only dense architectures (MLPs) were considered, but their inclination to overfitting lead to poor performance on validation and test data. Progressively increasing the number of convolutional layers at the start of the model significantly increased its generalisation capabilities. contributes to the analysis and prediction of climate, weather and air quality 32 . Extracting the atmospheric boundary layer height from Li-DAR retrievals: A deep learning approach. Given a range-corrected LiDAR signal (RCS), an ABLT value is obtained by forward-propagating the signal over the layers of a convolutional network 35 . The architecture of the implemented neural network is illustrated (by means of PlotNeuralNet 36 ) in figure 1. The dataset used for the training and tuning of the model is conformed by 15000 signals labeled via WCT, as explained in Supplementary Material, with a custom search threshold for each individual one; this was done in order to ensure the quality of the labels and, consequently, the predictions of the neural network. WCT was chosen as the labeling method both for its ease of implementation, and well-known robustness and performance [27] [28] [29] 37 . Assuming the correct training of the model, it is then expected that it replicates the predictions of a signal-by-signal fine-tuned WCT, but in a completely automated and non-supervised manner. The convolutional neural network here proposed for ABLT detection is compared to WCT in a supervised variant (custom search threshold for all time evolution) and unsupervised WCT (full signal as input during all time evolution). WCT has been chosen as the baseline as for the past two decades it has been used to measure the ABLT in numerous case studies 27,29,37 with one of these being Wuhan 28 , a region of growing environmental interest during the 2020 SARS-CoV-2 virus pandemic outbreak 38 . Over the past six months, fires of unprecedented impact on people wellbeing, biodiversity, wildlife, and infrastructure have dramatically affected the australian continent (january 2020) 8, 16 and the Amazon rainforest (august 2019) 9, 39 . Their possible connection to climate changes 16, 17, 40 have raised serious alarms, and a recurrent need for global environmental policy and timely law reinforcement, as well as for more accurate climate dynamics and atmospheric predictions, all of which constitute a crucial challenge for both adaptation and mitigation 16, 17, 40 . A detailed comparison of the results for ABLT prediction on different days (14, 15, 16 and 21) of august 2019, from data obtained at the LiDAR-CIBioFi station, is presented in figures 2 and 3. During such period, strong aerosol transport from the Amazon rain forest wildfires was reported by NASA's Atmospheric Infrared Sounder (AIRS) instrument aboard the Aqua satellite 41 , showing the aerosols propagation from the source (brazilian northwest Amazon region) through to the north of South America, during 8-22 august 2019, thus providing a challenging test bench for the above-described methods. The size of NASA's reported fires was so large that they could be spotted from space, and spread over several large Amazon states in northwest Brazil 9,39,40 . The results are depicted in graphs of temporal evolution (see figures 2 and 3) that exhibit very different LiDAR results throughout the day. The dynamics are presented in 2D graphs, with the signal intensity plotted in a colour scale; in figure 2a, each dashed vertical line corresponds to a single LiDAR measurement profile, which in turn are shown in figures 2b, and 2c. On August 14 (figure 2), clouds around 4 km height, with some formations around 2 km were detected, with the latter being very close to the height of the boundary layer, thus posing a challenge for accurate ABLT detection. In addition, some cases of turbulence were detected after 14 h. The first single LiDAR measure profile (figure 2b), taken at about 12:30 h, exhibits a stable and well-mixed layer, making it easy to discriminate between ABL and free troposphere, thus allowing a straightforward evaluation of the predictions: the NN gives very similar results to the supervised WCT (about 1.8 km), while the unsupervised WCT located the ABLT in a cloud formation above 4 km height. In contrast, the second profile (figure 2c), taken at 15:40 h, gives very different results for the supervised (sup.) WCT, the unsupervised (unsup.) WCT, and the NN. The sup. WCT located the ABLT at 500 m, below the expected result. The unsup. WCT placed the result at around 4 km, in a cloud formation pattern, while the NN located the ABLT about 2.6 km, following its actual behaviour. The temporal evolution of the LiDAR measurement profiles of figure 2 makes it clear that the unsup. WCT detection locates the position of the boundary layer at cloud formation height, erroneously placing the ABLT in most cases. The sup. WCT shows ABLT detection problems, severely underestimating ABLT, for cases of proximity to clouds (see e.g., around 12 h), and in cases of greater turbulence (e.g., after 14 h). Despite the drawbacks presented by sup. WCT and unsup. WCT, figure 2 clearly shows that our convolutional NN estimation of ABLT is resilient to turbulence and proximity of clouds, and correctly follows its evolution. Aerosol transport and biomass-burning emissions identification: The case of 2019 Brazilian rainforest wildfires. Following our analysis, we next introduce the results for the case study of the Amazon rain forest fires (august 2019), as plotted in figure 3 , for august 15, 16, and 21. August 15 (figure 3a) is a day of particular interest, since we detected ABLT levels around 0.5 km in the morning, well below the 1 km height-approximately the average value in the study region. This decrease in ABLT corresponds to an increase in the PM concentration, possibly due to a temperature decrease in the troposphere. This was corroborated by the local authority for air quality monitoring (DAGMA), and alerts due to local high PM concentration were issued on this date 42 . Similar concerns already pointed out for the ABLT detection methods were encountered in this case, for both supervised and unsupervised WCT. The supervised WCT, for example, registers ABLT values close to 400 m at about 14:00 h, which would imply a PM concentration higher than that reported during the morning, an incorrect prediction following the measurement results. This points out how critical is the requirement of a correct ABLT prediction in assessing the need for communicating early alarms due to high PM concentration, thus avoiding false positives. Figure 3b (August 16) exhibits a stable and well-mixed layer, which allows for a straightforward discrimination between ABL and free troposphere; in this case, the supervised WCT provides similar predictions to the NN. Unsupervised WCTs, as expected, are located in a layer of cloud formation around 4 km. On August 21 (figure 3c), an unusual local scenario was detected: ABLT levels about 4 km, well above the 1 km average, with small variations in the PM concentration during the day. A sharp ABLT increase of this kind is expected to be followed by a decrease in the PM concentration: if the latter is not observed, it is most likely that aerosol transport from a different source, say another region, may be occurring. This hypothesis is demonstrated in this case by analysing the wind and aerosol transport by means of the HYSPLIT backward trajectories model (figure 4), from which we were able to trace, on this date, the burning of biomass due to the 2019 Amazon rain forest fire 41 . In addition, and to corroborate this result, data reported by NASA's Atmospheric Infrared Sounder (AIRS) instrument aboard the Aqua Predictions for ABLT results arising from the three methods here considered are quite different from each other in the latter scenario (figure 3c). The unsupervised WCT is only able to locate values near the ABLT during an interrupted time window of about two hours, otherwise its forecasts are above 5 km in upper clouds (not shown). The supervised WCT shows ABLT values well below the actual ones, detecting most of them below 1 km height during extended time windows, which is an overly underestimation. In contrast to the previous results, the proposed NN-based method follows the appreciated ABLT evolution contour, during the full time-frame measured (9:30-19:00 h) and without difficulty. The results show that the high expressive power of the neural-network makes it possible to obtain physically plausible ABLT values, even in the presence of unusual atmospheric phenomena that are difficult to predict for traditional methods such as WCT. We stress that the propagation time through the NN is of the order of milliseconds on a CPU, and even faster on a GPU, thus allowing the possibility of real-time detection. Besides the prediction analysis of every particular case-study, we have performed a measurement of the standard deviation The standard deviation of the full time evolution can be used as a reliable metric of dispersion, but not of continuity. This is illustrated in figure 5a for August 21, where the full std. dev. of supervised WCT is smaller than the one of the NN, in stark contrast to what is shown in figure 3c. Given this, the mean distance between each point and its immediate neighbour was considered as a continuity metric, as it provides results that are consistent with those of figures 2 and 3. The mean standard deviation over blocks of 30 minutes acts as a middle ground metric between the aforementioned ones, as it quantifies dispersion on a small time window according to the high temporal variability exhibited by the ABL 43 . The results show that, for all the case studies here treated, the tested NN achieves the highest continuity of time evolution and least dispersion on its ABLT predictions, followed by sup. WCT. This coincides with the dynamical behaviour observed in figures 2 and 3. We conclude by summarising our findings and perspectives. A novel technique for active atmospheric remote sensing, and, in particular, for atmospheric boundary layer estimation has been presented using convolutional neural networks, having the data processed in real time, and without the need for supervision or postprocessing. A large set of LiDAR data were collected in-situ with instrumentation available at our station, and their corresponding boundary layer values labeled in a controlled manner with WCT. To validate and evaluate the performance of our model, we have analysed different scenarios, under very different atmospheric conditions. Compared to the WCT approaches, the proposed NN results were more robust and could readily be used as an ABLT estimator. We found that, to quantify ABLT during these experiments, the structure of NN provided better regression effects than those from WCT. Our network architecture easily adapts different ranges of behaviour of the mixing layer, differentiating turbulence and cloud proximity phenomena. As a test bed for our model, we have considered an example of important complexity: we have reported a several days scenario during August 2019, with well marked different behaviour and particularities. This time frame coincided with the northwest brazilian summer Amazon rain forest wildfires 9,39-41 . We have shown how the reported NN model allowed us to quantify the influence due to such wildfires on our local aerosol dynamics: despite turbulence in the mixture layer and close presence of clouds, we were able to successfully identify the ABLT by means of the proposed neural network. Our model has been specifically trained with 532 nm LiDAR data in elastic configuration, but future work will consider its extension to additional information channels, such as other elastic wavelengths and inelastic scattering, as well as to the depolarisation channel 37 . As the neural networks excel at learning highly complex correlations between channels of data, the implementation of such extensions would lead to an improved accuracy, since this can be used both as an additional input for the neural network, and as a complementary method for refining the labels fed to the model during the training time. The experimental setup that provided the LiDAR retrievals used in this work is shown in figure 6 . It consists in a LiDAR system located at CIBioFi-Universidad del Valle (3.37N; 76.53W). The LiDAR configuration used operates in elastic mode to receive single-scattered signals at 532 nm of wavelength, with a coaxial static alignment. The emission component comprises a single frequency Nd:YAG pulsed laser (Q-smart 450 mJ, by Quantel) with a pulse width of ≤ 6 ns at 1064 nm, and a pulse repetition frequency of 10 Hz. A second harmonic generator (SHG) was implemented in order to double the central frequency resulting in a wavelength of 532 nm with an average energy ≥ 220 mJ. The output beam has a diameter of 6.5 mm which is expanded three times (3×) through a Galilean system. A Newtonian telescope with a focal length of 1.0 m and a primary mirror with a diameter of 0.3 m constitutes the reception component. The telescope with a field view of 1.47 mrad collects the backscattered light by the atmosphere. The ocular is coupled to a diaphragm with variable aperture, then to an interferential filter at 532 nm, and finally to a photomultiplier tube (Hamamatsu, R9880U series) to guarantee the reception of the light elastically scattered with a quantum efficiency of 50% at 532 nm. The transient recorder system (Licel, TR20-160) comprises the configuration detection unit. The transient system allows the discrimination of signals of different altitude through the synchronisation with a periodic signal (the trigger), which generally comes from the Q-switch control of the laser system, achieving a spatial resolution of 3.75 m in the vertical atmospheric column. The acquisition protocol consists of routine observations on different days, storing LiDAR signals with at least 2000 pulses per data from 8:00 until 18:00, local time, on a weekly basis, since 2018 up to date. Hyperparameter tuning was achieved via importance sampling over progressively smaller hyperparameter grids. Logcosh was chosen as loss function and Adam 44 as optimizer, as they exhibited the fastest convergence for this particular problem. Batch normalisation 45 was used between layers for even faster convergence. No L2 regularisation or dropout were used (see Supplementary Material). Training was performed on 400 epochs with a linear learn rate decay from an initial value 0.1 to a final value of 0.001. 12000 signals were used as a training set, 3000 signals as cross-validation set, and another 2000 unlabeled signals as a qualitative test set, all normalised via standardisation. This computational method was implemented in the TensorFlow library 46 via its Python 3 API. Additional information about the ABLT neural network-ABLT-NN code can be found in Supplementary Code. The data reported in this article are available from the corresponding authors upon reasonable request. The Wavelet covariance Transform (WCT) method 1,2 is based on the convolution between the LiDAR rangecorrected signal B(z) and a Haar wavelet h, which is defined as 3 : where the parameter a is known as the amplitude or dilation of the wavelet. The convoluted WCT (z, a) signal is then: Convolving B(z) with h z a results in a new signal where each point indicates a degree of similarity between both. Given that h z a is, essentially, an abrupt gradient, each point z of the convoluted signal will quantify the gradients present on an interval of amplitude a around the point z of the original signal. Thus, the minimum of this convolution represents the point of higher similarity between B(z) and h z a , and its altitude z wct ABL is then, assuming that an adequate interval around the entrainment zone was chosen, taken as the atmospheric boundary layer top (ABLT): where z res lidar denotes the vertical spatial resolution of the LiDAR setup (3.75 m in our configuration). The dilation a defines the width of each convolution window, so if such dilation matches the entrainment zone width, the convolution region corresponding to it will be maximised relative to gradients of smaller amplitude, such as the ones associated to instrumental noise 2,4 , thus providing an ABLT value close to the midpoint of the indicated entrainment zone (assuming an adequate search interval was chosen). Nevertheless, the presence of clouds represents a factor of failure for this method, given that the optical width of these can reflect as gradients higher than the one associated to the entrainment zone, even for dilations close to the entrainment zone width 2 . The method regarded in this work as supervised WCT differs only in the fact that the search interval of the z wct ABL is limited to a threshold that encompasses the entrainment zone observed for the entire time evolution of each particular case study, in order to approximately avoid clouds or large gradients that can negatively affect the performance of the method, thus requiring supervision. In contrast, unsupervised WCT defines a search threshold that goes from the start of the LiDAR signal to an upper bound defined by the observed interpretable portion of the signal, which in our particular experimental setup corresponds to around 6 km of altitude. It is worth mentioning that the convolution operation involved in this method is treated in its traditional mathematical sense (similar to that of equation (4), but flipping the kernel on its dimensions before the operation), so it is not equivalent to the operation treated as a convolution in section 2.2 of this supplementary material. If the convolution were performed as a cross-correlation operation, the resulting signal would be flipped around the x-axis, so argmax should instead be used. Artificial neural-networks (NNs) are computational systems partially inspired by the biological neuron. The main components of the architecture of a NN are computing units called Artificial Neurons (ANs), which are interconnected in such a way that allows the propagation of information through the structure of the NN. The branch of artificial intelligence that concerns the design, architecture and optimisation of NNs is known as deep learning 5,6 . The quintessence of deep learning is the perceptron. Its more general version, the multi-layer perceptron (MLP) 5, [7] [8] [9] , is a sequential model conformed by one or more layers of ANs. An MLP defines a transformation y = f (X, θ) and is trained to learn the parameters θ = {W, b} that best approximate f to an objective function f * which is, in general, unknown explicitly, but con-tained in the inner structure of a set of data 6, 9 . MLPs are also called feedforward neural networks, because in these, information flows from an input layer to the hidden layers of ANs, and finally to an output layer ; there are no feedback connections in which the output of a layer returns to itself as input. The propagation of information between two sequential layers, say the (l − 1)-th and the l-th, of an MLP is done via the following expression: where A [l] is the vector containing the artificial neurons of the l-th layer, W [l] is a dense weight matrix which defines the connections between the neurons of both layers, b [l] is a weight vector called bias and g [l] is a non-linear function that is applied element-wise to induce non-linearity in the transformation defined by the MLP. The initial layer A [0] of an MLP corresponds to the input layer X. Given that W is a dense matrix, the computational cost of MLPs scales poorly with the dimensionality of data. To approach this problem, a new NN architecture inspired by the biological visual cortex was conceived, the so-called convolutional neural network. In deep learning, convolution refers to the operation usually regarded as cross-correlation in mathematics. For the two-dimensional case, this operation involves a matrix M and a filter or kernel K, and returns the degree of similarity C between M and K, for each of the points of M on that convolution is possible. Mathematically, this operation is denoted by a star " " and is defined as: where (i, j) are the matrix indices of M and (m, n) those of K. For the one-dimensional case of a given signal S, the operation is reduced to: In deep learning, additional parameters are usually defined on the convolution operation, such that the dimensions m conv 1 × m conv 2 of the two-dimensional convolution between an array of dimensions m1 × m2 and a filter with dimensions k1 × k2 are given by: where the parameters (s1, s2) are known as stride, and indicate the number of elements to skip between consecutive filter applications. The (p1, p2) parameters are known as padding, and indicate the number of zeros that will be added around the edges of the input, in order to preserve their original dimensions after convolution. When a NN layer uses the MLP propagation function, equation (3), it is said to be a dense layer. In contrast, when a layer of a NN uses as its propagation function, then it is said to be a convolutional layer. Here W [l] is a multidimensional array of filters that are convoluted in batch with the input of the layer. A Convolutional Neural Network (CNN, ConvNet or simply NN) is usually defined as a neural network with at least one convolutional layer 6,10,11 , although MLPs are sometimes regarded as CNNs. Unlike dense layers, convolutional layers operate by applying the same set of weights (filters) over each region of the input, and are said to be spatially regularized. It is the application of multiple low-dimensional filters that allows a convolutional layer to abstract features of a highdimensional input (such as high-resolution images) at a low computational cost 6,10,11 . The parameters of a NN architecture that can be only varied by the programmer (such as the number of hidden layers or the number of neurons per layer) are regarded as hyperparameters. The remaining parameters that can be varied by the machine (such as the weights and biases) are adjusted via a training process performed by means of an algorithm called backpropagation 9 . Given that backpropagation is a delicate and challenging algorithm to implement, especially for deeper architectures, training of neural networks is usually performed via linear algebra and symbolic derivation libraries such as Tensorflow 12 that automatically implement and perform the backpropagation process for a given neural network architecture. Once trained, the neural network can be treated as a blackbox that receives a portion of a lidar signal B(z) as input (e.g., the first 2000 points of the signal exhibited a good performance in our experiments), and ouputs an approximate ABLT vector index NN (B(z)) value such that the ABLT altitude z NN ABL is given by: where z res lidar denotes the vertical spatial resolution of the LiDAR setup. The ABLT detection was approached as a regression problem, as the NN outputs a single, continuous value on its final layer. A classification approach was also attempted, where the output layer of the NN consisted on a 2000 unit softmax layer, with each element indicating an ABLT occurrence probability at its corresponding altitude. Nevertheless, performance of this second approach was significantly poorer than the direct, regression treatment. During August 2019, one of the largest forest fires of recent times was recorded in the Brazilian Amazon forest. The transport of aerosols due to the burning of biomass was registered by satellite information, e.g., from Aqua 13 . This anomalous increase in the aerosol layer reached southwestern Colombia, and was detected at the LiDAR-CIBioFi on-site station, recording the most consistent behaviour on August 21, as reported in figure 2c of main paper. Data from NASA's Atmospheric Infrared Sounder (AIRS) instrument, aboard the Aqua satellite 13 , as shown in figure 1 of supplementary material (SM), serves as a guide to examine our results. Figure 1 -SM shows the transport of carbon monoxide during a 14, b 15, c 16, and d 21 of August, at a height of around 5.5 km being dragged towards the northwestern South American. Carbon monoxide is a pollutant that contributes to both air pollution and climate change; the concentrations depicted in colour in figure 1-SM range between green (100 parts per billion by volume-ppbv) and dark red (160 ppbv) 13 . In these plots, we can indeed appreciate the massive aerosol cloud distribution over a vast region of subcontinental South America, and, in particular, the impact on southwestern Colombia, the region where the measurements here reported were taken. The pollutant transport dynamics portrayed in figure 1 of supplementary material has been contrasted by means of the HYSPLIT model in order to account for wind transport profiles during five days previous to each of the dates plotted in figure 1 A European aerosol phenomenology-3: Physical and chemical characteristics of particulate matter from 60 rural, urban, and kerbside sites across Europe Atmospheric composition change: Climate-Chemistry interactions. Atmospheric Environment Global and regional climate changes due to black carbon Climatology of aerosol radiative properties in the free troposphere Climate Vulnerability Strong presentday aerosol cooling implies a hot future Decadal attribution of historic temperature and ocean heat content change to anthropogenic emissions In the line of fire When will the Amazon hit a tipping point An overview of ACE-Asia: Strategies for quantifying the relationships between Asian aerosols and their climatic impacts Organic aerosol and global climate modelling: a review The equilibrium sensitivity of the Earth's temperature to radiation changes Long-term air pollution exposure and cardio-respiratory mortality: a review Air quality co-benefits for human health and agriculture counterbalance costs to meet Paris Agreement pledges The contribution of outdoor air pollution sources to premature mortality on a global scale The race to decipher how climate change influenced Australia's record fires Climate Change Data. (Consulted Mar. 2020). data.worldbank.org/topic/climate-change Climate-smart agriculture for food security Detection and analysis of microfronts and associated coherent events using localized transforms Finding boundary layer top: Application of a wavelet covariance transform to lidar backscatter profiles Wavelet correlation transform method and gradient method to determine aerosol layering from lidar returns: Some comments The detection of mixed layer depth and entrainment zone thickness from lidar backscatter profiles Deep learning The perceptron: A probabilistic model for information storage and organization in the brain Learning representations by back-propagating errors Contour and Grouping in Computer Vision Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org NASA's AIRS maps carbon monoxide from Brazil fires. www.nasa.gov/feature/jpl/nasas-airs-mapscarbon-monoxide-from-brazil-fires This work was funded by the Colombian Science, Technology and Innovation Fund-General Royalties System (Fondo CTeI-Sistema General de Regalías) and Gobernación del Valle del Cauca (Grant BPIN 2013000100007), Fundación para la Promoción de la Investigación y la Tecnología (Grant 201921). We are grateful to the Laboratory for Atmospheric Physics (LAFA) and to the LiDAR station at the Centre for Bioinformatics and Photonics (CIBioFi) for the provided data. D.R.V. wrote the convolutional neural network code, D.R.V. and E.S. ran the model simulations, collected data and prepared the figures. J.H.R. conceived the study and supervised the experiment and simulations. All authors contributed to the analysis and interpretation of the results, and to the writing of the manuscript. Supplementary information is available in the online version of the paper. Correspondence and requests for materials should be addressed to J.H.R. The authors declare no competing interests.