key: cord-0079016-pdqkyphd
authors: Martín, Alejandro; Camacho, David
title: Recent advances on effective and efficient deep learning-based solutions
date: 2022-05-25
journal: Neural Comput Appl
DOI: 10.1007/s00521-022-07344-9
sha: 93241b1aaaa7e1173bd257f399386b7b66ff4693
doc_id: 79016
cord_uid: pdqkyphd

This editorial briefly analyses, describes, and provides a short summary of a set of selected papers published in a special issue focused on deep learning methods and architectures and their application to several domains and research areas. The set of selected and published articles covers several aspects related to two basic aspects in deep learning (DL) methods, efficiency of the models and effectiveness of the architectures These papers revolve around different interesting application domains such as health (e.g. cancer, polyps, melanoma, mental health), wearable technologies solar irradiance, social networks, cloud computing, wind turbines, object detection, music, and electricity, among others. This editorial provides a short description of each published article and a brief analysis of their main contributions.

Deep learning has received a lot of attention in the last two decades. Although references to deep models appeared years before, it is in this century that the computational resources have allowed to use these models to allow solving very complex tasks. This process has not stopped. New advances are continuously presented to address even more complicated and challenging undertakings. However, the size and sophistication of recent models and tasks is constantly growing. In this special issue, we have sought for high-quality contributions focusing on two fundamental characteristics necessary to extend and improve the use of deep learning architectures. On the one hand, we have received research addressing the efficiency of the models: novel paradigms and approaches to tackle computational resources restrictions limiting the application of these architectures. On the other hand, we also pursue research targeting the effectiveness of these architectures: novel and extended applications of deep learning going one step further and enabling to solve problems with robust performance. The submitted manuscripts were reviewed by a large number experts from both academia and industry. After two to three rounds of reviewing, the highest quality papers were accepted for this special issue. Totally, 20 papers were selected from 69 submissions to be finally published at Neural Computing and Applications journal.

Given the large number of accepted papers and the diversity of topics addressed in these works, we have generated a word-cloud (see Fig. 1 ) using the top 60 most frequent words appearing in the abstracts of all the papers. As can be seen in this Figure, several key-words appear highlighted (excluding some common key-words related to deep learning, CNN, neural networks, etc.) like: diagnoses, cancer or polyp related to Health; embedding, text processing, summarisation related to NLP; electricity, wind, fuzzy or control related to industry; and other like echo state networks, reservoir computing, performance, efficient, federated, explainable, cloud systems or workload related to the improvement of deep learning methods and architectures.

Therefore, we have classified the papers of this special issue in two main categories, the first one, which contributes to the state of the art in the basic research of deep learning models and architectures, and the second one, which are mainly focused on the application of deep learning techniques in several application domains (we have highlighted some few keywords for each paper to easily categorise its main contributions):

• Improvement of deep learning architectures (3 papers):

Al-Asaly et al. [ 

This section presents a summary of the main contributions for each published paper that have been organised following the previous categories.

A number of submissions are focused on improving current architectures or integrating new capacities. In this line, Al-Asaly et al. [1] are focused on providing a new method for workload forecasting method in the context of cloud resource provisioning by using a deep learning approach. The proposal includes a new deep learning solution to establish how to respond to the different workload changes. The architecture, a Diffusion Convolutional Neural Recurrent Neural Network allows to tackle problems presented by previous models, according to the authors, due to inconsistent or nonlinear workload in cloud computing systems. Thus, the new technique proposed allows to improve forecasting accuracy while reducing the error, as shown in the experiments with CPU usage traces. Also focused on improving current deep learning-based solutions. In the paper written by Arrieta et al. [4] , the authors elaborate on a framework to produce explanations for reservoir computing approaches, in particular deep echo state networks. The proposed framework is capable of generating different insights about the knowledge captured Neural Computing and Applications by these randomisation-based recurrent neural networks, informing the user about the memory depth of the trained reservoir, the existence of different state dynamics in the reservoir, and the importance attributed to the different parts of the modelled sequence. These explainability techniques are evaluated over different datasets, including new experiments for video classification using these models, showcasing the output of the proposed framework and the insights it can yield in practical contexts (e.g. presence of hidden biases).

A third paper is also included in this subsection focused on a very important feature of recent models, privacy. In Li and Wang [10] , the authors propose a privacy-preserving spatial-temporal prediction technique in combination with federated learning (FL) that allows to develop preserving privacy accurate spatial-temporal prediction. Given the characteristics of spatial-temporal data, it is necessary to adopt new approaches to successfully address data heterogeneity.

From all the submissions received, five were focused on the Natural Language Processing domain. In Ali et al. [2] , the authors propose a heterogeneous network embedding model that jointly learns node representations by exploiting semantics corresponding to the author, time, context, field of study, citations, and topics. This network is designed to exploit semantic relations and contextual information between the objects of bibliographic papers' networks, which can result in inadequate citation recommendations, reducing problems such as lack of penalisation, coldstart, and network sparsity when massive number of research articles need to be recommended or identified. Compared to baseline models, the results produced by the proposed model over the DBLP datasets prove 10% and 12% improvement on mean average precision (MAP) and normalised discounted cumulative gain metrics, respectively. Also, the effectiveness of this model is analysed on the cold-start papers and network sparsity problems, where it gains 12% and 9% better MAP and recall@10 scores, respectively. Ji et al. [7] a text representation is enhanced with lexicon-based sentiment scores and latent topics and proposes using relation networks to detect suicidal ideation and mental disorders with related risk indicators. This work addresses highly relevant issues and problems related to mental health through the use of effective deep learning methods using a relation module which is further equipped with the attention mechanism to prioritise more critical relational features, three real-world datasets are used to carry out an experimental evaluation of the authors approach.

Pandelea et al. [12] propose a retrieval-based dialogue system, which poses additional difficulties when deployed in environments characterised by limited resources. In this work, the authors propose a new framework for hardwareaware retrieval-based dialogue systems based on the dualencoder architecture, coupled with a clustering method to group candidates pertaining to a same conversation that reduces storage capacity and computational power requirements.

Paul et al. [13] propose a classifier ensemble of deep transfer learning models with support vector machine (SVM) as the aggregator for handwritten music symbol recognition, the authors develop a method that can store music sheets containing handwritten music scores digitally using optical music recognition systems (OMR is a system that automatically interprets the scanned handwritten music scores). The authors have applied this classifier in three pre-trained deep learning models, namely ResNet50, GoogleNet and DenseNet161 (each trained on ImageNet), and fine-tuned on their target datasets, i.e. music symbol image datasets. The experimental evaluation shows that the ensemble technique can capture a more complex association of the base classifiers, thus improving the overall performance.

Zhang et al. [20] propose a two-stage transformer-based abstractive summarisation model to improve the factual correctness, denoted as FCSF-TABS. This novel model has been designed to take a step further in machine summarisation models for massive text data processing. In the first stage, the authors use a fine-tuned BERT classifier to perform content selection to select summary-worthy single sentences or adjacent sentence pairs in the input document. In the second stage, the author feed the selected sentences into the transformer-based summarisation model to generate summary sentences. Furthermore, during the training, we also introduce the idea of reinforcement learning to jointly optimise a mixed-objective loss function.

Four relevant papers related to the application of deep learning techniques have been selected. In Amor et al. [3] , the authors propose a novel deep embedded refined clustering method for breast cancer differentiation based on DNA methylation. In concrete, the deep learning system presented in this paper uses the levels of CpG island methylation between 0 and 1. The proposed approach is composed of two main stages. The first stage consists in the dimensionality reduction of the methylation data based on an autoencoder. The second stage is a clustering algorithm based on the soft assignment of the latent space provided by the autoencoder. The whole method is optimised through a weighted loss function composed of two terms: reconstruction and classification terms. One of the most relevant contributions of this work is related to the novelty on the dimensionality reduction algorithms linked to classification trained end-to-end for DNA methylation analysis.

Nogueira-Rodríguez et al. [11] design a new deep learning model for real-time polyp detection based on a pre-trained YOLOv3 (You Only Look Once) architecture and complemented with a post-processing step based on an object-tracking algorithm to reduce false positives that allows to detect colorectal cancer. The base YOLOv3 network was fine-tuned using a dataset composed of 28,576 images labelled with locations of 941 polyps that will be made public soon. The object-tracking algorithm has demonstrated a significant improvement in specificity whereas maintaining sensitivity, as well as a marginal impact on computational performance.

Pérez and Ventura [14] proposed a convolutional neural network architecture for melanoma diagnosis inspired by ensemble learning and genetic algorithms. The architecture is designed by a genetic algorithm that finds optimal members of the ensemble. Additionally, the abstract features of all models are merged and, as a result, additional prediction capabilities are obtained. The diagnosis is achieved by combining all individual predictions. In this manner, the training process is implicitly regularised, showing better convergence, mitigating the overfitting of the model, and improving the generalisation performance. The aim of this work is to find the models that best contribute to the ensemble. The proposed approach also leverages data augmentation, transfer learning, and a segmentation algorithm. The segmentation can be performed without training and with a central processing unit, thus avoiding a significant amount of computational power, while maintaining its competitive performance.

In Qureshi et al. [15] , a systematic literature review presents the detailed literature on ambient assisted living solutions and helps to understand how ambient assisted living helps and motivates patients with cardiovascular diseases for self-management to reduce associated morbidity and mortalities. The paper is divided into four main themes, including self-monitoring wear-able systems, ambient assisted living in aged populations, clinician management systems, and deep learning-based systems for cardiovascular diagnosis. For each theme, a detailed and comprehensive analysis shows: (1) how these new technologies are nowadays integrated into diagnostic systems, and (2) how new technologies like IoT sensors, cloud models, machine, and deep learning strategies can be used to improve the medical services.

Related to the area of image and audio processing, six high-quality papers were finally selected. In Fenza et al. [5] , the authors address the problem of extraction of valuable insights from unstructured content in multimedia contents. This work focuses on the image understanding, people's name association in images that is still an open issue. The proposed solution given by authors tries to improve the name-face association by defining a cognitive layer for a deep learning architecture embedding the surrounding context of the entities in the caption or the image. The method mainly focuses on name-face association as enabling technology for people recognition in open-source intelligence frameworks that mostly investigate not popular (or unknown) people. Given a face, the proposed system predicts the most likely corresponding name leveraging image features, caption, and context. The learning model embeds the knowledge graph of the context elicited in the open sources through a graph neural network. Huertas-Tato et al. [6] present a multi-view convolutional neural network architecture to estimate solar irradiance from ground-level Total Sky Images (TSIs). Pairing Total Sky Images and Convolutional Neural Networks can effectively estimate Global Horizontal Irradiance (GHI) in photovoltaic (PV), replacing expensive equipment with off-the-shelf cameras. This network combines three different views with accurate results, proving to be more effective than a single-camera approach and strict baselines (feature-extraction and cloud fraction). Their results show the advantages of early fusion of multiple perspectives in the solar irradiance monitoring domain.

Tarasiuk and Szczepaniak [18] presented a novel method for improving the invariance of convolutional neural networks (CNNs) to selected geometric transformations in order to obtain more efficient image classifiers. A common strategy employed to achieve this aim is to train the network using data augmentation. Such a method alone, however, increases the complexity of the neural network model, as any change in the rotation or size of the input image results in the activation of different CNN feature maps. This problem can be resolved by the proposed novel convolutional neural network models with geometric transformations embedded into the network architecture. The evaluation of the proposed CNN model is performed on the image classification task with the use of diverse representative data sets.

Rodriguez-Conde et al. [16] provide an up-to-date review of the more relevant scientific research related to the object detection problem. The on-device paradigm has emerged as a recent alternative, pursuing more compact and efficient networks to ultimately enable the execution of the models directly on resource-constrained client devices. In particular, this work contributes to the field with a comprehensive architectural overview of both the existing lightweight object detection frameworks targeted to mobile and embedded devices, and the underlying convolutional neural networks that make up their internal structure.

Leroux et al. [8] propose a new architecture, which replaces sequential layers in a CNN with an iterative structure where weights are reused multiple times for a single input image, reducing the storage requirements drastically. This new architecture build upon deep residual networks (ResNets) is used to reduce the computational costs of evaluating a neural network, which usually only depends on design choices such as the number of layers or the number of units in each layer and not on the actual input. In addition, the authors incorporate an adaptive computation module that allows the network to adjust its computational cost at run time for each input sample independently.

Li et al. [9] propose a Combined Angular Mar-gin and Cosine Margin Softmax Loss (AMCM-Softmax) approach to enhance intra-class compactness and inter-class discrepancy simultaneously, this new approach is used to overcome one of the problems of the Softmax loss commonly used in existing CNNs that lacks sufficient power to discriminate deep features in domains such as music classification. Normalisation on the weight vectors and feature vectors is adopted to eliminate radial variations. Then, an angular margin parameter and a cosine margin parameter are introduced to maximise the decision margin by enforcing angular and cosine margin constraints. Consequently, the discrimination of features is enhanced by normalisation and margin maximization. The decision boundary and the target logit curve of AMCM-Softmax can provide a clear geometric interpretation and the experimental results show excellent results on music datasets.

Finally, two interesting and novel works related to Industry were selected for publication. In Sierra-Garcia and Santos [17] , the research focuses on wind energy, a topic of great relevance and interest. Specifically, it deals with the control of the blade angle of wind turbines, a complex, nonlinear, and with coupled dynamics renewable energy system. The authors develop a hybrid intelligent system that combines fuzzy logic and deep learning. Deep learning techniques are used to estimate the current wind that impacts the rotor and to forecast the future wind. Estimation and forecasting are combined to obtain the effective wind which feeds the fuzzy controller. Simulation results show how including the effective wind improves the efficiency of the turbine for different disturbances and wind speeds.

Torres et al. [19] propose a deep neural network to address the electricity consumption forecasting in the short-term, namely a long short-term memory (LSTM) network due to its ability to deal with sequential data such as time-series data. The optimal values for certain hyperparameters have been obtained by a random search and a metaheuristic, called coronavirus optimization algorithm (CVOA), based on the propagation of the SARS-Cov-2 virus. Then, the optimal LSTM has been applied to predict the electricity demand with 4-h forecast horizon. Results using Spanish electricity data during nine years and half measured with 10-min frequency are presented and discussed.

A deep learning-based resource usage prediction model for resource provisioning in an autonomic cloud computing environment

Citation recommendation employing heterogeneous bibliographic network embedding

A deep embedded refined clustering approach for breast cancer distinction based on dna methylation

On the post-hoc explainability of deep echo state networks for time series forecasting, image and video classification

Cognitive name-face association through context-aware graph neural network

Using a multi-view convolutional neural network to monitor solar irradiance

Neural Computing and Applications

Suicidal ideation and mental disorder detection with attentive relation networks

Iterative neural networks for adaptive inference on resource-constrained devices

Combined angular margin and cosine margin softmax loss for music classification based on spectrograms

Federated meta-learning for spatial-temporal prediction

Real-time polyp detection model using convolutional neural networks

Toward hardware-aware deep-learning-based dialogue systems

An ensemble of deep transfer learning models for handwritten music symbol recognition

An ensemble-based convolutional neural network model powered by a genetic algorithm for melanoma diagnosis

Deep learning-based ambient assisted living for self-management of cardiovascular conditions

Optimized convolutional neural network architectures for efficient ondevice vision-based object detection

Deep learning and fuzzy logic to implement a hybrid wind turbine pitch control

Novel convolutional neural networks for efficient classification of rotated and scaled images

A deep lstm network for the spanish electricity consumption forecasting

Fcsftabs: two-stage abstractive summarization with fact-aware reinforced content selection and fusion

Acknowledgements The guest editors would like to thank Prof. John MacIntyre who is the editor in chief of Neural Computing and Applications. The guest editors would like to thank the reviewers for their high-quality reviews, which provided insightful and constructive feedback to the authors of the papers. The guest editors also would like to thank journal editor Rachel Moriarty, Annette Hinze, Amin Fatemi, and Rashmi Jenna for their help on submission and publication. This work has been supported by BBVA Foundation un-der grant CIVIC project, and by the Spanish Ministry of Science and Innovation under FightDIS (PID2020-117263GB-100) and XAI-Disinfodemics (PLEC2021-007681) grants.

Conflict of interest The authors declare that there is no conflict of interests regarding the publication of this paper.