Abstract
In this work, we evaluate the You Only Look Once (YOLOv3) architecture for real-time detection of insect pests in soybean. Soybean crop images were collected on different days, locations, and weather conditions between the phenological stages R1 to R6, considered the period of the high occurrence of soybean pests. For training and testing the neural network, we used 5-fold cross-validation analyzing four metrics to evaluate the classification results: precision, recall, F-score, and accuracy; and three metrics to evaluate the detection results: mean absolute error (MAE), root mean square error (RMSE) and coefficient of determination (R\(^2\)). The experimental results showed that the YOLOv3 architecture trained with batch size 32 leads to higher classification and detection rates than batch sizes 4 and 16. The results indicate that the evaluated architecture can support experts and farmers in monitoring pest control action levels in soybean fields.
We thank the Centro Nacional de Desenvolvimento Científico e Tecnológico (CNPq), the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), NVIDIA Corporation for the graphics card donation, and the Fundação de Apoio ao Desenvolvimento do Ensino, Ciência e Tecnologia do estado de Mato Grosso do Sul (FUNDECT).
Access provided by University of Notre Dame Hesburgh Library. Download conference paper PDF
Similar content being viewed by others
1 Introduction
Soy (Glycine max) is a vegetable crop belonging to the Fabaceae family. It is mainly used for human nutrition (in the form of soybean oil, tofu, soy sauce, soy milk, soy protein, soybeans) and animal nutrition (in the preparation of animal feed). Additionally, it has an excellent nutritional profile and essential economic support [1]. In the 2019–2020 season, Brazil has once again become the world’s largest producer of soybeans, surpassing its main competitor, the United States. For the Brazilian crop in 2020–2021 is expected a record production of 134.45 million tons, representing an increase of 7.7% over the previous crop [2]. The state of Mato Grosso is the largest Brazilian producer, followed by the Rio Grande do Sul, Paraná, Goiás, and Mato Grosso do Sul [3]. It is estimated that most of the direct costs for soybean production are concentrated in fertilizers (27.82%), pesticides (18.24%), machinery operations (9.10%), seeds (7.35%) and depreciation of agricultural machinery and equipment (6.76%) [4]. Therefore, one of the highest costs for soybean production is applying pesticides to control pests that can cause damage to the end product.
Monitoring pests from the beginning of soybean development allows a more efficient application of pesticides. Agricultural inputs can be applied in the correct amount and places, reducing production costs and the environmental impact from the overuse of chemical control. Additionally, contributing to human health and food safety [5]. Sampling methods, visual examination of plants, and examination of soil samples have been used in monitoring pests, and their damage in the crop [6].
Regardless of the method used to assess field pest infestation, the number of insects should be recorded at each sampling point, allowing to determine the level of infestation in each area of the field. The higher the number of samples taken in an area, the more confidence in predicting the level of pest infestation. It is recommended at least six samplings for plots up to 10 ha, eight considering plots up to 30 ha, and ten considering plots up to 100 ha. For larger farms, a division into 100 ha plots is recommended [6].
Digital images collected from the field are an increasingly used alternative to manual sampling methods. Computer vision systems fed with these digital images perform real-time detection of pest insects. Automatically analyzing images is crucial for making these applications viable since many images are captured. In addition, high spatial resolution cameras can be embarked on Unmanned Aerial Vehicles (UAVs) to capture images of the crop on data collection missions, allowing experts and farmers to make better pest management decisions [7]. UAVs can also be programmed to perform automatic flight plan operations, with an autonomy of over 1 h depending on the model, covering average areas that can exceed 1,000 ha per flight [8]. However, few works address the use of digital images to identify insect pests in soybeans. Furthermore, no reports have been found in the literature of pest monitoring in soybean fields using state-of-the-art object detection methods such as YOLOv3 [9], Faster R-CNN [10], and RetinaNet [11].
This paper proposes to evaluate the You Only Look Once (YOLOv3) architecture to detect pest insects in an actual field environment under different lighting conditions, object size, and complex background variations. YOLOv3 is one of the most popular series of object detection models. Its advantage has been to provide real-time detections while approaching the accuracy of state-of-the-art object detection models. Soybean crop images were collected on different days, locations, and weather conditions between the phenological stages R1 to R6, considered the period of the high occurrence of soybean pests. For training and testing the neural network, we used 5-fold cross-validation analyzing four metrics to evaluate the classification results: accuracy, recall, F-score, and precision; and three metrics to evaluate the detection results: mean absolute error (MAE), root mean square error (RMSE) and coefficient of determination (R\(^2\)). To the best of our knowledge, no papers were found that address state-of-the-art object detection methods for pest detection in soybean.
2 Related Work
By using deep learning, the authors in [12] created a framework that identifies ten types of pests of different crops using a manually collected dataset. The dataset was collected by downloading 5,629 images from search engines (Google, Baidu, Yahoo, and Bing) and taking outdoor footage using an iPhone 7 Plus. In the pre-processing phase, data augmentation was used to expand the dataset. Further, the GrabCut and Watershed algorithms were implemented to remove complex background from the images. In the training phase, VGG-16, VGG-19, ResNet50, ResNet152 and GoogLeNet were used. The GoogLeNet model outperformed other models in terms of accuracy, model complexity, and robustness in the experiments. The original model achieved an accuracy of 93%, while the fine-tuned model achieved an accuracy of 98.91%. On the other hand, GoogLeNet required more computational resources. As a drawback, the complex architecture of Inception-V3 makes it challenging to adjust the layer structure for a specific dataset.
A novel pest detection approach called DeepPest, based on two-stage mobile vision, was proposed in [13] for small-sized pest recognition and multiple species classification in an unbalanced dataset. First, DeepPest extracts multi-scale contextual information from images. Next, a multi-projection pest detection model (MDM) is proposed and trained by crop-related pest images. The MDM function can combine pest contextual information from low-level convolutional layers with high-level layers to generate the super-resolved feature of DeepPest. Experimental results show better performance of DeepPest compared with other state-of-the-art methods. Although the dataset contains 17,192 pest images captured in a field environment with 76,595 pest annotations, the dataset is restricted to wheat and rice crops.
Through the use of Convolutional Neural Network (CNN), the authors in [14] developed a mobile application for detecting pests and diseases in rice. An Inception-v3 model was trained using transfer learning with the weights obtained from Imagenet. The final layer of the model was retrained from scratch using the provided images. Experiments showed that the model achieved an accuracy of 90.9%. The mobile app can display the appropriate description of the pest, its possible damage, and how to control it. Nonetheless, no information about the level of infestation or the number of pests in the crop can be found.
The authors in [15] created and made available a large-scale dataset, called IP102, for pest insect recognition, including over 75,000 images of 102 species. Initially, the performance of classification with shallow approaches was evaluated, comparing six methods: color histogram, Gabor filter, GIST, Scale Invariant Feature Transform (SIFT), Speeded Up Robust Features (SURF), local color histogram; and two classifiers: Support Vector Machine (SVM), k-Nearest Neighbors (k-NN). Next, four Deep Learning approaches were compared: Alexnet, GoogleNet, VGGNet, and ResNet. As a result, the best shallow approach achieved 19.5% accuracy using the combination of SURF and the SVM classifier. Resnet stood out among the deep approaches. Last, five state-of-the-art detection methods were evaluated on IP102: Faster R-CNN, FPN, SSD300, RefineDet, and YOLOv3. The results show that the performance of the two-stage detector (FPN) was better than the one-stage detector (SSD300, RefineDet, and YOLOv3). The experimental results showed low accuracy (<50%) in almost all scenarios, indicating that the shallow and deep methods cannot yet handle pest recognition well on data sets with inter-class and intra-class variance along with data imbalance.
Similarly, using Convolutional Neural Networks (CNN), the authors in [16] evaluated three detection and classification methods to identify five species of insects collected in traps in a factory environment. With a set of 200 original photos of 3,026 insects, the following methods were compared: (1) Adaptive Thresholding combined with VGG-16, (2) Single Shot MultiBox Detector (SSD) developed over VGG16 network but replacing the fully connected layers, (3) VGG16 with Sliding Window approach. SSD produced the best results among the three methods in the experiments, achieving detection and classification rates of 84% and 86%, respectively. In the paper, data augmentation was performed on the training set and test set. Note, however, that data augmentation should be done on the training set only. Otherwise, the visual patterns of an original image learned during training may be the same as those of an augmented image in the test set, causing overfitting and, ultimately, making the data set unreliable.
In Reference [17], an improved network architecture based on VGG19 was implemented for the detection and classification of 24 insect species collected in crop fields such as rice and soybean. The proposed method was implemented by combining VGG19 models and a particular network called Region Proposal Network (RPN). The experimental results showed that the proposed method achieved an accuracy of 89.2% on the MPest dataset and is superior to the traditional state-of-the-art SSD (85.3%) and Fast R-CNN (79.6%) methods. However, there are still some problems with this method, such as target detection errors. Although the paper proposes a CNN to solve the problem of multi-classification of agricultural insects, species frequently found on soybean, such as euschistus heros, edessa meditabunda, nezara viridula, anticarsia gemmatalis, spodoptera spp. and diabrotica speciosa, were not present in the MPest dataset.
The deep residual learning method was used in [18] for identifying ten classes of agricultural pests in images with complex backgrounds. The method’s performance was improved after optimizing with deep residual learning the pre-trained ResNet101 and ResNet50 models in ImageNet. Compared with traditional Support Vector Machines (SVM) and Convolutional Neural Networks (CNN), such as Alexnet, the ResNet-101 model based on deep residual learning achieved 98.67% accuracy in agricultural pest recognition, being significantly higher than traditional SVM and CNN. The deep residual network could be combined with object detection methods such as Faster R-CNN or R-FCN to track pest targets in real-time, providing a practical value of the method in agricultural pest control tasks.
The work in [19] aims to select the best approach for detection and identification of the most harmful pests in greenhouse tomato and pepper crops, Bemisia tabaci and Trialeurodes vaporariorum. A dataset with many images of infected tomato plants was created to train and evaluate machine learning models (MLP and k-NN) and deep learning models (SSD and Faster R-CNN). Results show that the deep learning models offer a better solution as they achieve the highest accuracy. Results also indicate that egg detection and identification are a significant challenge, as they are similar in color and shape at an early stage. Moreover, adding other metrics to evaluate the models would be interesting, such as accuracy and F-score.
In order to achieve accurate detection and identification of Pyralidae pests, the authors in [20] designed an intelligent robot on an autonomous vehicle to acquire images in the natural farm scene and also presented a Pyralidae pest recognition algorithm to be used on the robot. Precisely, by employing the color and shape features of Pyralidae pests, they proposed a segmentation algorithm using Inverse Histogram Mapping and spatial-constrained OTSU (SC-OTSU) to segment pests. Further, they designed a recognition approach based on Hu’s Invariant Moment. The experimental results show that the robot vehicle can automatically capture images of Pyralidae pests in the natural farm scene, achieving 94.3% accuracy in pest recognition. This approach, however, is limited by the time complexity required. The average processing time for each frame is longer than 1 s, leading to a delay in responding to the observation results. In addition, the detection of pests under non-uniform illumination was also not efficient. The local color of the images being too light or too dark made it difficult to determine a stable threshold range in the algorithm.
With the purpose of rapid detection and recognition of ten insect species affecting tea fields in China, the authors in [21] created the SIFT-HMAX model inspired by the mechanism of human visual attention. First, the statistical model Saliency Using Natural (SUN) generated saliency maps and detected the region of interest (ROI) in a pest image. To extract the attributes representing the appearance of the pest, the Hierarchical Model and X (HMAX) was extended. SIFT was integrated into the HMAX model to increase invariance to rotational changes. Meanwhile, Non-negative Sparse Coding (NNSC) was used to simulate the simple cell responses. Texture attributes were extracted based on the Local Configuration Pattern (LCP) algorithm. Finally, the extracted attributes were fed into an SVM to perform the recognition task. The proposed method performed well with a recognition rate of 85.5%, showing an advantage over the compared methods HMAX, Sparse coding, and NIMBLE, but being slightly inferior to a CNN such as MatConvNet (86.9%).
In the work [22] a method based on SVM, MSER and HOG was developed for aphid identification in wheat fields. The method uses a Maximally Stable Extreme Regions (MSER) descriptor to simplify the complex background of the images containing the aphids, and then Histogram Oriented Gradient (HOG) attribute vectors are used in an SVM for aphid classification. This method was compared with five other commonly used methods for aphid detection; its performance was analyzed using images with different aphid densities, colors, and plants’ locations. The results showed that the method provides average identification and error rates of 86.81% and 8.91%, respectively, higher than other compared methods including K-means, thresholding, SVM-HOG, SVM, SVM-HOG-Haar, and AdaBoost-Haar-HOG. The authors did not compare the proposed method with state-of-the-art deep learning models.
The authors in [7] evaluated deep learning models trained with different fine-tuning and transfer learning parameters for the tasks of classification and counting of pest insects in soybean. First, an image segmentation stage with the SLIC Superpixels method was considered to segment the images’ insects. In the classification stage, three deep learning models were compared: Inception-Resnet-v2, ResNet-50 and DenseNet-201. The approach with the best result was DenseNet-201 (94.89%) using 100% fine-tuning. In the counting experiment, the authors used the weights of the CNN best rated for accuracy in a computer vision system to classify each superpixel segment of the image. The count of pest insects was obtained by adding up the superpixels of each class, thus calculating the infestation level of a crop area. The count results showed that the accuracy decreases to 90.86% when an insect is segmented into two different superpixels. It should be noted that this is an area of research that has been explored but not yet resolved. Moreover, the authors did not compare the results with other state-of-the-art detection methods, such as YOLOv3, Faster R-CNN, and RetinaNet.
3 Materials and Methods
3.1 Image Acquisition
An experimental area of 2 ha was sown with conventional soybean cultivars and no pesticide application to create a reference collection of soybean pest insects. The agricultural area shown in Fig. 1 is located at the experimental farm of the Federal University of Grande Dourados, located in the municipality of Dourados-MS, Brazil, with geographic coordinates 22\(^{\circ }\)13’57.52" South latitude and 54\(^{\circ }\)59’17.93" West longitude.
In order to collect images of insects present in the experimental area, a Sony DSC-HX300 camera with 20.4 megapixels resolution and a SM-G930F camera with 12.2 megapixels resolution were used. A total of 1,800 images (600x800 pixels) in JPG format were collected from Sep/2017 to Feb/2018 crop during the phenological stages R1 to R6 of the soybean reproductive phase, on different days and weather conditions, between 8 am to 10 am and 5 pm to 6:30 pm. The exposure of pests at the plant’s top usually occurs early in the day or late in the afternoon. The recommendation is to sample insects of the aerial part be performed, preferably, in the cooler weather and more humid periods of the day, as reported in [6].
The researcher captured images on-site using a camera 50 cm away from the target of interest and a 90-degree angle of the camera to the ground. The targets, in this case, correspond to the defoliating insects that cause economic damage when found at high levels in soybean fields. Next, each image was annotated in LabelImgFootnote 1 with support from an entomologist biologist. Thus building a reference collection for the system’s training and test image set (see Fig. 2), named INSECT12C-Dataset and available at [23]. Table 1 presents the total number of insects annotated by species. The unequal number of samples used in the training and test image set reflects the number of occurrences of each pest species under actual field conditions.
3.2 You only Look once (YOLOv3)
The YOLOv3 detection method evaluated in this paper is briefly described in the following. The source code used as a basis for our implementation can be found at https://github.com/qqwweee/keras-yolo3.
Unlike Faster-RCNN, which has a region proposal stage, YOLOv3 [9] approaches object detection as a direct pixel regression problem for bounding box coordinates and class probabilities. The input image is divided into \(S \times S\) blocks. For each block, YOLOv3 predicts bounding boxes using dimension groups as anchor boxes. The objectivity score is predicted using logistic regression for each bounding box, which indicates the chance that the bounding box has an object of interest. In addition, the probabilities of class C are estimated for each bounding box, indicating the classes it may contain. In our case, each bounding box may contain the species of an insect or the bottom of the plantation (uninteresting object). Thus, each prediction in YOLOv3 comprises four parameters for the bounding box (coordinates), the objectivity score, and the C class probabilities. YOLOv3 uses Darknet-53 as its backbone to provide high accuracy, requiring fewer operations than other architectures.
3.3 Experimental Design
We adopted 5-fold cross-validation with validation, and test sets in the INSECT12C [23] dataset. In each fold, we set aside the 20% allocated for testing. We then split the 80% allocated for training into two subsets, 60% to be the actual training set and 20% to be the validation set. Thus, in each fold, the dataset was divided into 60% for training, 20% for validation, and 20% for testing, allowing the deep learning model to be iteratively trained and validated on different sets. Finally, the classifier’s output is given by the average of the five folds in the test set.
The classification task consists of predicting the class of the object in the image. The problem of identifying the location of the object along with its class is called object detection. Instead of predicting the object class from an image, we now have to train a multi-label classifier to determine the location of objects and predict the class of each object (circumscribed in a rectangle called the bounding box). We consider four metrics to evaluate the classification results: precision, recall, F-score, and accuracy; and three metrics to evaluate the detection results: mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R\(^2\)).
We have calculated the average results of the evaluation metrics for the model studied and the standard deviation. We used the ANOVA hypothesis test to determine statistical significance in the average performance between groups with different batch sizes. We report the p-value found for each metric, and the significance level was set at 5%.
In our experiments, we used the open-source implementation of the YOLOv3 [9] recognized by the ImageNet competition [24]. The following input parameters were used: input image width and height set to 608 \(\times \) 800 pixels, batch size by 4, 16, 32 samples, and training with 24,000 iterations. We also used the SGD optimizer [25] with a learning rate set at 0.001 and momentum at 0.9. We employed the data augmentation technique to supplement the training data by applying the random rotation operations between 0\(^{\circ }\) and 90\(^{\circ }\) clockwise/anticlockwise and changing image brightness from –10% to 10%. This procedure aims to enhance rotation invariance and illumination invariance during detection.
We used Colab, a Google Research service that allows writing code through a browser while providing free GPU resources in all of our experiments. We use the Google Colab Pro version for our work, which provides priority access to more powerful GPU resources and high-memory virtual machines over the free Colab. Additionally, the lifecycle of virtual machines is increased from 12 to 24 h.
3.4 Evaluation Metrics
To evaluate the classification results, we use four metrics: Accuracy Eq. (1), Precision Eq. (2), Recall Eq. (3) and F-score Eq. (4). We account for true positive (TP) when the insect is detected in the correct class with a detection score \(\ge \) 0.3; and false positive (FP) when some object that is not of interest (leaves or background) is identified as an insect. In cases where the insect was not detected or was detected in the incorrect class, we say it is a false-negative (FN). A true negative (TN) occurs when the network does not detect any insect in the image where there is no insect. Since all images have at least one insect, we have no true negative case in our dataset.
Furthermore, we consider three metrics to evaluate the detection results: mean absolute error (MAE) Eq. (5), root mean square error (RMSE) Eq. (6) and coefficient of determination (R\(^2\)) Eq. (7).
where \(y_{i}\) is the observed value, \(\hat{y}_{i}\) is the predicted value, \(\bar{y}\) is the mean of the observation, and n is the number of observations.
4 Results and Discussion
Classification Evaluation. The classification results obtained by YOLOv3 for precision, recall, F-measure, and accuracy are presented in Table 2. The values in percentages represent the average of the five folds in the test set. Regarding accuracy, the experiment with batch size 4 obtained a higher result than size 32, resulting in fewer false positives during detection. However, YOLOv3 achieved the best results with batch size 32 for the other metrics, showing that the proportion of predicted insects were true-positives in most cases.
Note that batch size 32 results in fewer false negatives than batch size 4 and batch size 16. Overall, this is the most critical measure for the studied problem since its occurrence implies the lack of agricultural product application in an area with the presence of pest insects.
The results of the ANOVA test indicate that there is evidence of a statistical significance in mean performance between the groups of batch sizes at a significance level of 5%, using the precision (p-value = \(2. 46e^{-04}\)), recall (p-value = \(7.99e^{-09}\)), F-score (p-value = \(2.3e^{-08}\)) and accuracy (p-value = \(1.37e^{-08}\)) as metrics.
Insect Pest Detection in Soybeans. Table 3 presents the measured values of the three configurations proposed. In the experiments, batch size 32 produced the best results among the sizes evaluated, obtaining MAE and RMSE rates of 0.41 and 0.83, respectively. The results show low mean error rates in almost all scenarios, indicating that the YOLOv3 architecture can handle multi-species pest detection well, even in unbalanced datasets with interclass and intraclass variance. Some examples of insect-pest detection in YOLOv3 are shown in Fig. 3. The bounding boxes are true positive with a detection score \(\ge \) 0.3.
However, YOLOv3 presented false negatives (insects not detected or detected incorrectly) and false positives (when some object that is not of interest is identified as an insect), showing that detections fail under certain field conditions such: non-uniform illumination on the insect, complex insect-like background (e.g., herbivory and lesions on leaves), partial occlusion of the insect, and mainly low representativity of species with few samples (Fig. 4).
5 Conclusion
In this paper, we evaluated the performance of the YOLOv3 architecture for the real-time detection of soybean pest insects. We also defined a baseline for soybean pest detection by publishing a new dataset called INSECT12C, consisting of 2,758 annotated insects from 12 species. Experimental results showed that YOLOv3 architecture trained with batch size 32 leads to higher classification and detection rates compared to batch sizes 4 and 16; however, the method failed in some cases: areas with complex lighting conditions, herbivory and leaf injuries, partial insect occlusion, and low species representativeness with few samples. The ratio of true positives to the total predicted positives achieved did not show significant distortions, indicating that YOLOv3 allows tracking pest targets in real-time, offering additional practical value of the method in agricultural pest control tasks.
We plan to evaluate new state-of-the-art object detection architectures with higher resolution cameras onboard the UAV as future work. We also intend to make use of the oversampling technique in order to solve class unbalances. Finally, we plan to implement automatic counting of detected insects by species.
Notes
- 1.
LabelImg is a graphical image annotation tool and label object bounding boxes in images.
References
Hou, J., et al.: Association analysis of vegetable soybean quality traits with ssr markers. Plant Breed. 130(4), 444–449 (2011). https://doi.org/10.1111/j.1439-0523.2011.01852.x
CONAB, Acompanhamento da safra brasileira grãos V. 8 - SAFRA 2020/21 - N. 3 - Terceiro levantamento (2020). https://www.conab.gov.br/info-agro/safras/graos/boletim-da-safra-de-graos. ISSN: 2318–6852
—, Acompanhamento da safra brasileira grãos V. 6 - SAFRA 2018/19 - N. 12 - Décimo segundo levantamento (2019). https://www.conab.gov.br/info-agro/safras/graos/boletim-da-safra-de-graos/item/download/28484_9a9ee12328baa359b3708d64e774e5d8. ISSN 2318–6852
—, Compêndio de estudos conab: Evolução dos custos de produção de soja no brasil, 2 (2016). https://www.conab.gov.br/institucional/publicacoes/compendio-de-estudos-da-conab/item/download/2512_c2638f76696e3b926ab22e93f9549d21. ISSN 2448–3710
Tetila, E.C., Machado, B.B., Belete, N.A., Guimarães, D.A., Pistori, H.: Identification of soybean foliar diseases using unmanned aerial vehicle images. IEEE Geosci. Remote Sens. Lett 14(12), 2190–2194 (2017). https://doi.org/10.1109/LGRS.2017.2743715
Hoffmann-Campo, C.B., Corrêa-Ferreira, B.S., Moscardi, F.: Soja: manejo integrado de insetos e outros Artrópodes-praga., Embrapa Soja (2012). http://www.cnpso.embrapa.br/artropodes/Capitulo9.pdf. ISBN 978-85-7035-139-5
Tetila, E.C., Machado, B.B., Menezes, G.V., de Souza Belete, N.A., Astolfi, G., Pistori, H.: A deep-learning approach for automatic counting of soybean insect pests. IEEE Geosci. Remote Sens. Lett 17, 1–5 (2019). https://doi.org/10.1109/LGRS.2019.2954735
Tetila, E.C.: Detecção e classificação de doenças e pragas da soja usando imagens de veículos aéreos não tripulados e técnicas de visão computacional. Ph.D. dissertation, Universidade Católica Dom Bosco (2019). http://repositorio.ufgd.edu.br/jspui/handle/prefix/2385
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). CoRR, vol. abs/1804.02767, http://arxiv.org/abs/1804.02767
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks (2015). CoRR, vol. abs/1506.01497, http://arxiv.org/abs/1506.01497
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 318–327 (2020)
Li, Y., Wang, H., Dang, L.M., Sadeghi-Niaraki, A., Moon, H.: Crop pest recognition in natural scenes using convolutional neural networks. Comput. Electron. Agric. 169,(2020). https://doi.org/10.1016/j.compag.2019.105174. ISSN 0168–1699
Wang, F., Wang, R., Xie, C., Yang, P., Liu, L.: Fusing multi-scale context-aware information representation for automatic in-field pest detection and recognition. Comput. Electron. Agric. 169,(2020). https://doi.org/10.1016/j.compag.2020.105222. ISSN 0168–1699
Mique, E.L., Palaoag, T.D.: Rice pest and disease detection using convolutional neural network. In: Proceedings of the 2018 International Conference on Information Science and System, Series ICISS ’18, pp. 147–151. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3209914.3209945. ISBN 9781450364218
Wu, X., Zhan, C., Lai, Y., Cheng, M., Yang, J.: Ip102: a large-scale benchmark dataset for insect pest recognition. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8779–8788 (2019). https://doi.org/10.1109/CVPR.2019.00899
Nam, N.T., Hung, P.D.: Pest detection on traps using deep convolutional neural networks. In: Proceedings of the 2018 International Conference on Control and Computer Vision, Series ICCCV ’18, pp. 33–38. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3232651.3232661
Xia, D., Chen, P., Wang, B., Zhang, J., Xie, C.: Insect detection and classification based on an improved convolutional neural network. Sensors 18(12), 4169 (2018). https://doi.org/10.3390/s18124169. ISSN 1424–8220
Cheng, X., Zhang, Y., Chen, Y., Wu, Y., Yue, Y.: Pest identification via deep residual learning in complex background. Comput. Electron. Agric. 141, 351–356 (2017). https://doi.org/10.1016/j.compag.2017.08.005
Gutierrez, A., Ansuategi, A., Susperregi, L., Tubío, C., Rankić, I., Lenža, L.: A benchmarking of learning strategies for pest detection and identification on tomato plants for autonomous scouting robots using internal databases (2019). https://doi.org/10.1155/2019/5219471
Liu, B., Hu, Z., Zhao, Y., Bai, Y., Wang, Y.: Recognition of pyralidae insects using intelligent monitoring autonomous robot vehicle in natural farm scene (2019). CoRR, vol. abs/1903.10827, http://arxiv.org/abs/1903.10827
Deng, L., Wang, Y., Han, Z., Yu, R.: Research on insect pest image detection and recognition based on bio-inspired methods. Biosyst. Eng. 169, 139–148 (2018). https://doi.org/10.1016/j.biosystemseng.2018.02.008. ISSN 1537–5110
Liu, T., Chen, W., Wu, W., Sun, C., Guo, W., Zhu, X.: Detection of aphids in wheat fields using a computer vision technique. Biosyst. Eng. 141, 82–93 (2016). https://doi.org/10.1016/j.biosystemseng.2015.11.005. ISSN 1537–5110
Tetila, E.C.: INSECT12C-Dataset - Conjunto de Imagens de Insetos e outros Invertebrados da Cultura da Soja, UFGD (2021). http://evertontetila.ws.ufgd.edu.br/INSECT12C-Dataset.zip
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Bottou, L., Bousquet, O.: The tradeoffs of large scale learning. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems, vol. 20. Curran Associates Inc. (2008). https://proceedings.neurips.cc/paper/2007/file/0d3180d672e08b4c5312dcdafdf6ef36-Paper.pdf
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Silveira, F.A.G.d., Tetila, E.C., Astolfi, G., Costa, A.B.d., Amorim, W.P. (2021). Performance Analysis of YOLOv3 for Real-Time Detection of Pests in Soybeans. In: Britto, A., Valdivia Delgado, K. (eds) Intelligent Systems. BRACIS 2021. Lecture Notes in Computer Science(), vol 13074. Springer, Cham. https://doi.org/10.1007/978-3-030-91699-2_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-91699-2_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91698-5
Online ISBN: 978-3-030-91699-2
eBook Packages: Computer ScienceComputer Science (R0)




