Dog Face Recognition Using Deep Features Embeddings

Andrade, João P. B.; Costa, Leonardo F.; Fernandes, Lucas S.; Rego, Paulo A. L.; Maia, José G. R.

doi:10.1007/978-3-031-45389-2_9

João P. B. Andrade⁹,
Leonardo F. Costa⁹,
Lucas S. Fernandes⁹,
Paulo A. L. Rego⁹ &
…
José G. R. Maia¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14196))

Included in the following conference series:

Brazilian Conference on Intelligent Systems

589 Accesses

Abstract

Over 470 million dogs are kept as pets around the world. Dogs are owned at an average number of $1.6\%$ per household. The US has the most dog pets, where about $68\%$ of households own at least one pet. Lost and missing dogs are a severe source of suffering and problems for their families. So, this paper addresses the problem of facial dog identification. This technology can benefit many applications, such as handling the missing pet problem, granting pets access to their houses, more intelligent zoonosis control, pet health care, and tracking stray pets. We evaluate a Residual Convolutional Neural Network, specifically ResNet-34, for facial identification in dogs. We tested in DogFaceNet and Flickr-dog datasets with and without two face preprocessing techniques: a central crop and an aligned facial extraction. Experimental results show promising results surpassing the state-of-the-art: $97.6\%$ and $82.8\%$ accuracies for DogFaceNet and Flickr-dog, respectively. Moreover, we also provide recall metrics for the best models.

Supported by organization FUNCAP.

Access provided by University of Notre Dame Hesburgh Library. Download conference paper PDF

A Deep Learning Approach for Dog Face Verification and Recognition

Dogface Detection and Localization of Dogface’s Landmarks

Where is my puppy? Retrieving lost dogs by facial features

Article 03 September 2016

1 Introduction

Effective identification and tracking of pets is a valuable technology for modern society. For example, many countries have more dogs than children, according to recent official data releases by [15]. Stray animals, in general, could pose a serious hazard to other species and human health in urban areas, especially to children. On the other hand, missing pets is a frequent problem whose conditions of occurrence are minimal. Since dogs are autonomous animals, it only takes a little carelessness to make them run away from home, especially when there are festive events with fireworks. On top of that, the human ability to discern animals by their traits is reasonably limited; consequently, it is difficult to identify and track animals effectively with the naked eye [4].

In [7], the authors sought to understand the process of coping with grief experienced by people who experienced the loss of their pets. They concluded that these processes have characteristics similar to those present in mourning processes arising from the loss of significant people. That said, canine recognition can be an attractive alternative for smart cities to deal with complex problems related to stray or lost animals, access control for pets, dog catching, zoonosis control, and others.

As evidence of the interest in identifying dogs, some work seeks to identify dogs using the dog’s nose print, an even more challenging problem with the images further reduced to just the dog’s nose. [13] reaches AUC of $88.81\%$ with a model with the dual global descriptor, which can exploit the multi-scaled features of image working in a dataset with 6000 dogs with 20, 000 photographs of their nose prints. A recent study, [23], moved by the problem that in Sri Lanka, the authors’ country, a lost dog being found is an infrequent occurrence, presents a model of Convolutional Neural Network (CNN) for classification and recognition of images for facial recognition of dogs. The CNN used consists of a customized model VGG-16, obtaining about $94\%$ accuracy in the dog recognition task.

It is noteworthy that facial recognition is one of the least invasive options and is less subject to operational problems when the animal does not cooperate with the use of devices or collars [15]. Of course, in the case of dogs, the effectiveness of this approach may vary if the animal is not cared for similarly after it is lost (hair and other characteristics may change, for example). There are algorithms in the literature that deal with this problem, but several of them are arguably overly complex or may present a still limited performance for such a challenging but significant problem. Thus, the present work seeks to consolidate results in this area by investigating the use of deep feature embedding vectors for dog face recognition, focusing on the facial identification task.

These are the main contributions of this paper:

We used ResNet-34, a residual CNN for the face identification task in dogs, on two datasets in which each dataset was evaluated in its standard form and applied two preprocessing techniques for each dataset. We evaluate the use of this computation-intensive method, its benefits, and whether they are justifiable, taking its accuracy into account.
We compare three dog face preprocessing methods using two public datasets [15, 16] found in similar works. While most papers only provide performance analysis considering accuracy, this paper also includes recall.
We obtained state-of-the-art performance using a fast approach that outperforms previous methods, which usually resort to more complex, computationally expensive models.

The remainder of this paper is organized as follows. In Sect. 2, we presented the related works. The datasets that address the problem of facial identification in dogs are described in Sect. 3. In Sect. 4, we described the approach presented in this work. In Sect. 5, we present the methodology for the experimental evaluation of our approach. In Sect. 6, we presented the results. Moreover, finally, in Sect. 7, we presented final remarks and future works.

2 Related Work

It is worth mentioning that most of the works found in the literature are related to the task of dog breed classification. This is a challenging problem due to the diversity of the existing native breeds and the breeds resulting from crossbreeding and genetic modifications, where many of these are visually similar to each other.

[14] presented a method for fine-grained image classification based on using key regions during the feature extraction process to improve the performance of the classification. This method was evaluated on the dog breed classification problem using a dataset of images downloaded from Google, Image-net, and Flickr, which comprises 133 dog breeds and 8, 351 images. The proposed method achieved a recognition rate of $67\%$ and used conventional tools such as SIFT and color histograms of the animal’s face region, excluding everything else in the image.

[17] used Dense-SIFT to extract features over the input image. These features are split and combined before feeding a CNN classifier. This method resulted in $94.2\%$ accuracy for 121 dog breeds, using a database obtained by crawl image data methods on the web. Similarly, [4] created a (closed) dataset for dog breed and wild wolf classification. These authors compared different CNN architectures using transfer learning and obtained $92.48\%$ accuracy.

[19] proposed another method based on deep learning, using supervised clustering based on a multi-part-convolutional neural network and expectation-maximization. The method was tested on many datasets, including the Snapshot Serengeti dataset [20], which contains approximately 78, 000 labeled images of animals in Africa, obtaining an accuracy of $98.4\%$, and the authors reinforce the idea that using CNNs with supervised training works very well in extracting features from images in such a type of problem.

Other works relate to using different techniques to track and find lost pets, whereas many approaches resort to installing GPS and other kinds of sensors. [21] proposed Bokk Meow, a mobile application whose goal is to help users locate their animals using GPS once the animals might have this type of sensor installed on their collars. The authors argue that their application contributes to lesser stray/lost pets. However, Bokk Meow demands an active Wi-Fi connection, and its tracking features do not work appropriately in dense urban building areas.

[1] proposed a methodology for noseprint-based dog recognition using a deep, Siamese convolutional network called DNNet. This CNN is composed of feature extraction and (expensive) self-attention modules. Their method seems promising, presenting an accuracy above $98.97\%$ and resorts to affordable equipment for image acquisition. However, such an approach sounds hard to automate. Moreover, human operators should handle dogs physically, which may cause risks or discomfort to the operator and the dog when there is no collaboration from the animal.

The work of [22] presents a smartphone application for dog face detection. The system uses the YOLO deep convolutional neural network to simultaneously predict face bounding boxes and class confidences. The Doggie Smile app helps to take pictures of the dogs as they face the camera. The proposed mobile app can simultaneously assess the gaze directions of three dogs in a scene more than 13 times per second, measured on the iPhone XR. The reported average accuracy of the dog face detection system is $92\%$.

3 Dog Face Recognition Datasets and Methods

There are datasets for many tasks related to fine-grained dog image classification, such as Columbia Dogs and Stanford Dogs, that are used in [9, 14], respectively. Those datasets were used in previous works for augmenting data and for fine-tuning pre-trained models before the dog recognition task was carried out. For example, [16] also resorted to these datasets to train and evaluate fog face detectors and aligners.

3.1 Dog Face Recognition Datasets

Table 1 summarizes the datasets known in the literature for the face identification task in dogs, their main characteristics, and which ones we used in our experiments. Our only motivation for choosing these datasets was that they were public and freely available for download.

Table 1. Datasets found in the literature for the task of facial identification in dogs.

Full size table

The DogFaceNet dataset [16], contains 8, 363 images from 1, 393 dog identities. Each dog has at least two JPEG images with $224\times 224$ RGB pixels. Despite the previous works of [15, 16] claim that DogFaceNet is the first public dataset for dog face recognition and verification.

Examples of the DogFaceNet^{Footnote 1} are shown at the top of Fig. 1, with 3 identities with 5 images each. It is essential to highlight that the images were cropped and aligned beforehand. However, this process introduces black borders in the images. Moreover, there are many samples where the dog appears in profile or their face is partially occluded.

The Flickr-dog dataset [15] is 22.3 times smaller than DogFaceNet, containing only 374 images of dogs from two breeds: pugs and huskies. The dataset includes 21 identities for each dog breed, totaling 42 for individual dogs and thus 5 images per identity. This means that Flickr-dog provides 33.1 times fewer identities than DogFaceNet, so the task of face recognition is also related to the challenge of discerning from multiple dogs of the same breed.

Each image in the dataset has $250\times 250$ RGB pixels containing a horizontally aligned face of a dog. Also, the images are normalized such that all images are cropped, (loosely) aligned, and resized. Five sample images are shown at the bottom of Fig. 1 for three individuals: lighting conditions vary significantly, there are artifacts, and also black borders are evident.

3.2 Dog Face Recognition Methods

In the work of [23], a VGG16-based [18] CNN is trained to classify dogs in 5 categories using transfer learning. Then, 128-dimensional embeddings are extracted from this CNN and used for identification by training an unsupervised model for distance calculation. According to the authors, experimental evaluation resulted in more than $90\%$ accuracy using their dataset where dogs appear in various poses and sizes, but the work is missing evaluation details of how dog faces are processed. Moreover, the dataset is private.

[15] proposed a pioneer work in dog face recognition. The authors evaluated four classic approaches (EigenFaces, FisherFaces, LBPH, and Sparse) against two CNNs, BARK and WOOF, for retrieving lost dogs based on facial features. These CNNs significantly outperformed the other methods, obtaining an accuracy of $89.4\%$ and $81.1\%$ for WOOF and BARK, respectively, for the single-breed Snoopybook dataset, which is not publicly available. Moreover, the accuracy is considerably lower considering the evaluation on the Flickr-dog dataset: $66.9\%$ and $67.6\%$ for WOOF and BARK, respectively.

In [3], in order to identify animals, the authors compared two different CNN architectures (VGG-16 and Inception-V3) originally applied in object detection. These two architectures are applied to the Flickr-dog and DogsVsCats datasets. The accuracy achieved by VGG-16 on Flickr-dog was $95\%$ and $98\%$ on DogsVsCats, while Inception-V3 reached $87\%$ on Flickr-dog and $85\%$ on the DogsVsCats dataset. DogsVsCats dataset is a dataset with a different purpose, which includes images of cats, so it was not considered among the datasets in this work.

[16] proposed a deep learning approach for dog face verification and introduced the DogFaceNet dataset. They evaluated two models based on VGG and ResNet using $224\times 224$ RGB images as input. Deep feature embedding vectors were generated using Triplet Loss with soft and hard triplet mining, generated both online and offline. These authors obtained a verification accuracy of $86\%$, considered poor compared to the typical performance of similar approaches for human face recognition. They stated that errors arose from frequent occlusions in the test dataset, dog’s pose, similarity of different dogs of the same breed, and light exposure.

[12] proposed the use of soft biometrics for improving identification. Based on a Fast R-CNN, their method performs a coarse-to-fine identification, i.e., starting from the breed filtering, then carrying out the identification. Their results suggest that breed information improved accuracy for different dog breeds, and overall accuracy is $76.53\%$ for the Flickr-dog using an Xception back-end [5]. The authors improved this result to $84.94\%$ by using a likelihood-adjusted decision network.

[2] proposed and evaluated a deep learning-based approach to pet detection and recognition, named Pet2NetID, based on a dataset provided by Pet2Network, a social network for pets and their owners. They achieved $94.59\%$ accuracy by combining transfer learning and object detection approaches with Inception V3 and SSD Inception V2 on the provided Pet2Net dataset, and $77.19\%$ was achieved when training with the Pet2Net dataset on the Flickr-dog dataset. The approach proposes to work with wild images without any pre-processing, including other objects besides the pets themselves, and the identification uses information from the whole animal, not only the face.

In general, as seen in Table 2, these works use deep learning methods without preprocessing experimentation (e.g., face detection) and do not provide recall results, only accuracy. We observed this and included it in our experiments, described in the following sections, developing a unique approach to the problem with a more robust evaluation than the related works in this section.

Table 2. Comparison between related works.

Full size table

4 Proposed Dog Face Recognition Method

Since many works in the literature already resorted to transferring learning and fine-tuning existing models, our investigation prioritized following a different strategy, so we trained models from scratch. As stated before, the incoming images are subjected to optional preprocessing, which can occur in two ways. However, only cropping is allowed in this step, so no further filtering or image enhancement is applied in favor of an honest comparison against other methods.

4.1 Preprocessing and Training Pipeline

The pipeline of evaluation using the methods is illustrated in Fig. 2. We took three different preprocessing paths and evaluated them separately, these being: (A) the dataset in its pure form; (B) a version where each image was cropped in the center, which we call Central Crop (CC); and (C) another path using Facial Detection (FD). The use of each technique is detailed below.

A centralized crop is applied to assume the images are satisfactorily aligned, i.e., the dog’s eyes lie near the center of the image, but assuming there is a border whose existence is questionable for facial recognition purposes. The CC preprocessing algorithm extracts the central pixels from a given dimension, thus reducing the input image size.

FD includes alignment and performs as follows. First, the dog face is detected using a CNN-based object detector using dlib’s [10] Max-Margin Object Detection (MMOD) loss layer that implements the object detection scheme as described in [11]. Despite using a concise representation, i.e., this CNN’s weights require only about 700 KB. However, MMOD inference is the most time-consuming operation in this pipeline. Second, six facial landmarks are then estimated using an ensemble of regression trees [8] with cascade depth $=20$ and tree depth $=5$. Finally, these points (shown in Fig. 3) are converted into another layout before they are used to extract aligned dog faces with padding $=0.2$ since mapping this 5-point into an aligned box is less prone to errors. Such conversion is based on the proportions of the eye and nose landmarks.

4.2 Preprocessing the Datasets

Once we have clarified the two preprocessing techniques, in this subsection, we detail the generation of datasets’ versions with each preprocessing and the details that occurred in each technique and dataset.

DogFaceNet CC results from a central crop of $180\times 180$ pixels of each image in DogFaceNet, so the total individuals and images are kept. DogFaceNet FD corresponds to the DogFaceNet dataset after face and alignment. We then used the same $80-20$ split used in the original DogFaceNet and DogFaceNet CC, but some images are missing due to face detection errors. This resulted in 1, 108 identities for training, corresponding to 5, 978 images. There are 276 identities for testing, corresponding to 1, 694 images. This process lost 9 identities corresponding to $0.65\%$ of individuals (ids 44, 67, 287, 449, 494, 948, 1131, 1274, and 1310). On the other hand, the 691 images lost correspond to $8.26\%$ of the images from the original dataset. It is worth highlighting that missing images cannot be included in the training-test split.

Flickr-dog CC corresponds to the Flickr-dog dataset after a central crop for the $200\times 200$ inner pixels. In turn, Flickr-dog FD corresponds to Flickr-dog after face detection and alignment. This process resulted in 41 of the 42 identities, totaling 218 images. This process lost one identity corresponding to $2.38\%$ of individuals since the face of “Wilco” could not be detected in any of its images (first row of Fig. 1). On the other hand, 156 images lost correspond to $41.71\%$ of the images from the original dataset. Such a fact is quite a loss in terms of images. However, it is essential to remember that all the 218 remaining images will be used for model evaluation.

In Fig. 4, there are some examples of removed images where the algorithm could not detect the face and examples of dogs that lost all images and consequently lost their identity for both datasets. The final numbers for the datasets are described in Table 3.

Table 3. Number of identities and images of each of the dog face datasets.

Full size table

5 Experimental Evaluation

In order to evaluate the use of the residual network in the dataset and each of the generated versions, we performed the identification experiments evaluating different metrics of interest in this context. We carried out the experiments on a computer running Linux operating system (Ubuntu 20.04.2 LTS), equipped with an Intel(R) Core(TM) i7-9700K CPU @ 3.60 GHz, 32 GB RAM, a 1 TB SDD, and an NVIDIA GeForce RTX 2080ti graphics card.

The ResNet-34 convolutional neural network was used for the experiments, as it is a good trade-off between performance and computational cost. It is a residual neural network with about 21, 282, 000 parameters, requiring about 4 GFLOPS for activation. Figure 5 shows the ResNet-34 architecture. The implementation of ResNet-34 present in the machine learning DLIB C++ library was used [10].

The split training and testing procedures were explained individually in specific topics for each dataset. We defined that all training sessions would have 80, 000 steps, to standardize all procedures for each dataset. We chose this number because it is the average value where the loss curve is already relatively stabilized.

5.1 Performance Metrics

The metrics used in this work are accuracy and recall. Both are defined in terms of true positives (tp), false positives (fp), true negative (tn) and false negative (fn) samples. Accuracy is defined in Eq. 1 and recall in Eq. 2.

$$\begin{aligned} Accuracy = \frac{tp+tn}{tp+tn+fp+fn} \end{aligned}$$

(1)

$$\begin{aligned} Recall = \frac{tp}{tp+fn} \end{aligned}$$

(2)

Precision and F1-score sound virtually meaningless for our experiments since the evaluation considers an upper triangular matrix where most elements in that matrix represent different individuals. In other words, we would be roughly counting the main diagonal versus the rest of the matrix in the limit, so these performance metrics are influenced mainly by the model’s ability to discard dissimilar individuals.

5.2 Preparation of the Datasets

Notice that the baseline work [16] adopts a $90-10$ split, i.e., only 697 images (8.33% of the dataset) are used for testing. To better generalize the model used here, we adopted the $80-20$ split, making the problem even more challenging by having fewer images for training and providing a more extensive test set. For the DogFaceNet case, 5, 978 images for training in each case (CC and FD), excluding all the FD algorithm images, did not detect a face, and 1, 794 test images for original DogFacneNet, 1, 704 images in CC, and 1, 520 in DogFaceNet FD.

The experiment performed with Flickr-dog proceeded as follows. Since this is a small-scale dataset, the method heavily depends on some data augmentation strategy or external data.

The results are inferior without training with external data and only with data augmentation: an accuracy limited to 76.55% and recall limited to 65.68%, also performed using the $80-20$ split.

We decided to use the model trained with the different versions of the DogFaceNet dataset (original, CC, and FD) and test on a range corresponding to 20% of the Flickr-dog dataset, which was also submitted and generated versions with CC and FD preprocessing. For this case, the original with 65 images, Flickr-dog CC with 69, and FD with 45 images.

6 Results

Preliminarily, we demonstrated the loss for each of the training variations of the DogFaceNet dataset in Fig. 6 and noticed that the curves are very similar, reaching very close values at the end of the training. The curve of the FD training stabilized faster, showing that it would not require pervasive training for this case, unlike the other two pieces of training that stabilized closer to the end of the training.

Still regarding the training, in Fig. 7, we showed through the embeddings extracted from each dataset that ResNet-34 produced an effective separation between classes, where classes are the “same dog” and “different dogs”, with Euclidean distance and relative frequency as dimensions. It is also possible to see in these distinction curves that the overlap area of the CC is visually smaller than the other two pieces of training.

The results are presented in Table 4, containing each dataset variation, accuracy, and recall of the method. The best result in the DogFaceNet dataset is 98.43% of accuracy with DogFaceNet with facial detection for training and testing. In the central cropped dataset version, the method achieves 85.64% of recall without preprocessing in training. Furthermore, the best result in the Flickr-dog is 91.53% of accuracy with facial detection in training and testing.

Overall, the facial alignment FD allowed for achieving the best accuracy for both datasets used in this work, which probably makes the network more objective, not paying attention to details that are not part of the dog’s face but are in the image.

Table 4. Results of accuracy and recall using ResNet-34 model

Full size table

Table 5. Comparison with other methods

Full size table

To compare more fairly, we do not use our results obtained in tests on the FD version with the other works in the literature. We consider that the modification made and the loss of images generate an impact, making the data set different. Using the CC technique with no image loss, we demonstrated better results than all the other works by using FD for training and CC for testing, as shown in Table 5.

7 Conclusion

This work investigated the problem of dog face recognition. We compared three strategies to preprocess the input images: no preprocessing, central crop, and face detection with alignment. We explored the performance of a robust baseline model, ResNet-34, showing that it can be competitive or even outperform the results found in the literature. Moreover, this paper also includes recall as an essential performance metric regarding the verification that is missing in most related works, especially considering the method’s ability not to repudiate the dog.

Experimental results show that face alignment does not play an important role for improving recall in dog face recognition, where the results on datasets with preprocessing were slightly better than the others.

Finally, it is worth emphasizing that the alignment process pays off on Flickr-dog since (1) this strategy improved the state-of-the-art considering both accuracy and recall, and (2) our experiments evaluated the entire Flickr-dog since the model was trained with external data. The latter observation is consistent with the hypothesis found in previous works [23] that augmentation of dog breeds yields better accuracy when the literature focus on classification and lacks a proper analysis for identification.

Future work includes extending the existing dog identification datasets in terms of both new identities and additional annotations, adding support for cat face recognition, investigating effective approaches for animal identification and re-identification, and the assessment of other techniques for the problem of animal biometric recognition.

Notes

1.
https://github.com/GuillaumeMougeot/DogFaceNet.

References

Bae, H.B., Pak, D., Lee, S.: Dog nose-print identification using deep neural networks. IEEE Access 9, 49141–49153 (2021)
Article Google Scholar
Batic, D., Culibrk, D.: Identifying individual dogs in social media images. arXiv preprint arXiv:2003.06705 (2020)
Capone, V., Figueiredo, C., Valle, E., Andaló, F.: CrowdPet: deep learning applied to the detection of dogs in the wild. Multimedia Tools Appl. 76(14), 15325–15340 (2017)
Google Scholar
Chaturvedi, K.: Wolf and dog breed image classification using deep learning techniques. Ph.D. thesis, Dublin, National College of Ireland (2020)
Google Scholar
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Karasu, S., Alkar, Ö., et al.: A qualitative investigation on the mourning period that occurs after the loss of pets. Veteriner Hekimler Derneği Dergisi/J. Turk. Vet. Med. Soc. 91(2), 86–97 (2020)
Article Google Scholar
Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874 (2014)
Google Scholar
Khosla, A., Jayadevaprakash, N., Yao, B., Li, F.F.: Novel dataset for fine-grained image categorization: Stanford dogs. In: Proceedings of CVPR Workshop on Fine-Grained Visual Categorization (FGVC), vol. 2. Citeseer (2011)
Google Scholar
King, D.E.: Dlib-ML: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)
Google Scholar
King, D.E.: Max-margin object detection. CoRR abs/1502.00046 (2015). http://arxiv.org/abs/1502.00046
Lai, K., Tu, X., Yanushkevich, S.: Dog identification using soft biometrics and neural networks. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2019)
Google Scholar
Li, B., Wang, Z., Wu, N., Shi, S., Ma, Q.: Dog nose print matching with dual global descriptor based on contrastive learning. arXiv preprint arXiv:2206.00580 (2022)
Liu, J., Kanazawa, A., Jacobs, D., Belhumeur, P.: Dog breed classification using part localization. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 172–185. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_13
Chapter Google Scholar
Moreira, T.P., Perez, M.L., de Oliveira Werneck, R., Valle, E.: Where is my puppy? Retrieving lost dogs by facial features. Multimedia Tools Appl. 76(14), 15325–15340 (2017)
Article Google Scholar
Mougeot, G., Li, D., Jia, S.: A deep learning approach for dog face verification and recognition. In: Nayak, A.C., Sharma, A. (eds.) PRICAI 2019. LNCS (LNAI), vol. 11672, pp. 418–430. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29894-4_34
Chapter Google Scholar
Ouyang, J., He, H., He, Y., Tang, H.: Dog recognition in public places based on convolutional neural network. Int. J. Distrib. Sens. Netw. 15(5), 1550147719829675 (2019)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sundaram, D.M., Loganathan, A.: A new supervised clustering framework using multi discriminative parts and expectation-maximization approach for a fine-grained animal breed classification (SC-MPEM). Neural Process. Lett. 52(1), 727–766 (2020)
Article Google Scholar
Swanson, A., Kosmala, M., Lintott, C., Simpson, R., Smith, A., Packer, C.: Data from: snapshot serengeti, high-frequency annotated camera trap images of 40 mammalian species in an African savanna (2015). https://doi.org/10.5061/dryad.5pt92
Tangsripairoj, S., Kittirattanaviwat, P., Koophiran, K., Raksaithong, L.: Bokk meow: a mobile application for finding and tracking pets. In: 2018 15th International Joint Conference on Computer Science and Software Engineering (JCSSE), pp. 1–6. IEEE (2018)
Google Scholar
Turečková, A., Holík, T., Komínková Oplatková, Z.: Dog face detection using yolo network. In: Mendel. Brno University of Technology (2020)
Google Scholar
Weerasekara, D., Gamage, M., Kulasooriya, K.: Combined approach of supervised and unsupervised learning for dog face recognition. In: 2021 6th International Conference for Convergence in Technology (I2CT), pp. 1–5. IEEE (2021)
Google Scholar
Yoon, B., So, H., Rhee, J.: A methodology for utilizing vector space to improve the performance of a dog face identification model. Appl. Sci. 11(5), 2074 (2021). https://doi.org/10.3390/app11052074
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank The Ceará State Foundation for the Support of Scientific and Technological Development (FUNCAP) for the financial support (6945087/2019).

Author information

Authors and Affiliations

MDCC, Universidade Federal do Ceará, Campus do Pici - Bloco 910, Fortaleza, CE, Brazil
João P. B. Andrade, Leonardo F. Costa, Lucas S. Fernandes & Paulo A. L. Rego
Instituto UFC Virtual - Universidade Federal do Ceará, Campus do Pici - Bloco 901, Fortaleza, CE, Brazil
José G. R. Maia

Authors

João P. B. Andrade
View author publications
Search author on:PubMed Google Scholar
Leonardo F. Costa
View author publications
Search author on:PubMed Google Scholar
Lucas S. Fernandes
View author publications
Search author on:PubMed Google Scholar
Paulo A. L. Rego
View author publications
Search author on:PubMed Google Scholar
José G. R. Maia
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to João P. B. Andrade .

Editor information

Editors and Affiliations

Federal University of São Carlos, São Carlos, Brazil
Murilo C. Naldi
Centro Universitario da FEI, São Bernardo do Campo, Brazil
Reinaldo A. C. Bianchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Andrade, J.P.B., Costa, L.F., Fernandes, L.S., Rego, P.A.L., Maia, J.G.R. (2023). Dog Face Recognition Using Deep Features Embeddings. In: Naldi, M.C., Bianchi, R.A.C. (eds) Intelligent Systems. BRACIS 2023. Lecture Notes in Computer Science(), vol 14196. Springer, Cham. https://doi.org/10.1007/978-3-031-45389-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-45389-2_9
Published: 12 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45388-5
Online ISBN: 978-3-031-45389-2
eBook Packages: Computer ScienceComputer Science (R0)

Keywords

Publish with us

Policies and ethics

Dog Face Recognition Using Deep Features Embeddings

Abstract

Similar content being viewed by others

A Deep Learning Approach for Dog Face Verification and Recognition

Dogface Detection and Localization of Dogface’s Landmarks

Where is my puppy? Retrieving lost dogs by facial features

Explore related subjects

1 Introduction

2 Related Work

3 Dog Face Recognition Datasets and Methods

3.1 Dog Face Recognition Datasets

3.2 Dog Face Recognition Methods

4 Proposed Dog Face Recognition Method

4.1 Preprocessing and Training Pipeline

4.2 Preprocessing the Datasets

5 Experimental Evaluation

5.1 Performance Metrics

5.2 Preparation of the Datasets

6 Results

7 Conclusion

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Keywords

Publish with us