Spatially Recalibrated Convolutional Neural Network for Vehicle Type Recognition

Document Type

Article

Publication Date

1-1-2023

Abstract

Vehicle Type Recognition (VTR) is a significant segment within the vehicle recognition field. It provides an alternative identification method aside from license plate recognition and vehicle make and model recognition. Most of the recent studies use Convolutional Neural Networks (CNNs) to perform VTR. However, the feature responses obtained from CNNs are not recalibrated based on saliency and this hinders the classification performance. In this study, we propose a Spatial Attention Module (SAM) that is compatible with the existing CNNs. We aim to exploit the spatial relationship between feature responses by scaling them according to their relative importance to increase classification accuracy. The results reveal the exceptional performance of SAM on Beijing Institute of Technology (BIT)-Vehicle, Stanford Cars and web-nature Comprehensive Cars (CompCarsWeb) with 96.92%, 84.48% and 95.96% accuracies, respectively. A qualitative inspection of the learned feature embedding suggests the high cohesivity of the features within the group. Furthermore, an ablation study is conducted to justify the hyperparameters of choice for SAM. SAM is also modular where it is highly compatible with other CNNs and it leads to considerable performance improvement. A comparison with existing attention modules suggests our proposal prevails in the VTR application. The inference times of 1 ms and 10 ms for CaffeNet-SAM and ResNet-SAM also make them suitable for real-time classification tasks.

Keywords

Convolutional neural network, multi-head self-attention, spatial attention module, transformer, vehicle type recognition

Divisions

sch_ecs

Publication Title

IEEE Access

Volume

11

Publisher

Institute of Electrical and Electronics Engineers

Publisher Location

445 HOES LANE, PISCATAWAY, NJ 08855-4141 USA

This document is currently not available here.

Share

COinS