Research Publications (2021 to 2025)

A deep learning framework for multi-object tracking in team sports videos

Document Type

Article

Publication Date

8-1-2024

Abstract

In response to the challenges of Multi-Object Tracking (MOT) in sports scenes, such as severe occlusions, similar appearances, drastic pose changes, and complex motion patterns, a deep-learning framework CTGMOT (CNN-Transformer-GNN-based MOT) specifically for multiple athlete tracking in sports videos that performs joint modelling of detection, appearance and motion features is proposed. Firstly, a detection network that combines Convolutional Neural Networks (CNN) and Transformers is constructed to extract both local and global features from images. The fusion of appearance and motion features is achieved through a design of parallel dual-branch decoders. Secondly, graph models are built using Graph Neural Networks (GNN) to accurately capture the spatio-temporal correlations between object and trajectory features from inter-frame and intra-frame associations. Experimental results on the public sports tracking dataset SportsMOT show that the proposed framework outperforms other state-of-the-art methods for MOT in complex sport scenes. In addition, the proposed framework shows excellent generality on benchmark datasets MOT17 and MOT20. The authors propose a deep-learning framework, CTGMOT, for multi-object tracking (MOT) in complex team sports videos. The backbone network of the framework combines CNN and Transformers to extract local and global features, and uses parallel decoders to fuse appearance and motion features. To accurately capture spatial-temporal correlations, the framework adopts GNN and an attention mechanism to fuse the spatial tracking features of objects within frames as well as the temporal tracking features across different frames, which better distinguishes fast-moving and occluded targets and improves the performance of online MOT.image

Keywords

feature extraction, feature selection, image motion analysis, neural net architecture, object detection, object tracking, sport

Publication Title

IET Computer Vision

Recommended Citation

Cao, Wei; Wang, Xiaoyong; Liu, Xianxiang; and Xu, Yishuai, "A deep learning framework for multi-object tracking in team sports videos" (2024). Research Publications (2021 to 2025). 5433.
https://knova.um.edu.my/research_publications_2021_2025/5433

Divisions

library

Funders

Anhui Provincial Department of Education

Volume

Issue

Publisher

Institution of Engineering and Technology (IET)

Publisher Location

111 RIVER ST, HOBOKEN 07030-5774, NJ USA

This document is currently not available here.

COinS

Research Publications (2021 to 2025)

A deep learning framework for multi-object tracking in team sports videos

Document Type

Publication Date

Abstract

Keywords

Publication Title

Recommended Citation

Divisions

Funders

Volume

Issue

Publisher

Publisher Location

Search

Browse

Author Corner

Research Publications (2021 to 2025)

A deep learning framework for multi-object tracking in team sports videos

Authors

Document Type

Publication Date

Abstract

Keywords

Publication Title

Recommended Citation

Divisions

Funders

Volume

Issue

Publisher

Publisher Location

Share

Search

Browse

Author Corner