Reducing class overlap in feature space using Kalman filtering with particle swarm optimization

Document Type

Article

Publication Date

2026

Abstract

The challenge of overlap among classes is a significant obstacle in supervised learning, which results in ambiguous decision surfaces and deteriorated performance when classifying the observations. Conventional overlap-cleaning techniques rely heavily on either deleting observations or duplicating observations among the classes, which incurs loss of information or biased reallocation among the classes. The research proposes a new hybrid framework that builds on principal component analysis, Kalman filtering (KF), and particle swarm optimization (PSO) to mitigate the effect of overlap in the feature space while also retaining all data samples. The proposed method considers that each feature vector represents an observed noisy version of the underlying class state. The KF iteratively refines the observed features and PSO adaptively improves the process and measurement noise covariance to robustly improve class separability. The procedure was thoroughly validated through analysis of performance with multiple classifiers, such as support vector machines, naive Bayes, and linear discriminant analysis across four benchmark datasets (dogs vs. cats, lithium battery, Wisconsin diagnostic breast cancer (WDBC), and extended Yale face). All classifier examples report significant increases in classification accuracy and witness significantly reduced overlap ratios compared to traditional methods, such as edited nearest neighbor, multi-class combined cleaning and resampling, one-sided selection, reduced noise synthetic minority oversampling technique, neighborhood cleaning rule, and Tomek links removal. The results, as summarized across all datasets, show evidence of a consistent and significant decrease in overlap percentage, with key reductions being from 25% to 2.5% in the dogs vs. cats dataset, 8.89% to 0.37% in the lithium battery dataset, 8.44% to 0.7% in the WDBC dataset, and 9.52 % to 0% in the extended Yale face dataset.

Keywords

Kalman filter (KF), Particle swarm optimization (PSO), k-Nearest neighbor (kNN)

Publication Title

Pattern Recognition

ISSN

0031-3203

DOI

10.1016/j.patcog.2025.112952

Divisions

Faculty of Engineering

Volume

174

First Page

112952

Publisher

Elsevier

Share

COinS