Date of Award

1-1-2024

Thesis Type

masters

Document Type

Thesis (Restricted Access)

Divisions

science

Department

Institute of Mathematical Sciences

Institution

Universiti Malaya

Abstract

The ease and the affordability of image data acquisition have made whole-image analysis an attractive analytical approach in biological research. Coupled with machine learning, whole-image analysis has the potential to complement or even supplant traditional morphometric approaches for species identification in medical, veterinary, and forensic entomology. Here, I used a substantially expanded dataset (n = 759; 13 species and a species variant; 3 families) to consolidate findings from a pilot study (n = 74; 15 species; 2 families) for automated species identification of fly species based on their wing venation patterns, using classical Krawtchouk moment invariants coupled with a random forest model. To leverage on state-on-the-art methods on image analysis, I conducted a comparative analysis using ResNet, a deep learning model. Five-fold cross validation results show impressive mean identification accuracies of 98.56 ± 0.38% and 99.60 ± 0.27% at the family level, and 91.04 ± 1.33% and 97.87 ± 1.01% at the species level, for the classical and deep learning approaches, respectively. Additionally, the mean F1- scores of 0.89 ± 0.02 and 0.97 ± 0.01 respectively indicate a good balance of precision and recall for both models. Importantly, the regions on the fly wings that are used by ResNet for species identification were successfully visualised using Grad-CAM heatmaps, thus facilitating the interpretation of putative biological bases of identifications using ResNet. In summary, this study demonstrates the extent to which species differences in the studied dipteran species can be expressed in wing morphology, both quantitatively and qualitatively, through image data. Specifically, the findings from interpretable deep learning are potentially useful for generating hypotheses about putative wing anatomies that hold taxonomic value.

Note

Thesis (M.A) – Faculty of Science, Universiti Malaya, 2024.

Share

COinS