Contextualizing injury severity from occupational accident reports using an optimized deep learning prediction model

Document Type

Article

Publication Date

4-1-2024

Abstract

Background: This study introduced a novel approach for predicting occupational injury severity by leveraging deep learning -based text classi fi cation techniques to analyze unstructured narratives. Unlike conventional methods that rely on structured data, our approach recognizes the richness of information within injury narrative descriptions with the aim of extracting valuable insights for improved occupational injury severity assessment. Methods: Natural language processing (NLP) techniques were harnessed to preprocess the occupational injury narratives obtained from the US Occupational Safety and Health Administration (OSHA) from January 2015 to June 2023. The methodology involved meticulous preprocessing of textual narratives to standardize text and eliminate noise, followed by the innovative integration of Term FrequencyInverse Document Frequency (TF-IDF) and Global Vector (GloVe) word embeddings for effective text representation. The proposed predictive model adopts a novel Bidirectional Long Short -Term Memory (Bi-LSTM) architecture and is further re fi ned through model optimization, including random search hyperparameters and in-depth feature importance analysis. The optimized Bi-LSTM model has been compared and validated against other machine learning classi fi ers which are na & iuml;ve Bayes, support vector machine, random forest, decision trees, and K -nearest neighbor. Results: The proposed optimized Bi-LSTM models ` superior predictability, boasted an accuracy of 0.95 for hospitalization and 0.98 for amputation cases with faster model processing times. Interestingly, the feature importance analysis revealed predictive keywords related to the causal factors of occupational injuries thereby providing valuable insights to enhance model interpretability. Conclusion: Our proposed optimized Bi-LSTM model offers safety and health practitioners an effective tool to empower workplace safety proactive measures, thereby contributing to business productivity and sustainability. This study lays the foundation for further exploration of predictive analytics in the occupational safety and health domain.

Keywords

Natural language processing, Machine learning, Deep learning, Text classi fi cation, Occupational injury, Occupational safety and health

Divisions

biomedengine

Funders

Deanship of Scientific Research, Vice Presidency for Graduate Studies and Scientific Research, King Faisal University, Saudi Arabia (4118)

Publication Title

PeerJ Computer Science

Volume

10

Publisher

PeerJ

Publisher Location

341-345 OLD ST, THIRD FLR, LONDON, EC1V 9LL, ENGLAND

This document is currently not available here.

Share

COinS