Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports

The time-intensive extraction of insights from textual safety documents using conventional methods causes delays and inaccuracies, hindering proactive incident prevention in construction projects. While the architecture of large language models (LLMs) were well-studied, their deployment efficiencies...

Full description

Saved in:
Bibliographic Details
Main Authors: Vedat Togan, Fatemeh Mostofi, Onur Behzat Tokdemir, Fethi Kadioglu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11023522/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849336886925983744
author Vedat Togan
Fatemeh Mostofi
Onur Behzat Tokdemir
Fethi Kadioglu
author_facet Vedat Togan
Fatemeh Mostofi
Onur Behzat Tokdemir
Fethi Kadioglu
author_sort Vedat Togan
collection DOAJ
description The time-intensive extraction of insights from textual safety documents using conventional methods causes delays and inaccuracies, hindering proactive incident prevention in construction projects. While the architecture of large language models (LLMs) were well-studied, their deployment efficiencies were often overlooked. This study proposes DistilBERT as a more efficient text management method for extracting safety text from construction safety documents. To maintain the relevance of the extracted safety text, a dataset of 5,224 construction accident cases from 73 projects across the Euro-Asia region was compiled, where incidents were analyzed through detailed questionnaires to identify safety attributes, with term frequency-inverse document frequency (TF-IDF) analysis applied for validation. When benchmarked against conventional NLP methods and state-of-the-art LLMs such as BERT, RoBERTa, and XLNet, DistilBERT demonstrated comparable accuracy with significantly reduced computational time. Specifically, DistilBERT achieved an accuracy of 79% in severity scale classification with an F1 score of 0.72, while reducing processing time by approximately 50% compared to BERT (from 2,918.28 seconds to 1,492.08 seconds). By offering rapid inference speeds with negligible accuracy trade-offs, DistilBERT emerges as a practical tool for automating safety text extraction, making it ideal for settings with limited computational capabilities and urgent decision-making requirements. This study examines how DistilBERT can be integrated into construction safety management systems without modifying the underlying platforms. Future work should focus on API creation, secure machine learning pipelines, and optimized deployment of LLMs, particularly in complex contexts.
format Article
id doaj-art-c48c73c007eb43bd88d8110bc3eb1f6f
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-c48c73c007eb43bd88d8110bc3eb1f6f2025-08-20T03:44:51ZengIEEEIEEE Access2169-35362025-01-0113997589977710.1109/ACCESS.2025.357644211023522Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident ReportsVedat Togan0https://orcid.org/0000-0001-8734-6300Fatemeh Mostofi1https://orcid.org/0000-0003-0974-1270Onur Behzat Tokdemir2https://orcid.org/0000-0002-4101-8560Fethi Kadioglu3https://orcid.org/0000-0001-7049-1704Civil Engineering Department, Karadeniz Technical University, Trabzon, TürkiyeCivil Engineering Department, Karadeniz Technical University, Trabzon, TürkiyeCivil Engineering Department, Istanbul Technical University, Istanbul, TürkiyeCivil Engineering Department, Istanbul Technical University, Istanbul, TürkiyeThe time-intensive extraction of insights from textual safety documents using conventional methods causes delays and inaccuracies, hindering proactive incident prevention in construction projects. While the architecture of large language models (LLMs) were well-studied, their deployment efficiencies were often overlooked. This study proposes DistilBERT as a more efficient text management method for extracting safety text from construction safety documents. To maintain the relevance of the extracted safety text, a dataset of 5,224 construction accident cases from 73 projects across the Euro-Asia region was compiled, where incidents were analyzed through detailed questionnaires to identify safety attributes, with term frequency-inverse document frequency (TF-IDF) analysis applied for validation. When benchmarked against conventional NLP methods and state-of-the-art LLMs such as BERT, RoBERTa, and XLNet, DistilBERT demonstrated comparable accuracy with significantly reduced computational time. Specifically, DistilBERT achieved an accuracy of 79% in severity scale classification with an F1 score of 0.72, while reducing processing time by approximately 50% compared to BERT (from 2,918.28 seconds to 1,492.08 seconds). By offering rapid inference speeds with negligible accuracy trade-offs, DistilBERT emerges as a practical tool for automating safety text extraction, making it ideal for settings with limited computational capabilities and urgent decision-making requirements. This study examines how DistilBERT can be integrated into construction safety management systems without modifying the underlying platforms. Future work should focus on API creation, secure machine learning pipelines, and optimized deployment of LLMs, particularly in complex contexts.https://ieeexplore.ieee.org/document/11023522/Construction industrydecision makingmachine learningnatural language processingproject managementsafety management
spellingShingle Vedat Togan
Fatemeh Mostofi
Onur Behzat Tokdemir
Fethi Kadioglu
Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports
IEEE Access
Construction industry
decision making
machine learning
natural language processing
project management
safety management
title Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports
title_full Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports
title_fullStr Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports
title_full_unstemmed Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports
title_short Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports
title_sort efficient management of safety documents using text based analytics to extract safety attributes from construction accident reports
topic Construction industry
decision making
machine learning
natural language processing
project management
safety management
url https://ieeexplore.ieee.org/document/11023522/
work_keys_str_mv AT vedattogan efficientmanagementofsafetydocumentsusingtextbasedanalyticstoextractsafetyattributesfromconstructionaccidentreports
AT fatemehmostofi efficientmanagementofsafetydocumentsusingtextbasedanalyticstoextractsafetyattributesfromconstructionaccidentreports
AT onurbehzattokdemir efficientmanagementofsafetydocumentsusingtextbasedanalyticstoextractsafetyattributesfromconstructionaccidentreports
AT fethikadioglu efficientmanagementofsafetydocumentsusingtextbasedanalyticstoextractsafetyattributesfromconstructionaccidentreports