Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports
The time-intensive extraction of insights from textual safety documents using conventional methods causes delays and inaccuracies, hindering proactive incident prevention in construction projects. While the architecture of large language models (LLMs) were well-studied, their deployment efficiencies...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11023522/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849336886925983744 |
|---|---|
| author | Vedat Togan Fatemeh Mostofi Onur Behzat Tokdemir Fethi Kadioglu |
| author_facet | Vedat Togan Fatemeh Mostofi Onur Behzat Tokdemir Fethi Kadioglu |
| author_sort | Vedat Togan |
| collection | DOAJ |
| description | The time-intensive extraction of insights from textual safety documents using conventional methods causes delays and inaccuracies, hindering proactive incident prevention in construction projects. While the architecture of large language models (LLMs) were well-studied, their deployment efficiencies were often overlooked. This study proposes DistilBERT as a more efficient text management method for extracting safety text from construction safety documents. To maintain the relevance of the extracted safety text, a dataset of 5,224 construction accident cases from 73 projects across the Euro-Asia region was compiled, where incidents were analyzed through detailed questionnaires to identify safety attributes, with term frequency-inverse document frequency (TF-IDF) analysis applied for validation. When benchmarked against conventional NLP methods and state-of-the-art LLMs such as BERT, RoBERTa, and XLNet, DistilBERT demonstrated comparable accuracy with significantly reduced computational time. Specifically, DistilBERT achieved an accuracy of 79% in severity scale classification with an F1 score of 0.72, while reducing processing time by approximately 50% compared to BERT (from 2,918.28 seconds to 1,492.08 seconds). By offering rapid inference speeds with negligible accuracy trade-offs, DistilBERT emerges as a practical tool for automating safety text extraction, making it ideal for settings with limited computational capabilities and urgent decision-making requirements. This study examines how DistilBERT can be integrated into construction safety management systems without modifying the underlying platforms. Future work should focus on API creation, secure machine learning pipelines, and optimized deployment of LLMs, particularly in complex contexts. |
| format | Article |
| id | doaj-art-c48c73c007eb43bd88d8110bc3eb1f6f |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-c48c73c007eb43bd88d8110bc3eb1f6f2025-08-20T03:44:51ZengIEEEIEEE Access2169-35362025-01-0113997589977710.1109/ACCESS.2025.357644211023522Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident ReportsVedat Togan0https://orcid.org/0000-0001-8734-6300Fatemeh Mostofi1https://orcid.org/0000-0003-0974-1270Onur Behzat Tokdemir2https://orcid.org/0000-0002-4101-8560Fethi Kadioglu3https://orcid.org/0000-0001-7049-1704Civil Engineering Department, Karadeniz Technical University, Trabzon, TürkiyeCivil Engineering Department, Karadeniz Technical University, Trabzon, TürkiyeCivil Engineering Department, Istanbul Technical University, Istanbul, TürkiyeCivil Engineering Department, Istanbul Technical University, Istanbul, TürkiyeThe time-intensive extraction of insights from textual safety documents using conventional methods causes delays and inaccuracies, hindering proactive incident prevention in construction projects. While the architecture of large language models (LLMs) were well-studied, their deployment efficiencies were often overlooked. This study proposes DistilBERT as a more efficient text management method for extracting safety text from construction safety documents. To maintain the relevance of the extracted safety text, a dataset of 5,224 construction accident cases from 73 projects across the Euro-Asia region was compiled, where incidents were analyzed through detailed questionnaires to identify safety attributes, with term frequency-inverse document frequency (TF-IDF) analysis applied for validation. When benchmarked against conventional NLP methods and state-of-the-art LLMs such as BERT, RoBERTa, and XLNet, DistilBERT demonstrated comparable accuracy with significantly reduced computational time. Specifically, DistilBERT achieved an accuracy of 79% in severity scale classification with an F1 score of 0.72, while reducing processing time by approximately 50% compared to BERT (from 2,918.28 seconds to 1,492.08 seconds). By offering rapid inference speeds with negligible accuracy trade-offs, DistilBERT emerges as a practical tool for automating safety text extraction, making it ideal for settings with limited computational capabilities and urgent decision-making requirements. This study examines how DistilBERT can be integrated into construction safety management systems without modifying the underlying platforms. Future work should focus on API creation, secure machine learning pipelines, and optimized deployment of LLMs, particularly in complex contexts.https://ieeexplore.ieee.org/document/11023522/Construction industrydecision makingmachine learningnatural language processingproject managementsafety management |
| spellingShingle | Vedat Togan Fatemeh Mostofi Onur Behzat Tokdemir Fethi Kadioglu Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports IEEE Access Construction industry decision making machine learning natural language processing project management safety management |
| title | Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports |
| title_full | Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports |
| title_fullStr | Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports |
| title_full_unstemmed | Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports |
| title_short | Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports |
| title_sort | efficient management of safety documents using text based analytics to extract safety attributes from construction accident reports |
| topic | Construction industry decision making machine learning natural language processing project management safety management |
| url | https://ieeexplore.ieee.org/document/11023522/ |
| work_keys_str_mv | AT vedattogan efficientmanagementofsafetydocumentsusingtextbasedanalyticstoextractsafetyattributesfromconstructionaccidentreports AT fatemehmostofi efficientmanagementofsafetydocumentsusingtextbasedanalyticstoextractsafetyattributesfromconstructionaccidentreports AT onurbehzattokdemir efficientmanagementofsafetydocumentsusingtextbasedanalyticstoextractsafetyattributesfromconstructionaccidentreports AT fethikadioglu efficientmanagementofsafetydocumentsusingtextbasedanalyticstoextractsafetyattributesfromconstructionaccidentreports |