Developing Data Workflows: From Conceptual Blueprints to Physical Implementation

Data workflows are an important component of modern analytical systems, enabling structured data extraction, transformation, integration, and delivery across diverse applications. Despite their importance, these workflows are often developed using ad hoc approaches, leading to scalability and mainte...

Full description

Saved in:
Bibliographic Details
Main Authors: Bruno Oliveira, Óscar Oliveira
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Data
Subjects:
Online Access:https://www.mdpi.com/2306-5729/10/7/97
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Data workflows are an important component of modern analytical systems, enabling structured data extraction, transformation, integration, and delivery across diverse applications. Despite their importance, these workflows are often developed using ad hoc approaches, leading to scalability and maintenance challenges. This paper proposes a structured, three-level methodology—conceptual, logical, and physical—for modeling data workflows using Business Process Model and Notation (BPMN). A custom BPMN metamodel is introduced, along with a tool built on BPMN.io, that enforces modeling constraints and supports translation from high-level workflow designs to executable implementations. Logical models are further enriched through blueprint definitions, specified in a formal, implementation-agnostic JSON schema. The methodology is validated through a case study, demonstrating its applicability across ETL and machine learning domains, promoting clarity, reuse, and automation in data pipeline development.
ISSN:2306-5729