Doubly Structured Data Synthesis for Time-Series Energy-Use Data
As the demand for efficient energy management increases, the need for extensive, high-quality energy data becomes critical. However, privacy concerns and insufficient data volume pose significant challenges. To address these issues, data synthesis techniques are employed to augment and replace real...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-12-01
|
| Series: | Sensors |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1424-8220/24/24/8033 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | As the demand for efficient energy management increases, the need for extensive, high-quality energy data becomes critical. However, privacy concerns and insufficient data volume pose significant challenges. To address these issues, data synthesis techniques are employed to augment and replace real data. This paper introduces Doubly Structured Data Synthesis (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi>DS</mi><mn>2</mn></msup></semantics></math></inline-formula>), a novel method to tackle privacy concerns in time-series energy-use data. <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi>DS</mi><mn>2</mn></msup></semantics></math></inline-formula> synthesizes rate changes to maintain longitudinal information and uses calibration techniques to preserve the cross-sectional mean structure at each time point. Numerical analyses reveal that <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi>DS</mi><mn>2</mn></msup></semantics></math></inline-formula> surpasses existing methods, such as Conditional Tabular GAN (CTGAN) and Transformer-based Time-Series Generative Adversarial Network (TTS-GAN), in capturing both time-series and cross-sectional characteristics. We evaluated our proposed method using metrics for data similarity, utility, and privacy. The results indicate that <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi>DS</mi><mn>2</mn></msup></semantics></math></inline-formula> effectively retains the underlying characteristics of real datasets while ensuring adequate privacy protection. <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi>DS</mi><mn>2</mn></msup></semantics></math></inline-formula> is a valuable tool for sharing and utilizing energy data, significantly enhancing energy demand prediction and management. |
|---|---|
| ISSN: | 1424-8220 |