Academic Journal

Survey: Time-Series Data Preprocessing: A Survey and an Empirical Analysis

Bibliographic Details
Title: Survey: Time-Series Data Preprocessing: A Survey and an Empirical Analysis
Authors: Tawalkuli, Amal, Havers, Bastian, 1991, Gulisano, Vincenzo Massimiliano, 1984, Kaiser, Daniel, Engel, Thomas
Source: AutoSPADA (Automotive Stream Processing and Distributed Analytics) OODIDA Phase 2 Journal of Engineering Research. 13(2):674-711
Subject Terms: Data Preprocessing, Data Quality
Description: Data are naturally collected in their raw state and must undergo a series of preprocessing steps to obtain data in their input state for Artificial Intelligence (AI) and other applications. The data preprocessing phase is not only necessary to fit input requirements but also effective in improving AI training efficiency and output accuracy. Data preprocessing is a time consuming and complex phase that lacks a unified and structured approach. We survey data preprocessing techniques under different categories to provide an extended and structured scope of data preprocessing relevant to numerical time-series data. We also provide an empirical analysis of the impact of preprocessing techniques on the quality of the data and on the performance of AI algorithms. In addition, we discuss the feasibility of distributing some of the surveyed techniques to the edge. Leveraging edge computing to distribute data preprocessing reduces the workload on central systems, creates more manageable data lakes, reduces the consumption of resources (e.g., energy) and enables EdgeAI.
File Description: electronic
Access URL: https://research.chalmers.se/publication/540273
https://research.chalmers.se/publication/540495
https://research.chalmers.se/publication/540495/file/540495_Fulltext.pdf
Database: SwePub
FullText Text:
  Availability: 0
CustomLinks:
  – Url: https://research.chalmers.se/publication/540273#
    Name: EDS - SwePub (ns324271)
    Category: fullText
    Text: View record in SwePub
  – Url: https://www.doi.org/10.1016/j.jer.2024.02.018?
    Name: ScienceDirect (all content) (s7799221)
    Category: fullText
    Text: View record from ScienceDirect
    MouseOverText: View record from ScienceDirect
  – Url: https://dx.doi.org/doi:10.1016/j.jer.2024.02.018
    Name: EDS - Springer Nature Journals (s7799221)
    Category: fullText
    Text: View record at Springer
  – Url: https://resolver.ebsco.com/c/fiv2js/result?sid=EBSCO:edsswe&genre=article&issn=23071885&ISBN=&volume=13&issue=2&date=20250101&spage=674&pages=674-711&title=AutoSPADA (Automotive Stream Processing and Distributed Analytics) OODIDA Phase 2 Journal of Engineering Research&atitle=Survey%3A%20Time-Series%20Data%20Preprocessing%3A%20A%20Survey%20and%20an%20Empirical%20Analysis&aulast=Tawalkuli%2C%20Amal&id=DOI:10.1016/j.jer.2024.02.018
    Name: Full Text Finder (for New FTF UI) (ns324271)
    Category: fullText
    Text: Full Text Finder
    MouseOverText: Full Text Finder
Header DbId: edsswe
DbLabel: SwePub
An: edsswe.oai.research.chalmers.se.1d423331.6919.45dc.a4f1.a45b0f1013c8
RelevancyScore: 1115
AccessLevel: 6
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 1114.736328125
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Survey: Time-Series Data Preprocessing: A Survey and an Empirical Analysis
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Tawalkuli%2C+Amal%22">Tawalkuli, Amal</searchLink><br /><searchLink fieldCode="AR" term="%22Havers%2C+Bastian%22">Havers, Bastian</searchLink>, 1991<br /><searchLink fieldCode="AR" term="%22Gulisano%2C+Vincenzo+Massimiliano%22">Gulisano, Vincenzo Massimiliano</searchLink>, 1984<br /><searchLink fieldCode="AR" term="%22Kaiser%2C+Daniel%22">Kaiser, Daniel</searchLink><br /><searchLink fieldCode="AR" term="%22Engel%2C+Thomas%22">Engel, Thomas</searchLink>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <i>AutoSPADA (Automotive Stream Processing and Distributed Analytics) OODIDA Phase 2 Journal of Engineering Research</i>. 13(2):674-711
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Data+Preprocessing%22">Data Preprocessing</searchLink><br /><searchLink fieldCode="DE" term="%22Data+Quality%22">Data Quality</searchLink>
– Name: Abstract
  Label: Description
  Group: Ab
  Data: Data are naturally collected in their raw state and must undergo a series of preprocessing steps to obtain data in their input state for Artificial Intelligence (AI) and other applications. The data preprocessing phase is not only necessary to fit input requirements but also effective in improving AI training efficiency and output accuracy. Data preprocessing is a time consuming and complex phase that lacks a unified and structured approach. We survey data preprocessing techniques under different categories to provide an extended and structured scope of data preprocessing relevant to numerical time-series data. We also provide an empirical analysis of the impact of preprocessing techniques on the quality of the data and on the performance of AI algorithms. In addition, we discuss the feasibility of distributing some of the surveyed techniques to the edge. Leveraging edge computing to distribute data preprocessing reduces the workload on central systems, creates more manageable data lakes, reduces the consumption of resources (e.g., energy) and enables EdgeAI.
– Name: Format
  Label: File Description
  Group: SrcInfo
  Data: electronic
– Name: URL
  Label: Access URL
  Group: URL
  Data: <link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/540273" linkWindow="_blank">https://research.chalmers.se/publication/540273</link><br /><link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/540495" linkWindow="_blank">https://research.chalmers.se/publication/540495</link><br /><link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/540495/file/540495_Fulltext.pdf" linkWindow="_blank">https://research.chalmers.se/publication/540495/file/540495_Fulltext.pdf</link>
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsswe&AN=edsswe.oai.research.chalmers.se.1d423331.6919.45dc.a4f1.a45b0f1013c8
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1016/j.jer.2024.02.018
    Languages:
      – Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 38
        StartPage: 674
    Subjects:
      – SubjectFull: Data Preprocessing
        Type: general
      – SubjectFull: Data Quality
        Type: general
    Titles:
      – TitleFull: Survey: Time-Series Data Preprocessing: A Survey and an Empirical Analysis
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Tawalkuli, Amal
      – PersonEntity:
          Name:
            NameFull: Havers, Bastian
      – PersonEntity:
          Name:
            NameFull: Gulisano, Vincenzo Massimiliano
      – PersonEntity:
          Name:
            NameFull: Kaiser, Daniel
      – PersonEntity:
          Name:
            NameFull: Engel, Thomas
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Type: published
              Y: 2025
          Identifiers:
            – Type: issn-print
              Value: 23071885
            – Type: issn-print
              Value: 23071877
            – Type: issn-locals
              Value: SWEPUB_FREE
            – Type: issn-locals
              Value: CTH_SWEPUB
          Numbering:
            – Type: volume
              Value: 13
            – Type: issue
              Value: 2
          Titles:
            – TitleFull: AutoSPADA (Automotive Stream Processing and Distributed Analytics) OODIDA Phase 2 Journal of Engineering Research
              Type: main
ResultId 1