Towards studying the effect of compiler optimizations and software randomization on GPU reliability

Λεπτομέρειες βιβλιογραφικής εγγραφής
Τίτλος: Towards studying the effect of compiler optimizations and software randomization on GPU reliability
Συγγραφείς: Castillón, Pau López, Hernández, Xavier Caricchio, Kosmidis, Leonidas
Συνεισφορές: Pau López Castillón and Xavier Caricchio Hernández and Leonidas Kosmidis
Πηγή: UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Στοιχεία εκδότη: Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2025.
Έτος έκδοσης: 2025
Θεματικοί όροι: reliability, Àrees temàtiques de la UPC::Informàtica::Enginyeria del software, Software randomization, error rate, software randomization, ddc:004, Error rate, Reliability, Graphics processing units
Περιγραφή: The evolution of Graphics Processing Unit (GPU) compilers has facilitated the support for general-purpose programming languages across various architectures. The NVIDIA CUDA Compiler (NVCC) employs multiple compilation levels prior to generating machine code, implementing intricate optimizations to enhance performance. These optimizations influence the manner in which software is mapped to the underlying hardware, which can also impact GPU reliability. TASA is a source-to-source code randomization tool designed to alter the mapping of software onto the underlying hardware. It achieves this by generating random permutations of variable and function declarations, thereby introducing random padding between declarations of different types and modifying the program memory layout. Since this modifies their location in the memory, it also modifies their cache placement, affecting both their execution time (due to the different conflicts between them, which result in a different amount of cache misses in every execution), as well as their lifetime in the cache. In this work, which is part of the HiPEAC Student Challenge 2025, we first examine the reproducibility of a subset of data presented in the ACM TACO paper "Assessing the Impact of Compiler Optimizations on GPU Reliability" [Santos et al., 2024], and second we extend it by combining it with our proposal of software randomization. The paper indicates that the -O3 optimization flag facilitates an increased workload before failures occur within the application. By employing TASA, we investigate the impact of GPU randomization on reliability and performance metrics. By reproducing the results of the paper on a different GPU platform, we observe the same trend as reported in the original publication. Moreover, our preliminary results with the application of software randomization show in several cases an improved Mean Waiting Before Failure (MWBF) compared to the original source code.
This work was supported by the ESA funded project “Open Source Software Randomisation Framework for Probabilistic WCET Prediction and Security on (multicore) CPUs, GPUs and Accelerators” as well as European Commission’s METASAT Horizon Europe project (grant agreement 101082622). Moreover, it was also partially supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under the grant IJC2020-045931-I.
Τύπος εγγράφου: Conference object
Article
Περιγραφή αρχείου: application/pdf
Γλώσσα: English
DOI: 10.4230/oasics.parma-ditam.2025.4
Σύνδεσμος πρόσβασης: https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.4
Rights: CC BY
Αριθμός Καταχώρησης: edsair.dedup.wf.002..cec53899b6e3cc6e48233c0eaef77bd3
Βάση Δεδομένων: OpenAIRE
FullText Text:
  Availability: 0
CustomLinks:
  – Url: https://explore.openaire.eu/search/publication?articleId=dedup_wf_002%3A%3Acec53899b6e3cc6e48233c0eaef77bd3
    Name: EDS - OpenAIRE (ns324271)
    Category: fullText
    Text: View record at OpenAIRE
Header DbId: edsair
DbLabel: OpenAIRE
An: edsair.dedup.wf.002..cec53899b6e3cc6e48233c0eaef77bd3
RelevancyScore: 980
AccessLevel: 3
PubType: Conference
PubTypeId: conference
PreciseRelevancyScore: 979.736328125
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Towards studying the effect of compiler optimizations and software randomization on GPU reliability
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Castillón%2C+Pau+López%22">Castillón, Pau López</searchLink><br /><searchLink fieldCode="AR" term="%22Hernández%2C+Xavier+Caricchio%22">Hernández, Xavier Caricchio</searchLink><br /><searchLink fieldCode="AR" term="%22Kosmidis%2C+Leonidas%22">Kosmidis, Leonidas</searchLink>
– Name: Author
  Label: Contributors
  Group: Au
  Data: Pau López Castillón and Xavier Caricchio Hernández and Leonidas Kosmidis
– Name: TitleSource
  Label: Source
  Group: Src
  Data: UPCommons. Portal del coneixement obert de la UPC<br />Universitat Politècnica de Catalunya (UPC)
– Name: Publisher
  Label: Publisher Information
  Group: PubInfo
  Data: Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2025.
– Name: DatePubCY
  Label: Publication Year
  Group: Date
  Data: 2025
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22reliability%22">reliability</searchLink><br /><searchLink fieldCode="DE" term="%22Àrees+temàtiques+de+la+UPC%3A%3AInformàtica%3A%3AEnginyeria+del+software%22">Àrees temàtiques de la UPC::Informàtica::Enginyeria del software</searchLink><br /><searchLink fieldCode="DE" term="%22Software+randomization%22">Software randomization</searchLink><br /><searchLink fieldCode="DE" term="%22error+rate%22">error rate</searchLink><br /><searchLink fieldCode="DE" term="%22software+randomization%22">software randomization</searchLink><br /><searchLink fieldCode="DE" term="%22ddc%3A004%22">ddc:004</searchLink><br /><searchLink fieldCode="DE" term="%22Error+rate%22">Error rate</searchLink><br /><searchLink fieldCode="DE" term="%22Reliability%22">Reliability</searchLink><br /><searchLink fieldCode="DE" term="%22Graphics+processing+units%22">Graphics processing units</searchLink>
– Name: Abstract
  Label: Description
  Group: Ab
  Data: The evolution of Graphics Processing Unit (GPU) compilers has facilitated the support for general-purpose programming languages across various architectures. The NVIDIA CUDA Compiler (NVCC) employs multiple compilation levels prior to generating machine code, implementing intricate optimizations to enhance performance. These optimizations influence the manner in which software is mapped to the underlying hardware, which can also impact GPU reliability. TASA is a source-to-source code randomization tool designed to alter the mapping of software onto the underlying hardware. It achieves this by generating random permutations of variable and function declarations, thereby introducing random padding between declarations of different types and modifying the program memory layout. Since this modifies their location in the memory, it also modifies their cache placement, affecting both their execution time (due to the different conflicts between them, which result in a different amount of cache misses in every execution), as well as their lifetime in the cache. In this work, which is part of the HiPEAC Student Challenge 2025, we first examine the reproducibility of a subset of data presented in the ACM TACO paper "Assessing the Impact of Compiler Optimizations on GPU Reliability" [Santos et al., 2024], and second we extend it by combining it with our proposal of software randomization. The paper indicates that the -O3 optimization flag facilitates an increased workload before failures occur within the application. By employing TASA, we investigate the impact of GPU randomization on reliability and performance metrics. By reproducing the results of the paper on a different GPU platform, we observe the same trend as reported in the original publication. Moreover, our preliminary results with the application of software randomization show in several cases an improved Mean Waiting Before Failure (MWBF) compared to the original source code.<br />This work was supported by the ESA funded project “Open Source Software Randomisation Framework for Probabilistic WCET Prediction and Security on (multicore) CPUs, GPUs and Accelerators” as well as European Commission’s METASAT Horizon Europe project (grant agreement 101082622). Moreover, it was also partially supported by the Spanish Ministry of Economy and Competitiveness (MINECO) under the grant IJC2020-045931-I.
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: Conference object<br />Article
– Name: Format
  Label: File Description
  Group: SrcInfo
  Data: application/pdf
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: DOI
  Label: DOI
  Group: ID
  Data: 10.4230/oasics.parma-ditam.2025.4
– Name: URL
  Label: Access URL
  Group: URL
  Data: <link linkTarget="URL" linkTerm="https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.4" linkWindow="_blank">https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.4</link>
– Name: Copyright
  Label: Rights
  Group: Cpyrght
  Data: CC BY
– Name: AN
  Label: Accession Number
  Group: ID
  Data: edsair.dedup.wf.002..cec53899b6e3cc6e48233c0eaef77bd3
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsair&AN=edsair.dedup.wf.002..cec53899b6e3cc6e48233c0eaef77bd3
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.4230/oasics.parma-ditam.2025.4
    Languages:
      – Text: English
    Subjects:
      – SubjectFull: reliability
        Type: general
      – SubjectFull: Àrees temàtiques de la UPC::Informàtica::Enginyeria del software
        Type: general
      – SubjectFull: Software randomization
        Type: general
      – SubjectFull: error rate
        Type: general
      – SubjectFull: software randomization
        Type: general
      – SubjectFull: ddc:004
        Type: general
      – SubjectFull: Error rate
        Type: general
      – SubjectFull: Reliability
        Type: general
      – SubjectFull: Graphics processing units
        Type: general
    Titles:
      – TitleFull: Towards studying the effect of compiler optimizations and software randomization on GPU reliability
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Castillón, Pau López
      – PersonEntity:
          Name:
            NameFull: Hernández, Xavier Caricchio
      – PersonEntity:
          Name:
            NameFull: Kosmidis, Leonidas
      – PersonEntity:
          Name:
            NameFull: Pau López Castillón and Xavier Caricchio Hernández and Leonidas Kosmidis
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Type: published
              Y: 2025
          Identifiers:
            – Type: issn-locals
              Value: edsair
            – Type: issn-locals
              Value: edsairFT
ResultId 1