SlideShare a Scribd company logo
1 of 31
Download to read offline
Ajouter une autre dimension a la 
qualité des logiciels 
Exemple de mise en oeuvre d’un outil 
d’analyse statique 
Rene Brun, CERN
CERN 
• Organisation Européenne pour la Recherche 
Nucléaire (Physique des particules): Frontière 
Franco-Suisse près de Genève 
• Recherche fondamentale, mais aussi avec des 
retombées pratiques (électronique, vide, 
cryogénie, scanners et le WEB 
• Comprendre le big-bang et l'évolution de 
l’univers
Le LHC 
(Large Hadron Collider) 
• Un tunnel de 28km a 100m sous terre a la frontiere FR/CH 
• Collisioneur de protons ou d’ions lourds 
• 10000 physiciens
LHC 
LHC 
Large Hadron Collider
CERN en chiffres 
• Utilisé par 10000 physiciens de 608 universités 
dans 113 pays. 
• Plus de 1000 physiciens par expérience 
• Une expérience peut durer 40 ans! 
• The LHC génère 15,000,000 Gigabytes de 
données par an. 
• Budget: 1 milliard de CHF/an
Informatique au CERN 
• Approx 50 MLOC C++ par les physiciens 
• Beaucoup de données: code doit être rapide 
• The code doit être correct: résultats 
scientifiques; 
• Doit être stable: 1 Higgs / 10,000,000,000,000 
collisions;
Logiciels au CERN 
• Chaque expérience développe et maintient 
ses propres logiciels (2/3 du total environ). 
• The CERN développe et maintient des outils 
généralistes communs a toutes les 
expériences: ROOT: math / statistiques, 
stockage et management des données, 
simulation de détecteurs, graphiques, GUI, 
calculs en parallèle.
Logiciels des expériences 
• Plusieurs centaines de 
developpeurs/expérience: physiciens, 
quelques experts logiciels; actifs pour 
quelques années seulement (beaucoup de 
PostDoc). 
• Développement très souvent anarchique. 
• Beaucoup de dépendances sur des logiciels 
externes.
ROOT 
• http://root.cern.ch 
• 7 Developpeurs @ 2.6 MLOC 
• Début en 1995, 90% du code avant 2005. Doit 
être maintenu sur un grand nombre de systèmes 
durant au moins 30 ans. 
• Deux releases / an 
• “Core project”, utilise par toutes les expériences/ 
physiciens y compris en dehors du domaine du 
CERN: 95000 adresses IP distinctes/an.
Software Risks 
• Maintenance 
• Implosion of huge software stack 
• Integrity of years of LHC experiments’ data 
• Correctness of scientific results 
• Failing searches for new physics 
• Note: driven by reality, not ISO xyz
Challenges For QA 
• C++ 
• anarchy 
• few experts 
• multi-platform 
• huge programs 
• part-time coding 
• input data volume 
• software as a tool 
• interpreter binding 
• lack of reproducibility
QA Requirements 
• Results must be easy to understand: novices, 
spare time coding, complex code 
• Results must be relevant: motivate, don’t 
flood, developers 
• QA must handle rare code paths: don’t hide 
the Higgs 
• QA as part of nightly / continuous testing
QA Toolbox 
• Patch review (ROOT) 
• Continuous integration (ROOT) 
• Multi-Platform: compilers, versions, 
architectures 
• Unit + functionality test suites 100k LOC for 
ROOT 
• Regression tests on performance, results
QA Toolbox (2) 
• Nightly builds + tests + automatic reports 
• Memory: valgrind, massif, ROOT, igprof... 
• Performance: callgrind, shark, gprof... 
• Test coverage: gcov 
• Coding rule checkers 
• Automation + strong software management
Memory QA 
• C++ inherent: memory errors 
– use after delete 
– uninitialized pointers 
– buffer overruns 
– memory leaks 
• Amplified by novice coders, non-defined 
ownership rules,...
Performance QA 
• Huge amounts of data, limited computing, 
thus performance sensitive 
• Difficult to predict: caching, templates, 
inlining, call of external libraries 
• Monitoring difficult to present, interpret 
• Using advanced tools + techniques to handle 
>10 MLOC in a novice-friendly way
QA Toolbox: Full?
Open QA Issues 
• Those almost impossible code paths! 
• Problems that cannot be reproduced 
• Too many code paths for humans 
• Condition matrices 
• “You know what I mean” coding
Static Code Checkers 
• Check what could be run, not what is run 
• CERN did a study of open source options
Coverity @ CERN 
• ROOT traditionally (informally) spearheads 
new technologies at CERN: what works for 
ROOT (and scales!) works for CERN 
• Realized lack of static analysis at CERN 
• Discovered Coverity through famous Scan: 
Linux, Python, Apache, glibc, openSSL,... 
• ROOT as CERN’s guinea pig project
ROOT Meets Coverity 
• Plan: 
– Product discovered by a team member 
– Convince ROOT developers 
– Convince CERN 
• Set up as “idle time project” in 4 weeks: build 
integration, automation, reporting 
• 2.6 MLOC: 4 processes * 8 hours
First Coverity Results
Report Quality 
• Very readable reports embedded in source 
• Mostly easy to understand, though complex 
contexts make complex reports 
• Amazing Relevance: 12% false positives, 23% 
intentional thus 2/3 reports causing source 
changes - despite an already well-filled QA 
toolbox
Consequence 
• Relevant reports create motivated developers!
Results With ROOT 
• 4300 reports by 7 developers in 6 weeks 
• Noticeably better release 
– Instability due blind changes, e.g. “missing break” 
• Found several long-hunted bugs, e.g. hard to 
reproduce or track down 
• Systematic deficiencies triggered systematic 
fixes: buffer overflows / strlcpy
Coverity On 50MLOC 
• Excellent show-case for experiments 
• Immediate interest: 
– Allows for precise, relevant, accessible, large-scale, 
automated, quick turn-around reports for 
software management 
– Classification in high / low impact helps fixing 
critical reports first 
– Increased awareness of software quality issues
Setup 
• Separate build + analysis server for each 
experiment due to processing times: 4 
processes * 8 hours to 4 days - partially 
caused by build systems 
• Dedicated database / tomcat server for web 
front-end, for each experiment 
• Keeps databases reasonably small, user (web) 
interface responsive
Initial Numbers: Time 
5 
0 
20 
15 
10 
ALICE 
ATLAS 
CMS 
LHCb 
ROOT 
Units 
Translation 1000 
/ MLOC 
Time 
Days 
Analysis /
Initial Numbers: Issues 
8000 
6000 
2000 
0 
4000 
ALICE 
ATLAS 
CMS 
LHCb 
ROOT 
High Impact 
Defects
Has It Arrived? 
• ROOT, particle beam, data transfer, virtual 
machine etc: nightly / regular use 
• All LHC experiments use it, one even 
embedded in automatic QA 
• Coverity is known in all experiments as “the 
tool that doesn’t forgive lazy coding”
Summary 
• Software quality a key factor for CERN: 
influences quality of scientific results 
• Blind spot despite of large QA toolset 
• Coverity unique in abilities and quality 
– Improved software, happier users 
– Quick adoption at CERN (rare!): recognized to 
improve software quality dramatically

More Related Content

What's hot

Operational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in ProductionOperational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in ProductionArne Wiebalck
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseDataStax Academy
 
Juggling with Bits and Bytes - How Apache Flink operates on binary data
Juggling with Bits and Bytes - How Apache Flink operates on binary dataJuggling with Bits and Bytes - How Apache Flink operates on binary data
Juggling with Bits and Bytes - How Apache Flink operates on binary dataFabian Hueske
 
Case study: formal verification of the Brain Fuck Scheduler
Case study: formal verification of the Brain Fuck SchedulerCase study: formal verification of the Brain Fuck Scheduler
Case study: formal verification of the Brain Fuck SchedulerMengxuan Xia
 
Integrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private CloudIntegrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private CloudArne Wiebalck
 
OSMC 2021 | Monitoring Open Source Hardware
OSMC 2021 | Monitoring Open Source HardwareOSMC 2021 | Monitoring Open Source Hardware
OSMC 2021 | Monitoring Open Source HardwareNETWAYS
 
TRAP (transient detection pipeline) status update
TRAP (transient detection pipeline) status updateTRAP (transient detection pipeline) status update
TRAP (transient detection pipeline) status updateGijs Molenaar
 
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache ZeppelinMoon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache ZeppelinFlink Forward
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...SignalFx
 
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...Paul Brebner
 
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry confluent
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Sourceaspyker
 
K. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward KeynoteK. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward KeynoteFlink Forward
 
OSMC 2021 | Robotmk: You don’t run IT – you deliver services!
OSMC 2021 | Robotmk: You don’t run IT – you deliver services!OSMC 2021 | Robotmk: You don’t run IT – you deliver services!
OSMC 2021 | Robotmk: You don’t run IT – you deliver services!NETWAYS
 
Flink history, roadmap and vision
Flink history, roadmap and visionFlink history, roadmap and vision
Flink history, roadmap and visionStephan Ewen
 
Gossip & Key Value Store
Gossip & Key Value StoreGossip & Key Value Store
Gossip & Key Value StoreSajeev P
 
Eac integrations JS LiveStream
Eac integrations JS LiveStreamEac integrations JS LiveStream
Eac integrations JS LiveStreamChronoLogic
 
Upstream Testing Collaboration
Upstream Testing Collaboration Upstream Testing Collaboration
Upstream Testing Collaboration OPNFV
 

What's hot (20)

Operational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in ProductionOperational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in Production
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series Database
 
Juggling with Bits and Bytes - How Apache Flink operates on binary data
Juggling with Bits and Bytes - How Apache Flink operates on binary dataJuggling with Bits and Bytes - How Apache Flink operates on binary data
Juggling with Bits and Bytes - How Apache Flink operates on binary data
 
Case study: formal verification of the Brain Fuck Scheduler
Case study: formal verification of the Brain Fuck SchedulerCase study: formal verification of the Brain Fuck Scheduler
Case study: formal verification of the Brain Fuck Scheduler
 
Integrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private CloudIntegrating Bare-metal Provisioning into CERN's Private Cloud
Integrating Bare-metal Provisioning into CERN's Private Cloud
 
OSMC 2021 | Monitoring Open Source Hardware
OSMC 2021 | Monitoring Open Source HardwareOSMC 2021 | Monitoring Open Source Hardware
OSMC 2021 | Monitoring Open Source Hardware
 
Apache Flink Hands On
Apache Flink Hands OnApache Flink Hands On
Apache Flink Hands On
 
TRAP (transient detection pipeline) status update
TRAP (transient detection pipeline) status updateTRAP (transient detection pipeline) status update
TRAP (transient detection pipeline) status update
 
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache ZeppelinMoon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...
 
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
 
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
K. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward KeynoteK. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward Keynote
 
OSMC 2021 | Robotmk: You don’t run IT – you deliver services!
OSMC 2021 | Robotmk: You don’t run IT – you deliver services!OSMC 2021 | Robotmk: You don’t run IT – you deliver services!
OSMC 2021 | Robotmk: You don’t run IT – you deliver services!
 
Flink history, roadmap and vision
Flink history, roadmap and visionFlink history, roadmap and vision
Flink history, roadmap and vision
 
Gossip & Key Value Store
Gossip & Key Value StoreGossip & Key Value Store
Gossip & Key Value Store
 
Eac integrations JS LiveStream
Eac integrations JS LiveStreamEac integrations JS LiveStream
Eac integrations JS LiveStream
 
Logstash and friends
Logstash and friendsLogstash and friends
Logstash and friends
 
Upstream Testing Collaboration
Upstream Testing Collaboration Upstream Testing Collaboration
Upstream Testing Collaboration
 

Viewers also liked

Comment développer un serveur métier en python/C++
Comment développer un serveur métier en python/C++Comment développer un serveur métier en python/C++
Comment développer un serveur métier en python/C++cppfrug
 
Devoxx france 2014 compteurs de perf
Devoxx france 2014 compteurs de perfDevoxx france 2014 compteurs de perf
Devoxx france 2014 compteurs de perfJean-Philippe BEMPEL
 
LLVM, clang & c++
LLVM, clang & c++LLVM, clang & c++
LLVM, clang & c++cppfrug
 
Du Polymorphisme dynamique au polymorphisme statique : Abstraction sans perte...
Du Polymorphisme dynamique au polymorphisme statique : Abstraction sans perte...Du Polymorphisme dynamique au polymorphisme statique : Abstraction sans perte...
Du Polymorphisme dynamique au polymorphisme statique : Abstraction sans perte...cppfrug
 
Skillshare - Using Kobo Toolbox for mobile data collection
Skillshare - Using Kobo Toolbox for mobile data collectionSkillshare - Using Kobo Toolbox for mobile data collection
Skillshare - Using Kobo Toolbox for mobile data collectionSchool of Data
 
12 Months of Learning about eBooks in 40 minutes
12 Months of Learning about eBooks in 40 minutes12 Months of Learning about eBooks in 40 minutes
12 Months of Learning about eBooks in 40 minutesKobo
 
C++ How I learned to stop worrying and love metaprogramming
C++ How I learned to stop worrying and love metaprogrammingC++ How I learned to stop worrying and love metaprogramming
C++ How I learned to stop worrying and love metaprogrammingcppfrug
 
Cerberus, un outil pour l'automatisation des tests fonctionnels
Cerberus, un outil pour l'automatisation des tests fonctionnelsCerberus, un outil pour l'automatisation des tests fonctionnels
Cerberus, un outil pour l'automatisation des tests fonctionnelsAurélien Bourdon
 
Big Data Paris 2015 - Cassandra chez Chronopost
Big Data Paris 2015 - Cassandra chez ChronopostBig Data Paris 2015 - Cassandra chez Chronopost
Big Data Paris 2015 - Cassandra chez ChronopostAlexander DEJANOVSKI
 
Async await in C++
Async await in C++Async await in C++
Async await in C++cppfrug
 
Training on using kobotoolbox for on line or off-line
Training on using kobotoolbox for on line or off-lineTraining on using kobotoolbox for on line or off-line
Training on using kobotoolbox for on line or off-lineEnamul Haque
 
JHipster à Devoxx 2015
JHipster à Devoxx 2015JHipster à Devoxx 2015
JHipster à Devoxx 2015Julien Dubois
 
Requêtes multi-critères avec Cassandra
Requêtes multi-critères avec CassandraRequêtes multi-critères avec Cassandra
Requêtes multi-critères avec CassandraJulien Dubois
 
Introduction à Apache Cassandra — IppEvent chez OVH 2017-03-02
Introduction à Apache Cassandra — IppEvent chez OVH 2017-03-02Introduction à Apache Cassandra — IppEvent chez OVH 2017-03-02
Introduction à Apache Cassandra — IppEvent chez OVH 2017-03-02Jérôme Mainaud
 
Monitoring Compteur EDF avec node.js
Monitoring Compteur EDF avec node.jsMonitoring Compteur EDF avec node.js
Monitoring Compteur EDF avec node.jslaurenthuet
 
Tools of data collection
Tools of data collectionTools of data collection
Tools of data collectionDr.Suresh Isave
 

Viewers also liked (20)

L’enfer des callbacks
L’enfer des callbacksL’enfer des callbacks
L’enfer des callbacks
 
Comment développer un serveur métier en python/C++
Comment développer un serveur métier en python/C++Comment développer un serveur métier en python/C++
Comment développer un serveur métier en python/C++
 
Devoxx france 2014 compteurs de perf
Devoxx france 2014 compteurs de perfDevoxx france 2014 compteurs de perf
Devoxx france 2014 compteurs de perf
 
LLVM, clang & c++
LLVM, clang & c++LLVM, clang & c++
LLVM, clang & c++
 
Kobo Collect Workshop
Kobo Collect WorkshopKobo Collect Workshop
Kobo Collect Workshop
 
KoBo Toolbox
KoBo ToolboxKoBo Toolbox
KoBo Toolbox
 
Du Polymorphisme dynamique au polymorphisme statique : Abstraction sans perte...
Du Polymorphisme dynamique au polymorphisme statique : Abstraction sans perte...Du Polymorphisme dynamique au polymorphisme statique : Abstraction sans perte...
Du Polymorphisme dynamique au polymorphisme statique : Abstraction sans perte...
 
Skillshare - Using Kobo Toolbox for mobile data collection
Skillshare - Using Kobo Toolbox for mobile data collectionSkillshare - Using Kobo Toolbox for mobile data collection
Skillshare - Using Kobo Toolbox for mobile data collection
 
12 Months of Learning about eBooks in 40 minutes
12 Months of Learning about eBooks in 40 minutes12 Months of Learning about eBooks in 40 minutes
12 Months of Learning about eBooks in 40 minutes
 
C++ How I learned to stop worrying and love metaprogramming
C++ How I learned to stop worrying and love metaprogrammingC++ How I learned to stop worrying and love metaprogramming
C++ How I learned to stop worrying and love metaprogramming
 
Cerberus, un outil pour l'automatisation des tests fonctionnels
Cerberus, un outil pour l'automatisation des tests fonctionnelsCerberus, un outil pour l'automatisation des tests fonctionnels
Cerberus, un outil pour l'automatisation des tests fonctionnels
 
Big Data Paris 2015 - Cassandra chez Chronopost
Big Data Paris 2015 - Cassandra chez ChronopostBig Data Paris 2015 - Cassandra chez Chronopost
Big Data Paris 2015 - Cassandra chez Chronopost
 
Async await in C++
Async await in C++Async await in C++
Async await in C++
 
Training on using kobotoolbox for on line or off-line
Training on using kobotoolbox for on line or off-lineTraining on using kobotoolbox for on line or off-line
Training on using kobotoolbox for on line or off-line
 
JHipster à Devoxx 2015
JHipster à Devoxx 2015JHipster à Devoxx 2015
JHipster à Devoxx 2015
 
Requêtes multi-critères avec Cassandra
Requêtes multi-critères avec CassandraRequêtes multi-critères avec Cassandra
Requêtes multi-critères avec Cassandra
 
Introduction à Apache Cassandra — IppEvent chez OVH 2017-03-02
Introduction à Apache Cassandra — IppEvent chez OVH 2017-03-02Introduction à Apache Cassandra — IppEvent chez OVH 2017-03-02
Introduction à Apache Cassandra — IppEvent chez OVH 2017-03-02
 
Chapter 9-METHODS OF DATA COLLECTION
Chapter 9-METHODS OF DATA COLLECTIONChapter 9-METHODS OF DATA COLLECTION
Chapter 9-METHODS OF DATA COLLECTION
 
Monitoring Compteur EDF avec node.js
Monitoring Compteur EDF avec node.jsMonitoring Compteur EDF avec node.js
Monitoring Compteur EDF avec node.js
 
Tools of data collection
Tools of data collectionTools of data collection
Tools of data collection
 

Similar to How static analysis supports quality over 50 million lines of C++ code

The Open Chemistry Project
The Open Chemistry ProjectThe Open Chemistry Project
The Open Chemistry ProjectMarcus Hanwell
 
Avogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and SemanticsAvogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and SemanticsMarcus Hanwell
 
open62541 - Open Source OPC UA on Steroids
open62541  - Open Source OPC UA on Steroidsopen62541  - Open Source OPC UA on Steroids
open62541 - Open Source OPC UA on SteroidsJulius Pfrommer
 
HPC Controls Future
HPC Controls FutureHPC Controls Future
HPC Controls Futurercastain
 
Testing Below the Application
Testing Below the ApplicationTesting Below the Application
Testing Below the ApplicationAsh Winter
 
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Spark Summit
 
Chemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopChemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopMarcus Hanwell
 
Open Source Visualization of Scientific Data
Open Source Visualization of Scientific DataOpen Source Visualization of Scientific Data
Open Source Visualization of Scientific DataMarcus Hanwell
 
Devtest: using Lean and Devops practices to bring QA and coders together by L...
Devtest: using Lean and Devops practices to bring QA and coders together by L...Devtest: using Lean and Devops practices to bring QA and coders together by L...
Devtest: using Lean and Devops practices to bring QA and coders together by L...Institut Lean France
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudyJohn Adams
 
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Lucidworks
 
3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging振东 刘
 
2012 sept 18_thug_biotech
2012 sept 18_thug_biotech2012 sept 18_thug_biotech
2012 sept 18_thug_biotechAdam Muise
 
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardUsing Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardDocker, Inc.
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)Tibo Beijen
 
Tuenti Release Workflow
Tuenti Release WorkflowTuenti Release Workflow
Tuenti Release WorkflowTuenti
 
iSense Java Summit 2017 - Microservices in action at the Dutch National Police
iSense Java Summit 2017 - Microservices in action at the Dutch National PoliceiSense Java Summit 2017 - Microservices in action at the Dutch National Police
iSense Java Summit 2017 - Microservices in action at the Dutch National PoliceBert Jan Schrijver
 
Towards "write once - run whenever possible" with Safety Critical Java af Ben...
Towards "write once - run whenever possible" with Safety Critical Java af Ben...Towards "write once - run whenever possible" with Safety Critical Java af Ben...
Towards "write once - run whenever possible" with Safety Critical Java af Ben...InfinIT - Innovationsnetværket for it
 

Similar to How static analysis supports quality over 50 million lines of C++ code (20)

The Open Chemistry Project
The Open Chemistry ProjectThe Open Chemistry Project
The Open Chemistry Project
 
Avogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and SemanticsAvogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and Semantics
 
open62541 - Open Source OPC UA on Steroids
open62541  - Open Source OPC UA on Steroidsopen62541  - Open Source OPC UA on Steroids
open62541 - Open Source OPC UA on Steroids
 
HPC Controls Future
HPC Controls FutureHPC Controls Future
HPC Controls Future
 
Testing Below the Application
Testing Below the ApplicationTesting Below the Application
Testing Below the Application
 
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
 
Chemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the DesktopChemical Databases and Open Chemistry on the Desktop
Chemical Databases and Open Chemistry on the Desktop
 
Open Source Visualization of Scientific Data
Open Source Visualization of Scientific DataOpen Source Visualization of Scientific Data
Open Source Visualization of Scientific Data
 
Devtest: using Lean and Devops practices to bring QA and coders together by L...
Devtest: using Lean and Devops practices to bring QA and coders together by L...Devtest: using Lean and Devops practices to bring QA and coders together by L...
Devtest: using Lean and Devops practices to bring QA and coders together by L...
 
StarlingX - A Platform for the Distributed Edge | Ildiko Vancsa
StarlingX - A Platform for the Distributed Edge | Ildiko VancsaStarlingX - A Platform for the Distributed Edge | Ildiko Vancsa
StarlingX - A Platform for the Distributed Edge | Ildiko Vancsa
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
 
3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging
 
2012 sept 18_thug_biotech
2012 sept 18_thug_biotech2012 sept 18_thug_biotech
2012 sept 18_thug_biotech
 
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardUsing Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
 
Tuenti Release Workflow
Tuenti Release WorkflowTuenti Release Workflow
Tuenti Release Workflow
 
iSense Java Summit 2017 - Microservices in action at the Dutch National Police
iSense Java Summit 2017 - Microservices in action at the Dutch National PoliceiSense Java Summit 2017 - Microservices in action at the Dutch National Police
iSense Java Summit 2017 - Microservices in action at the Dutch National Police
 
Towards "write once - run whenever possible" with Safety Critical Java af Ben...
Towards "write once - run whenever possible" with Safety Critical Java af Ben...Towards "write once - run whenever possible" with Safety Critical Java af Ben...
Towards "write once - run whenever possible" with Safety Critical Java af Ben...
 
Scaling tappsi
Scaling tappsiScaling tappsi
Scaling tappsi
 

Recently uploaded

MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
How To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTROHow To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTROmotivationalword821
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 

Recently uploaded (20)

MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
How To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTROHow To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTRO
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 

How static analysis supports quality over 50 million lines of C++ code

  • 1. Ajouter une autre dimension a la qualité des logiciels Exemple de mise en oeuvre d’un outil d’analyse statique Rene Brun, CERN
  • 2. CERN • Organisation Européenne pour la Recherche Nucléaire (Physique des particules): Frontière Franco-Suisse près de Genève • Recherche fondamentale, mais aussi avec des retombées pratiques (électronique, vide, cryogénie, scanners et le WEB • Comprendre le big-bang et l'évolution de l’univers
  • 3. Le LHC (Large Hadron Collider) • Un tunnel de 28km a 100m sous terre a la frontiere FR/CH • Collisioneur de protons ou d’ions lourds • 10000 physiciens
  • 4. LHC LHC Large Hadron Collider
  • 5. CERN en chiffres • Utilisé par 10000 physiciens de 608 universités dans 113 pays. • Plus de 1000 physiciens par expérience • Une expérience peut durer 40 ans! • The LHC génère 15,000,000 Gigabytes de données par an. • Budget: 1 milliard de CHF/an
  • 6. Informatique au CERN • Approx 50 MLOC C++ par les physiciens • Beaucoup de données: code doit être rapide • The code doit être correct: résultats scientifiques; • Doit être stable: 1 Higgs / 10,000,000,000,000 collisions;
  • 7. Logiciels au CERN • Chaque expérience développe et maintient ses propres logiciels (2/3 du total environ). • The CERN développe et maintient des outils généralistes communs a toutes les expériences: ROOT: math / statistiques, stockage et management des données, simulation de détecteurs, graphiques, GUI, calculs en parallèle.
  • 8. Logiciels des expériences • Plusieurs centaines de developpeurs/expérience: physiciens, quelques experts logiciels; actifs pour quelques années seulement (beaucoup de PostDoc). • Développement très souvent anarchique. • Beaucoup de dépendances sur des logiciels externes.
  • 9. ROOT • http://root.cern.ch • 7 Developpeurs @ 2.6 MLOC • Début en 1995, 90% du code avant 2005. Doit être maintenu sur un grand nombre de systèmes durant au moins 30 ans. • Deux releases / an • “Core project”, utilise par toutes les expériences/ physiciens y compris en dehors du domaine du CERN: 95000 adresses IP distinctes/an.
  • 10. Software Risks • Maintenance • Implosion of huge software stack • Integrity of years of LHC experiments’ data • Correctness of scientific results • Failing searches for new physics • Note: driven by reality, not ISO xyz
  • 11. Challenges For QA • C++ • anarchy • few experts • multi-platform • huge programs • part-time coding • input data volume • software as a tool • interpreter binding • lack of reproducibility
  • 12. QA Requirements • Results must be easy to understand: novices, spare time coding, complex code • Results must be relevant: motivate, don’t flood, developers • QA must handle rare code paths: don’t hide the Higgs • QA as part of nightly / continuous testing
  • 13. QA Toolbox • Patch review (ROOT) • Continuous integration (ROOT) • Multi-Platform: compilers, versions, architectures • Unit + functionality test suites 100k LOC for ROOT • Regression tests on performance, results
  • 14. QA Toolbox (2) • Nightly builds + tests + automatic reports • Memory: valgrind, massif, ROOT, igprof... • Performance: callgrind, shark, gprof... • Test coverage: gcov • Coding rule checkers • Automation + strong software management
  • 15. Memory QA • C++ inherent: memory errors – use after delete – uninitialized pointers – buffer overruns – memory leaks • Amplified by novice coders, non-defined ownership rules,...
  • 16. Performance QA • Huge amounts of data, limited computing, thus performance sensitive • Difficult to predict: caching, templates, inlining, call of external libraries • Monitoring difficult to present, interpret • Using advanced tools + techniques to handle >10 MLOC in a novice-friendly way
  • 18. Open QA Issues • Those almost impossible code paths! • Problems that cannot be reproduced • Too many code paths for humans • Condition matrices • “You know what I mean” coding
  • 19. Static Code Checkers • Check what could be run, not what is run • CERN did a study of open source options
  • 20. Coverity @ CERN • ROOT traditionally (informally) spearheads new technologies at CERN: what works for ROOT (and scales!) works for CERN • Realized lack of static analysis at CERN • Discovered Coverity through famous Scan: Linux, Python, Apache, glibc, openSSL,... • ROOT as CERN’s guinea pig project
  • 21. ROOT Meets Coverity • Plan: – Product discovered by a team member – Convince ROOT developers – Convince CERN • Set up as “idle time project” in 4 weeks: build integration, automation, reporting • 2.6 MLOC: 4 processes * 8 hours
  • 23. Report Quality • Very readable reports embedded in source • Mostly easy to understand, though complex contexts make complex reports • Amazing Relevance: 12% false positives, 23% intentional thus 2/3 reports causing source changes - despite an already well-filled QA toolbox
  • 24. Consequence • Relevant reports create motivated developers!
  • 25. Results With ROOT • 4300 reports by 7 developers in 6 weeks • Noticeably better release – Instability due blind changes, e.g. “missing break” • Found several long-hunted bugs, e.g. hard to reproduce or track down • Systematic deficiencies triggered systematic fixes: buffer overflows / strlcpy
  • 26. Coverity On 50MLOC • Excellent show-case for experiments • Immediate interest: – Allows for precise, relevant, accessible, large-scale, automated, quick turn-around reports for software management – Classification in high / low impact helps fixing critical reports first – Increased awareness of software quality issues
  • 27. Setup • Separate build + analysis server for each experiment due to processing times: 4 processes * 8 hours to 4 days - partially caused by build systems • Dedicated database / tomcat server for web front-end, for each experiment • Keeps databases reasonably small, user (web) interface responsive
  • 28. Initial Numbers: Time 5 0 20 15 10 ALICE ATLAS CMS LHCb ROOT Units Translation 1000 / MLOC Time Days Analysis /
  • 29. Initial Numbers: Issues 8000 6000 2000 0 4000 ALICE ATLAS CMS LHCb ROOT High Impact Defects
  • 30. Has It Arrived? • ROOT, particle beam, data transfer, virtual machine etc: nightly / regular use • All LHC experiments use it, one even embedded in automatic QA • Coverity is known in all experiments as “the tool that doesn’t forgive lazy coding”
  • 31. Summary • Software quality a key factor for CERN: influences quality of scientific results • Blind spot despite of large QA toolset • Coverity unique in abilities and quality – Improved software, happier users – Quick adoption at CERN (rare!): recognized to improve software quality dramatically