SlideShare a Scribd company logo
1 of 20
BY SAM COLLIE
UNIVERSITY OF ALABAMA
AT BIRMINGHAM
IPROGRESS LAB
Instrumentation of Unsafe C
Functions to Safe Functions Using
the RTC Tool
Slide: 1 of 21
Outline
 Background on RTC and ROSE
 My Extension
 Comparison to other tools
 Current Drawbacks and Future Goals
Slide: 2 of 21
What is RTC?
 RTC is a runtime checking tool for the C
programming language co-developed at
UAB.
 It’s built on the ROSE compiler to read in
source code, make changes to it, and output
new code.
Slide: 3 of 21
Background
 C is the second most popular programming
language (IEEE Spectrum ranking 2015).
 Thanks to manual management of memory via
pointer, C programs can achieve a staggering level
of efficiency.
Slide: 4 of 21
Safe Languages
Slide: 5 of 21
 Java:
int[] numbers = new int[10];
numbers[11] = 5;
Exception in thread "main"
java.lang.IndexOutOfBoundsException: Index: 11,
Size: 10 at
java.util.ArrayList.rangeCheck(ArrayList.java:604)
Unsafe Languages
Slide: of 21
 C Language:
int numbers[10];
numbers[11] = 5;
 What will C do?
 Undefined Behaviour.
RTC: Instrumentation
ROSE Parser
Preprocessing
Instrumentation
ROSE Unparser
RTC
Source Code Instrumented
Source Code
Slide: 6 of 21
Pointer Metadata
Slide: 7 of 21
 Records Lower and Upper bounds of allocation.
 Records scope in which pointer is valid.
 Records whether the memory allocated to the
pointer has been freed.
Checks Made by RTC
Slide: 8 of 21
 Uses the metadata to make spatial and temporal
checks
 Spatial: No index out of bounds access
 Temporal: No leaving a valid scope with freeing
memory allocated to pointer
RTC vs. Other Tools
Slide: 9 of 21
Metadata Stack
 Globally accessible stack
 Handles passing pointers from one function to
another
 Only handles situation where both functions can
access same global scope.
Slide: 10 of 21
Passing Pointers
Slide: 11 of 21
Push Metadata
Pass Pointer
Pop Metadata
Perform Checks
Specific Aims
 To instrument function calls to C
Standard Library functions that are
unsafe.
By changing the function calls to call a
library whose source code will be visible to
RTC during the instrumentation process.
Slide: 12 of 21
Take the following code:
BEFORE
int main(){ memcpy(dest, src, size); }
************************************************
AFTER
int main(){ rtcMemCpy(dest,src,size); }
rtcMemCpy(void* dest, const void* src, size_t size){
// rtcMemCpy source code
}
************************************************
Slide: 14 of 21
Implementation
 Takes place before RTC instrumentation
 Function names are instrumented, and function
definitions are added.
 Finally, RTC itself is run to insert the checks into
added function definitions
Slide: 15 of 21
How is this helpful?
 There are several commonly used standard
library functions that operate on pointers.
 Some of these functions don't guarantee
spatial and temporal safety within the
function (unsafe).
Slide: 16 of 21
Memcpy
Slide: 17 of 21
void* memcpy(void* dest, const void* src, size_t size){
char *dp = dest;
const char *sp = src;
while (size--)
*dp++ = *sp++; //unsafe access
return dest;
}
Comparison to Other Techniques
 ManagedC (Grimmer et al., 2015)
 All or nothing instrumentation
 Valgrind
 Can make checks due to using binaries, has a much
larger footprint
 Address Sanitizer (Serebryany et al, 2012)
 Handles “some” functions but excludes third party
libraries
Slide: 18 of 21
Current Drawbacks
 Some standard library functions are dependent on
other standard library functions that are also unsafe.
 Can’t keep up to date pointer information when
pointers are passed to functions not in our provided
library.
Slide: 19 of 21
Future Goals
 Fix interdependency problems
 Optimize
 Add more functions to the safe library
 Allow easy addition of functions to safe library by
users
Slide: 20 of 21

More Related Content

What's hot

Staroletov Design by Contract, verification of Cyber-physical systems
Staroletov Design by Contract, verification of Cyber-physical systemsStaroletov Design by Contract, verification of Cyber-physical systems
Staroletov Design by Contract, verification of Cyber-physical systemsSergey Staroletov
 
Practical List COMPILER DESIGN
Practical List COMPILER DESIGNPractical List COMPILER DESIGN
Practical List COMPILER DESIGNShraddha Patel
 
Why C is Called Structured Programming Language
Why C is Called Structured Programming LanguageWhy C is Called Structured Programming Language
Why C is Called Structured Programming LanguageSinbad Konick
 
Data analysis with R and Julia
Data analysis with R and JuliaData analysis with R and Julia
Data analysis with R and JuliaMark Tabladillo
 
C Programming Language Tutorial for beginners - JavaTpoint
C Programming Language Tutorial for beginners - JavaTpointC Programming Language Tutorial for beginners - JavaTpoint
C Programming Language Tutorial for beginners - JavaTpointJavaTpoint.Com
 
R09 advanced computer architecture
R09 advanced computer architectureR09 advanced computer architecture
R09 advanced computer architecturesriniefs
 
Embedded c programming22 for fdp
Embedded c programming22 for fdpEmbedded c programming22 for fdp
Embedded c programming22 for fdpPradeep Kumar TS
 
Software languages
Software languagesSoftware languages
Software languagesEelco Visser
 
OpenGL Based Testing Tool Architecture for Exascale Computing
OpenGL Based Testing Tool Architecture for Exascale ComputingOpenGL Based Testing Tool Architecture for Exascale Computing
OpenGL Based Testing Tool Architecture for Exascale ComputingCSCJournals
 
C programming basics
C  programming basicsC  programming basics
C programming basicsargusacademy
 
Compiler Engineering Lab#2
Compiler Engineering Lab#2Compiler Engineering Lab#2
Compiler Engineering Lab#2MashaelQ
 
Introduction to programming with c,
Introduction to programming with c,Introduction to programming with c,
Introduction to programming with c,Hossain Md Shakhawat
 
Declare Your Language: What is a Compiler?
Declare Your Language: What is a Compiler?Declare Your Language: What is a Compiler?
Declare Your Language: What is a Compiler?Eelco Visser
 
Error Permissive Computing
Error Permissive ComputingError Permissive Computing
Error Permissive ComputingRyousei Takano
 
Embedded C programming based on 8051 microcontroller
Embedded C programming based on 8051 microcontrollerEmbedded C programming based on 8051 microcontroller
Embedded C programming based on 8051 microcontrollerGaurav Verma
 
Brief introduction to the c programming language
Brief introduction to the c programming languageBrief introduction to the c programming language
Brief introduction to the c programming languageKumar Gaurav
 

What's hot (20)

Staroletov Design by Contract, verification of Cyber-physical systems
Staroletov Design by Contract, verification of Cyber-physical systemsStaroletov Design by Contract, verification of Cyber-physical systems
Staroletov Design by Contract, verification of Cyber-physical systems
 
Practical List COMPILER DESIGN
Practical List COMPILER DESIGNPractical List COMPILER DESIGN
Practical List COMPILER DESIGN
 
Why C is Called Structured Programming Language
Why C is Called Structured Programming LanguageWhy C is Called Structured Programming Language
Why C is Called Structured Programming Language
 
Embedded concepts
Embedded conceptsEmbedded concepts
Embedded concepts
 
Data analysis with R and Julia
Data analysis with R and JuliaData analysis with R and Julia
Data analysis with R and Julia
 
C Programming Language Tutorial for beginners - JavaTpoint
C Programming Language Tutorial for beginners - JavaTpointC Programming Language Tutorial for beginners - JavaTpoint
C Programming Language Tutorial for beginners - JavaTpoint
 
system software
system softwaresystem software
system software
 
R09 advanced computer architecture
R09 advanced computer architectureR09 advanced computer architecture
R09 advanced computer architecture
 
Embedded c programming22 for fdp
Embedded c programming22 for fdpEmbedded c programming22 for fdp
Embedded c programming22 for fdp
 
Software languages
Software languagesSoftware languages
Software languages
 
OpenGL Based Testing Tool Architecture for Exascale Computing
OpenGL Based Testing Tool Architecture for Exascale ComputingOpenGL Based Testing Tool Architecture for Exascale Computing
OpenGL Based Testing Tool Architecture for Exascale Computing
 
C programming basics
C  programming basicsC  programming basics
C programming basics
 
Compiler Engineering Lab#2
Compiler Engineering Lab#2Compiler Engineering Lab#2
Compiler Engineering Lab#2
 
Ppt on fft
Ppt on fftPpt on fft
Ppt on fft
 
Introduction to programming with c,
Introduction to programming with c,Introduction to programming with c,
Introduction to programming with c,
 
Declare Your Language: What is a Compiler?
Declare Your Language: What is a Compiler?Declare Your Language: What is a Compiler?
Declare Your Language: What is a Compiler?
 
Embedded c
Embedded cEmbedded c
Embedded c
 
Error Permissive Computing
Error Permissive ComputingError Permissive Computing
Error Permissive Computing
 
Embedded C programming based on 8051 microcontroller
Embedded C programming based on 8051 microcontrollerEmbedded C programming based on 8051 microcontroller
Embedded C programming based on 8051 microcontroller
 
Brief introduction to the c programming language
Brief introduction to the c programming languageBrief introduction to the c programming language
Brief introduction to the c programming language
 

Similar to Instrumentation of Unsafe C Methods to Safe Methods (Samuel Bret Collie) (1)

Programming with c language practical manual
Programming with c language practical manualProgramming with c language practical manual
Programming with c language practical manualAnil Bishnoi
 
Bounded Model Checking for C Programs in an Enterprise Environment
Bounded Model Checking for C Programs in an Enterprise EnvironmentBounded Model Checking for C Programs in an Enterprise Environment
Bounded Model Checking for C Programs in an Enterprise EnvironmentAdaCore
 
An Introduction to PC-Lint
An Introduction to PC-LintAn Introduction to PC-Lint
An Introduction to PC-LintRalf Holly
 
Larson and toubro
Larson and toubroLarson and toubro
Larson and toubroanoopc1998
 
HIS 2017 Mark Batty-Industrial concurrency specification for C/C++
HIS 2017 Mark Batty-Industrial concurrency specification for C/C++HIS 2017 Mark Batty-Industrial concurrency specification for C/C++
HIS 2017 Mark Batty-Industrial concurrency specification for C/C++jamieayre
 
Python Basis Tutorial
Python Basis TutorialPython Basis Tutorial
Python Basis Tutorialmd sathees
 
Directive-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingDirective-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingRuymán Reyes
 
Design of Real - Time Operating System Using Keil µVision Ide
Design of Real - Time Operating System Using Keil µVision IdeDesign of Real - Time Operating System Using Keil µVision Ide
Design of Real - Time Operating System Using Keil µVision Ideiosrjce
 
Design of LDPC Decoder Based On FPGA in Digital Image Watermarking Technology
Design of LDPC Decoder Based On FPGA in Digital Image Watermarking TechnologyDesign of LDPC Decoder Based On FPGA in Digital Image Watermarking Technology
Design of LDPC Decoder Based On FPGA in Digital Image Watermarking TechnologyTELKOMNIKA JOURNAL
 
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...Youness Lahdili
 
cscript_controller.pdf
cscript_controller.pdfcscript_controller.pdf
cscript_controller.pdfVcTrn1
 
C programming session 01
C programming session 01C programming session 01
C programming session 01Vivek Singh
 
An introduction to_programming_the_microchip_pic_in_ccs_c
An introduction to_programming_the_microchip_pic_in_ccs_cAn introduction to_programming_the_microchip_pic_in_ccs_c
An introduction to_programming_the_microchip_pic_in_ccs_cSuresh Murugesan
 
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...AMD Developer Central
 

Similar to Instrumentation of Unsafe C Methods to Safe Methods (Samuel Bret Collie) (1) (20)

Programming with c language practical manual
Programming with c language practical manualProgramming with c language practical manual
Programming with c language practical manual
 
Dsp lab manual 15 11-2016
Dsp lab manual 15 11-2016Dsp lab manual 15 11-2016
Dsp lab manual 15 11-2016
 
Bounded Model Checking for C Programs in an Enterprise Environment
Bounded Model Checking for C Programs in an Enterprise EnvironmentBounded Model Checking for C Programs in an Enterprise Environment
Bounded Model Checking for C Programs in an Enterprise Environment
 
An Introduction to PC-Lint
An Introduction to PC-LintAn Introduction to PC-Lint
An Introduction to PC-Lint
 
Larson and toubro
Larson and toubroLarson and toubro
Larson and toubro
 
Compiler tricks
Compiler tricksCompiler tricks
Compiler tricks
 
HIS 2017 Mark Batty-Industrial concurrency specification for C/C++
HIS 2017 Mark Batty-Industrial concurrency specification for C/C++HIS 2017 Mark Batty-Industrial concurrency specification for C/C++
HIS 2017 Mark Batty-Industrial concurrency specification for C/C++
 
Embedded C.pptx
Embedded C.pptxEmbedded C.pptx
Embedded C.pptx
 
Python Basis Tutorial
Python Basis TutorialPython Basis Tutorial
Python Basis Tutorial
 
Directive-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingDirective-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous Computing
 
Design of Real - Time Operating System Using Keil µVision Ide
Design of Real - Time Operating System Using Keil µVision IdeDesign of Real - Time Operating System Using Keil µVision Ide
Design of Real - Time Operating System Using Keil µVision Ide
 
Design of LDPC Decoder Based On FPGA in Digital Image Watermarking Technology
Design of LDPC Decoder Based On FPGA in Digital Image Watermarking TechnologyDesign of LDPC Decoder Based On FPGA in Digital Image Watermarking Technology
Design of LDPC Decoder Based On FPGA in Digital Image Watermarking Technology
 
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
 
unit 5-ERTS.pptx
unit 5-ERTS.pptxunit 5-ERTS.pptx
unit 5-ERTS.pptx
 
cscript_controller.pdf
cscript_controller.pdfcscript_controller.pdf
cscript_controller.pdf
 
C programming session 01
C programming session 01C programming session 01
C programming session 01
 
An introduction to_programming_the_microchip_pic_in_ccs_c
An introduction to_programming_the_microchip_pic_in_ccs_cAn introduction to_programming_the_microchip_pic_in_ccs_c
An introduction to_programming_the_microchip_pic_in_ccs_c
 
Srgoc dotnet_new
Srgoc dotnet_newSrgoc dotnet_new
Srgoc dotnet_new
 
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
 
Csharp dot net
Csharp dot netCsharp dot net
Csharp dot net
 

Instrumentation of Unsafe C Methods to Safe Methods (Samuel Bret Collie) (1)

  • 1. BY SAM COLLIE UNIVERSITY OF ALABAMA AT BIRMINGHAM IPROGRESS LAB Instrumentation of Unsafe C Functions to Safe Functions Using the RTC Tool Slide: 1 of 21
  • 2. Outline  Background on RTC and ROSE  My Extension  Comparison to other tools  Current Drawbacks and Future Goals Slide: 2 of 21
  • 3. What is RTC?  RTC is a runtime checking tool for the C programming language co-developed at UAB.  It’s built on the ROSE compiler to read in source code, make changes to it, and output new code. Slide: 3 of 21
  • 4. Background  C is the second most popular programming language (IEEE Spectrum ranking 2015).  Thanks to manual management of memory via pointer, C programs can achieve a staggering level of efficiency. Slide: 4 of 21
  • 5. Safe Languages Slide: 5 of 21  Java: int[] numbers = new int[10]; numbers[11] = 5; Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 11, Size: 10 at java.util.ArrayList.rangeCheck(ArrayList.java:604)
  • 6. Unsafe Languages Slide: of 21  C Language: int numbers[10]; numbers[11] = 5;  What will C do?  Undefined Behaviour.
  • 7. RTC: Instrumentation ROSE Parser Preprocessing Instrumentation ROSE Unparser RTC Source Code Instrumented Source Code Slide: 6 of 21
  • 8. Pointer Metadata Slide: 7 of 21  Records Lower and Upper bounds of allocation.  Records scope in which pointer is valid.  Records whether the memory allocated to the pointer has been freed.
  • 9. Checks Made by RTC Slide: 8 of 21  Uses the metadata to make spatial and temporal checks  Spatial: No index out of bounds access  Temporal: No leaving a valid scope with freeing memory allocated to pointer
  • 10. RTC vs. Other Tools Slide: 9 of 21
  • 11. Metadata Stack  Globally accessible stack  Handles passing pointers from one function to another  Only handles situation where both functions can access same global scope. Slide: 10 of 21
  • 12. Passing Pointers Slide: 11 of 21 Push Metadata Pass Pointer Pop Metadata Perform Checks
  • 13. Specific Aims  To instrument function calls to C Standard Library functions that are unsafe. By changing the function calls to call a library whose source code will be visible to RTC during the instrumentation process. Slide: 12 of 21
  • 14. Take the following code: BEFORE int main(){ memcpy(dest, src, size); } ************************************************ AFTER int main(){ rtcMemCpy(dest,src,size); } rtcMemCpy(void* dest, const void* src, size_t size){ // rtcMemCpy source code } ************************************************ Slide: 14 of 21
  • 15. Implementation  Takes place before RTC instrumentation  Function names are instrumented, and function definitions are added.  Finally, RTC itself is run to insert the checks into added function definitions Slide: 15 of 21
  • 16. How is this helpful?  There are several commonly used standard library functions that operate on pointers.  Some of these functions don't guarantee spatial and temporal safety within the function (unsafe). Slide: 16 of 21
  • 17. Memcpy Slide: 17 of 21 void* memcpy(void* dest, const void* src, size_t size){ char *dp = dest; const char *sp = src; while (size--) *dp++ = *sp++; //unsafe access return dest; }
  • 18. Comparison to Other Techniques  ManagedC (Grimmer et al., 2015)  All or nothing instrumentation  Valgrind  Can make checks due to using binaries, has a much larger footprint  Address Sanitizer (Serebryany et al, 2012)  Handles “some” functions but excludes third party libraries Slide: 18 of 21
  • 19. Current Drawbacks  Some standard library functions are dependent on other standard library functions that are also unsafe.  Can’t keep up to date pointer information when pointers are passed to functions not in our provided library. Slide: 19 of 21
  • 20. Future Goals  Fix interdependency problems  Optimize  Add more functions to the safe library  Allow easy addition of functions to safe library by users Slide: 20 of 21

Editor's Notes

  1. Hi, my name is Sam Collie. I’m an undergraduate researcher at the University of Alabama at Birmingham in the IPROGRESS Lab. My topic is The Instrumentation of Unsafe C Methods to Safe Methods Using the RTC Tool.
  2. So first, we’ll be looking at background information concerning RTC and the compiler it’s built on, ROSE. Next we’ll discuss the extension I’m currently working on for RTC. Then we’ll compare the features I’ve added to other tools. Finally, we’ll discuss the current drawbacks of my extension and my future goals for improving it.
  3. That’s where RTC comes in. RTC is a runtime checking tool for the C programming language co-developed at UAB. It’s built on the ROSE source to source compiler to instrument code written in C. The ROSE compiler reads in source code and breaks it down into a data structure (an abstract syntax tree), that can be traversed and manipulated by other code.
  4. According to the IEEE Spectrum ranking C is the second most popular programming language. This will probably not come as a shock to most of you since C is a very old and efficient programming language. Unlike Java, C can manually manage memory via pointers. Pointers allow memory to be allocated to them for general use, thus allowing a great deal of robust usage out of the C programming language. This allows it to operate at a lower level than most high level languages, and not to work with safety checks and runtime environements. However, the contribution to speed C gets from it’s lack of a runtime environment is also one of it’s greatest weaknesses. C has no built in exceptions that will be thrown if you use access memory outside of a pointer’s allocation. If your program crashes, C will likely only tell you that the program crashed and nothing else. Even more fun, C might simply give you a random value, which can be even more tricky to troubleshoot.
  5. First we run the source code through the ROSE parser which gives us our AST. Then preprocessing normalizes the AST. (e.g. converting all arrow expressions to dot expressions, moving termination conditions of for and while loops into their bodies, moving structs defined in functions out into the global scope). We then instrument the source code to include the code for the metadata and checks we want to impose.
  6. RTC implements three kinds of safety checks: Arithmetic overow/underow, memory safety checks to nd memory bugs on stack and heap, and run-time type-safety violations. For every type of pointer in the input program, RTC declares and defines a struct to hold pointers of that type, as well as functions that handle the creation of those structs.
  7. Average Execution overhead?
  8. Memcpy recently gained notoriety due to it’s role in the heartbleed bug. Memcpy takes two pointers and an integer, and copies the number of bytes specified by the integer into the other pointer. What happens if I specify more bytes than whats available from the source pointer.
  9. ASAN: The current implementation of AddressSanitizer is based on compile-time instrumentation and thus does not han-dle system libraries (it does, however, handle some C library functions such as memset). For the open source libraries the best approach might be to create special instrumented builds. ManagedC: Managed allocations cannot be shared with precompiled native code. Therefore, ManagedC requires that the source code of the entire C program is available and is executed under ManagedC.