SlideShare a Scribd company logo
1 of 14
HTTP::Parser::XS
writing a fast & secure XS module


       Cybozu Labs, Inc.
         Kazuho Oku
Pros & Cons of XS

   Pros
        Faster (well, not always, but…)
   Cons
        Difficult and time-consuming
        Higher security risk
              Mistakes in C programming leads to remote code
               injection




May 28 2010             HTTP::Parser::XS - writing a fast & secure XS module   2
The Best Practice, in General

   Don’t use XS
        Until you hit performance bottleneck
   Rewrite the bottleneck (only) using XS




May 28 2010        HTTP::Parser::XS - writing a fast & secure XS module   3
HTTP::Parser::XS




May 28 2010   HTTP::Parser::XS - writing a fast & secure XS module   4
Summary

   An HTTP request parser
   Designed and implemented to be simple,
    fast & secure
   Stateless
   PSGI-compatible
        Required or recommended by many Plack servers
              Starman, Starlet, Twiggy, etc.



May 28 2010             HTTP::Parser::XS - writing a fast & secure XS module   5
Two-layered Approach

   Picohttpparser
        HTTP request / response parser written in C
        has its own test suite
        does not parse request / response content
              request line (response line) and headers only
        copy-less
              faster, lesser probability of security holes and memory
               leaks

   HTTP::Parser::XS
        glue code to access picohttpparser from perl
May 28 2010             HTTP::Parser::XS - writing a fast & secure XS module   6
Speed and Complexity

   HTTP::Parser::XS is simple, and fast
        most of the time is not spent in picohttpparser,
         but in the glue code constructing hashref
              picohttpparser/trunk can handle >1Mreqs/sec.

                                              reqs/sec.                        lines of code
       HTTP::HeaderParser::XS                             116,000                      1,166
       HTTP::Parser::XS                                   140,000                        487
       Plack::HTTPParser::PP                                10,100                       104


May 28 2010             HTTP::Parser::XS - writing a fast & secure XS module                   7
Why is it a Stateless?

   Faster and simpler than a stateful parser
        lower security risks
   most HTTP requests / responses arrive
    in a single packet, anyway
   if written optimally in C, the cost of re-
    parsing is smaller than storing headers
    into a perlhashref

May 28 2010         HTTP::Parser::XS - writing a fast & secure XS module   8
Why is it Stateless? (cont’d)

   Easy to determine the end of an multi-
    packet HTTP request
        by looking for “rnrn” within the last packet
         (and preceding three bytes)
        mainly as a countermeasure for Slowloris




May 28 2010         HTTP::Parser::XS - writing a fast & secure XS module   9
Consistent Design

   Functions take same arguments
             buf – points to current char
             buf_end – points to end of buffer
             *ret – error value
             returns pointer to the next char
                or null on error (the reason will be stored in *ret)



   const char* parse_http_version(const char* buf, const char* buf_end,
   int* minor_version, int* ret)
   {
     EXPECT_CHAR('H'); EXPECT_CHAR('T'); EXPECT_CHAR('T');
     EXPECT_CHAR('P');EXPECT_CHAR('/'); EXPECT_CHAR('1');
     EXPECT_CHAR('.');
     return parse_int(buf, buf_end, minor_version, ret);
   }

May 28 2010                    HTTP::Parser::XS - writing a fast & secure XS module   10
Macros

   Consistent design is
                                                                      #define CHECK_EOF() 
                                                                        if (buf == buf_end) { 
                                                                          *ret = -2;      
    essential to heavy use                                            }
                                                                          return NULL;       


    of macros                                                         #define EXPECT_CHAR(ch) 
                                                                        CHECK_EOF();            

        Good abstraction ⇒ safe                                        if (*buf++ != ch) { 
                                                                          *ret = -1;      
         code                                                           }
                                                                          return NULL;       


   const char* parse_http_version(const char* buf, const char* buf_end,
   int* minor_version, int* ret)
   {
     EXPECT_CHAR('H'); EXPECT_CHAR('T'); EXPECT_CHAR('T');
     EXPECT_CHAR('P');EXPECT_CHAR('/'); EXPECT_CHAR('1');
     EXPECT_CHAR('.');
     return parse_int(buf, buf_end, minor_version, ret);
   }

May 28 2010                 HTTP::Parser::XS - writing a fast & secure XS module                    11
Micro-optimization (only in picohttpparser)

   Reduce # of conditional
    branches, optimize for pipeline (27%
    faster) unlikely(x) __builtin_expect(!!(x), 0)
         #define

              for (; ; ++buf) {
                CHECK_EOF();
                if (unlikely((unsigned char)*buf<= 'r')
              && (*buf == 'r' || *buf == 'n'))
              gotoEOL_FOUND;
              }




   Unroll loops (36% faster)
May 28 2010                HTTP::Parser::XS - writing a fast & secure XS module   12
Release History of HTTP::Parser::XS

   Current version: 0.07
   No security hole found since initial
    release
   Two memory leaks were found and fixed

   Please let me know if you find any
    security holes (especially the ones that
    lead to arbitrary code execution)
May 28 2010     HTTP::Parser::XS - writing a fast & secure XS module   13
Conclusion – KISS

   Keep it simple, stupid
        for fast development
        simple design leads to more secure code
        use perl whenever possible
              simple operations (like tokenization) is worth
               converting to XS
              complex operations (from handling of strings to
               database queries) are not so slow in perl




May 28 2010             HTTP::Parser::XS - writing a fast & secure XS module   14

More Related Content

What's hot

What's hot (20)

Fluentd at HKOScon
Fluentd at HKOSconFluentd at HKOScon
Fluentd at HKOScon
 
Fluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellFluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshell
 
Fluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellFluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshell
 
gRPC & Kubernetes
gRPC & KubernetesgRPC & Kubernetes
gRPC & Kubernetes
 
Fluentd vs. Logstash for OpenStack Log Management
Fluentd vs. Logstash for OpenStack Log ManagementFluentd vs. Logstash for OpenStack Log Management
Fluentd vs. Logstash for OpenStack Log Management
 
Fluentd 101
Fluentd 101Fluentd 101
Fluentd 101
 
Максим Барышиков-«WoT: Geographically distributed cluster of clusters»
Максим Барышиков-«WoT: Geographically distributed cluster of clusters»Максим Барышиков-«WoT: Geographically distributed cluster of clusters»
Максим Барышиков-«WoT: Geographically distributed cluster of clusters»
 
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyGolang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war story
 
JRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing WorldJRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing World
 
Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and Fluentd
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
Fluentd v1 and Roadmap
Fluentd v1 and RoadmapFluentd v1 and Roadmap
Fluentd v1 and Roadmap
 
Tales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe runningTales Of The Black Knight - Keeping EverythingMe running
Tales Of The Black Knight - Keeping EverythingMe running
 
Fluentd meetup in japan
Fluentd meetup in japanFluentd meetup in japan
Fluentd meetup in japan
 
Fluentd v1 and future at techtalk
Fluentd v1 and future at techtalkFluentd v1 and future at techtalk
Fluentd v1 and future at techtalk
 
Fluentd v0.14 Plugin API Details
Fluentd v0.14 Plugin API DetailsFluentd v0.14 Plugin API Details
Fluentd v0.14 Plugin API Details
 
How to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdataHow to create Treasure Data #dotsbigdata
How to create Treasure Data #dotsbigdata
 
Solving some of the scalability problems at booking.com
Solving some of the scalability problems at booking.comSolving some of the scalability problems at booking.com
Solving some of the scalability problems at booking.com
 
Node.js
Node.jsNode.js
Node.js
 
Ruby eventmachine pres at rubybdx
Ruby eventmachine pres at rubybdxRuby eventmachine pres at rubybdx
Ruby eventmachine pres at rubybdx
 

Similar to HTTP::Parser::XS - writing a fast & secure XS module

Степан Кольцов — Rust — лучше, чем C++
Степан Кольцов — Rust — лучше, чем C++Степан Кольцов — Rust — лучше, чем C++
Степан Кольцов — Rust — лучше, чем C++
Yandex
 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
Kernel TLV
 
Writing a TSDB from scratch_ performance optimizations.pdf
Writing a TSDB from scratch_ performance optimizations.pdfWriting a TSDB from scratch_ performance optimizations.pdf
Writing a TSDB from scratch_ performance optimizations.pdf
RomanKhavronenko
 
Lab Assignment 4 CSE330 Spring 2014 Skeleton Code for ex.docx
 Lab Assignment 4 CSE330 Spring 2014  Skeleton Code for ex.docx Lab Assignment 4 CSE330 Spring 2014  Skeleton Code for ex.docx
Lab Assignment 4 CSE330 Spring 2014 Skeleton Code for ex.docx
MARRY7
 
Rust: код может быть одновременно безопасным и быстрым, Степан Кольцов
Rust: код может быть одновременно безопасным и быстрым, Степан КольцовRust: код может быть одновременно безопасным и быстрым, Степан Кольцов
Rust: код может быть одновременно безопасным и быстрым, Степан Кольцов
Yandex
 
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the bfinalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
ChereCheek752
 

Similar to HTTP::Parser::XS - writing a fast & secure XS module (20)

Mxhr
MxhrMxhr
Mxhr
 
Степан Кольцов — Rust — лучше, чем C++
Степан Кольцов — Rust — лучше, чем C++Степан Кольцов — Rust — лучше, чем C++
Степан Кольцов — Rust — лучше, чем C++
 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
 
Writing a TSDB from scratch_ performance optimizations.pdf
Writing a TSDB from scratch_ performance optimizations.pdfWriting a TSDB from scratch_ performance optimizations.pdf
Writing a TSDB from scratch_ performance optimizations.pdf
 
Lab Assignment 4 CSE330 Spring 2014 Skeleton Code for ex.docx
 Lab Assignment 4 CSE330 Spring 2014  Skeleton Code for ex.docx Lab Assignment 4 CSE330 Spring 2014  Skeleton Code for ex.docx
Lab Assignment 4 CSE330 Spring 2014 Skeleton Code for ex.docx
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
 
Rust: код может быть одновременно безопасным и быстрым, Степан Кольцов
Rust: код может быть одновременно безопасным и быстрым, Степан КольцовRust: код может быть одновременно безопасным и быстрым, Степан Кольцов
Rust: код может быть одновременно безопасным и быстрым, Степан Кольцов
 
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the bfinalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
 
Berkeley Packet Filters
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet Filters
 
[2007 CodeEngn Conference 01] seaofglass - Linux Virus Analysis
[2007 CodeEngn Conference 01] seaofglass - Linux Virus Analysis[2007 CodeEngn Conference 01] seaofglass - Linux Virus Analysis
[2007 CodeEngn Conference 01] seaofglass - Linux Virus Analysis
 
Unicode (UTF-8) with PHP 5.3, MySQL 5.5 and HTML5 Cheat Sheet (2011)
Unicode (UTF-8) with PHP 5.3, MySQL 5.5 and HTML5 Cheat Sheet (2011)Unicode (UTF-8) with PHP 5.3, MySQL 5.5 and HTML5 Cheat Sheet (2011)
Unicode (UTF-8) with PHP 5.3, MySQL 5.5 and HTML5 Cheat Sheet (2011)
 
Analyzing Firebird 3.0
Analyzing Firebird 3.0Analyzing Firebird 3.0
Analyzing Firebird 3.0
 
Analyzing Firebird 3.0
Analyzing Firebird 3.0Analyzing Firebird 3.0
Analyzing Firebird 3.0
 
NodeJs Modules.pdf
NodeJs Modules.pdfNodeJs Modules.pdf
NodeJs Modules.pdf
 
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
 
Programming Assignment Help
Programming Assignment HelpProgramming Assignment Help
Programming Assignment Help
 
Linux Kernel, tested by the Linux-version of PVS-Studio
Linux Kernel, tested by the Linux-version of PVS-StudioLinux Kernel, tested by the Linux-version of PVS-Studio
Linux Kernel, tested by the Linux-version of PVS-Studio
 
Quick tour of PHP from inside
Quick tour of PHP from insideQuick tour of PHP from inside
Quick tour of PHP from inside
 
Hadoop spark performance comparison
Hadoop spark performance comparisonHadoop spark performance comparison
Hadoop spark performance comparison
 
Sockets and Socket-Buffer
Sockets and Socket-BufferSockets and Socket-Buffer
Sockets and Socket-Buffer
 

More from Kazuho Oku

Programming TCP for responsiveness
Programming TCP for responsivenessProgramming TCP for responsiveness
Programming TCP for responsiveness
Kazuho Oku
 
ウェブを速くするためにDeNAがやっていること - HTTP/2と、さらにその先
ウェブを速くするためにDeNAがやっていること - HTTP/2と、さらにその先ウェブを速くするためにDeNAがやっていること - HTTP/2と、さらにその先
ウェブを速くするためにDeNAがやっていること - HTTP/2と、さらにその先
Kazuho Oku
 

More from Kazuho Oku (20)

HTTP/2で 速くなるとき ならないとき
HTTP/2で 速くなるとき ならないときHTTP/2で 速くなるとき ならないとき
HTTP/2で 速くなるとき ならないとき
 
QUIC標準化動向 〜2017/7
QUIC標準化動向 〜2017/7QUIC標準化動向 〜2017/7
QUIC標準化動向 〜2017/7
 
HTTP/2の課題と将来
HTTP/2の課題と将来HTTP/2の課題と将来
HTTP/2の課題と将来
 
TLS 1.3 と 0-RTT のこわ〜い話
TLS 1.3 と 0-RTT のこわ〜い話TLS 1.3 と 0-RTT のこわ〜い話
TLS 1.3 と 0-RTT のこわ〜い話
 
Reorganizing Website Architecture for HTTP/2 and Beyond
Reorganizing Website Architecture for HTTP/2 and BeyondReorganizing Website Architecture for HTTP/2 and Beyond
Reorganizing Website Architecture for HTTP/2 and Beyond
 
Recent Advances in HTTP, controlling them using ruby
Recent Advances in HTTP, controlling them using rubyRecent Advances in HTTP, controlling them using ruby
Recent Advances in HTTP, controlling them using ruby
 
Programming TCP for responsiveness
Programming TCP for responsivenessProgramming TCP for responsiveness
Programming TCP for responsiveness
 
Programming TCP for responsiveness
Programming TCP for responsivenessProgramming TCP for responsiveness
Programming TCP for responsiveness
 
Developing the fastest HTTP/2 server
Developing the fastest HTTP/2 serverDeveloping the fastest HTTP/2 server
Developing the fastest HTTP/2 server
 
TLS & LURK @ IETF 95
TLS & LURK @ IETF 95TLS & LURK @ IETF 95
TLS & LURK @ IETF 95
 
HTTPとサーバ技術の最新動向
HTTPとサーバ技術の最新動向HTTPとサーバ技術の最新動向
HTTPとサーバ技術の最新動向
 
ウェブを速くするためにDeNAがやっていること - HTTP/2と、さらにその先
ウェブを速くするためにDeNAがやっていること - HTTP/2と、さらにその先ウェブを速くするためにDeNAがやっていること - HTTP/2と、さらにその先
ウェブを速くするためにDeNAがやっていること - HTTP/2と、さらにその先
 
Cache aware-server-push in H2O version 1.5
Cache aware-server-push in H2O version 1.5Cache aware-server-push in H2O version 1.5
Cache aware-server-push in H2O version 1.5
 
HTTP/2時代のウェブサイト設計
HTTP/2時代のウェブサイト設計HTTP/2時代のウェブサイト設計
HTTP/2時代のウェブサイト設計
 
H2O - making the Web faster
H2O - making the Web fasterH2O - making the Web faster
H2O - making the Web faster
 
H2O - making HTTP better
H2O - making HTTP betterH2O - making HTTP better
H2O - making HTTP better
 
H2O - the optimized HTTP server
H2O - the optimized HTTP serverH2O - the optimized HTTP server
H2O - the optimized HTTP server
 
JSON SQL Injection and the Lessons Learned
JSON SQL Injection and the Lessons LearnedJSON SQL Injection and the Lessons Learned
JSON SQL Injection and the Lessons Learned
 
JSX 速さの秘密 - 高速なJavaScriptを書く方法
JSX 速さの秘密 - 高速なJavaScriptを書く方法JSX 速さの秘密 - 高速なJavaScriptを書く方法
JSX 速さの秘密 - 高速なJavaScriptを書く方法
 
JSX の現在と未来 - Oct 26 2013
JSX の現在と未来 - Oct 26 2013JSX の現在と未来 - Oct 26 2013
JSX の現在と未来 - Oct 26 2013
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

HTTP::Parser::XS - writing a fast & secure XS module

  • 1. HTTP::Parser::XS writing a fast & secure XS module Cybozu Labs, Inc. Kazuho Oku
  • 2. Pros & Cons of XS Pros Faster (well, not always, but…) Cons Difficult and time-consuming Higher security risk Mistakes in C programming leads to remote code injection May 28 2010 HTTP::Parser::XS - writing a fast & secure XS module 2
  • 3. The Best Practice, in General Don’t use XS Until you hit performance bottleneck Rewrite the bottleneck (only) using XS May 28 2010 HTTP::Parser::XS - writing a fast & secure XS module 3
  • 4. HTTP::Parser::XS May 28 2010 HTTP::Parser::XS - writing a fast & secure XS module 4
  • 5. Summary An HTTP request parser Designed and implemented to be simple, fast & secure Stateless PSGI-compatible Required or recommended by many Plack servers Starman, Starlet, Twiggy, etc. May 28 2010 HTTP::Parser::XS - writing a fast & secure XS module 5
  • 6. Two-layered Approach Picohttpparser HTTP request / response parser written in C has its own test suite does not parse request / response content request line (response line) and headers only copy-less faster, lesser probability of security holes and memory leaks HTTP::Parser::XS glue code to access picohttpparser from perl May 28 2010 HTTP::Parser::XS - writing a fast & secure XS module 6
  • 7. Speed and Complexity HTTP::Parser::XS is simple, and fast most of the time is not spent in picohttpparser, but in the glue code constructing hashref picohttpparser/trunk can handle >1Mreqs/sec. reqs/sec. lines of code HTTP::HeaderParser::XS 116,000 1,166 HTTP::Parser::XS 140,000 487 Plack::HTTPParser::PP 10,100 104 May 28 2010 HTTP::Parser::XS - writing a fast & secure XS module 7
  • 8. Why is it a Stateless? Faster and simpler than a stateful parser lower security risks most HTTP requests / responses arrive in a single packet, anyway if written optimally in C, the cost of re- parsing is smaller than storing headers into a perlhashref May 28 2010 HTTP::Parser::XS - writing a fast & secure XS module 8
  • 9. Why is it Stateless? (cont’d) Easy to determine the end of an multi- packet HTTP request by looking for “rnrn” within the last packet (and preceding three bytes) mainly as a countermeasure for Slowloris May 28 2010 HTTP::Parser::XS - writing a fast & secure XS module 9
  • 10. Consistent Design Functions take same arguments  buf – points to current char  buf_end – points to end of buffer  *ret – error value  returns pointer to the next char  or null on error (the reason will be stored in *ret) const char* parse_http_version(const char* buf, const char* buf_end, int* minor_version, int* ret) { EXPECT_CHAR('H'); EXPECT_CHAR('T'); EXPECT_CHAR('T'); EXPECT_CHAR('P');EXPECT_CHAR('/'); EXPECT_CHAR('1'); EXPECT_CHAR('.'); return parse_int(buf, buf_end, minor_version, ret); } May 28 2010 HTTP::Parser::XS - writing a fast & secure XS module 10
  • 11. Macros Consistent design is #define CHECK_EOF() if (buf == buf_end) { *ret = -2; essential to heavy use } return NULL; of macros #define EXPECT_CHAR(ch) CHECK_EOF(); Good abstraction ⇒ safe if (*buf++ != ch) { *ret = -1; code } return NULL; const char* parse_http_version(const char* buf, const char* buf_end, int* minor_version, int* ret) { EXPECT_CHAR('H'); EXPECT_CHAR('T'); EXPECT_CHAR('T'); EXPECT_CHAR('P');EXPECT_CHAR('/'); EXPECT_CHAR('1'); EXPECT_CHAR('.'); return parse_int(buf, buf_end, minor_version, ret); } May 28 2010 HTTP::Parser::XS - writing a fast & secure XS module 11
  • 12. Micro-optimization (only in picohttpparser) Reduce # of conditional branches, optimize for pipeline (27% faster) unlikely(x) __builtin_expect(!!(x), 0) #define for (; ; ++buf) { CHECK_EOF(); if (unlikely((unsigned char)*buf<= 'r') && (*buf == 'r' || *buf == 'n')) gotoEOL_FOUND; } Unroll loops (36% faster) May 28 2010 HTTP::Parser::XS - writing a fast & secure XS module 12
  • 13. Release History of HTTP::Parser::XS Current version: 0.07 No security hole found since initial release Two memory leaks were found and fixed Please let me know if you find any security holes (especially the ones that lead to arbitrary code execution) May 28 2010 HTTP::Parser::XS - writing a fast & secure XS module 13
  • 14. Conclusion – KISS Keep it simple, stupid for fast development simple design leads to more secure code use perl whenever possible simple operations (like tokenization) is worth converting to XS complex operations (from handling of strings to database queries) are not so slow in perl May 28 2010 HTTP::Parser::XS - writing a fast & secure XS module 14