2. Intro
• Researcher, bug-hunter, CEO
• Web application security in depth
• @d0znpp personal twitter
• lab.onsec.ru our blog (@ONsec_lab)
3. What is normalization?
• Transferring and storing data are always
accompanied by their formatting
• First normalization than formatting
• Encoding (different charsets)
• Truncation (limited sizes)
• Trims
• Canonizations
• ...
5. Web application basics
• Client-Server model
• Client is browser (Chrome, Safari, IE, FF)
• Server is web server software (Nginx,
Apache)
• Application server (FastCGI,Tomcat)
• Database storage (SQL or noSQL)
10. Protocol level
normalization
• Urlencoding - what could be simpler?
• %22 to «
• %23 to #
• %25 to %
• Double url-encoding is basic bypass for
many input validators, right?
30. PHP string encoding
http://www.php.net/manual/
language.types.string.php#language.types.string.details
• String will be encoded in whatever fashion it is encoded in
the script file
• If Zend Multibyte is enabled, the script may be written in
an arbitrary encoding (which is explicity declared or is
detected) and then converted to a certain internal
encoding, which is then the encoding that will be used for
the string literals
• State-dependent encodings where the same byte values can
be used in initial and non-initial shift states may be
problematic
31. Multibyte problems
• Lengths in chars or bytes?
• State-dependent encodings
• 0x0102 char
• 0x0203 char
• 0x01020203 two chars
• But what about case when 0x0202 is valid
char also?
• Try to find 0x0202 in this string ;)