SlideShare una empresa de Scribd logo
1 de 50
Descargar para leer sin conexión
Monitoring repositories
 for FUN and PROFIT
         @snyff
          [_]
About me
● Security consultant (C.T.O.)
working for Securus Global in Melbourne

● PentesterLab (.com):
  ○ cool/awesome (web) *free* training/exercises
  ○ real life scenario
Disclaimer
● No code is going to be released today

● No repositories
were harmed during
the preparation of
this talk

● I worked on Web and Open Source projects
● I worked on commits without using the entire
  project's source code
Why work on commits?
● Corporate development:
  ○   Cannot review all projects anymore
  ○   Nice to have a “what to check today”
  ○   Sort commits by criticality
  ○   Detect backdoors


● Agile development:
  ○ The code changes every day
  ○ Can’t rely on one time code review anymore
  ○ Current approach: daily scan
Why work on commits?



● You have vulnerabilities:
  ○ Detect patches affecting your bugs
  ○ Detect changes to sensitive functions
Why work on commits?



● You want vulnerabilities ($$):
   ○ Detect new features with dangerous functions
   ○ Detect changes to sensitive functions
Why work on commits?



● You want bugs (lulz):
  ○ Get bugs few hours before the patch is available
  ○ Get a list of bad practices examples
  ○ Detect silent patching
What's a repository?
● Developers

● Files

● Commits

● And all of these are constantly moving...
Developers
● Main developer(s):
  ○ Add features
  ○ Fix bugs


● Cosmetic committer(s):
  ○   Change comments (fix typo)
  ○   Change designs of the website
  ○   Change indentation
  ○   Add documentation


● External people
  ○ Do a bit of everything
Files
● README/LICENSE files

● Templates, HTML, CSS

● Images

● Code:
  ○ Libraries
  ○ Installation code
  ○ "normal" code
Commits
● Developer's name

● Code changes:
  ○ Changes: diff
  ○ Files changed
  ○ Number of deletion/addition


● Date/Time of the commit

● Message
Examples of projects monitored
Stats (on the last 5000 commits)
● Commits per week:
  ○ anywhere between 20 and 180 (phpmyadmin) per
    week
  ○ 40 commits per week seems to be the average for
    "normal/interesting" projects


● Authors:
  ○ between 1 and 140


● Average commit: 200 lines
  (insertions+deletions)
Goals...
Goals: counterexample
Goals: example
Goals: example
Goals: example
Filtering...
Filtering files
● General approach:
  ○ images
  ○ css
  ○ README


● Framework based:
  ○ tests (interesting to keep for some projects)
  ○ database migration/creation script


● Project based files
  ○ deployment
  ○ installation files
Filtering developers
● For a given project find the "cosmetic
  developers"


● Don't get me wrong they are not useless,
  they just do things i don't care about
Results
● Around 5-10% of commits have nothing to
  do with code...


● You can divide the size of most other
  commits by 2-3 if you ignore noise
  (files/comments/...):
  ○ new code with test cases
  ○ modification in comments
  ○ ...
Classification
Data mining
● Take your samples (commits)
   ○ Extract a vector from each sample
   ○ Classify each sample


● From a training set, learn to classify the data

● Apply what you learned:
   ○ to the same training set after splitting it (cross-
     validation)
   ○ to new samples
Data mining
● training set:
  [1,2,3,0,10,220 ] -> bugfix
  [2,4,3,0,1,0 ] -> boring
  [2,5,3,3,1,1 ] -> boring
  [20,1,0,100,0,10 ] -> new bug

● testing:
  [23,0,1,90,0,15 ] -> ???
Extracting a vector
● You can't really say a commit is close to
  another commit

● You need to generate a vector from each
  commit to compare them

● Once you have done that, everything else is
  just magic^W Maths
Extracting a vector: getting data
● Number of lines changed:
  ○ insertion vs deletion


● Number of words changed (--word-diff):
  ○ insertion vs deletion


● Authors:
  ○ rating of authors based on the project's history
    ■ "fixing" score
    ■ "vulnerability creator" score
  ○ new developers
  ○ known security researchers
Extracting a vector: getting data
● Number of "dangerous" functions:
   ○ insertion
   ○ deletion


● Number of "filtering" functions:
   ○ insertion
   ○ deletion


● commit date vs author date

● Keywords in the message and in the code
Extracting a vector: getting data
● Files modified:
  ○ already implicated in a bug fix
  ○ already implicated in a vulnerability
Filtering vs Dangerous
● Good list of "dangerous" signatures from
  graudit:
   ○ https://github.com/wireghoul/graudit/


● Weighting is *really* important:
   ○ echo -> potential XSS -> 1 point
   ○ system -> potential commands execution -> 10
     points


● Some functions are in both:
   ○ crypto functions for example
   ○ crypto can be dangerous and but can filter as well
Filtering vs Dangerous
                                    attr_protected
system     attr_protected    htmlentities
           echo             attr_accessible
   echo
                                  File.basename
         eval
  exec      preg_replace
  assert           preg_replace
                                        basename
   create_function
                               intval

open3         popen
          send                 escape
Keywords                                 Code
                   vulnerability         execution
  SQL injection
                               Cross Site
               punctuation
   CSS rules                   Scripting XSS

                    Directory             disclosure
Documentation                      CVE
                    traversal
     Version
     number                              Dangerous
               CSS           Security
Typo          selector
                                   Command exec
  description Changelog
                                CSRF        Risky
Classification
● Fixed bugs:
   ○ learn from dangerous keywords


● New bugs:
   ○ git blame
   ○ read the source code and classify manually


● Potentially interesting new feature:
   ○ read the source code
   ○ can be a new bug
Results
● Vector computation:
   ○ between 15 and 120 minutes for 5000 commits


● Classification:
   ○ less than a minute


● Scoring:
   ○ 90% success rate on bug fix (without using the
     message as part of the vector)
   ○ 50/50 between FP and FN on bug fix
   ○ 200 commits down to 5-10 bugs per day
My tool: SANZARU
● Japanese names for tools make you a Ninja ;)

● Ruby based (what else...)

● Data Mining done with Weka (thx Silvio)
SANZARU: virtuous circle
● Made in a way that the more you learn on a
  project the more effective it gets :)

● Score authors through learning

● Score files through learning

● add functions used by the project
SANZARU: "learning mode"
● take the last 5k commits and give you the list
  of impacted files and authors with a weight

● still working on finding the initial bug's author
  but it doesn't really give you more information
SANZARU: configuration file
configure({ :path => "/home/snyff/code/rails",
            :type => :git,
            :remote => "origin/master",
            :origin => "https://github.
com/rails/rails",
            :languages => [ :ruby ] })

filter({ :extensions =>
            [ :html, :css, :jpg, :png, :md, :tpl ],
         :files => ["LICENSE", "*test*"] })

alert({ :keywords =>
            [keywords_default]
            ... })
SANZARU: configuration file
classify(:authors => { :default => 0,
"rafaelmfranca@gmail.com"=>19,"guilleiguaran@gmail.com"
=>15,"fxn@hashref.com"=>8, "lrodriguezsanc@gmail.com"
=>11,
"vijaydev.cse@gmail.com"=>25, .... },

         :files   => { :default => 0,
"activemodel/lib/active_model/mass_assignment_security.
rb"=>20,
 "railties/lib/rails/application.rb"=>17,
 "actionpack/lib/action_view/helpers/form_helper.rb"=>17,
 "activerecord/lib/active_record/core.rb"=>17,
   ...
})
SANZARU: "classification mode"
● Using ruby to create all the vectors

● Using weka to classify the data

● Then manual review of the results:
   ○ New features to find security bugs
   ○ FP for possible silent patching
SANZARU: "daily mode"
● Cron job (every day)
  ○ update all repositories (hasn't been blacklisted by
    github...yet), ruby-git is *shit*
  ○ find alerts in new commits
  ○ classify new commits
  ○ give me a nice report with what to read
SANZARU: example of output
Example found this week (not
exploitable... yet):




    esc_js escapes ' and "... this doesn't
Example found this week:
Example found this morning:
General observations
● Most fixes are:
  ○ small code insertion (less than 10 lines)
  ○ basic line substitution
  ○ easy to detect


● Most new bugs are:
  ○ details...
  ○ really hard to detect statistically
  ○ general approach: read all potentially interesting
    commits
  ○ working on important projects make the creation of
    bugs far less likely
  ○ it's not going to rain 0dayz...
Possible improvements
● Integrating syntactic analysis:
   ○ regular expression are just not enough
   ○ False alerts are time consuming...


● Retrieve information from external sources:
   ○ bug report
   ○ CVE


● Support for more languages/platforms:
   ○ Objective C libraries and applications?
   ○ Linux kernel?
   ○ ...
Conclusion
● Easy to detect:
   ○ (Silent) Security Fixes
   ○ New features with "interesting" functions


● Not so easy to detect
   ○ New security bugs


● Still worth the time
   ○ if you want bugs
   ○ if you are doing code review to have examples to
     learn from or share: vulnerability patterns
   ○ most frustrating thing you can do?
Questions?
            @snyff

● Have a great Ruxcon
● Play the CTF and Lock Picking
● Remember to checkout:
  ○ PentesterLab.com
  ○ @PentesterLab
● Thx to everyone who helped me
putting this talk together

Más contenido relacionado

La actualidad más candente

Vulnerabilities in data processing levels
Vulnerabilities in data processing levelsVulnerabilities in data processing levels
Vulnerabilities in data processing levelsbeched
 
Adventures in Asymmetric Warfare
Adventures in Asymmetric WarfareAdventures in Asymmetric Warfare
Adventures in Asymmetric WarfareWill Schroeder
 
JWT: jku x5u
JWT: jku x5uJWT: jku x5u
JWT: jku x5usnyff
 
Lie to Me: Bypassing Modern Web Application Firewalls
Lie to Me: Bypassing Modern Web Application FirewallsLie to Me: Bypassing Modern Web Application Firewalls
Lie to Me: Bypassing Modern Web Application FirewallsIvan Novikov
 
Snake bites : Python for Pentesters
Snake bites : Python for PentestersSnake bites : Python for Pentesters
Snake bites : Python for PentestersAnant Shrivastava
 
libinjection: from SQLi to XSS  by Nick Galbreath
libinjection: from SQLi to XSS  by Nick Galbreathlibinjection: from SQLi to XSS  by Nick Galbreath
libinjection: from SQLi to XSS  by Nick GalbreathCODE BLUE
 
Advanced JS Deobfuscation
Advanced JS DeobfuscationAdvanced JS Deobfuscation
Advanced JS DeobfuscationMinded Security
 
Carlo Sciolla - Above and beyond type systems with clojure.spec - Codemotion ...
Carlo Sciolla - Above and beyond type systems with clojure.spec - Codemotion ...Carlo Sciolla - Above and beyond type systems with clojure.spec - Codemotion ...
Carlo Sciolla - Above and beyond type systems with clojure.spec - Codemotion ...Codemotion
 
Testing Ext JS and Sencha Touch
Testing Ext JS and Sencha TouchTesting Ext JS and Sencha Touch
Testing Ext JS and Sencha TouchMats Bryntse
 
[HES2013] Virtually secure, analysis to remote root 0day on an industry leadi...
[HES2013] Virtually secure, analysis to remote root 0day on an industry leadi...[HES2013] Virtually secure, analysis to remote root 0day on an industry leadi...
[HES2013] Virtually secure, analysis to remote root 0day on an industry leadi...Hackito Ergo Sum
 
Think Like a Hacker - Database Attack Vectors
Think Like a Hacker - Database Attack VectorsThink Like a Hacker - Database Attack Vectors
Think Like a Hacker - Database Attack VectorsMark Ginnebaugh
 
New techniques in sql obfuscation, from DEFCON 20
New techniques in sql obfuscation, from DEFCON 20New techniques in sql obfuscation, from DEFCON 20
New techniques in sql obfuscation, from DEFCON 20Nick Galbreath
 
Automated JavaScript Deobfuscation - PacSec 2007
Automated JavaScript Deobfuscation - PacSec 2007Automated JavaScript Deobfuscation - PacSec 2007
Automated JavaScript Deobfuscation - PacSec 2007Stephan Chenette
 
BSidesJXN 2017 - Improving Vulnerability Management
BSidesJXN 2017 - Improving Vulnerability ManagementBSidesJXN 2017 - Improving Vulnerability Management
BSidesJXN 2017 - Improving Vulnerability ManagementAndrew McNicol
 
Jinx - Malware 2.0
Jinx - Malware 2.0Jinx - Malware 2.0
Jinx - Malware 2.0Itzik Kotler
 
Dirty Little Secrets They Didn't Teach You In Pentest Class v2
Dirty Little Secrets They Didn't Teach You In Pentest Class v2Dirty Little Secrets They Didn't Teach You In Pentest Class v2
Dirty Little Secrets They Didn't Teach You In Pentest Class v2Rob Fuller
 

La actualidad más candente (20)

Vulnerabilities in data processing levels
Vulnerabilities in data processing levelsVulnerabilities in data processing levels
Vulnerabilities in data processing levels
 
Adventures in Asymmetric Warfare
Adventures in Asymmetric WarfareAdventures in Asymmetric Warfare
Adventures in Asymmetric Warfare
 
JWT: jku x5u
JWT: jku x5uJWT: jku x5u
JWT: jku x5u
 
Lie to Me: Bypassing Modern Web Application Firewalls
Lie to Me: Bypassing Modern Web Application FirewallsLie to Me: Bypassing Modern Web Application Firewalls
Lie to Me: Bypassing Modern Web Application Firewalls
 
Flash it baby!
Flash it baby!Flash it baby!
Flash it baby!
 
Snake bites : Python for Pentesters
Snake bites : Python for PentestersSnake bites : Python for Pentesters
Snake bites : Python for Pentesters
 
libinjection: from SQLi to XSS  by Nick Galbreath
libinjection: from SQLi to XSS  by Nick Galbreathlibinjection: from SQLi to XSS  by Nick Galbreath
libinjection: from SQLi to XSS  by Nick Galbreath
 
Advanced JS Deobfuscation
Advanced JS DeobfuscationAdvanced JS Deobfuscation
Advanced JS Deobfuscation
 
Carlo Sciolla - Above and beyond type systems with clojure.spec - Codemotion ...
Carlo Sciolla - Above and beyond type systems with clojure.spec - Codemotion ...Carlo Sciolla - Above and beyond type systems with clojure.spec - Codemotion ...
Carlo Sciolla - Above and beyond type systems with clojure.spec - Codemotion ...
 
Pwnstaller
PwnstallerPwnstaller
Pwnstaller
 
Testing Ext JS and Sencha Touch
Testing Ext JS and Sencha TouchTesting Ext JS and Sencha Touch
Testing Ext JS and Sencha Touch
 
I Hunt Sys Admins
I Hunt Sys AdminsI Hunt Sys Admins
I Hunt Sys Admins
 
[HES2013] Virtually secure, analysis to remote root 0day on an industry leadi...
[HES2013] Virtually secure, analysis to remote root 0day on an industry leadi...[HES2013] Virtually secure, analysis to remote root 0day on an industry leadi...
[HES2013] Virtually secure, analysis to remote root 0day on an industry leadi...
 
Think Like a Hacker - Database Attack Vectors
Think Like a Hacker - Database Attack VectorsThink Like a Hacker - Database Attack Vectors
Think Like a Hacker - Database Attack Vectors
 
New techniques in sql obfuscation, from DEFCON 20
New techniques in sql obfuscation, from DEFCON 20New techniques in sql obfuscation, from DEFCON 20
New techniques in sql obfuscation, from DEFCON 20
 
Automated JavaScript Deobfuscation - PacSec 2007
Automated JavaScript Deobfuscation - PacSec 2007Automated JavaScript Deobfuscation - PacSec 2007
Automated JavaScript Deobfuscation - PacSec 2007
 
BSidesJXN 2017 - Improving Vulnerability Management
BSidesJXN 2017 - Improving Vulnerability ManagementBSidesJXN 2017 - Improving Vulnerability Management
BSidesJXN 2017 - Improving Vulnerability Management
 
Jinx - Malware 2.0
Jinx - Malware 2.0Jinx - Malware 2.0
Jinx - Malware 2.0
 
A Year in the Empire
A Year in the EmpireA Year in the Empire
A Year in the Empire
 
Dirty Little Secrets They Didn't Teach You In Pentest Class v2
Dirty Little Secrets They Didn't Teach You In Pentest Class v2Dirty Little Secrets They Didn't Teach You In Pentest Class v2
Dirty Little Secrets They Didn't Teach You In Pentest Class v2
 

Destacado

20120307 CakePHP Study in Tokyo
20120307 CakePHP Study in Tokyo20120307 CakePHP Study in Tokyo
20120307 CakePHP Study in Tokyoichikaway
 
ZeroNights - SmartTV
ZeroNights - SmartTV ZeroNights - SmartTV
ZeroNights - SmartTV Sergey Belov
 
Exploiting WebApp Race Condition Vulnerability 101
Exploiting WebApp Race Condition Vulnerability 101Exploiting WebApp Race Condition Vulnerability 101
Exploiting WebApp Race Condition Vulnerability 101Pichaya Morimoto
 
Attacking thru HTTP Host header
Attacking thru HTTP Host headerAttacking thru HTTP Host header
Attacking thru HTTP Host headerSergey Belov
 
What should a hacker know about WebDav?
What should a hacker know about WebDav?What should a hacker know about WebDav?
What should a hacker know about WebDav?Mikhail Egorov
 
CSCE "Rails Mass Assignment"
CSCE "Rails Mass Assignment"CSCE "Rails Mass Assignment"
CSCE "Rails Mass Assignment"Lukas Klein
 
Security Misconfiguration (OWASP Top 10 - 2013 - A5)
Security Misconfiguration (OWASP Top 10 - 2013 - A5)Security Misconfiguration (OWASP Top 10 - 2013 - A5)
Security Misconfiguration (OWASP Top 10 - 2013 - A5)Pichaya Morimoto
 
IE Memory Protector
IE Memory ProtectorIE Memory Protector
IE Memory Protector3S Labs
 
Vulnerable Active Record: A tale of SQL Injection in PHP Framework
Vulnerable Active Record: A tale of SQL Injection in PHP FrameworkVulnerable Active Record: A tale of SQL Injection in PHP Framework
Vulnerable Active Record: A tale of SQL Injection in PHP FrameworkPichaya Morimoto
 
Exploiting Blind Vulnerabilities
Exploiting Blind VulnerabilitiesExploiting Blind Vulnerabilities
Exploiting Blind VulnerabilitiesPichaya Morimoto
 
Entity provider selection confusion attacks in JAX-RS applications
Entity provider selection confusion attacks in JAX-RS applicationsEntity provider selection confusion attacks in JAX-RS applications
Entity provider selection confusion attacks in JAX-RS applicationsMikhail Egorov
 
Ruby on Rails Penetration Testing
Ruby on Rails Penetration TestingRuby on Rails Penetration Testing
Ruby on Rails Penetration Testing3S Labs
 
Hacking Adobe Experience Manager sites
Hacking Adobe Experience Manager sitesHacking Adobe Experience Manager sites
Hacking Adobe Experience Manager sitesMikhail Egorov
 
CodeFest 2014 - Pentesting client/server API
CodeFest 2014 - Pentesting client/server APICodeFest 2014 - Pentesting client/server API
CodeFest 2014 - Pentesting client/server APISergey Belov
 

Destacado (20)

20120307 CakePHP Study in Tokyo
20120307 CakePHP Study in Tokyo20120307 CakePHP Study in Tokyo
20120307 CakePHP Study in Tokyo
 
ZeroNights - SmartTV
ZeroNights - SmartTV ZeroNights - SmartTV
ZeroNights - SmartTV
 
Rails and security
Rails and securityRails and security
Rails and security
 
Exploiting WebApp Race Condition Vulnerability 101
Exploiting WebApp Race Condition Vulnerability 101Exploiting WebApp Race Condition Vulnerability 101
Exploiting WebApp Race Condition Vulnerability 101
 
Attacking thru HTTP Host header
Attacking thru HTTP Host headerAttacking thru HTTP Host header
Attacking thru HTTP Host header
 
Кеширование данных в БД
Кеширование данных в БДКеширование данных в БД
Кеширование данных в БД
 
What should a hacker know about WebDav?
What should a hacker know about WebDav?What should a hacker know about WebDav?
What should a hacker know about WebDav?
 
Rails Security
Rails SecurityRails Security
Rails Security
 
Practice of AppSec .NET
Practice of AppSec .NETPractice of AppSec .NET
Practice of AppSec .NET
 
CSCE "Rails Mass Assignment"
CSCE "Rails Mass Assignment"CSCE "Rails Mass Assignment"
CSCE "Rails Mass Assignment"
 
Cloud Orchestration is Broken
Cloud Orchestration is BrokenCloud Orchestration is Broken
Cloud Orchestration is Broken
 
Security Misconfiguration (OWASP Top 10 - 2013 - A5)
Security Misconfiguration (OWASP Top 10 - 2013 - A5)Security Misconfiguration (OWASP Top 10 - 2013 - A5)
Security Misconfiguration (OWASP Top 10 - 2013 - A5)
 
IE Memory Protector
IE Memory ProtectorIE Memory Protector
IE Memory Protector
 
Vulnerable Active Record: A tale of SQL Injection in PHP Framework
Vulnerable Active Record: A tale of SQL Injection in PHP FrameworkVulnerable Active Record: A tale of SQL Injection in PHP Framework
Vulnerable Active Record: A tale of SQL Injection in PHP Framework
 
Exploiting Blind Vulnerabilities
Exploiting Blind VulnerabilitiesExploiting Blind Vulnerabilities
Exploiting Blind Vulnerabilities
 
Entity provider selection confusion attacks in JAX-RS applications
Entity provider selection confusion attacks in JAX-RS applicationsEntity provider selection confusion attacks in JAX-RS applications
Entity provider selection confusion attacks in JAX-RS applications
 
Ruby on Rails Penetration Testing
Ruby on Rails Penetration TestingRuby on Rails Penetration Testing
Ruby on Rails Penetration Testing
 
Hacking Adobe Experience Manager sites
Hacking Adobe Experience Manager sitesHacking Adobe Experience Manager sites
Hacking Adobe Experience Manager sites
 
CodeFest 2014 - Pentesting client/server API
CodeFest 2014 - Pentesting client/server APICodeFest 2014 - Pentesting client/server API
CodeFest 2014 - Pentesting client/server API
 
SQL Injection Defense in Python
SQL Injection Defense in PythonSQL Injection Defense in Python
SQL Injection Defense in Python
 

Similar a Ln monitoring repositories

Pentester++
Pentester++Pentester++
Pentester++CTruncer
 
"Scala in Goozy", Alexey Zlobin
"Scala in Goozy", Alexey Zlobin "Scala in Goozy", Alexey Zlobin
"Scala in Goozy", Alexey Zlobin Vasil Remeniuk
 
The Ring programming language version 1.5.4 book - Part 180 of 185
The Ring programming language version 1.5.4 book - Part 180 of 185The Ring programming language version 1.5.4 book - Part 180 of 185
The Ring programming language version 1.5.4 book - Part 180 of 185Mahmoud Samir Fayed
 
The Ring programming language version 1.5.2 book - Part 176 of 181
The Ring programming language version 1.5.2 book - Part 176 of 181The Ring programming language version 1.5.2 book - Part 176 of 181
The Ring programming language version 1.5.2 book - Part 176 of 181Mahmoud Samir Fayed
 
Power Leveling your TypeScript
Power Leveling your TypeScriptPower Leveling your TypeScript
Power Leveling your TypeScriptOffirmo
 
DevSecCon Singapore 2018 - Remove developers’ shameful secrets or simply rem...
DevSecCon Singapore 2018 -  Remove developers’ shameful secrets or simply rem...DevSecCon Singapore 2018 -  Remove developers’ shameful secrets or simply rem...
DevSecCon Singapore 2018 - Remove developers’ shameful secrets or simply rem...DevSecCon
 
Continuous Delivery: 5 years later (Incontro DevOps 2018)
Continuous Delivery: 5 years later (Incontro DevOps 2018)Continuous Delivery: 5 years later (Incontro DevOps 2018)
Continuous Delivery: 5 years later (Incontro DevOps 2018)Giovanni Toraldo
 
Keep your repo clean
Keep your repo cleanKeep your repo clean
Keep your repo cleanHector Canto
 
Dart the Better JavaScript
Dart the Better JavaScriptDart the Better JavaScript
Dart the Better JavaScriptJorg Janke
 
Not Your Fathers C - C Application Development In 2016
Not Your Fathers C - C Application Development In 2016Not Your Fathers C - C Application Development In 2016
Not Your Fathers C - C Application Development In 2016maiktoepfer
 
DevSecCon SG 2018 Fabian Presentation Slides
DevSecCon SG 2018 Fabian Presentation SlidesDevSecCon SG 2018 Fabian Presentation Slides
DevSecCon SG 2018 Fabian Presentation SlidesFab L
 
PVS-Studio vs Chromium
PVS-Studio vs ChromiumPVS-Studio vs Chromium
PVS-Studio vs ChromiumPVS-Studio
 
PVS-Studio vs Chromium
PVS-Studio vs ChromiumPVS-Studio vs Chromium
PVS-Studio vs ChromiumAndrey Karpov
 
Crystal internals (part 1)
Crystal internals (part 1)Crystal internals (part 1)
Crystal internals (part 1)Ary Borenszweig
 
Crystal internals (part 1)
Crystal internals (part 1)Crystal internals (part 1)
Crystal internals (part 1)Ary Borenszweig
 
Crystal internals (part 1)
Crystal internals (part 1)Crystal internals (part 1)
Crystal internals (part 1)Crystal Language
 
Test-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxTest-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxVictor Rentea
 

Similar a Ln monitoring repositories (20)

Pentester++
Pentester++Pentester++
Pentester++
 
Ruxmon.2013-08.-.CodeBro!
Ruxmon.2013-08.-.CodeBro!Ruxmon.2013-08.-.CodeBro!
Ruxmon.2013-08.-.CodeBro!
 
"Scala in Goozy", Alexey Zlobin
"Scala in Goozy", Alexey Zlobin "Scala in Goozy", Alexey Zlobin
"Scala in Goozy", Alexey Zlobin
 
The Ring programming language version 1.5.4 book - Part 180 of 185
The Ring programming language version 1.5.4 book - Part 180 of 185The Ring programming language version 1.5.4 book - Part 180 of 185
The Ring programming language version 1.5.4 book - Part 180 of 185
 
The Ring programming language version 1.5.2 book - Part 176 of 181
The Ring programming language version 1.5.2 book - Part 176 of 181The Ring programming language version 1.5.2 book - Part 176 of 181
The Ring programming language version 1.5.2 book - Part 176 of 181
 
Power Leveling your TypeScript
Power Leveling your TypeScriptPower Leveling your TypeScript
Power Leveling your TypeScript
 
DevSecCon Singapore 2018 - Remove developers’ shameful secrets or simply rem...
DevSecCon Singapore 2018 -  Remove developers’ shameful secrets or simply rem...DevSecCon Singapore 2018 -  Remove developers’ shameful secrets or simply rem...
DevSecCon Singapore 2018 - Remove developers’ shameful secrets or simply rem...
 
Scaling xtext
Scaling xtextScaling xtext
Scaling xtext
 
Continuous Delivery: 5 years later (Incontro DevOps 2018)
Continuous Delivery: 5 years later (Incontro DevOps 2018)Continuous Delivery: 5 years later (Incontro DevOps 2018)
Continuous Delivery: 5 years later (Incontro DevOps 2018)
 
Keep your repo clean
Keep your repo cleanKeep your repo clean
Keep your repo clean
 
Dart the Better JavaScript
Dart the Better JavaScriptDart the Better JavaScript
Dart the Better JavaScript
 
Not Your Fathers C - C Application Development In 2016
Not Your Fathers C - C Application Development In 2016Not Your Fathers C - C Application Development In 2016
Not Your Fathers C - C Application Development In 2016
 
DevSecCon SG 2018 Fabian Presentation Slides
DevSecCon SG 2018 Fabian Presentation SlidesDevSecCon SG 2018 Fabian Presentation Slides
DevSecCon SG 2018 Fabian Presentation Slides
 
PVS-Studio vs Chromium
PVS-Studio vs ChromiumPVS-Studio vs Chromium
PVS-Studio vs Chromium
 
PVS-Studio vs Chromium
PVS-Studio vs ChromiumPVS-Studio vs Chromium
PVS-Studio vs Chromium
 
Python for web security - beginner
Python for web security - beginnerPython for web security - beginner
Python for web security - beginner
 
Crystal internals (part 1)
Crystal internals (part 1)Crystal internals (part 1)
Crystal internals (part 1)
 
Crystal internals (part 1)
Crystal internals (part 1)Crystal internals (part 1)
Crystal internals (part 1)
 
Crystal internals (part 1)
Crystal internals (part 1)Crystal internals (part 1)
Crystal internals (part 1)
 
Test-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxTest-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptx
 

Último

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 

Último (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

Ln monitoring repositories

  • 1. Monitoring repositories for FUN and PROFIT @snyff [_]
  • 2. About me ● Security consultant (C.T.O.) working for Securus Global in Melbourne ● PentesterLab (.com): ○ cool/awesome (web) *free* training/exercises ○ real life scenario
  • 3. Disclaimer ● No code is going to be released today ● No repositories were harmed during the preparation of this talk ● I worked on Web and Open Source projects ● I worked on commits without using the entire project's source code
  • 4. Why work on commits? ● Corporate development: ○ Cannot review all projects anymore ○ Nice to have a “what to check today” ○ Sort commits by criticality ○ Detect backdoors ● Agile development: ○ The code changes every day ○ Can’t rely on one time code review anymore ○ Current approach: daily scan
  • 5. Why work on commits? ● You have vulnerabilities: ○ Detect patches affecting your bugs ○ Detect changes to sensitive functions
  • 6. Why work on commits? ● You want vulnerabilities ($$): ○ Detect new features with dangerous functions ○ Detect changes to sensitive functions
  • 7. Why work on commits? ● You want bugs (lulz): ○ Get bugs few hours before the patch is available ○ Get a list of bad practices examples ○ Detect silent patching
  • 8. What's a repository? ● Developers ● Files ● Commits ● And all of these are constantly moving...
  • 9. Developers ● Main developer(s): ○ Add features ○ Fix bugs ● Cosmetic committer(s): ○ Change comments (fix typo) ○ Change designs of the website ○ Change indentation ○ Add documentation ● External people ○ Do a bit of everything
  • 10. Files ● README/LICENSE files ● Templates, HTML, CSS ● Images ● Code: ○ Libraries ○ Installation code ○ "normal" code
  • 11. Commits ● Developer's name ● Code changes: ○ Changes: diff ○ Files changed ○ Number of deletion/addition ● Date/Time of the commit ● Message
  • 12.
  • 13. Examples of projects monitored
  • 14. Stats (on the last 5000 commits) ● Commits per week: ○ anywhere between 20 and 180 (phpmyadmin) per week ○ 40 commits per week seems to be the average for "normal/interesting" projects ● Authors: ○ between 1 and 140 ● Average commit: 200 lines (insertions+deletions)
  • 21. Filtering files ● General approach: ○ images ○ css ○ README ● Framework based: ○ tests (interesting to keep for some projects) ○ database migration/creation script ● Project based files ○ deployment ○ installation files
  • 22. Filtering developers ● For a given project find the "cosmetic developers" ● Don't get me wrong they are not useless, they just do things i don't care about
  • 23. Results ● Around 5-10% of commits have nothing to do with code... ● You can divide the size of most other commits by 2-3 if you ignore noise (files/comments/...): ○ new code with test cases ○ modification in comments ○ ...
  • 25. Data mining ● Take your samples (commits) ○ Extract a vector from each sample ○ Classify each sample ● From a training set, learn to classify the data ● Apply what you learned: ○ to the same training set after splitting it (cross- validation) ○ to new samples
  • 26. Data mining ● training set: [1,2,3,0,10,220 ] -> bugfix [2,4,3,0,1,0 ] -> boring [2,5,3,3,1,1 ] -> boring [20,1,0,100,0,10 ] -> new bug ● testing: [23,0,1,90,0,15 ] -> ???
  • 27. Extracting a vector ● You can't really say a commit is close to another commit ● You need to generate a vector from each commit to compare them ● Once you have done that, everything else is just magic^W Maths
  • 28. Extracting a vector: getting data ● Number of lines changed: ○ insertion vs deletion ● Number of words changed (--word-diff): ○ insertion vs deletion ● Authors: ○ rating of authors based on the project's history ■ "fixing" score ■ "vulnerability creator" score ○ new developers ○ known security researchers
  • 29. Extracting a vector: getting data ● Number of "dangerous" functions: ○ insertion ○ deletion ● Number of "filtering" functions: ○ insertion ○ deletion ● commit date vs author date ● Keywords in the message and in the code
  • 30. Extracting a vector: getting data ● Files modified: ○ already implicated in a bug fix ○ already implicated in a vulnerability
  • 31. Filtering vs Dangerous ● Good list of "dangerous" signatures from graudit: ○ https://github.com/wireghoul/graudit/ ● Weighting is *really* important: ○ echo -> potential XSS -> 1 point ○ system -> potential commands execution -> 10 points ● Some functions are in both: ○ crypto functions for example ○ crypto can be dangerous and but can filter as well
  • 32. Filtering vs Dangerous attr_protected system attr_protected htmlentities echo attr_accessible echo File.basename eval exec preg_replace assert preg_replace basename create_function intval open3 popen send escape
  • 33. Keywords Code vulnerability execution SQL injection Cross Site punctuation CSS rules Scripting XSS Directory disclosure Documentation CVE traversal Version number Dangerous CSS Security Typo selector Command exec description Changelog CSRF Risky
  • 34. Classification ● Fixed bugs: ○ learn from dangerous keywords ● New bugs: ○ git blame ○ read the source code and classify manually ● Potentially interesting new feature: ○ read the source code ○ can be a new bug
  • 35. Results ● Vector computation: ○ between 15 and 120 minutes for 5000 commits ● Classification: ○ less than a minute ● Scoring: ○ 90% success rate on bug fix (without using the message as part of the vector) ○ 50/50 between FP and FN on bug fix ○ 200 commits down to 5-10 bugs per day
  • 36. My tool: SANZARU ● Japanese names for tools make you a Ninja ;) ● Ruby based (what else...) ● Data Mining done with Weka (thx Silvio)
  • 37. SANZARU: virtuous circle ● Made in a way that the more you learn on a project the more effective it gets :) ● Score authors through learning ● Score files through learning ● add functions used by the project
  • 38. SANZARU: "learning mode" ● take the last 5k commits and give you the list of impacted files and authors with a weight ● still working on finding the initial bug's author but it doesn't really give you more information
  • 39. SANZARU: configuration file configure({ :path => "/home/snyff/code/rails", :type => :git, :remote => "origin/master", :origin => "https://github. com/rails/rails", :languages => [ :ruby ] }) filter({ :extensions => [ :html, :css, :jpg, :png, :md, :tpl ], :files => ["LICENSE", "*test*"] }) alert({ :keywords => [keywords_default] ... })
  • 40. SANZARU: configuration file classify(:authors => { :default => 0, "rafaelmfranca@gmail.com"=>19,"guilleiguaran@gmail.com" =>15,"fxn@hashref.com"=>8, "lrodriguezsanc@gmail.com" =>11, "vijaydev.cse@gmail.com"=>25, .... }, :files => { :default => 0, "activemodel/lib/active_model/mass_assignment_security. rb"=>20, "railties/lib/rails/application.rb"=>17, "actionpack/lib/action_view/helpers/form_helper.rb"=>17, "activerecord/lib/active_record/core.rb"=>17, ... })
  • 41. SANZARU: "classification mode" ● Using ruby to create all the vectors ● Using weka to classify the data ● Then manual review of the results: ○ New features to find security bugs ○ FP for possible silent patching
  • 42. SANZARU: "daily mode" ● Cron job (every day) ○ update all repositories (hasn't been blacklisted by github...yet), ruby-git is *shit* ○ find alerts in new commits ○ classify new commits ○ give me a nice report with what to read
  • 44. Example found this week (not exploitable... yet): esc_js escapes ' and "... this doesn't
  • 46. Example found this morning:
  • 47. General observations ● Most fixes are: ○ small code insertion (less than 10 lines) ○ basic line substitution ○ easy to detect ● Most new bugs are: ○ details... ○ really hard to detect statistically ○ general approach: read all potentially interesting commits ○ working on important projects make the creation of bugs far less likely ○ it's not going to rain 0dayz...
  • 48. Possible improvements ● Integrating syntactic analysis: ○ regular expression are just not enough ○ False alerts are time consuming... ● Retrieve information from external sources: ○ bug report ○ CVE ● Support for more languages/platforms: ○ Objective C libraries and applications? ○ Linux kernel? ○ ...
  • 49. Conclusion ● Easy to detect: ○ (Silent) Security Fixes ○ New features with "interesting" functions ● Not so easy to detect ○ New security bugs ● Still worth the time ○ if you want bugs ○ if you are doing code review to have examples to learn from or share: vulnerability patterns ○ most frustrating thing you can do?
  • 50. Questions? @snyff ● Have a great Ruxcon ● Play the CTF and Lock Picking ● Remember to checkout: ○ PentesterLab.com ○ @PentesterLab ● Thx to everyone who helped me putting this talk together