SlideShare una empresa de Scribd logo
1 de 13
Descargar para leer sin conexión
FFRI,Inc.
Fourteenforty Research Institute, Inc.
FFRI, Inc
http://www.ffri.jp
Monthly Research
Consideration and evaluation of using fuzzy hashing
Ver2.00.01
FFRI,Inc.
• Background and purpose
• Basis of fuzzy hashing
• An experiment
• The result
• Consideration
Agenda
2
FFRI,Inc.
• ‘fuzzy hashing’ was introduced in 2006 by Jesse Kornblum
– http://dfrws.org/2006/proceedings/12-Kornblum.pdf
• In malware analysis fuzzy hashing algorithms such as ssdeep are being
introduced in recent years
• (IMHO) However, we don’t consider the effective usage of them enough
• In this slides, we evaluate an effectiveness of classification of malware similarity
by fuzzy hashing
Background and purpose
3
FFRI,Inc.
• In general, cryptographic hashing like MD5 is popular and it has the attributes
as follows: (cf. http://en.wikipedia.org/wiki/Cryptographic_hash_function)
– it is easy to compute the hash value for any given message
– it is infeasible to generate a message that has a given hash
– it is infeasible to modify a message without changing the hash
– it is infeasible to find two different messages with the same hash
• Cryptographic hashing is often used for identify the same files
• On the other hand, it is unsuitable to identify similar files because digests are
completely different even if 1 bit is altered in the other file
• In DFIR, this need exists and fuzzy hashing was developed to solve this
problem
Basis of Fuzzy hashing(1/4)
4
FFRI,Inc.
• Fuzzy hashing = Context Triggered Piecewise Hashing(CTPH) =
Piecewise hashing + Rolling hashing
• Piecewise Hashing
– Dividing message into N-block and calculating hash value of each blocks
• Rolling Hash
– A method to calculate hash value of sub-message(position 1-3, 2-4…) fast
– In general, if calculated values are the same, it is a high probability that
original messages are identical
Basis of Fuzzy hashing(2/4)
5
1 2 3 4 5 6 7 8
a b c d e f g h
Calculating a hash value for each N chars(N=3)
FFRI,Inc.
• Fuzzy hashing(CTPH)
– When rolling hash generates a specific value at any position, it calculates
cryptographic hash value of the partial message from the beginning to the
position
– Generate a hash value by concatenating all (partial) hashes
Basis of Fuzzy hashing(3/4)
6
1 2 3 4 5 6 7 8
a a a b b b c c
①calculating rolling hash, and the result is the specific value (trigger)
②calculating a hash value of this block by cryptographic hash
③calculating rolling hash value from position 4(trying to determine a next block)
FFRI,Inc.
• Fuzzy hashing(CTPH)
– With almost identical messages, it would calculate a hash value of identical
partial message stochastically
– We can identify partial matches between similar messages
Basis of Fuzzy hashing(4/4)
7
1 2 3 4 5 6 7 8
a a a b b b c c
1 2 3 4 5 6 7 8
b b b a a a d d
a823 928c 817d
928c a823 1972
Hash values
for each block
238c7d
8c2372
Hash value for
entire message
FFRI,Inc.
• In general, usage of fuzzy hashing is proposed as follows:
– Determining similar files(i.e. almost identical but MD5s aren’t matched)
– Matching partial data in files
• This time, we evaluate determining similar malware by fuzzy hashing
• We make it clear “how effective it is in actually” and “what we should consider if
we applying it” for the above
An experiment(1/2)
8
FFRI,Inc.
• Preparation
– Preparing 2,036 unique malware files in MD5s collected by ourselves
– Calculating(fuzzy) hash values by ssdeep and similarity of all of each files
• nCr: 2,036C2 = 2,071,630 combinations
• Determining similar malware
– Extracting all the pairs whose similarities are 50%-100%
– Determining if the detection name of files in a pair is matched for each
similarity threshold
An experiment(2/2)
9
A B C …
A 90 82 54
B 76 62
C 46
…
similarity of each malware files(%)
80%+
・A-B
・A-C
80%+
・(A)Trojan.XYZ – (B)Trojan.XYX
・(A)Trojan.XYZ – (C)WORM.DEA
Matched
Not matched
FFRI,Inc.
• The higher the threshold is, the higher matching rate of detection name we get
– Up to the threshold of 90% it keeps around 50-60% matching rate
The result
10
Thenumberofpairswhose
similarityareabovethreshold
Threshold of similarity(%)
FFRI,Inc.
• Dividing “matched” pairs into a group who has “generic” in its name and the others
• “matched(Generic)%” shows the same trend with the matching rate above
-> The higher the threshold is, the more malware are detected as “generic”
The rate detected by the name “Generic”?
11
Thenumberofpairswhose
similarityareabovethreshold
Threshold of similarity(%)
FFRI,Inc.
• A meaning of the result depends on if the AVV uses fuzzy hashing for generic
detection
– If they use fuzzy hashing for generic detection
• The result is natural
– If not
• By using fuzzy hashing, we may obtain a similar result to the generic
detection
• If we use fuzzy hashing for generic detection, 90%+ similarity might be
required with known malware (fuzzy) hash values
Consideration
12
FFRI,Inc.
• E-Mail: research-feedback@ffri.jp
• twitter: @FFRI_Research
Contact Information
13

Más contenido relacionado

La actualidad más candente

CNIT 141: 6. Hash Functions
CNIT 141: 6. Hash FunctionsCNIT 141: 6. Hash Functions
CNIT 141: 6. Hash FunctionsSam Bowne
 
Symmetric ciphermodel
Symmetric ciphermodelSymmetric ciphermodel
Symmetric ciphermodelpriyapavi96
 
Securing Neural Networks
Securing Neural NetworksSecuring Neural Networks
Securing Neural NetworksTahseen Shabab
 
Symmetric Encryption Techniques
Symmetric Encryption Techniques Symmetric Encryption Techniques
Symmetric Encryption Techniques Dr. Kapil Gupta
 
Authorcontext:ire
Authorcontext:ireAuthorcontext:ire
Authorcontext:ireSoham Saha
 
Finding Similar Files in Large Document Repositories
Finding Similar Files in Large Document RepositoriesFinding Similar Files in Large Document Repositories
Finding Similar Files in Large Document Repositoriesfeiwin
 
A New Modified Version of Caser Cipher Algorithm
A New Modified Version of Caser Cipher AlgorithmA New Modified Version of Caser Cipher Algorithm
A New Modified Version of Caser Cipher AlgorithmIJERD Editor
 
It Takes Two to Tango: an Exploration of Domain Pairs for Cross-Domain Collab...
It Takes Two to Tango: an Exploration of Domain Pairs for Cross-Domain Collab...It Takes Two to Tango: an Exploration of Domain Pairs for Cross-Domain Collab...
It Takes Two to Tango: an Exploration of Domain Pairs for Cross-Domain Collab...Shaghayegh (Sherry) Sahebi
 
Cross-Domain Recommendation for Large-Scale Data
Cross-Domain Recommendation for Large-Scale DataCross-Domain Recommendation for Large-Scale Data
Cross-Domain Recommendation for Large-Scale DataShaghayegh (Sherry) Sahebi
 

La actualidad más candente (12)

Probabilistic content models,
Probabilistic content models,Probabilistic content models,
Probabilistic content models,
 
Kyung Kim
Kyung KimKyung Kim
Kyung Kim
 
CNIT 141: 6. Hash Functions
CNIT 141: 6. Hash FunctionsCNIT 141: 6. Hash Functions
CNIT 141: 6. Hash Functions
 
RSA
RSARSA
RSA
 
Symmetric ciphermodel
Symmetric ciphermodelSymmetric ciphermodel
Symmetric ciphermodel
 
Securing Neural Networks
Securing Neural NetworksSecuring Neural Networks
Securing Neural Networks
 
Symmetric Encryption Techniques
Symmetric Encryption Techniques Symmetric Encryption Techniques
Symmetric Encryption Techniques
 
Authorcontext:ire
Authorcontext:ireAuthorcontext:ire
Authorcontext:ire
 
Finding Similar Files in Large Document Repositories
Finding Similar Files in Large Document RepositoriesFinding Similar Files in Large Document Repositories
Finding Similar Files in Large Document Repositories
 
A New Modified Version of Caser Cipher Algorithm
A New Modified Version of Caser Cipher AlgorithmA New Modified Version of Caser Cipher Algorithm
A New Modified Version of Caser Cipher Algorithm
 
It Takes Two to Tango: an Exploration of Domain Pairs for Cross-Domain Collab...
It Takes Two to Tango: an Exploration of Domain Pairs for Cross-Domain Collab...It Takes Two to Tango: an Exploration of Domain Pairs for Cross-Domain Collab...
It Takes Two to Tango: an Exploration of Domain Pairs for Cross-Domain Collab...
 
Cross-Domain Recommendation for Large-Scale Data
Cross-Domain Recommendation for Large-Scale DataCross-Domain Recommendation for Large-Scale Data
Cross-Domain Recommendation for Large-Scale Data
 

Similar a FFRI Evaluation of Using Fuzzy Hashing to Classify Malware Similarity

Public Key Encryption & Hash functions
Public Key Encryption & Hash functionsPublic Key Encryption & Hash functions
Public Key Encryption & Hash functionsDr.Florence Dayana
 
Lecture 02 - 05 Oct 21.pptx
Lecture 02 - 05 Oct 21.pptxLecture 02 - 05 Oct 21.pptx
Lecture 02 - 05 Oct 21.pptxHammadRao5
 
Malicious Hashing: Eve’s Variant of SHA-1
Malicious Hashing: Eve’s Variant of SHA-1Malicious Hashing: Eve’s Variant of SHA-1
Malicious Hashing: Eve’s Variant of SHA-1Ange Albertini
 
Network security cryptographic hash function
Network security  cryptographic hash functionNetwork security  cryptographic hash function
Network security cryptographic hash functionMijanur Rahman Milon
 
secure hash function for authentication in CNS
secure hash function for authentication in CNSsecure hash function for authentication in CNS
secure hash function for authentication in CNSNithyasriA2
 
Mr201311 behavioral-based malware clustering (English)
Mr201311 behavioral-based malware clustering (English)Mr201311 behavioral-based malware clustering (English)
Mr201311 behavioral-based malware clustering (English)FFRI, Inc.
 
Hash Function & Analysis
Hash Function & AnalysisHash Function & Analysis
Hash Function & AnalysisPawandeep Kaur
 
Improving accuracy of malware detection by filtering evaluation dataset based...
Improving accuracy of malware detection by filtering evaluation dataset based...Improving accuracy of malware detection by filtering evaluation dataset based...
Improving accuracy of malware detection by filtering evaluation dataset based...FFRI, Inc.
 
CISSP Week 20
CISSP Week 20CISSP Week 20
CISSP Week 20jemtallon
 
Hash function landscape
Hash function landscapeHash function landscape
Hash function landscapeSandeep Joshi
 
Acquisition of malicious code using active learning
Acquisition of malicious code using active learningAcquisition of malicious code using active learning
Acquisition of malicious code using active learningUltraUploader
 
HP-UX with Rsync by Dusan Baljevic
HP-UX with Rsync by Dusan BaljevicHP-UX with Rsync by Dusan Baljevic
HP-UX with Rsync by Dusan BaljevicCircling Cycle
 
Webinar alain-2009-03-04-clamav
Webinar alain-2009-03-04-clamavWebinar alain-2009-03-04-clamav
Webinar alain-2009-03-04-clamavthc2cat
 
Information and data security cryptanalysis method
Information and data security cryptanalysis methodInformation and data security cryptanalysis method
Information and data security cryptanalysis methodMazin Alwaaly
 
Malware Static Analysis
Malware Static AnalysisMalware Static Analysis
Malware Static AnalysisHossein Yavari
 

Similar a FFRI Evaluation of Using Fuzzy Hashing to Classify Malware Similarity (20)

Public Key Encryption & Hash functions
Public Key Encryption & Hash functionsPublic Key Encryption & Hash functions
Public Key Encryption & Hash functions
 
Lecture 02 - 05 Oct 21.pptx
Lecture 02 - 05 Oct 21.pptxLecture 02 - 05 Oct 21.pptx
Lecture 02 - 05 Oct 21.pptx
 
Malicious Hashing: Eve’s Variant of SHA-1
Malicious Hashing: Eve’s Variant of SHA-1Malicious Hashing: Eve’s Variant of SHA-1
Malicious Hashing: Eve’s Variant of SHA-1
 
Network security cryptographic hash function
Network security  cryptographic hash functionNetwork security  cryptographic hash function
Network security cryptographic hash function
 
secure hash function for authentication in CNS
secure hash function for authentication in CNSsecure hash function for authentication in CNS
secure hash function for authentication in CNS
 
Mr201311 behavioral-based malware clustering (English)
Mr201311 behavioral-based malware clustering (English)Mr201311 behavioral-based malware clustering (English)
Mr201311 behavioral-based malware clustering (English)
 
Hash Function & Analysis
Hash Function & AnalysisHash Function & Analysis
Hash Function & Analysis
 
Improving accuracy of malware detection by filtering evaluation dataset based...
Improving accuracy of malware detection by filtering evaluation dataset based...Improving accuracy of malware detection by filtering evaluation dataset based...
Improving accuracy of malware detection by filtering evaluation dataset based...
 
CISSP Week 20
CISSP Week 20CISSP Week 20
CISSP Week 20
 
Hashing
HashingHashing
Hashing
 
Hash function landscape
Hash function landscapeHash function landscape
Hash function landscape
 
Acquisition of malicious code using active learning
Acquisition of malicious code using active learningAcquisition of malicious code using active learning
Acquisition of malicious code using active learning
 
SPIE-2014
SPIE-2014SPIE-2014
SPIE-2014
 
HP-UX with Rsync by Dusan Baljevic
HP-UX with Rsync by Dusan BaljevicHP-UX with Rsync by Dusan Baljevic
HP-UX with Rsync by Dusan Baljevic
 
HASH FUNCTIONS.pdf
HASH FUNCTIONS.pdfHASH FUNCTIONS.pdf
HASH FUNCTIONS.pdf
 
Webinar alain-2009-03-04-clamav
Webinar alain-2009-03-04-clamavWebinar alain-2009-03-04-clamav
Webinar alain-2009-03-04-clamav
 
Hashing
HashingHashing
Hashing
 
Information and data security cryptanalysis method
Information and data security cryptanalysis methodInformation and data security cryptanalysis method
Information and data security cryptanalysis method
 
Malware Static Analysis
Malware Static AnalysisMalware Static Analysis
Malware Static Analysis
 
Digital signatures
Digital signaturesDigital signatures
Digital signatures
 

Más de FFRI, Inc.

Appearances are deceiving: Novel offensive techniques in Windows 10/11 on ARM
Appearances are deceiving: Novel offensive techniques in Windows 10/11 on ARMAppearances are deceiving: Novel offensive techniques in Windows 10/11 on ARM
Appearances are deceiving: Novel offensive techniques in Windows 10/11 on ARMFFRI, Inc.
 
Appearances are deceiving: Novel offensive techniques in Windows 10/11 on ARM
Appearances are deceiving: Novel offensive techniques in Windows 10/11 on ARMAppearances are deceiving: Novel offensive techniques in Windows 10/11 on ARM
Appearances are deceiving: Novel offensive techniques in Windows 10/11 on ARMFFRI, Inc.
 
TrustZone use case and trend (FFRI Monthly Research Mar 2017)
TrustZone use case and trend (FFRI Monthly Research Mar 2017) TrustZone use case and trend (FFRI Monthly Research Mar 2017)
TrustZone use case and trend (FFRI Monthly Research Mar 2017) FFRI, Inc.
 
Android Things Security Research in Developer Preview 2 (FFRI Monthly Researc...
Android Things Security Research in Developer Preview 2 (FFRI Monthly Researc...Android Things Security Research in Developer Preview 2 (FFRI Monthly Researc...
Android Things Security Research in Developer Preview 2 (FFRI Monthly Researc...FFRI, Inc.
 
An Overview of the Android Things Security (FFRI Monthly Research Jan 2017)
An Overview of the Android Things Security (FFRI Monthly Research Jan 2017) An Overview of the Android Things Security (FFRI Monthly Research Jan 2017)
An Overview of the Android Things Security (FFRI Monthly Research Jan 2017) FFRI, Inc.
 
Black Hat Europe 2016 Survey Report (FFRI Monthly Research Dec 2016)
Black Hat Europe 2016 Survey Report (FFRI Monthly Research Dec 2016) Black Hat Europe 2016 Survey Report (FFRI Monthly Research Dec 2016)
Black Hat Europe 2016 Survey Report (FFRI Monthly Research Dec 2016) FFRI, Inc.
 
An Example of use the Threat Modeling Tool (FFRI Monthly Research Nov 2016)
An Example of use the Threat Modeling Tool (FFRI Monthly Research Nov 2016)An Example of use the Threat Modeling Tool (FFRI Monthly Research Nov 2016)
An Example of use the Threat Modeling Tool (FFRI Monthly Research Nov 2016)FFRI, Inc.
 
STRIDE Variants and Security Requirements-based Threat Analysis (FFRI Monthly...
STRIDE Variants and Security Requirements-based Threat Analysis (FFRI Monthly...STRIDE Variants and Security Requirements-based Threat Analysis (FFRI Monthly...
STRIDE Variants and Security Requirements-based Threat Analysis (FFRI Monthly...FFRI, Inc.
 
Introduction of Threat Analysis Methods(FFRI Monthly Research 2016.9)
Introduction of Threat Analysis Methods(FFRI Monthly Research 2016.9)Introduction of Threat Analysis Methods(FFRI Monthly Research 2016.9)
Introduction of Threat Analysis Methods(FFRI Monthly Research 2016.9)FFRI, Inc.
 
Black Hat USA 2016 Survey Report (FFRI Monthly Research 2016.8)
Black Hat USA 2016  Survey Report (FFRI Monthly Research 2016.8)Black Hat USA 2016  Survey Report (FFRI Monthly Research 2016.8)
Black Hat USA 2016 Survey Report (FFRI Monthly Research 2016.8)FFRI, Inc.
 
About security assessment framework “CHIPSEC” (FFRI Monthly Research 2016.7)
About security assessment framework “CHIPSEC” (FFRI Monthly Research 2016.7) About security assessment framework “CHIPSEC” (FFRI Monthly Research 2016.7)
About security assessment framework “CHIPSEC” (FFRI Monthly Research 2016.7) FFRI, Inc.
 
Black Hat USA 2016 Pre-Survey (FFRI Monthly Research 2016.6)
Black Hat USA 2016 Pre-Survey (FFRI Monthly Research 2016.6)Black Hat USA 2016 Pre-Survey (FFRI Monthly Research 2016.6)
Black Hat USA 2016 Pre-Survey (FFRI Monthly Research 2016.6)FFRI, Inc.
 
Black Hat Asia 2016 Survey Report (FFRI Monthly Research 2016.4)
Black Hat Asia 2016 Survey Report (FFRI Monthly Research 2016.4)Black Hat Asia 2016 Survey Report (FFRI Monthly Research 2016.4)
Black Hat Asia 2016 Survey Report (FFRI Monthly Research 2016.4)FFRI, Inc.
 
ARMv8-M TrustZone: A New Security Feature for Embedded Systems (FFRI Monthly ...
ARMv8-M TrustZone: A New Security Feature for Embedded Systems (FFRI Monthly ...ARMv8-M TrustZone: A New Security Feature for Embedded Systems (FFRI Monthly ...
ARMv8-M TrustZone: A New Security Feature for Embedded Systems (FFRI Monthly ...FFRI, Inc.
 
CODE BLUE 2015 Report (FFRI Monthly Research 2015.11)
CODE BLUE 2015 Report (FFRI Monthly Research 2015.11)CODE BLUE 2015 Report (FFRI Monthly Research 2015.11)
CODE BLUE 2015 Report (FFRI Monthly Research 2015.11)FFRI, Inc.
 
Latest Security Reports of Automobile and Vulnerability Assessment by CVSS v3...
Latest Security Reports of Automobile and Vulnerability Assessment by CVSS v3...Latest Security Reports of Automobile and Vulnerability Assessment by CVSS v3...
Latest Security Reports of Automobile and Vulnerability Assessment by CVSS v3...FFRI, Inc.
 
Black Hat USA 2015 Survey Report (FFRI Monthly Research 201508)
Black Hat USA 2015 Survey Report (FFRI Monthly Research 201508)Black Hat USA 2015 Survey Report (FFRI Monthly Research 201508)
Black Hat USA 2015 Survey Report (FFRI Monthly Research 201508)FFRI, Inc.
 
A Survey of Threats in OS X and iOS(FFRI Monthly Research 201507)
A Survey of Threats in OS X and iOS(FFRI Monthly Research 201507)A Survey of Threats in OS X and iOS(FFRI Monthly Research 201507)
A Survey of Threats in OS X and iOS(FFRI Monthly Research 201507)FFRI, Inc.
 
Security of Windows 10 IoT Core(FFRI Monthly Research 201506)
Security of Windows 10 IoT Core(FFRI Monthly Research 201506)Security of Windows 10 IoT Core(FFRI Monthly Research 201506)
Security of Windows 10 IoT Core(FFRI Monthly Research 201506)FFRI, Inc.
 
Trend of Next-Gen In-Vehicle Network Standard and Current State of Security(F...
Trend of Next-Gen In-Vehicle Network Standard and Current State of Security(F...Trend of Next-Gen In-Vehicle Network Standard and Current State of Security(F...
Trend of Next-Gen In-Vehicle Network Standard and Current State of Security(F...FFRI, Inc.
 

Más de FFRI, Inc. (20)

Appearances are deceiving: Novel offensive techniques in Windows 10/11 on ARM
Appearances are deceiving: Novel offensive techniques in Windows 10/11 on ARMAppearances are deceiving: Novel offensive techniques in Windows 10/11 on ARM
Appearances are deceiving: Novel offensive techniques in Windows 10/11 on ARM
 
Appearances are deceiving: Novel offensive techniques in Windows 10/11 on ARM
Appearances are deceiving: Novel offensive techniques in Windows 10/11 on ARMAppearances are deceiving: Novel offensive techniques in Windows 10/11 on ARM
Appearances are deceiving: Novel offensive techniques in Windows 10/11 on ARM
 
TrustZone use case and trend (FFRI Monthly Research Mar 2017)
TrustZone use case and trend (FFRI Monthly Research Mar 2017) TrustZone use case and trend (FFRI Monthly Research Mar 2017)
TrustZone use case and trend (FFRI Monthly Research Mar 2017)
 
Android Things Security Research in Developer Preview 2 (FFRI Monthly Researc...
Android Things Security Research in Developer Preview 2 (FFRI Monthly Researc...Android Things Security Research in Developer Preview 2 (FFRI Monthly Researc...
Android Things Security Research in Developer Preview 2 (FFRI Monthly Researc...
 
An Overview of the Android Things Security (FFRI Monthly Research Jan 2017)
An Overview of the Android Things Security (FFRI Monthly Research Jan 2017) An Overview of the Android Things Security (FFRI Monthly Research Jan 2017)
An Overview of the Android Things Security (FFRI Monthly Research Jan 2017)
 
Black Hat Europe 2016 Survey Report (FFRI Monthly Research Dec 2016)
Black Hat Europe 2016 Survey Report (FFRI Monthly Research Dec 2016) Black Hat Europe 2016 Survey Report (FFRI Monthly Research Dec 2016)
Black Hat Europe 2016 Survey Report (FFRI Monthly Research Dec 2016)
 
An Example of use the Threat Modeling Tool (FFRI Monthly Research Nov 2016)
An Example of use the Threat Modeling Tool (FFRI Monthly Research Nov 2016)An Example of use the Threat Modeling Tool (FFRI Monthly Research Nov 2016)
An Example of use the Threat Modeling Tool (FFRI Monthly Research Nov 2016)
 
STRIDE Variants and Security Requirements-based Threat Analysis (FFRI Monthly...
STRIDE Variants and Security Requirements-based Threat Analysis (FFRI Monthly...STRIDE Variants and Security Requirements-based Threat Analysis (FFRI Monthly...
STRIDE Variants and Security Requirements-based Threat Analysis (FFRI Monthly...
 
Introduction of Threat Analysis Methods(FFRI Monthly Research 2016.9)
Introduction of Threat Analysis Methods(FFRI Monthly Research 2016.9)Introduction of Threat Analysis Methods(FFRI Monthly Research 2016.9)
Introduction of Threat Analysis Methods(FFRI Monthly Research 2016.9)
 
Black Hat USA 2016 Survey Report (FFRI Monthly Research 2016.8)
Black Hat USA 2016  Survey Report (FFRI Monthly Research 2016.8)Black Hat USA 2016  Survey Report (FFRI Monthly Research 2016.8)
Black Hat USA 2016 Survey Report (FFRI Monthly Research 2016.8)
 
About security assessment framework “CHIPSEC” (FFRI Monthly Research 2016.7)
About security assessment framework “CHIPSEC” (FFRI Monthly Research 2016.7) About security assessment framework “CHIPSEC” (FFRI Monthly Research 2016.7)
About security assessment framework “CHIPSEC” (FFRI Monthly Research 2016.7)
 
Black Hat USA 2016 Pre-Survey (FFRI Monthly Research 2016.6)
Black Hat USA 2016 Pre-Survey (FFRI Monthly Research 2016.6)Black Hat USA 2016 Pre-Survey (FFRI Monthly Research 2016.6)
Black Hat USA 2016 Pre-Survey (FFRI Monthly Research 2016.6)
 
Black Hat Asia 2016 Survey Report (FFRI Monthly Research 2016.4)
Black Hat Asia 2016 Survey Report (FFRI Monthly Research 2016.4)Black Hat Asia 2016 Survey Report (FFRI Monthly Research 2016.4)
Black Hat Asia 2016 Survey Report (FFRI Monthly Research 2016.4)
 
ARMv8-M TrustZone: A New Security Feature for Embedded Systems (FFRI Monthly ...
ARMv8-M TrustZone: A New Security Feature for Embedded Systems (FFRI Monthly ...ARMv8-M TrustZone: A New Security Feature for Embedded Systems (FFRI Monthly ...
ARMv8-M TrustZone: A New Security Feature for Embedded Systems (FFRI Monthly ...
 
CODE BLUE 2015 Report (FFRI Monthly Research 2015.11)
CODE BLUE 2015 Report (FFRI Monthly Research 2015.11)CODE BLUE 2015 Report (FFRI Monthly Research 2015.11)
CODE BLUE 2015 Report (FFRI Monthly Research 2015.11)
 
Latest Security Reports of Automobile and Vulnerability Assessment by CVSS v3...
Latest Security Reports of Automobile and Vulnerability Assessment by CVSS v3...Latest Security Reports of Automobile and Vulnerability Assessment by CVSS v3...
Latest Security Reports of Automobile and Vulnerability Assessment by CVSS v3...
 
Black Hat USA 2015 Survey Report (FFRI Monthly Research 201508)
Black Hat USA 2015 Survey Report (FFRI Monthly Research 201508)Black Hat USA 2015 Survey Report (FFRI Monthly Research 201508)
Black Hat USA 2015 Survey Report (FFRI Monthly Research 201508)
 
A Survey of Threats in OS X and iOS(FFRI Monthly Research 201507)
A Survey of Threats in OS X and iOS(FFRI Monthly Research 201507)A Survey of Threats in OS X and iOS(FFRI Monthly Research 201507)
A Survey of Threats in OS X and iOS(FFRI Monthly Research 201507)
 
Security of Windows 10 IoT Core(FFRI Monthly Research 201506)
Security of Windows 10 IoT Core(FFRI Monthly Research 201506)Security of Windows 10 IoT Core(FFRI Monthly Research 201506)
Security of Windows 10 IoT Core(FFRI Monthly Research 201506)
 
Trend of Next-Gen In-Vehicle Network Standard and Current State of Security(F...
Trend of Next-Gen In-Vehicle Network Standard and Current State of Security(F...Trend of Next-Gen In-Vehicle Network Standard and Current State of Security(F...
Trend of Next-Gen In-Vehicle Network Standard and Current State of Security(F...
 

Último

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Último (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 

FFRI Evaluation of Using Fuzzy Hashing to Classify Malware Similarity

  • 1. FFRI,Inc. Fourteenforty Research Institute, Inc. FFRI, Inc http://www.ffri.jp Monthly Research Consideration and evaluation of using fuzzy hashing Ver2.00.01
  • 2. FFRI,Inc. • Background and purpose • Basis of fuzzy hashing • An experiment • The result • Consideration Agenda 2
  • 3. FFRI,Inc. • ‘fuzzy hashing’ was introduced in 2006 by Jesse Kornblum – http://dfrws.org/2006/proceedings/12-Kornblum.pdf • In malware analysis fuzzy hashing algorithms such as ssdeep are being introduced in recent years • (IMHO) However, we don’t consider the effective usage of them enough • In this slides, we evaluate an effectiveness of classification of malware similarity by fuzzy hashing Background and purpose 3
  • 4. FFRI,Inc. • In general, cryptographic hashing like MD5 is popular and it has the attributes as follows: (cf. http://en.wikipedia.org/wiki/Cryptographic_hash_function) – it is easy to compute the hash value for any given message – it is infeasible to generate a message that has a given hash – it is infeasible to modify a message without changing the hash – it is infeasible to find two different messages with the same hash • Cryptographic hashing is often used for identify the same files • On the other hand, it is unsuitable to identify similar files because digests are completely different even if 1 bit is altered in the other file • In DFIR, this need exists and fuzzy hashing was developed to solve this problem Basis of Fuzzy hashing(1/4) 4
  • 5. FFRI,Inc. • Fuzzy hashing = Context Triggered Piecewise Hashing(CTPH) = Piecewise hashing + Rolling hashing • Piecewise Hashing – Dividing message into N-block and calculating hash value of each blocks • Rolling Hash – A method to calculate hash value of sub-message(position 1-3, 2-4…) fast – In general, if calculated values are the same, it is a high probability that original messages are identical Basis of Fuzzy hashing(2/4) 5 1 2 3 4 5 6 7 8 a b c d e f g h Calculating a hash value for each N chars(N=3)
  • 6. FFRI,Inc. • Fuzzy hashing(CTPH) – When rolling hash generates a specific value at any position, it calculates cryptographic hash value of the partial message from the beginning to the position – Generate a hash value by concatenating all (partial) hashes Basis of Fuzzy hashing(3/4) 6 1 2 3 4 5 6 7 8 a a a b b b c c ①calculating rolling hash, and the result is the specific value (trigger) ②calculating a hash value of this block by cryptographic hash ③calculating rolling hash value from position 4(trying to determine a next block)
  • 7. FFRI,Inc. • Fuzzy hashing(CTPH) – With almost identical messages, it would calculate a hash value of identical partial message stochastically – We can identify partial matches between similar messages Basis of Fuzzy hashing(4/4) 7 1 2 3 4 5 6 7 8 a a a b b b c c 1 2 3 4 5 6 7 8 b b b a a a d d a823 928c 817d 928c a823 1972 Hash values for each block 238c7d 8c2372 Hash value for entire message
  • 8. FFRI,Inc. • In general, usage of fuzzy hashing is proposed as follows: – Determining similar files(i.e. almost identical but MD5s aren’t matched) – Matching partial data in files • This time, we evaluate determining similar malware by fuzzy hashing • We make it clear “how effective it is in actually” and “what we should consider if we applying it” for the above An experiment(1/2) 8
  • 9. FFRI,Inc. • Preparation – Preparing 2,036 unique malware files in MD5s collected by ourselves – Calculating(fuzzy) hash values by ssdeep and similarity of all of each files • nCr: 2,036C2 = 2,071,630 combinations • Determining similar malware – Extracting all the pairs whose similarities are 50%-100% – Determining if the detection name of files in a pair is matched for each similarity threshold An experiment(2/2) 9 A B C … A 90 82 54 B 76 62 C 46 … similarity of each malware files(%) 80%+ ・A-B ・A-C 80%+ ・(A)Trojan.XYZ – (B)Trojan.XYX ・(A)Trojan.XYZ – (C)WORM.DEA Matched Not matched
  • 10. FFRI,Inc. • The higher the threshold is, the higher matching rate of detection name we get – Up to the threshold of 90% it keeps around 50-60% matching rate The result 10 Thenumberofpairswhose similarityareabovethreshold Threshold of similarity(%)
  • 11. FFRI,Inc. • Dividing “matched” pairs into a group who has “generic” in its name and the others • “matched(Generic)%” shows the same trend with the matching rate above -> The higher the threshold is, the more malware are detected as “generic” The rate detected by the name “Generic”? 11 Thenumberofpairswhose similarityareabovethreshold Threshold of similarity(%)
  • 12. FFRI,Inc. • A meaning of the result depends on if the AVV uses fuzzy hashing for generic detection – If they use fuzzy hashing for generic detection • The result is natural – If not • By using fuzzy hashing, we may obtain a similar result to the generic detection • If we use fuzzy hashing for generic detection, 90%+ similarity might be required with known malware (fuzzy) hash values Consideration 12
  • 13. FFRI,Inc. • E-Mail: research-feedback@ffri.jp • twitter: @FFRI_Research Contact Information 13