SlideShare a Scribd company logo
1 of 3
Download to read offline
How NetApp Dedupe works?
When you 'sis on' a volume, the behaviour of that volume changes, and the change takes place in
two phases:
TWO PHASE PROCESS:
PHASE 1 -> SIS enabled: Pre-process: Before the block is written to the array: Collecting Fingerprint
Note: This is true for new blocks, for the existing data blocks that were written before enabling SIS, you need to run the scan on the
existing data and pull those fingerprints into the catalogue.
PHASE 2 -> SIS Start : Post-process: After the block is written to the array: Sorting, Comparing and
deduping
PHASE 1:
The moment SIS is enabled:
 Every time SIS notices a block write request coming in, the sis process makes a call to Data
ONTAP to get a copy of the fingerprint for that block so that it can store this fingerprint in its
catalogue file.
Note: This request interrupts the write string and results in a 7% performance penalty for all writes
into any volume with sis enabled.
PHASE 2:
Now, at some point you'll want to dedupe the volume using the 'sis start' command manually/auto
or via schedule:
 SIS goes through the process of comparing fingerprints from the fingerprint database
catalogue file, validating data, and dedupe'ing blocks that pass the validation phase.
Note: In the end all we are really doing is adjusting some inode metadata to say "hey remember that
data that used to be here, well it’s over there now."
IMPORTANT: Nothing about the basic data structure of the WAFL file system has changed, except
you are traversing a different path in the file structure to get to your desired data block. That’s why
NetApp dedupe *usually* has no perceivable impact on read performance - all we've done is
redirect some block pointers. Accessing your data might go a little faster, a little slower, or more
likely not change at all - it all depends on the pattern of the file system data structure and the
pattern of requests coming from the application.
What is a fingerprint?
Fingerprint is a small digital representation of a larger data object. Basically, it is a checksum
character generated by WAFL for each BLOCK for the purpose of consistency checking (This generally
involves the creation of a hash).
Is fingerprint generated by SIS?
No. Each time a WAFL block is created, a 'checksum' character is generated for the purpose of
consistency checking. NetApp Deduplication (SIS) simply "borrows" a copy of this checksum and
stores it in a catalogue as fingerprint.
What happens during post-process dedupe?
A. The fingerprint catalogue is sorted and searched for identical fingerprints.
B. When a fingerprint "match" is made, the associated data blocks are retrieved and scanned byte-
for-byte.
C. Assuming successful validation, the inode pointer metadata of the duplicate block is redirected to
the original block.
D. The duplicate block is marked as "Free" and returned to the system, eligible for re-use.
When to use QSM vs. VSM on dedupe volumes?
Use QSM when you only want to dedupe the destination volume, and use VSM when you want to
dedupe both the source and destination volumes automatically, and save bandwidth during SM
transfers.
Courtesy: Dr. Dedupe, NetApp.
Prepared by:
ashwinwriter@gmail.com

More Related Content

More from Ashwin Pawar

Network port administrative speed does not display correctly on NetApp storage
Network port administrative speed does not display correctly on NetApp storageNetwork port administrative speed does not display correctly on NetApp storage
Network port administrative speed does not display correctly on NetApp storageAshwin Pawar
 
How to connect to NetApp FILER micro-USB console port
How to connect to NetApp FILER micro-USB console portHow to connect to NetApp FILER micro-USB console port
How to connect to NetApp FILER micro-USB console portAshwin Pawar
 
NDMP backup models
NDMP backup modelsNDMP backup models
NDMP backup modelsAshwin Pawar
 
How to use Active IQ tool to access filer information
How to use Active IQ tool to access filer informationHow to use Active IQ tool to access filer information
How to use Active IQ tool to access filer informationAshwin Pawar
 
San vs Nas fun series
San vs Nas fun seriesSan vs Nas fun series
San vs Nas fun seriesAshwin Pawar
 
Steps to identify ONTAP latency related issues
Steps to identify ONTAP latency related issuesSteps to identify ONTAP latency related issues
Steps to identify ONTAP latency related issuesAshwin Pawar
 
SnapDiff process flow chart
SnapDiff process flow chartSnapDiff process flow chart
SnapDiff process flow chartAshwin Pawar
 
SnapDiff performance issue
SnapDiff performance issueSnapDiff performance issue
SnapDiff performance issueAshwin Pawar
 
Volume level restore fails with error transient snapshot copy is not supported
Volume level restore fails with error transient snapshot copy is not supportedVolume level restore fails with error transient snapshot copy is not supported
Volume level restore fails with error transient snapshot copy is not supportedAshwin Pawar
 
Disk reports predicted failure event
Disk reports predicted failure eventDisk reports predicted failure event
Disk reports predicted failure eventAshwin Pawar
 
OCUM shows ONTAP cluster health degraded
OCUM shows ONTAP cluster health degradedOCUM shows ONTAP cluster health degraded
OCUM shows ONTAP cluster health degradedAshwin Pawar
 
NDMPCOPY lun from 7-mode NetApp to cDOT
NDMPCOPY lun from 7-mode NetApp to cDOTNDMPCOPY lun from 7-mode NetApp to cDOT
NDMPCOPY lun from 7-mode NetApp to cDOTAshwin Pawar
 
Latency in storage
Latency in storageLatency in storage
Latency in storageAshwin Pawar
 
NetApp storage layering
NetApp storage layeringNetApp storage layering
NetApp storage layeringAshwin Pawar
 
What is storage from client's perspective
What is storage from client's perspectiveWhat is storage from client's perspective
What is storage from client's perspectiveAshwin Pawar
 
Difference between cluster image package show-repository and system image get
Difference between cluster image package show-repository and system image getDifference between cluster image package show-repository and system image get
Difference between cluster image package show-repository and system image getAshwin Pawar
 
Cannot access NetApp 7-mode admin shares etc$
Cannot access NetApp 7-mode admin shares etc$Cannot access NetApp 7-mode admin shares etc$
Cannot access NetApp 7-mode admin shares etc$Ashwin Pawar
 

More from Ashwin Pawar (20)

Network port administrative speed does not display correctly on NetApp storage
Network port administrative speed does not display correctly on NetApp storageNetwork port administrative speed does not display correctly on NetApp storage
Network port administrative speed does not display correctly on NetApp storage
 
How to connect to NetApp FILER micro-USB console port
How to connect to NetApp FILER micro-USB console portHow to connect to NetApp FILER micro-USB console port
How to connect to NetApp FILER micro-USB console port
 
NDMP backup models
NDMP backup modelsNDMP backup models
NDMP backup models
 
How to use Active IQ tool to access filer information
How to use Active IQ tool to access filer informationHow to use Active IQ tool to access filer information
How to use Active IQ tool to access filer information
 
San vs Nas fun series
San vs Nas fun seriesSan vs Nas fun series
San vs Nas fun series
 
Steps to identify ONTAP latency related issues
Steps to identify ONTAP latency related issuesSteps to identify ONTAP latency related issues
Steps to identify ONTAP latency related issues
 
SnapDiff
SnapDiffSnapDiff
SnapDiff
 
SnapDiff process flow chart
SnapDiff process flow chartSnapDiff process flow chart
SnapDiff process flow chart
 
SnapDiff performance issue
SnapDiff performance issueSnapDiff performance issue
SnapDiff performance issue
 
Volume level restore fails with error transient snapshot copy is not supported
Volume level restore fails with error transient snapshot copy is not supportedVolume level restore fails with error transient snapshot copy is not supported
Volume level restore fails with error transient snapshot copy is not supported
 
Disk reports predicted failure event
Disk reports predicted failure eventDisk reports predicted failure event
Disk reports predicted failure event
 
OCUM shows ONTAP cluster health degraded
OCUM shows ONTAP cluster health degradedOCUM shows ONTAP cluster health degraded
OCUM shows ONTAP cluster health degraded
 
NDMPCOPY lun from 7-mode NetApp to cDOT
NDMPCOPY lun from 7-mode NetApp to cDOTNDMPCOPY lun from 7-mode NetApp to cDOT
NDMPCOPY lun from 7-mode NetApp to cDOT
 
Latency in storage
Latency in storageLatency in storage
Latency in storage
 
NVRAM vs NVMEM
NVRAM vs NVMEMNVRAM vs NVMEM
NVRAM vs NVMEM
 
NAS vs SAN
NAS vs SANNAS vs SAN
NAS vs SAN
 
NetApp storage layering
NetApp storage layeringNetApp storage layering
NetApp storage layering
 
What is storage from client's perspective
What is storage from client's perspectiveWhat is storage from client's perspective
What is storage from client's perspective
 
Difference between cluster image package show-repository and system image get
Difference between cluster image package show-repository and system image getDifference between cluster image package show-repository and system image get
Difference between cluster image package show-repository and system image get
 
Cannot access NetApp 7-mode admin shares etc$
Cannot access NetApp 7-mode admin shares etc$Cannot access NetApp 7-mode admin shares etc$
Cannot access NetApp 7-mode admin shares etc$
 

Recently uploaded

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Recently uploaded (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

How netapp dedupe works

  • 1. How NetApp Dedupe works? When you 'sis on' a volume, the behaviour of that volume changes, and the change takes place in two phases: TWO PHASE PROCESS: PHASE 1 -> SIS enabled: Pre-process: Before the block is written to the array: Collecting Fingerprint Note: This is true for new blocks, for the existing data blocks that were written before enabling SIS, you need to run the scan on the existing data and pull those fingerprints into the catalogue. PHASE 2 -> SIS Start : Post-process: After the block is written to the array: Sorting, Comparing and deduping PHASE 1: The moment SIS is enabled:  Every time SIS notices a block write request coming in, the sis process makes a call to Data ONTAP to get a copy of the fingerprint for that block so that it can store this fingerprint in its catalogue file.
  • 2. Note: This request interrupts the write string and results in a 7% performance penalty for all writes into any volume with sis enabled. PHASE 2: Now, at some point you'll want to dedupe the volume using the 'sis start' command manually/auto or via schedule:  SIS goes through the process of comparing fingerprints from the fingerprint database catalogue file, validating data, and dedupe'ing blocks that pass the validation phase. Note: In the end all we are really doing is adjusting some inode metadata to say "hey remember that data that used to be here, well it’s over there now." IMPORTANT: Nothing about the basic data structure of the WAFL file system has changed, except you are traversing a different path in the file structure to get to your desired data block. That’s why NetApp dedupe *usually* has no perceivable impact on read performance - all we've done is redirect some block pointers. Accessing your data might go a little faster, a little slower, or more likely not change at all - it all depends on the pattern of the file system data structure and the pattern of requests coming from the application.
  • 3. What is a fingerprint? Fingerprint is a small digital representation of a larger data object. Basically, it is a checksum character generated by WAFL for each BLOCK for the purpose of consistency checking (This generally involves the creation of a hash). Is fingerprint generated by SIS? No. Each time a WAFL block is created, a 'checksum' character is generated for the purpose of consistency checking. NetApp Deduplication (SIS) simply "borrows" a copy of this checksum and stores it in a catalogue as fingerprint. What happens during post-process dedupe? A. The fingerprint catalogue is sorted and searched for identical fingerprints. B. When a fingerprint "match" is made, the associated data blocks are retrieved and scanned byte- for-byte. C. Assuming successful validation, the inode pointer metadata of the duplicate block is redirected to the original block. D. The duplicate block is marked as "Free" and returned to the system, eligible for re-use. When to use QSM vs. VSM on dedupe volumes? Use QSM when you only want to dedupe the destination volume, and use VSM when you want to dedupe both the source and destination volumes automatically, and save bandwidth during SM transfers. Courtesy: Dr. Dedupe, NetApp. Prepared by: ashwinwriter@gmail.com