SlideShare a Scribd company logo
1 of 26
Download to read offline
Private and confidential© 2014 Delphix. All Rights Reserved Private and confidential© 2014 Delphix. All Rights Reserved
Improving the ZFS Userland-Kernel API:
Channel Programs
Matt Ahrens
Principal Engineer
Delphix
● Problems
● Performance
● Atomicity
● API sprawl
● Use case - Delphix database virtualization
● Solution: Channel Programs
● Real-world use cases
● Future work
● Exciting OpenZFS announcements!
Background: ZFS Administrative Operations
zfs snapshot rpool/fs@snap
Kernel
Userland starts
snapshot ioctl
Assigned to open txg (100)
wait for it to Sync
Actual work
happens in
syncing context
ioctl returns with
results of operation
Open txg
Syncing txg
TXG 100 TXG 101 TXG 102
Background: Dependent Operations
zfs snapshot rpool/fs@snap
zfs set zfs:p=test rpool/fs@snap
Snapshot ioctl
Snapshot assigned to open
txg (100); wait for it to Sync
Set property ioctl
Open txg
Syncing txg
TXG 100 TXG 101 TXG 102
Set-prop assigned to open
txg (102); wait for it to Sync
● Syncing TXG takes longer when pool processing lots of writes
● Each one can take seconds
● Userland sees massive delay for each operation
Background: spa_sync Time
Assigned to open txg
wait for it to Sync
Open txg
Syncing txg
TXG 100 TXG 101 TXG 102
Userland starts
snapshot ioctl
6.01 seconds later,
ioctl returns with
results of operation
Wait 1 second to
start syncing
Writing dirty user data
takes 5 seconds
Processing ioctl
takes 0.01 second
Background: Dependent Operations
zfs snapshot rpool/fs@snap
zfs set zfs:p=test rpool/fs@snap
Snapshot ioctl
Snapshot assigned to open
txg (100); wait for it to Sync
Set property ioctl
Open txg
Syncing txg
TXG 100 TXG 101 TXG 102
Set-prop assigned to open
txg (102); wait for it to Sync
For 2 dependent operations: 10 seconds
Background: Atomicity
zfs snapshot rpool/fs@snap
zfs set zfs:p=test rpool/fs@snap
Snapshot ioctl Set property ioctl
Open txg
Syncing txg
TXG 100 TXG 101 TXG 102
Snapshot exists without property set
Background: Atomicity
Open txg
Syncing txg
TXG 100 TXG 101 TXG 102
What if someone creates a new clone?
zfs promote rpool/clone
zfs destroy rpool/fs
promote ioctl destroy ioctl
How an ioctl Evolves: Snapshots
1. Start simple:
● snapshot(“rpool/fs@snap”)
2. Need atomicity/speed for multiple snapshots:
● snapshot(“rpool/fs@snap”, “rpool/fs2@snap”, …)
● All or nothing: if any snapshot fails none are created
3. ‘zfs snapshot -r’ doesn’t work with “all or nothing”:
● If any snapshot fails with something other than ENOENT
none are created
4. Want to set properties while creating snapshots:
● snapshot(“rpool/fs@snap”, “rpool/fs2@snap”, …,
props={map})
Why not just have an ioctl for ‘zfs snapshot -r’?
Delphix Proprietary and Confidential
Production
Database
Server
Delphix Engine
Oracle RMAN
NFS
Fiber
Channel
non-production
Database
servers for
development,
test, reporting
Create Virtual Database
zfs create pool/vdb1
zfs create pool/vdb1/version1
zfs create pool/vdb1/version1/temp
zfs clone pool/prod_db/datafile@today pool/vdb/version1/datafile
zfs snapshot pool/prod_db/logs pool/prod_db/logs@tempsnap
zfs clone pool/prod_db/logs@tempsnap pool/vdb/version1/logs
zfs destroy -d pool/prod_db/logs@tempsnap
zfs set com.delphix:user_prop=bla_bla pool/vdb/version1
zfs set sharenfs=rw=10.0.4.123 pool/vdb/version1/datafile
zfs set sharenfs=rw=10.0.4.123 pool/vdb/version1/temp
zfs set sharenfs=rw=10.0.4.123 pool/vdb/version1/logs
Refresh Virtual Database
zfs create pool/vdb1/version2
zfs create pool/vdb1/version2/temp
zfs clone pool/prod_db/datafile@today pool/vdb/version2/datafile
zfs snapshot pool/prod_db/logs pool/prod_db/logs@tempsnap
zfs clone pool/prod_db/logs@tempsnap pool/vdb/version2/logs
zfs destroy -d pool/prod_db/logs@tempsnap
zfs set com.delphix:user_prop=bla_bla pool/vdb/version2
zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/datafile
zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/temp
zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/logs
10+ TXG’s (>2 minutes)
. . .
zfs destroy -r pool/vdb1/version1
5+ TXG’s (>1 minute)
Streamline the user/kernel API: Channel Programs
● Core operations are not changing frequently:
● snapshot(“rpool/fs@onesnap”)
● create(“rpool/onefs”)
● destroy(“rpool/onefs”, defer=true/false)
● Stop creating a new ioctl for every possible combination of
core operations
● Have syncing context interpret “channel programs” that
describe what combination of operations to perform, how to
do iteration, and how to deal with errors
Channel Programs: use in Delphix product
● Create snap + clone
● Destroy multiple fs & snaps
● Snapshot MDS + list datasets & props
Refresh Virtual Database
zfs create pool/vdb1/version2
zfs create pool/vdb1/version2/temp
zfs clone pool/prod_db/datafile@today pool/vdb/version2/datafile
zfs snapshot pool/prod_db/logs pool/prod_db/logs@tempsnap
zfs clone pool/prod_db/logs@tempsnap pool/vdb/version2/logs
zfs destroy -d pool/prod_db/logs@tempsnap
zfs set com.delphix:user_prop=bla_bla pool/vdb/version2
zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/datafile
zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/temp
zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/logs
10+ TXG’s (>2 minutes)
. . .
zfs destroy -r pool/vdb1/version1
5+ TXG’s (>1 minute)
Channel Programs: Clone filesystem
snap = input.fsname .. “@tempsnap”
err = zfs.sync.snapshot(snap)
if err ~= 0 then
return err
done
err = zfs.sync.clone(snap, input.clonename)
zfs.sync.destroy(snap, defer=true)
return err
● Clones the current state of a filesystem, creating a new
snapshot that is deferred-destroyed in the same transaction
Open txg
Syncing txg
TXG 100 TXG 101 TXG 102
program ioctl
Channel Programs: Clone filesystem
zfs program pool clone_fs.zcp 
pool/prod_db/logs pool/vdb/version1/logs
Create snapshot, clone, defer destroy in one TXG
● Safety
● Must be root (for now)
● Memory limit
● Instruction count limit
● Works best for programmatic consumers
● Not every ZFS operation fits this model
● e.g. zpool add, zfs send/receive
Refresh Virtual Database
zfs create pool/vdb1/version2
zfs create pool/vdb1/version2/temp
zfs clone pool/prod_db/datafile@today pool/vdb/version2/datafile
zfs snapshot pool/prod_db/logs pool/prod_db/logs@tempsnap
zfs clone pool/prod_db/logs@tempsnap pool/vdb/version2/logs
zfs destroy -d pool/prod_db/logs@tempsnap
zfs set com.delphix:user_prop=bla_bla pool/vdb/version2
zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/datafile
zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/temp
zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/logs
10+ TXG’s (>2 minutes)
. . .
zfs destroy -r pool/vdb1/version1
5+ TXG’s (>1 minute)
Channel Programs: destroy -r
function destroy_recursive(root)
for child in zfs.list.children(root) do
destroy_recursive(child)
end
for snap in zfs.list.snapshots(root) do
zfs.sync.destroy(snap)
end
zfs.sync.destroy(root)
end
destroy_recursive(args['fs'])
Channel Programs: destroy -r (w/error handling)
function destroy_recursive(root)
for child in zfs.list.children(root) do
destroy_recursive(child)
end
for snap in zfs.list.snapshots(root) do
err = zfs.sync.destroy(snap)
if (err ~= 0) then
zfs.debug('snap destroy failed: ' .. snap .. ' with error ' .. err)
assert(false)
end
end
err = zfs.sync.destroy(root)
if (err ~= 0) then
zfs.debug('fs destroy failed: ' .. root .. ' with error ' .. err)
assert(false)
end
end
args = ...
destroy_recursive(args['fs'])
Upgrade validation
filesystems/snaps in pool must match internal database
zfs list -t all -o name,used,origin,...
zfs snapshot pool/mds@upgrade_test
But there are concurrent zfs snapshot, zfs destroy...
Problems:
● List of filesystems/snapshots is not self-consistent
● Snapshot of MDS is not consistent with list of fs/snap
● Check spuriously fails
Solution:
● Add locking to application to prohibit concurrent zfs ops?
Channel Programs: snapshot, list fs, get props
args = ...
pool = args['pool']
snapName = args['snapName']
properties = args['properties']
datasets = {}
function get_all(root)
local snapshots = {}
local child_datasets = {}
datasets[root] = {}
datasets[root].properties = get_props(root)
for snap in zfs.list.snapshots(root) do
table.insert(snapshots, snap)
datasets[snap] = {}
datasets[snap].properties = get_props(snap)
end
for child in zfs.list.children(root) do
table.insert(child_datasets, child)
get_all(child)
end
datasets[root].snapshots = snapshots
datasets[root].children = child_datasets
end
function get_props(ds)
local property_table={}
for _, property in ipairs(properties) do
property_val = zfs.get_prop(ds, property)
property_table[property] = property_val
end
return property_table
end
get_all(pool)
if snapName ~= nil and snapName ~= '' then
err = zfs.sync.snapshot(snapName)
if (err ~= 0) then
zfs.debug('fs snapshot failed: ' .. snapName .. ' w
assert(false)
end
end
return datasets
zfs.list.snapshots(dataset)
zfs.list.clones(snapshot)
zfs.list.children(dataset)
zfs.sync.delete(<snap | fs>)
zfs.sync.promote(dataset)
zfs.sync.rollback(dataset)
zfs.sync.snapshot(snapshot)
zfs.get_prop(dataset, property)
Next up:
Integrate Pull Request!
Create filesystem / volume / clone
Set properties
News
● OpenZFS Pull Requests now tested on:
● Donations accepted via &
October 24-25th (Tues-Wed)
San Francisco
Talks; Hackathon
http://open-zfs.org/
Submit talks by September 4th
Registration now open!
Sponsorship opportunities
Thanks to early-bird sponsors:

More Related Content

What's hot

Troubleshooting containerized triple o deployment
Troubleshooting containerized triple o deploymentTroubleshooting containerized triple o deployment
Troubleshooting containerized triple o deploymentSadique Puthen
 
An Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersAn Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersKento Aoyama
 
New Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingNew Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingScyllaDB
 
Stackless Python In Eve
Stackless Python In EveStackless Python In Eve
Stackless Python In Evel xf
 
DCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep diveDCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep diveMadhu Venugopal
 
Performance: Observe and Tune
Performance: Observe and TunePerformance: Observe and Tune
Performance: Observe and TunePaul V. Novarese
 
Spying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profitSpying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profitAndrea Righi
 
FBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversFBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversAngelo Failla
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugginglibfetion
 
SREConEurope15 - The evolution of the DHCP infrastructure at Facebook
SREConEurope15 - The evolution of the DHCP infrastructure at FacebookSREConEurope15 - The evolution of the DHCP infrastructure at Facebook
SREConEurope15 - The evolution of the DHCP infrastructure at FacebookAngelo Failla
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecturehugo lu
 
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructureDevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructureAngelo Failla
 
Kernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel TLV
 
Linux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudLinux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudAndrea Righi
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machineAlexei Starovoitov
 
Linux network namespaces
Linux network namespacesLinux network namespaces
Linux network namespacesMike Wilson
 
Testing Wi-Fi with OSS Tools
Testing Wi-Fi with OSS ToolsTesting Wi-Fi with OSS Tools
Testing Wi-Fi with OSS ToolsAll Things Open
 
RxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsRxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsBrendan Gregg
 
Berkeley Packet Filters
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet FiltersKernel TLV
 

What's hot (20)

Troubleshooting containerized triple o deployment
Troubleshooting containerized triple o deploymentTroubleshooting containerized triple o deployment
Troubleshooting containerized triple o deployment
 
An Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersAn Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux Containers
 
New Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingNew Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using Tracing
 
Docker.io
Docker.ioDocker.io
Docker.io
 
Stackless Python In Eve
Stackless Python In EveStackless Python In Eve
Stackless Python In Eve
 
DCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep diveDCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep dive
 
Performance: Observe and Tune
Performance: Observe and TunePerformance: Observe and Tune
Performance: Observe and Tune
 
Spying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profitSpying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profit
 
FBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversFBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp servers
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 
SREConEurope15 - The evolution of the DHCP infrastructure at Facebook
SREConEurope15 - The evolution of the DHCP infrastructure at FacebookSREConEurope15 - The evolution of the DHCP infrastructure at Facebook
SREConEurope15 - The evolution of the DHCP infrastructure at Facebook
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecture
 
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructureDevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
 
Kernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel Proc Connector and Containers
Kernel Proc Connector and Containers
 
Linux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudLinux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloud
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machine
 
Linux network namespaces
Linux network namespacesLinux network namespaces
Linux network namespaces
 
Testing Wi-Fi with OSS Tools
Testing Wi-Fi with OSS ToolsTesting Wi-Fi with OSS Tools
Testing Wi-Fi with OSS Tools
 
RxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsRxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance Results
 
Berkeley Packet Filters
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet Filters
 

Similar to Improving the ZFS Userland-Kernel API with Channel Programs - BSDCAN 2017 - Matt Ahrens

Performance Profiling in Rust
Performance Profiling in RustPerformance Profiling in Rust
Performance Profiling in RustInfluxData
 
Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Community
 
Slackware Demystified [SELF 2011]
Slackware Demystified [SELF 2011]Slackware Demystified [SELF 2011]
Slackware Demystified [SELF 2011]Vincent Batts
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internalsKostas Tzoumas
 
Debugging node in prod
Debugging node in prodDebugging node in prod
Debugging node in prodYunong Xiao
 
OpenStack Tokyo Meeup - Gluster Storage Day
OpenStack Tokyo Meeup - Gluster Storage DayOpenStack Tokyo Meeup - Gluster Storage Day
OpenStack Tokyo Meeup - Gluster Storage DayDan Radez
 
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017Chris Fregly
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learnedtcurdt
 
Docker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in PragueDocker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in Praguetomasbart
 
Advanced debugging  techniques in different environments
Advanced debugging  techniques in different environmentsAdvanced debugging  techniques in different environments
Advanced debugging  techniques in different environmentsAndrii Soldatenko
 
DevOps in PHP environment
DevOps in PHP environment DevOps in PHP environment
DevOps in PHP environment Evaldo Felipe
 
Simplest-Ownage-Human-Observed… - Routers
 Simplest-Ownage-Human-Observed… - Routers Simplest-Ownage-Human-Observed… - Routers
Simplest-Ownage-Human-Observed… - RoutersLogicaltrust pl
 
Filip palian mateuszkocielski. simplest ownage human observed… routers
Filip palian mateuszkocielski. simplest ownage human observed… routersFilip palian mateuszkocielski. simplest ownage human observed… routers
Filip palian mateuszkocielski. simplest ownage human observed… routersYury Chemerkin
 
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...DataWorks Summit
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...Chris Fregly
 
Testing Delphix: easy data virtualization
Testing Delphix: easy data virtualizationTesting Delphix: easy data virtualization
Testing Delphix: easy data virtualizationFranck Pachot
 
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Ontico
 

Similar to Improving the ZFS Userland-Kernel API with Channel Programs - BSDCAN 2017 - Matt Ahrens (20)

Performance Profiling in Rust
Performance Profiling in RustPerformance Profiling in Rust
Performance Profiling in Rust
 
Go Replicator
Go ReplicatorGo Replicator
Go Replicator
 
Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph
 
Slackware Demystified [SELF 2011]
Slackware Demystified [SELF 2011]Slackware Demystified [SELF 2011]
Slackware Demystified [SELF 2011]
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
ZFS Talk Part 1
ZFS Talk Part 1ZFS Talk Part 1
ZFS Talk Part 1
 
Debugging node in prod
Debugging node in prodDebugging node in prod
Debugging node in prod
 
OpenStack Tokyo Meeup - Gluster Storage Day
OpenStack Tokyo Meeup - Gluster Storage DayOpenStack Tokyo Meeup - Gluster Storage Day
OpenStack Tokyo Meeup - Gluster Storage Day
 
Backups
BackupsBackups
Backups
 
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learned
 
Docker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in PragueDocker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in Prague
 
Advanced debugging  techniques in different environments
Advanced debugging  techniques in different environmentsAdvanced debugging  techniques in different environments
Advanced debugging  techniques in different environments
 
DevOps in PHP environment
DevOps in PHP environment DevOps in PHP environment
DevOps in PHP environment
 
Simplest-Ownage-Human-Observed… - Routers
 Simplest-Ownage-Human-Observed… - Routers Simplest-Ownage-Human-Observed… - Routers
Simplest-Ownage-Human-Observed… - Routers
 
Filip palian mateuszkocielski. simplest ownage human observed… routers
Filip palian mateuszkocielski. simplest ownage human observed… routersFilip palian mateuszkocielski. simplest ownage human observed… routers
Filip palian mateuszkocielski. simplest ownage human observed… routers
 
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
Optimizing, profiling and deploying high performance Spark ML and TensorFlow ...
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
 
Testing Delphix: easy data virtualization
Testing Delphix: easy data virtualizationTesting Delphix: easy data virtualization
Testing Delphix: easy data virtualization
 
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
 

More from Matthew Ahrens

OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt Ahrens
OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt AhrensOpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt Ahrens
OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt AhrensMatthew Ahrens
 
OpenZFS at AsiaBSDcon FreeBSD Developer Summit
OpenZFS at AsiaBSDcon FreeBSD Developer SummitOpenZFS at AsiaBSDcon FreeBSD Developer Summit
OpenZFS at AsiaBSDcon FreeBSD Developer SummitMatthew Ahrens
 
OpenZFS Channel programs
OpenZFS Channel programsOpenZFS Channel programs
OpenZFS Channel programsMatthew Ahrens
 
OpenZFS code repository
OpenZFS code repositoryOpenZFS code repository
OpenZFS code repositoryMatthew Ahrens
 
OpenZFS Developer Summit Introduction
OpenZFS Developer Summit IntroductionOpenZFS Developer Summit Introduction
OpenZFS Developer Summit IntroductionMatthew Ahrens
 

More from Matthew Ahrens (11)

OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt Ahrens
OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt AhrensOpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt Ahrens
OpenZFS novel algorithms: snapshots, space allocation, RAID-Z - Matt Ahrens
 
Experimental dtrace
Experimental dtraceExperimental dtrace
Experimental dtrace
 
OpenZFS dotScale
OpenZFS dotScaleOpenZFS dotScale
OpenZFS dotScale
 
OpenZFS at AsiaBSDcon FreeBSD Developer Summit
OpenZFS at AsiaBSDcon FreeBSD Developer SummitOpenZFS at AsiaBSDcon FreeBSD Developer Summit
OpenZFS at AsiaBSDcon FreeBSD Developer Summit
 
OpenZFS - BSDcan 2014
OpenZFS - BSDcan 2014OpenZFS - BSDcan 2014
OpenZFS - BSDcan 2014
 
OpenZFS - AsiaBSDcon
OpenZFS - AsiaBSDconOpenZFS - AsiaBSDcon
OpenZFS - AsiaBSDcon
 
OpenZFS Channel programs
OpenZFS Channel programsOpenZFS Channel programs
OpenZFS Channel programs
 
OpenZFS code repository
OpenZFS code repositoryOpenZFS code repository
OpenZFS code repository
 
OpenZFS Developer Summit Introduction
OpenZFS Developer Summit IntroductionOpenZFS Developer Summit Introduction
OpenZFS Developer Summit Introduction
 
OpenZFS at EuroBSDcon
OpenZFS at EuroBSDconOpenZFS at EuroBSDcon
OpenZFS at EuroBSDcon
 
OpenZFS at LinuxCon
OpenZFS at LinuxConOpenZFS at LinuxCon
OpenZFS at LinuxCon
 

Recently uploaded

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 

Recently uploaded (20)

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 

Improving the ZFS Userland-Kernel API with Channel Programs - BSDCAN 2017 - Matt Ahrens

  • 1. Private and confidential© 2014 Delphix. All Rights Reserved Private and confidential© 2014 Delphix. All Rights Reserved Improving the ZFS Userland-Kernel API: Channel Programs Matt Ahrens Principal Engineer Delphix
  • 2. ● Problems ● Performance ● Atomicity ● API sprawl ● Use case - Delphix database virtualization ● Solution: Channel Programs ● Real-world use cases ● Future work ● Exciting OpenZFS announcements!
  • 3. Background: ZFS Administrative Operations zfs snapshot rpool/fs@snap Kernel Userland starts snapshot ioctl Assigned to open txg (100) wait for it to Sync Actual work happens in syncing context ioctl returns with results of operation Open txg Syncing txg TXG 100 TXG 101 TXG 102
  • 4. Background: Dependent Operations zfs snapshot rpool/fs@snap zfs set zfs:p=test rpool/fs@snap Snapshot ioctl Snapshot assigned to open txg (100); wait for it to Sync Set property ioctl Open txg Syncing txg TXG 100 TXG 101 TXG 102 Set-prop assigned to open txg (102); wait for it to Sync
  • 5. ● Syncing TXG takes longer when pool processing lots of writes ● Each one can take seconds ● Userland sees massive delay for each operation Background: spa_sync Time Assigned to open txg wait for it to Sync Open txg Syncing txg TXG 100 TXG 101 TXG 102 Userland starts snapshot ioctl 6.01 seconds later, ioctl returns with results of operation Wait 1 second to start syncing Writing dirty user data takes 5 seconds Processing ioctl takes 0.01 second
  • 6. Background: Dependent Operations zfs snapshot rpool/fs@snap zfs set zfs:p=test rpool/fs@snap Snapshot ioctl Snapshot assigned to open txg (100); wait for it to Sync Set property ioctl Open txg Syncing txg TXG 100 TXG 101 TXG 102 Set-prop assigned to open txg (102); wait for it to Sync For 2 dependent operations: 10 seconds
  • 7. Background: Atomicity zfs snapshot rpool/fs@snap zfs set zfs:p=test rpool/fs@snap Snapshot ioctl Set property ioctl Open txg Syncing txg TXG 100 TXG 101 TXG 102 Snapshot exists without property set
  • 8. Background: Atomicity Open txg Syncing txg TXG 100 TXG 101 TXG 102 What if someone creates a new clone? zfs promote rpool/clone zfs destroy rpool/fs promote ioctl destroy ioctl
  • 9. How an ioctl Evolves: Snapshots 1. Start simple: ● snapshot(“rpool/fs@snap”) 2. Need atomicity/speed for multiple snapshots: ● snapshot(“rpool/fs@snap”, “rpool/fs2@snap”, …) ● All or nothing: if any snapshot fails none are created 3. ‘zfs snapshot -r’ doesn’t work with “all or nothing”: ● If any snapshot fails with something other than ENOENT none are created 4. Want to set properties while creating snapshots: ● snapshot(“rpool/fs@snap”, “rpool/fs2@snap”, …, props={map}) Why not just have an ioctl for ‘zfs snapshot -r’?
  • 10. Delphix Proprietary and Confidential Production Database Server Delphix Engine Oracle RMAN NFS Fiber Channel non-production Database servers for development, test, reporting
  • 11. Create Virtual Database zfs create pool/vdb1 zfs create pool/vdb1/version1 zfs create pool/vdb1/version1/temp zfs clone pool/prod_db/datafile@today pool/vdb/version1/datafile zfs snapshot pool/prod_db/logs pool/prod_db/logs@tempsnap zfs clone pool/prod_db/logs@tempsnap pool/vdb/version1/logs zfs destroy -d pool/prod_db/logs@tempsnap zfs set com.delphix:user_prop=bla_bla pool/vdb/version1 zfs set sharenfs=rw=10.0.4.123 pool/vdb/version1/datafile zfs set sharenfs=rw=10.0.4.123 pool/vdb/version1/temp zfs set sharenfs=rw=10.0.4.123 pool/vdb/version1/logs
  • 12. Refresh Virtual Database zfs create pool/vdb1/version2 zfs create pool/vdb1/version2/temp zfs clone pool/prod_db/datafile@today pool/vdb/version2/datafile zfs snapshot pool/prod_db/logs pool/prod_db/logs@tempsnap zfs clone pool/prod_db/logs@tempsnap pool/vdb/version2/logs zfs destroy -d pool/prod_db/logs@tempsnap zfs set com.delphix:user_prop=bla_bla pool/vdb/version2 zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/datafile zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/temp zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/logs 10+ TXG’s (>2 minutes) . . . zfs destroy -r pool/vdb1/version1 5+ TXG’s (>1 minute)
  • 13. Streamline the user/kernel API: Channel Programs ● Core operations are not changing frequently: ● snapshot(“rpool/fs@onesnap”) ● create(“rpool/onefs”) ● destroy(“rpool/onefs”, defer=true/false) ● Stop creating a new ioctl for every possible combination of core operations ● Have syncing context interpret “channel programs” that describe what combination of operations to perform, how to do iteration, and how to deal with errors
  • 14. Channel Programs: use in Delphix product ● Create snap + clone ● Destroy multiple fs & snaps ● Snapshot MDS + list datasets & props
  • 15. Refresh Virtual Database zfs create pool/vdb1/version2 zfs create pool/vdb1/version2/temp zfs clone pool/prod_db/datafile@today pool/vdb/version2/datafile zfs snapshot pool/prod_db/logs pool/prod_db/logs@tempsnap zfs clone pool/prod_db/logs@tempsnap pool/vdb/version2/logs zfs destroy -d pool/prod_db/logs@tempsnap zfs set com.delphix:user_prop=bla_bla pool/vdb/version2 zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/datafile zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/temp zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/logs 10+ TXG’s (>2 minutes) . . . zfs destroy -r pool/vdb1/version1 5+ TXG’s (>1 minute)
  • 16. Channel Programs: Clone filesystem snap = input.fsname .. “@tempsnap” err = zfs.sync.snapshot(snap) if err ~= 0 then return err done err = zfs.sync.clone(snap, input.clonename) zfs.sync.destroy(snap, defer=true) return err ● Clones the current state of a filesystem, creating a new snapshot that is deferred-destroyed in the same transaction
  • 17. Open txg Syncing txg TXG 100 TXG 101 TXG 102 program ioctl Channel Programs: Clone filesystem zfs program pool clone_fs.zcp pool/prod_db/logs pool/vdb/version1/logs Create snapshot, clone, defer destroy in one TXG
  • 18. ● Safety ● Must be root (for now) ● Memory limit ● Instruction count limit ● Works best for programmatic consumers ● Not every ZFS operation fits this model ● e.g. zpool add, zfs send/receive
  • 19. Refresh Virtual Database zfs create pool/vdb1/version2 zfs create pool/vdb1/version2/temp zfs clone pool/prod_db/datafile@today pool/vdb/version2/datafile zfs snapshot pool/prod_db/logs pool/prod_db/logs@tempsnap zfs clone pool/prod_db/logs@tempsnap pool/vdb/version2/logs zfs destroy -d pool/prod_db/logs@tempsnap zfs set com.delphix:user_prop=bla_bla pool/vdb/version2 zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/datafile zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/temp zfs set sharenfs=rw=10.0.4.123 pool/vdb/version2/logs 10+ TXG’s (>2 minutes) . . . zfs destroy -r pool/vdb1/version1 5+ TXG’s (>1 minute)
  • 20. Channel Programs: destroy -r function destroy_recursive(root) for child in zfs.list.children(root) do destroy_recursive(child) end for snap in zfs.list.snapshots(root) do zfs.sync.destroy(snap) end zfs.sync.destroy(root) end destroy_recursive(args['fs'])
  • 21. Channel Programs: destroy -r (w/error handling) function destroy_recursive(root) for child in zfs.list.children(root) do destroy_recursive(child) end for snap in zfs.list.snapshots(root) do err = zfs.sync.destroy(snap) if (err ~= 0) then zfs.debug('snap destroy failed: ' .. snap .. ' with error ' .. err) assert(false) end end err = zfs.sync.destroy(root) if (err ~= 0) then zfs.debug('fs destroy failed: ' .. root .. ' with error ' .. err) assert(false) end end args = ... destroy_recursive(args['fs'])
  • 22. Upgrade validation filesystems/snaps in pool must match internal database zfs list -t all -o name,used,origin,... zfs snapshot pool/mds@upgrade_test But there are concurrent zfs snapshot, zfs destroy... Problems: ● List of filesystems/snapshots is not self-consistent ● Snapshot of MDS is not consistent with list of fs/snap ● Check spuriously fails Solution: ● Add locking to application to prohibit concurrent zfs ops?
  • 23. Channel Programs: snapshot, list fs, get props args = ... pool = args['pool'] snapName = args['snapName'] properties = args['properties'] datasets = {} function get_all(root) local snapshots = {} local child_datasets = {} datasets[root] = {} datasets[root].properties = get_props(root) for snap in zfs.list.snapshots(root) do table.insert(snapshots, snap) datasets[snap] = {} datasets[snap].properties = get_props(snap) end for child in zfs.list.children(root) do table.insert(child_datasets, child) get_all(child) end datasets[root].snapshots = snapshots datasets[root].children = child_datasets end function get_props(ds) local property_table={} for _, property in ipairs(properties) do property_val = zfs.get_prop(ds, property) property_table[property] = property_val end return property_table end get_all(pool) if snapName ~= nil and snapName ~= '' then err = zfs.sync.snapshot(snapName) if (err ~= 0) then zfs.debug('fs snapshot failed: ' .. snapName .. ' w assert(false) end end return datasets
  • 25. News ● OpenZFS Pull Requests now tested on: ● Donations accepted via &
  • 26. October 24-25th (Tues-Wed) San Francisco Talks; Hackathon http://open-zfs.org/ Submit talks by September 4th Registration now open! Sponsorship opportunities Thanks to early-bird sponsors: