SlideShare una empresa de Scribd logo
1 de 20
1IEEE eScience 2017
sciunits:sciunits:
Reusable Research ObjectsReusable Research Objects
School of Computing, College of Computing and Digital MediaSchool of Computing, College of Computing and Digital Media
Dai Hai Ton That, Gabe Fils,Dai Hai Ton That, Gabe Fils,
Zhihao Yuan, Tanu MalikZhihao Yuan, Tanu Malik
Presented by Gabe FilsPresented by Gabe Fils
2IEEE eScience 2017
Problem SpaceProblem Space
No easily creatable, readily reusable, efficiently versioned,No easily creatable, readily reusable, efficiently versioned,
discrete unit of computation existsdiscrete unit of computation exists
Virtualization Distributed
Version Control
Research Object
portable
self-contained
repeatable collaborative
versioned
documented
3IEEE eScience 2017
Introducing: TheIntroducing: The sciunitsciunit
 Captures application executionsCaptures application executions
 Repeats executionsRepeats executions
 Reproduces executions, changing input argsReproduces executions, changing input args
 Versioned executions stored as oneVersioned executions stored as one sciunitsciunit
 Uses provenance for self-documentationUses provenance for self-documentation
AA reusablereusable research object.research object.
4IEEE eScience 2017
Sample Applications: FIE, VICSample Applications: FIE, VIC
FIE Application WorkflowFIE Application Workflow
 City of Chicago Food InspectionsCity of Chicago Food Inspections
Evaluation ModelEvaluation Model
 Four applicationsFour applications
 Two languagesTwo languages
 130 files130 files
 1580 dependencies1580 dependencies
 908 MB908 MB
https://github.com/chicago/food-inspections-https://github.com/chicago/food-inspections-
evaluationevaluation
 Variable Infiltration CapacityVariable Infiltration Capacity
 Four applicationsFour applications
 Five languagesFive languages
 7 GB7 GB https://vic.readthedocs.iohttps://vic.readthedocs.io
5IEEE eScience 2017
sciunitsciunit ArchitectureArchitecture
https://bitbucket.org/geotrust/sciunit-clihttps://bitbucket.org/geotrust/sciunit-cli
sciunitsciunit clientclient sciunitsciunit serverserver
sciunitsciunit
 LightweightLightweight
 VersionedVersioned
 Self-containedSelf-contained
 StoresStores sciunitssciunits
 VisualizesVisualizes sciunitssciunits
 Linux CLI AppLinux CLI App
 C / PythonC / Python
 Minimal DependenciesMinimal Dependencies
6IEEE eScience 2017
ClientClient packagepackage CommandCommand
Initialize or retrieve aInitialize or retrieve a sciunitsciunit::
Run FIE, capture into package:Run FIE, capture into package:
7IEEE eScience 2017
Packaging DetailsPackaging Details
Package In Storage (133 MB):Package In Storage (133 MB):
22ndnd
Version Of Package (133 MB):Version Of Package (133 MB):
FIE pkg-root DirFIE pkg-root Dir
1) Attach to process1) Attach to process
2) Intercept system calls2) Intercept system calls
3) Copy files / executables3) Copy files / executables
4) Log system calls4) Log system calls
8IEEE eScience 2017
ClientClient repeatrepeat CommandCommand
original calloriginal call
61021 exec61021 exec
“/usr/bin/python”“/usr/bin/python”
replaced callreplaced call
61021 exec61021 exec
“/home/user1/pkgroot/usr/bin/python”“/home/user1/pkgroot/usr/bin/python”
Repeat a package:Repeat a package: 1) Attach to process1) Attach to process
2) Replace system call args2) Replace system call args
9IEEE eScience 2017
Versioning SolutionVersioning Solution
80G File, Fixed 4K Chunks:80G File, Fixed 4K Chunks:
Same File, 1 Byte Inserted At Start:Same File, 1 Byte Inserted At Start:
 When to store?When to store?
 During packagingDuring packaging
 After packagingAfter packaging
 How to store?How to store?
 Line-based diffsLine-based diffs
 Fixed-size chunksFixed-size chunks
 Content-definedContent-defined
10IEEE eScience 2017
Rabin HashRabin Hash
 Hash of subset of file bytes (Hash of subset of file bytes (RH(BRH(B11,, BB22, …, … BBnn))))
 Fixed-size sliding windowFixed-size sliding window nn
 Hash at any positionHash at any position ii ((RH(XRH(X(i,n)(i,n)))))
 Deduplicate chunkDeduplicate chunk
11IEEE eScience 2017
Storage And RetrievalStorage And Retrieval
Deduplicated Container StorageDeduplicated Container Storage
Store package:Store package:
1) Archive package-root1) Archive package-root
2) CDC on archive2) CDC on archive
3) Store manifest3) Store manifest
Retrieve package:Retrieve package:
1) Retrieve manifest1) Retrieve manifest
2) Concatenate chunks2) Concatenate chunks
3) Extract archive3) Extract archive
12IEEE eScience 2017
Detailed VisualizationDetailed Visualization
Part Of A Normal (Verbose) Provenance LogPart Of A Normal (Verbose) Provenance Log
Small Section Of Graph Built From Normal Provenance LogSmall Section Of Graph Built From Normal Provenance Log
13IEEE eScience 2017
Summarization: Group By SimilaritySummarization: Group By Similarity
 Group vertices byGroup vertices by type / connectionstype / connections
 Effect: group subprocesses, group files in directoryEffect: group subprocesses, group files in directory
Similarity RuleSimilarity Rule
Type(u) = Type(v), Input(u) = Input(v), Output(u) = Output(v)Type(u) = Type(v), Input(u) = Input(v), Output(u) = Output(v)
Similarity AppliedSimilarity AppliedFull GraphFull Graph
14IEEE eScience 2017
Summarization: PackSummarization: Pack
 Find min-connected nodes, pack into hubsFind min-connected nodes, pack into hubs
Packability RulesPackability Rules
1) Type(u) = file, {1) Type(u) = file, { !e | e E ( e=(u,v) e=(v,u) ) }∃ ∈ ∧ ∨!e | e E ( e=(u,v) e=(v,u) ) }∃ ∈ ∧ ∨
2) Type(u) = process, { !e | e E e=(u,v) }∃ ∈ ∧2) Type(u) = process, { !e | e E e=(u,v) }∃ ∈ ∧
3) Type(u) = file, { !(e∃3) Type(u) = file, { !(e∃ 11,e,e22) | ( x V, v≠x) ( e∃ ∈ ∧) | ( x V, v≠x) ( e∃ ∈ ∧ 11=(u,v) E, e∈=(u,v) E, e∈ 22=(x,u) E ) }∈=(x,u) E ) }∈
Packability AppliedPackability Applied
Similarity AppliedSimilarity Applied
15IEEE eScience 2017
Summarization: AnnotateSummarization: Annotate
 Higher precedence to process nodesHigher precedence to process nodes
 File with n > 1 edges → n annotationsFile with n > 1 edges → n annotations
Annotation AppliedAnnotation AppliedPackability AppliedPackability Applied
16IEEE eScience 2017
Package / Repeat PerformancePackage / Repeat Performance
 Added ptrace system callsAdded ptrace system calls
 I/O-intensive apps: VICI/O-intensive apps: VIC
 Non-I/O-intensive apps: FIENon-I/O-intensive apps: FIE
Package/Repeat RuntimesPackage/Repeat Runtimes
1) Run app normally1) Run app normally
2) Run with2) Run with packagepackage
3) Run with3) Run with repeatrepeat
17IEEE eScience 2017
Versioning PerformanceVersioning Performance
Commit/Reconstruct TimesCommit/Reconstruct Times
Storage SizesStorage Sizes
Package/Repeat RuntimesPackage/Repeat Runtimes
1) Size of several versions1) Size of several versions
2) Size after deduplication2) Size after deduplication
3) CDC / concatenation time3) CDC / concatenation time
18IEEE eScience 2017
Results From Provenance SummarizationResults From Provenance Summarization
Reduction Of EdgesReduction Of EdgesReduction Of File NodesReduction Of File Nodes
Reduction Of Process NodesReduction Of Process Nodes
1) Full FIE graph1) Full FIE graph
2) All techniques applied2) All techniques applied
3) Dynamic expansion3) Dynamic expansion
19IEEE eScience 2017
Conclusion And Current WorkConclusion And Current Work
 Graph summarization testingGraph summarization testing
 Database applicationsDatabase applications
 Exact partial repeatabilityExact partial repeatability
 Apps with network-operationsApps with network-operations
 Parallel HPC applicationsParallel HPC applications
 Emerging reusable object formatsEmerging reusable object formats
sciunitsciunit is a portable, self-contained, and inherentlyis a portable, self-contained, and inherently
understandable versioned unit of computation.understandable versioned unit of computation.
20IEEE eScience 2017
Links And AcknowledgementsLinks And Acknowledgements
sciunitsciunit::
 https://sciunit.runhttps://sciunit.run
sciunitsciunit paper:paper:
 https://arxiv.orghttps://arxiv.org
 Search for “Search for “sciunit”sciunit”
National Science Foundation grants ICER-1639759,National Science Foundation grants ICER-1639759,
ICER-1661918, ICER-1440327, ICER-1343816ICER-1661918, ICER-1440327, ICER-1343816

Más contenido relacionado

Similar a Sciunits: Resuable Research Object

Reproducibility challenges in computational settings: what are they, why shou...
Reproducibility challenges in computational settings: what are they, why shou...Reproducibility challenges in computational settings: what are they, why shou...
Reproducibility challenges in computational settings: what are they, why shou...Research Data Alliance
 
AI For Software Engineering: Two Industrial Experience Reports
AI For Software Engineering: Two Industrial Experience ReportsAI For Software Engineering: Two Industrial Experience Reports
AI For Software Engineering: Two Industrial Experience ReportsUniversity of Antwerp
 
Towards a Foundational API for Resilient Distributed Systems Design
Towards a Foundational API for Resilient Distributed Systems DesignTowards a Foundational API for Resilient Distributed Systems Design
Towards a Foundational API for Resilient Distributed Systems DesignDanilo Pianini
 
Open & reproducible research - What can we do in practice?
Open & reproducible research - What can we do in practice?Open & reproducible research - What can we do in practice?
Open & reproducible research - What can we do in practice?Felix Z. Hoffmann
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesThomas Zimmermann
 
Beyond the GFLOPS
Beyond the GFLOPSBeyond the GFLOPS
Beyond the GFLOPSSlide_N
 
Tutorial for Estimating Broad and Narrow Sense Heritability using R
Tutorial for Estimating Broad and Narrow Sense Heritability using RTutorial for Estimating Broad and Narrow Sense Heritability using R
Tutorial for Estimating Broad and Narrow Sense Heritability using RAvjinder (Avi) Kaler
 
Modern javascript localization with c-3po and the good old gettext
Modern javascript localization with c-3po and the good old gettextModern javascript localization with c-3po and the good old gettext
Modern javascript localization with c-3po and the good old gettextAlexander Mostovenko
 
The Popper Experimentation Protocol and CLI tool
The Popper Experimentation Protocol and CLI toolThe Popper Experimentation Protocol and CLI tool
The Popper Experimentation Protocol and CLI toolIvo Jimenez
 
Kqueue : Generic Event notification
Kqueue : Generic Event notificationKqueue : Generic Event notification
Kqueue : Generic Event notificationMahendra M
 
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat Pôle Systematic Paris-Region
 
An Empirical Study of Identical Function Clones in CRAN
An Empirical Study of Identical Function Clones in CRANAn Empirical Study of Identical Function Clones in CRAN
An Empirical Study of Identical Function Clones in CRANTom Mens
 
Comparison of Open Source Virtualization Technology
Comparison of Open Source Virtualization TechnologyComparison of Open Source Virtualization Technology
Comparison of Open Source Virtualization TechnologyBenoit des Ligneris
 

Similar a Sciunits: Resuable Research Object (20)

Reproducibility challenges in computational settings: what are they, why shou...
Reproducibility challenges in computational settings: what are they, why shou...Reproducibility challenges in computational settings: what are they, why shou...
Reproducibility challenges in computational settings: what are they, why shou...
 
R sharing 101
R sharing 101R sharing 101
R sharing 101
 
AI For Software Engineering: Two Industrial Experience Reports
AI For Software Engineering: Two Industrial Experience ReportsAI For Software Engineering: Two Industrial Experience Reports
AI For Software Engineering: Two Industrial Experience Reports
 
Ab initio training Ab-initio Architecture
Ab initio training Ab-initio ArchitectureAb initio training Ab-initio Architecture
Ab initio training Ab-initio Architecture
 
Towards a Foundational API for Resilient Distributed Systems Design
Towards a Foundational API for Resilient Distributed Systems DesignTowards a Foundational API for Resilient Distributed Systems Design
Towards a Foundational API for Resilient Distributed Systems Design
 
Open & reproducible research - What can we do in practice?
Open & reproducible research - What can we do in practice?Open & reproducible research - What can we do in practice?
Open & reproducible research - What can we do in practice?
 
Handout3o
Handout3oHandout3o
Handout3o
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
Beyond the GFLOPS
Beyond the GFLOPSBeyond the GFLOPS
Beyond the GFLOPS
 
Slimfast
SlimfastSlimfast
Slimfast
 
Tutorial for Estimating Broad and Narrow Sense Heritability using R
Tutorial for Estimating Broad and Narrow Sense Heritability using RTutorial for Estimating Broad and Narrow Sense Heritability using R
Tutorial for Estimating Broad and Narrow Sense Heritability using R
 
Icsm08a.ppt
Icsm08a.pptIcsm08a.ppt
Icsm08a.ppt
 
Modern javascript localization with c-3po and the good old gettext
Modern javascript localization with c-3po and the good old gettextModern javascript localization with c-3po and the good old gettext
Modern javascript localization with c-3po and the good old gettext
 
The Popper Experimentation Protocol and CLI tool
The Popper Experimentation Protocol and CLI toolThe Popper Experimentation Protocol and CLI tool
The Popper Experimentation Protocol and CLI tool
 
Kqueue : Generic Event notification
Kqueue : Generic Event notificationKqueue : Generic Event notification
Kqueue : Generic Event notification
 
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
 
An Empirical Study of Identical Function Clones in CRAN
An Empirical Study of Identical Function Clones in CRANAn Empirical Study of Identical Function Clones in CRAN
An Empirical Study of Identical Function Clones in CRAN
 
Machine Learning and Deep Software Variability
Machine Learning and Deep Software VariabilityMachine Learning and Deep Software Variability
Machine Learning and Deep Software Variability
 
biopython, doctest and makefiles
biopython, doctest and makefilesbiopython, doctest and makefiles
biopython, doctest and makefiles
 
Comparison of Open Source Virtualization Technology
Comparison of Open Source Virtualization TechnologyComparison of Open Source Virtualization Technology
Comparison of Open Source Virtualization Technology
 

Más de Tanu Malik

Auditing and Maintaining Provenance in Software Packages
Auditing and Maintaining Provenance in Software PackagesAuditing and Maintaining Provenance in Software Packages
Auditing and Maintaining Provenance in Software PackagesTanu Malik
 
GeoDataspace: Simplifying Data Management Tasks with Globus
GeoDataspace: Simplifying Data Management Tasks with GlobusGeoDataspace: Simplifying Data Management Tasks with Globus
GeoDataspace: Simplifying Data Management Tasks with GlobusTanu Malik
 
LDV: Light-weight Database Virtualization
LDV: Light-weight Database VirtualizationLDV: Light-weight Database Virtualization
LDV: Light-weight Database VirtualizationTanu Malik
 
GEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsGEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsTanu Malik
 
GlobusWorld 2015
GlobusWorld 2015GlobusWorld 2015
GlobusWorld 2015Tanu Malik
 
Benchmarking Cloud-based Tagging Services
Benchmarking Cloud-based Tagging ServicesBenchmarking Cloud-based Tagging Services
Benchmarking Cloud-based Tagging ServicesTanu Malik
 
PTU: Using Provenance for Repeatability
PTU: Using Provenance for RepeatabilityPTU: Using Provenance for Repeatability
PTU: Using Provenance for RepeatabilityTanu Malik
 
Scientometrics
ScientometricsScientometrics
ScientometricsTanu Malik
 
EarthCube DDMA AGU
EarthCube DDMA AGUEarthCube DDMA AGU
EarthCube DDMA AGUTanu Malik
 
SOLE: Linking Research Papers with Science Objects
SOLE: Linking Research Papers with Science ObjectsSOLE: Linking Research Papers with Science Objects
SOLE: Linking Research Papers with Science ObjectsTanu Malik
 

Más de Tanu Malik (10)

Auditing and Maintaining Provenance in Software Packages
Auditing and Maintaining Provenance in Software PackagesAuditing and Maintaining Provenance in Software Packages
Auditing and Maintaining Provenance in Software Packages
 
GeoDataspace: Simplifying Data Management Tasks with Globus
GeoDataspace: Simplifying Data Management Tasks with GlobusGeoDataspace: Simplifying Data Management Tasks with Globus
GeoDataspace: Simplifying Data Management Tasks with Globus
 
LDV: Light-weight Database Virtualization
LDV: Light-weight Database VirtualizationLDV: Light-weight Database Virtualization
LDV: Light-weight Database Virtualization
 
GEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsGEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC Programs
 
GlobusWorld 2015
GlobusWorld 2015GlobusWorld 2015
GlobusWorld 2015
 
Benchmarking Cloud-based Tagging Services
Benchmarking Cloud-based Tagging ServicesBenchmarking Cloud-based Tagging Services
Benchmarking Cloud-based Tagging Services
 
PTU: Using Provenance for Repeatability
PTU: Using Provenance for RepeatabilityPTU: Using Provenance for Repeatability
PTU: Using Provenance for Repeatability
 
Scientometrics
ScientometricsScientometrics
Scientometrics
 
EarthCube DDMA AGU
EarthCube DDMA AGUEarthCube DDMA AGU
EarthCube DDMA AGU
 
SOLE: Linking Research Papers with Science Objects
SOLE: Linking Research Papers with Science ObjectsSOLE: Linking Research Papers with Science Objects
SOLE: Linking Research Papers with Science Objects
 

Último

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 

Último (20)

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 

Sciunits: Resuable Research Object

  • 1. 1IEEE eScience 2017 sciunits:sciunits: Reusable Research ObjectsReusable Research Objects School of Computing, College of Computing and Digital MediaSchool of Computing, College of Computing and Digital Media Dai Hai Ton That, Gabe Fils,Dai Hai Ton That, Gabe Fils, Zhihao Yuan, Tanu MalikZhihao Yuan, Tanu Malik Presented by Gabe FilsPresented by Gabe Fils
  • 2. 2IEEE eScience 2017 Problem SpaceProblem Space No easily creatable, readily reusable, efficiently versioned,No easily creatable, readily reusable, efficiently versioned, discrete unit of computation existsdiscrete unit of computation exists Virtualization Distributed Version Control Research Object portable self-contained repeatable collaborative versioned documented
  • 3. 3IEEE eScience 2017 Introducing: TheIntroducing: The sciunitsciunit  Captures application executionsCaptures application executions  Repeats executionsRepeats executions  Reproduces executions, changing input argsReproduces executions, changing input args  Versioned executions stored as oneVersioned executions stored as one sciunitsciunit  Uses provenance for self-documentationUses provenance for self-documentation AA reusablereusable research object.research object.
  • 4. 4IEEE eScience 2017 Sample Applications: FIE, VICSample Applications: FIE, VIC FIE Application WorkflowFIE Application Workflow  City of Chicago Food InspectionsCity of Chicago Food Inspections Evaluation ModelEvaluation Model  Four applicationsFour applications  Two languagesTwo languages  130 files130 files  1580 dependencies1580 dependencies  908 MB908 MB https://github.com/chicago/food-inspections-https://github.com/chicago/food-inspections- evaluationevaluation  Variable Infiltration CapacityVariable Infiltration Capacity  Four applicationsFour applications  Five languagesFive languages  7 GB7 GB https://vic.readthedocs.iohttps://vic.readthedocs.io
  • 5. 5IEEE eScience 2017 sciunitsciunit ArchitectureArchitecture https://bitbucket.org/geotrust/sciunit-clihttps://bitbucket.org/geotrust/sciunit-cli sciunitsciunit clientclient sciunitsciunit serverserver sciunitsciunit  LightweightLightweight  VersionedVersioned  Self-containedSelf-contained  StoresStores sciunitssciunits  VisualizesVisualizes sciunitssciunits  Linux CLI AppLinux CLI App  C / PythonC / Python  Minimal DependenciesMinimal Dependencies
  • 6. 6IEEE eScience 2017 ClientClient packagepackage CommandCommand Initialize or retrieve aInitialize or retrieve a sciunitsciunit:: Run FIE, capture into package:Run FIE, capture into package:
  • 7. 7IEEE eScience 2017 Packaging DetailsPackaging Details Package In Storage (133 MB):Package In Storage (133 MB): 22ndnd Version Of Package (133 MB):Version Of Package (133 MB): FIE pkg-root DirFIE pkg-root Dir 1) Attach to process1) Attach to process 2) Intercept system calls2) Intercept system calls 3) Copy files / executables3) Copy files / executables 4) Log system calls4) Log system calls
  • 8. 8IEEE eScience 2017 ClientClient repeatrepeat CommandCommand original calloriginal call 61021 exec61021 exec “/usr/bin/python”“/usr/bin/python” replaced callreplaced call 61021 exec61021 exec “/home/user1/pkgroot/usr/bin/python”“/home/user1/pkgroot/usr/bin/python” Repeat a package:Repeat a package: 1) Attach to process1) Attach to process 2) Replace system call args2) Replace system call args
  • 9. 9IEEE eScience 2017 Versioning SolutionVersioning Solution 80G File, Fixed 4K Chunks:80G File, Fixed 4K Chunks: Same File, 1 Byte Inserted At Start:Same File, 1 Byte Inserted At Start:  When to store?When to store?  During packagingDuring packaging  After packagingAfter packaging  How to store?How to store?  Line-based diffsLine-based diffs  Fixed-size chunksFixed-size chunks  Content-definedContent-defined
  • 10. 10IEEE eScience 2017 Rabin HashRabin Hash  Hash of subset of file bytes (Hash of subset of file bytes (RH(BRH(B11,, BB22, …, … BBnn))))  Fixed-size sliding windowFixed-size sliding window nn  Hash at any positionHash at any position ii ((RH(XRH(X(i,n)(i,n)))))  Deduplicate chunkDeduplicate chunk
  • 11. 11IEEE eScience 2017 Storage And RetrievalStorage And Retrieval Deduplicated Container StorageDeduplicated Container Storage Store package:Store package: 1) Archive package-root1) Archive package-root 2) CDC on archive2) CDC on archive 3) Store manifest3) Store manifest Retrieve package:Retrieve package: 1) Retrieve manifest1) Retrieve manifest 2) Concatenate chunks2) Concatenate chunks 3) Extract archive3) Extract archive
  • 12. 12IEEE eScience 2017 Detailed VisualizationDetailed Visualization Part Of A Normal (Verbose) Provenance LogPart Of A Normal (Verbose) Provenance Log Small Section Of Graph Built From Normal Provenance LogSmall Section Of Graph Built From Normal Provenance Log
  • 13. 13IEEE eScience 2017 Summarization: Group By SimilaritySummarization: Group By Similarity  Group vertices byGroup vertices by type / connectionstype / connections  Effect: group subprocesses, group files in directoryEffect: group subprocesses, group files in directory Similarity RuleSimilarity Rule Type(u) = Type(v), Input(u) = Input(v), Output(u) = Output(v)Type(u) = Type(v), Input(u) = Input(v), Output(u) = Output(v) Similarity AppliedSimilarity AppliedFull GraphFull Graph
  • 14. 14IEEE eScience 2017 Summarization: PackSummarization: Pack  Find min-connected nodes, pack into hubsFind min-connected nodes, pack into hubs Packability RulesPackability Rules 1) Type(u) = file, {1) Type(u) = file, { !e | e E ( e=(u,v) e=(v,u) ) }∃ ∈ ∧ ∨!e | e E ( e=(u,v) e=(v,u) ) }∃ ∈ ∧ ∨ 2) Type(u) = process, { !e | e E e=(u,v) }∃ ∈ ∧2) Type(u) = process, { !e | e E e=(u,v) }∃ ∈ ∧ 3) Type(u) = file, { !(e∃3) Type(u) = file, { !(e∃ 11,e,e22) | ( x V, v≠x) ( e∃ ∈ ∧) | ( x V, v≠x) ( e∃ ∈ ∧ 11=(u,v) E, e∈=(u,v) E, e∈ 22=(x,u) E ) }∈=(x,u) E ) }∈ Packability AppliedPackability Applied Similarity AppliedSimilarity Applied
  • 15. 15IEEE eScience 2017 Summarization: AnnotateSummarization: Annotate  Higher precedence to process nodesHigher precedence to process nodes  File with n > 1 edges → n annotationsFile with n > 1 edges → n annotations Annotation AppliedAnnotation AppliedPackability AppliedPackability Applied
  • 16. 16IEEE eScience 2017 Package / Repeat PerformancePackage / Repeat Performance  Added ptrace system callsAdded ptrace system calls  I/O-intensive apps: VICI/O-intensive apps: VIC  Non-I/O-intensive apps: FIENon-I/O-intensive apps: FIE Package/Repeat RuntimesPackage/Repeat Runtimes 1) Run app normally1) Run app normally 2) Run with2) Run with packagepackage 3) Run with3) Run with repeatrepeat
  • 17. 17IEEE eScience 2017 Versioning PerformanceVersioning Performance Commit/Reconstruct TimesCommit/Reconstruct Times Storage SizesStorage Sizes Package/Repeat RuntimesPackage/Repeat Runtimes 1) Size of several versions1) Size of several versions 2) Size after deduplication2) Size after deduplication 3) CDC / concatenation time3) CDC / concatenation time
  • 18. 18IEEE eScience 2017 Results From Provenance SummarizationResults From Provenance Summarization Reduction Of EdgesReduction Of EdgesReduction Of File NodesReduction Of File Nodes Reduction Of Process NodesReduction Of Process Nodes 1) Full FIE graph1) Full FIE graph 2) All techniques applied2) All techniques applied 3) Dynamic expansion3) Dynamic expansion
  • 19. 19IEEE eScience 2017 Conclusion And Current WorkConclusion And Current Work  Graph summarization testingGraph summarization testing  Database applicationsDatabase applications  Exact partial repeatabilityExact partial repeatability  Apps with network-operationsApps with network-operations  Parallel HPC applicationsParallel HPC applications  Emerging reusable object formatsEmerging reusable object formats sciunitsciunit is a portable, self-contained, and inherentlyis a portable, self-contained, and inherently understandable versioned unit of computation.understandable versioned unit of computation.
  • 20. 20IEEE eScience 2017 Links And AcknowledgementsLinks And Acknowledgements sciunitsciunit::  https://sciunit.runhttps://sciunit.run sciunitsciunit paper:paper:  https://arxiv.orghttps://arxiv.org  Search for “Search for “sciunit”sciunit” National Science Foundation grants ICER-1639759,National Science Foundation grants ICER-1639759, ICER-1661918, ICER-1440327, ICER-1343816ICER-1661918, ICER-1440327, ICER-1343816