SlideShare una empresa de Scribd logo
1 de 26
Slinging Data: Data Loading and Cleanup in Evergreen Growing Evergreen Conference 22 April 2010
To migrate data … ,[object Object]
Whence ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
All over the map ,[object Object],[object Object],[object Object],[object Object]
All over the map ,[object Object],[object Object],[object Object],[object Object],[object Object]
All over the map ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
All over the map Legacy Item Type Circ Modifier 0 Regular 1 Media 45 AV 123 Reference 234 Reference
Cleaning up ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Don’t box me in! ,[object Object],[object Object]
Yes, those fixed fields really matter ,[object Object]
Yes, those fixed fields really matter ,[object Object]
Fixed fields
Oops! ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
On stage ,[object Object],[object Object],[object Object]
On stage ,[object Object],[object Object],[object Object],[object Object]
On stage ,[object Object],[object Object]
On stage ,[object Object]
On stage ,[object Object],[object Object],[object Object],[object Object]
On stage ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Counting ,[object Object],[object Object]
Counting ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Counting ,[object Object],[object Object]
Tools ,[object Object],[object Object],[object Object],[object Object],[object Object]
And now something new
Equinox Migration Tools ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object]

Más contenido relacionado

Similar a Slinging Data: Data Loading and Cleanup in Evergreen

Brief Intro to Apache Spark @ Stanford ICME
Brief Intro to Apache Spark @ Stanford ICMEBrief Intro to Apache Spark @ Stanford ICME
Brief Intro to Apache Spark @ Stanford ICMEPaco Nathan
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsJulien Le Dem
 
Think Like Spark
Think Like SparkThink Like Spark
Think Like SparkAlpine Data
 
Apache Spark: What? Why? When?
Apache Spark: What? Why? When?Apache Spark: What? Why? When?
Apache Spark: What? Why? When?Massimo Schenone
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory Optimizationguest3eed30
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory OptimizationWei Lin
 
Mapreduce in Search
Mapreduce in SearchMapreduce in Search
Mapreduce in SearchAmund Tveit
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce AlgorithmsAmund Tveit
 
Think Like Spark: Some Spark Concepts and a Use Case
Think Like Spark: Some Spark Concepts and a Use CaseThink Like Spark: Some Spark Concepts and a Use Case
Think Like Spark: Some Spark Concepts and a Use CaseRachel Warren
 
R Spatial Analysis using SP
R Spatial Analysis using SPR Spatial Analysis using SP
R Spatial Analysis using SPtjagger
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2goMoriyoshi Koizumi
 
Big data shim
Big data shimBig data shim
Big data shimtistrue
 
Introduction to Python , Overview
Introduction to Python , OverviewIntroduction to Python , Overview
Introduction to Python , OverviewNB Veeresh
 
Introduction to Dart
Introduction to DartIntroduction to Dart
Introduction to DartRamesh Nair
 
Strata NYC 2015 - What's coming for the Spark community
Strata NYC 2015 - What's coming for the Spark communityStrata NYC 2015 - What's coming for the Spark community
Strata NYC 2015 - What's coming for the Spark communityDatabricks
 
Compiler2016 by abcdabcd987
Compiler2016 by abcdabcd987Compiler2016 by abcdabcd987
Compiler2016 by abcdabcd987乐群 陈
 
Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014Konrad Malawski
 

Similar a Slinging Data: Data Loading and Cleanup in Evergreen (20)

Brief Intro to Apache Spark @ Stanford ICME
Brief Intro to Apache Spark @ Stanford ICMEBrief Intro to Apache Spark @ Stanford ICME
Brief Intro to Apache Spark @ Stanford ICME
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analytics
 
Think Like Spark
Think Like SparkThink Like Spark
Think Like Spark
 
Apache Spark: What? Why? When?
Apache Spark: What? Why? When?Apache Spark: What? Why? When?
Apache Spark: What? Why? When?
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory Optimization
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory Optimization
 
Mapreduce in Search
Mapreduce in SearchMapreduce in Search
Mapreduce in Search
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
Think Like Spark: Some Spark Concepts and a Use Case
Think Like Spark: Some Spark Concepts and a Use CaseThink Like Spark: Some Spark Concepts and a Use Case
Think Like Spark: Some Spark Concepts and a Use Case
 
R Spatial Analysis using SP
R Spatial Analysis using SPR Spatial Analysis using SP
R Spatial Analysis using SP
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2go
 
Big data shim
Big data shimBig data shim
Big data shim
 
Introduction to Python , Overview
Introduction to Python , OverviewIntroduction to Python , Overview
Introduction to Python , Overview
 
Introduction to Dart
Introduction to DartIntroduction to Dart
Introduction to Dart
 
Strata NYC 2015 - What's coming for the Spark community
Strata NYC 2015 - What's coming for the Spark communityStrata NYC 2015 - What's coming for the Spark community
Strata NYC 2015 - What's coming for the Spark community
 
Compiler2016 by abcdabcd987
Compiler2016 by abcdabcd987Compiler2016 by abcdabcd987
Compiler2016 by abcdabcd987
 
Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014
 
What is data structure
What is data structureWhat is data structure
What is data structure
 
Lexyacc
LexyaccLexyacc
Lexyacc
 
Easy R
Easy REasy R
Easy R
 

Más de Galen Charlton

Putting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For Koha
Putting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For KohaPutting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For Koha
Putting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For KohaGalen Charlton
 
Lovingly crafting a mountain, not by hand: managing piles of metadata
Lovingly crafting a mountain, not by hand: managing piles of metadataLovingly crafting a mountain, not by hand: managing piles of metadata
Lovingly crafting a mountain, not by hand: managing piles of metadataGalen Charlton
 
Calm at the center of the distributed whirlwinds
Calm at the center of the distributed whirlwindsCalm at the center of the distributed whirlwinds
Calm at the center of the distributed whirlwindsGalen Charlton
 
Scotty, I need more speed - Koha tuning
Scotty, I need more speed - Koha tuningScotty, I need more speed - Koha tuning
Scotty, I need more speed - Koha tuningGalen Charlton
 
Lightning Talk - pgTAP
Lightning Talk - pgTAPLightning Talk - pgTAP
Lightning Talk - pgTAPGalen Charlton
 
Stuff, stuff, and more stuff
Stuff, stuff, and more stuffStuff, stuff, and more stuff
Stuff, stuff, and more stuffGalen Charlton
 
Awash in a sea of connections
Awash in a sea of connectionsAwash in a sea of connections
Awash in a sea of connectionsGalen Charlton
 
Koha 3 2 Update @ Code4Lib 2010
Koha 3 2 Update @ Code4Lib 2010Koha 3 2 Update @ Code4Lib 2010
Koha 3 2 Update @ Code4Lib 2010Galen Charlton
 
‡biblios.net Web Services
‡biblios.net Web Services‡biblios.net Web Services
‡biblios.net Web ServicesGalen Charlton
 

Más de Galen Charlton (12)

Putting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For Koha
Putting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For KohaPutting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For Koha
Putting the Cat in the Catalogue: A Feline-Inspired OPAC Theme For Koha
 
Lovingly crafting a mountain, not by hand: managing piles of metadata
Lovingly crafting a mountain, not by hand: managing piles of metadataLovingly crafting a mountain, not by hand: managing piles of metadata
Lovingly crafting a mountain, not by hand: managing piles of metadata
 
Calm at the center of the distributed whirlwinds
Calm at the center of the distributed whirlwindsCalm at the center of the distributed whirlwinds
Calm at the center of the distributed whirlwinds
 
MetaData, MetaKoha
MetaData, MetaKohaMetaData, MetaKoha
MetaData, MetaKoha
 
Scotty, I need more speed - Koha tuning
Scotty, I need more speed - Koha tuningScotty, I need more speed - Koha tuning
Scotty, I need more speed - Koha tuning
 
Lightning Talk - pgTAP
Lightning Talk - pgTAPLightning Talk - pgTAP
Lightning Talk - pgTAP
 
Stuff, stuff, and more stuff
Stuff, stuff, and more stuffStuff, stuff, and more stuff
Stuff, stuff, and more stuff
 
Awash in a sea of connections
Awash in a sea of connectionsAwash in a sea of connections
Awash in a sea of connections
 
Koha 3 2 Update @ Code4Lib 2010
Koha 3 2 Update @ Code4Lib 2010Koha 3 2 Update @ Code4Lib 2010
Koha 3 2 Update @ Code4Lib 2010
 
Koha 3.2 Status
Koha 3.2 StatusKoha 3.2 Status
Koha 3.2 Status
 
Beyond Koha 3.2
Beyond Koha 3.2Beyond Koha 3.2
Beyond Koha 3.2
 
‡biblios.net Web Services
‡biblios.net Web Services‡biblios.net Web Services
‡biblios.net Web Services
 

Último

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 

Último (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 

Slinging Data: Data Loading and Cleanup in Evergreen

  • 1. Slinging Data: Data Loading and Cleanup in Evergreen Growing Evergreen Conference 22 April 2010
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7. All over the map Legacy Item Type Circ Modifier 0 Regular 1 Media 45 AV 123 Reference 234 Reference
  • 8.
  • 9.
  • 10.
  • 11.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 25.
  • 26.