SlideShare a Scribd company logo
1 of 46
Data	
  Management	
  
                                       The	
  Current	
  Landscape	
  




Carly	
  Strasser	
  
California	
  Digital	
  Library	
                         2012	
  IASSIST	
  Conference	
  
University	
  of	
  California	
  Curation	
  Center	
                         June	
  2012	
  
From	
  Flickr	
  by	
  	
  DW0825	
  
                                                                                                                 From	
  Flickr	
  by	
  Flickmor	
  




                                                          From	
  Flickr	
  by	
  	
  deltaMike	
  
                                                                                                                                                                       Digital	
  data	
  




                                             www.woodrow.org	
  
                                                                                            C.	
  Strasser	
  




                                                                                                                                                        Courtesey	
  of	
  WHOI	
  
 From	
  Flickr	
  by	
  US	
  Army	
  Environmental	
  Command	
  
Digital	
  data	
  
       +	
  	
  
Complex	
  analyses	
  
Data	
                               Models	
  

                    Maximum	
  
                    Likelihood	
  
                    estimation	
  



                      Matrix	
  
                      Models	
  



       Images	
       Tables	
       Paper	
  
UGLY TRUTH
                                                    Most	
  
                                                    Earth	
  |	
  Environmental	
  |	
  Ecological	
  
                                                    scientists…	
  	
  
                                                    	
  
5shortessays.blogspot.com	
  



                                                                 	
  
                          are	
  not	
  taught	
  data	
  management	
  
                          don’t	
  know	
  what	
  metadata	
  are	
  
                          can’t	
  name	
  data	
  centers	
  or	
  repositories	
  
                          don’t	
  share	
  data	
  publicly	
  or	
  store	
  it	
  in	
  an	
  archive	
  
                          aren’t	
  convinced	
  they	
  should	
  share	
  data	
  

                                                                           	
  
Where	
  data	
  end	
  up	
  
                                                       From	
  Flickr	
  by	
  diylibrarian	
  




                                                                                                  www




                         blog.order2disorder.com	
  




                                                                                                  From	
  Flickr	
  by	
  csessums	
  
  Data	
  
Metadata	
  




                                                                                                      From	
  Flickr	
  by	
  csessums	
  
                                                                          Recreated	
  from	
  Klump	
  et	
  al.	
  2006	
  
Who	
  cares?	
  
       	
  
                                                    From	
  Flickr	
  by	
  Redden-­‐McAllister	
  




     From	
  Flickr	
  by	
  AJC1	
     www.rba.gov.au	
  
Where	
  data	
  end	
  up	
  
                                                                    From	
  Flickr	
  by	
  diylibrarian	
  




                                                                                                               www




  Data	
  
                                                                                         www
Metadata	
  
                             From	
  Flickr	
  by	
  torkildr	
  




                                                                                       Recreated	
  from	
  Klump	
  et	
  al.	
  2006	
  
Data	
  
   Reuse	
  

   Data	
  
  Sharing	
  

   Data	
  
Management	
  
Trends	
  in	
  Data	
  Archiving	
  
Journal	
  publishers	
  
Joint	
  Data	
  Archiving	
  Agreement	
  
	
  
Data	
  Papers	
  etc.	
  
Ecological	
  Archives,	
  Beyond	
  the	
  PDF	
  
	
  
Funders	
  
Data	
  management	
  requirements	
  
	
  
What	
  is	
  a	
  data	
  management	
  plan?	
  
A	
  document	
  that	
  describes	
  what	
  you	
  will	
  do	
  with	
  your	
  data	
  
during	
  your	
  research	
  and	
  after	
  you	
  complete	
  your	
  research	
  
Why	
  should	
  a	
  scientist	
  prepare	
  a	
  
                           DMP?	
  
       	
                        	
  
       Saves	
  time	
  
       Increases	
  efficiency	
  
       Easier	
  to	
  use	
  data	
  	
  	
  
       Others	
  can	
  understand	
  &	
  use	
  data	
  
       Credit	
  for	
  data	
  products	
  
       Funders	
  require	
  it	
  
	
  
NSF	
  DMP	
  Requirements	
  
 From	
  Grant	
  Proposal	
  Guidelines:	
  
	
  DMP	
  supplement	
  may	
  include:	
  
     1.  the	
  types	
  of	
  data,	
  samples,	
  physical	
  collections,	
  software,	
  curriculum	
  
         materials,	
  and	
  other	
  materials	
  to	
  be	
  produced	
  in	
  the	
  course	
  of	
  the	
  project	
  
  2.  	
  the	
  standards	
  to	
  be	
  used	
  for	
  data	
  and	
  metadata	
  format	
  and	
  content	
  (where	
  
      existing	
  standards	
  are	
  absent	
  or	
  deemed	
  inadequate,	
  this	
  should	
  be	
  
      documented	
  along	
  with	
  any	
  proposed	
  solutions	
  or	
  remedies)	
  
  3.  	
  policies	
  for	
  access	
  and	
  sharing	
  including	
  provisions	
  for	
  appropriate	
  
      protection	
  of	
  privacy,	
  confidentiality,	
  security,	
  intellectual	
  property,	
  or	
  other	
  
      rights	
  or	
  requirements	
  
  4.  	
  policies	
  and	
  provisions	
  for	
  re-­‐use,	
  re-­‐distribution,	
  and	
  the	
  production	
  of	
  
      derivatives	
  
  5.  	
  plans	
  for	
  archiving	
  data,	
  samples,	
  and	
  other	
  research	
  products,	
  and	
  for	
  
      preservation	
  of	
  access	
  to	
  them	
  
NSF’s	
  Vision*	
  


    DMPs	
  and	
  their	
  evaluation	
  will	
  grow	
  &	
  change	
  over	
  time	
  
    (similar	
  to	
  broader	
  impacts)	
  
    Peer	
  review	
  will	
  determine	
  next	
  steps	
  
    Community-­‐driven	
  guidelines	
  	
  
           –  Different	
  disciplines	
  have	
  different	
  definitions	
  of	
  acceptable	
  
              data	
  sharing	
  
           –  Flexibility	
  at	
  the	
  directorate	
  and	
  division	
  levels	
  
           –  Tailor	
  implementation	
  of	
  DMP	
  requirement	
  

    Evaluation	
  will	
  vary	
  with	
  directorate,	
  division,	
  &	
  program	
  
    officer	
  
    	
  
*Unofficially	
  
                                                                                Help	
  from	
  Jennifer	
  Schopf,	
  NSF	
  
dmp.cdlib.org	
  




                    dmponline.dcc.ac.uk	
  
now	
  called	
  
                                                                                                 DataUp     	
  

•    Open	
  source	
  add-­‐in	
  &	
  web	
  application	
  
•    Facilitate	
  data	
  management,	
  sharing,	
  archiving	
  for	
  scientists	
  
•    Focus	
  on	
  atmospheric,	
  ecological,	
  hydrological,	
  and	
  
     oceanographic	
  data	
  
•    Collecting	
  requirements	
  for	
  add-­‐in	
  from	
  scientists,	
  data	
  
     centers,	
  libraries	
  



                   Funders:	
  Gordon	
  and	
  Betty	
  Moore	
  Foundation,	
  Microsoft	
  Research	
  
www.dataone.org	
  
•    Data	
  Education	
  Tutorials	
  
•    Database	
  of	
  best	
  practices	
  	
  &	
  software	
  tools	
  
•    Primer	
  on	
  data	
  management	
  
•    Investigator	
  Toolkit	
                                               now	
  called	
  
                                                                              DataUp     	
  
Data	
  Management	
  
                                                    Best	
  Practices	
  




Carly	
  Strasser	
  
California	
  Digital	
  Library	
                               2012	
  IASSIST	
  Conference	
  
University	
  of	
  California	
  Curation	
  Center	
                               June	
  2012	
  
Best	
  Practices	
  for	
  Data	
  Management	
  

   1.  Planning	
  
   2.  Data	
  collection	
  &	
  organization	
  
   3.  Quality	
  control	
  &	
  assurance	
  
   4.  Metadata	
  
   5.  Workflows	
  
   6.  Data	
  stewardship	
  &	
  reuse	
  
Best	
  Practices	
  for	
  Data	
  Management	
  

   1.  Planning	
  
   2.  Data	
  collection	
  &	
  organization	
  
   3.  Quality	
  control	
  &	
  assurance	
  
   4.  Metadata	
  
   5.  Workflows	
  
   6.  Data	
  stewardship	
  &	
  reuse	
  
Best	
  Practices	
  for	
  Data	
  Management	
  

   1.  Planning	
  
   2.  Data	
  collection	
  &	
  organization	
  
   3.  Quality	
  control	
  &	
  assurance	
  
   4.  Metadata	
  
   5.  Workflows	
  
   6.  Data	
  stewardship	
  &	
  reuse	
  
2.	
  Data	
  collection	
  &	
  organization	
  

Create	
  unique	
  identifiers	
  
     •  Decide	
  on	
  naming	
  scheme	
  early	
  
     •  Create	
  a	
  key	
  
     •  Different	
  for	
  each	
  sample	
  




   From	
  Flickr	
  by	
  zebbie	
          From	
  Flickr	
  by	
  sjbresnahan	
  
2.	
  Data	
  collection	
  &	
  organization	
  

        Standardize	
  
                      •  Consistent	
  within	
  columns	
  
                                    – only	
  numbers,	
  dates,	
  or	
  text	
  
                      •  Consistent	
  names,	
  codes,	
  formats	
  




Modified	
  from	
  K.	
  Vanderbilt	
  	
  
                                                                                     From	
  Pink	
  Floyd,	
  The	
  Wall	
  	
  	
  themurkyfringe.com	
  
2.	
  Data	
  collection	
  &	
  organization	
  

Use	
  descriptive	
  file	
  names	
  




                                         PhDcomics.com	
  
2.	
  Data	
  collection	
  &	
  organization	
  

   	
  Use	
  descriptive	
  file	
  names	
  *	
  
       •  Unique	
  
       •  Reflect	
  contents	
  

Bad:	
       	
  Mydata.xls	
              Better: 	
  Eaffinis_nanaimo_2010_counts.xls	
  
   	
        	
  2001_data.csv	
  
   	
        	
  best	
  version.txt	
  
                                                Study	
                          Year	
  
                                              organism	
      Site	
  
                                                             name	
                                       What	
  was	
  
                                                                                                          measured	
  	
  



           *Not	
  for	
  everyone	
  
                                                                         From	
  R	
  Cook,	
  ESA	
  Best	
  Practices	
  Workshop	
  2010	
  
2.	
  Data	
  collection	
  &	
  organization	
  

	
  Preserve	
  information	
                                            R	
  script	
  for	
  processing	
  &	
  
                                                                                                   analysis	
  
 •  Keep	
  raw	
  data	
  raw	
  
 •  Use	
  scripts	
  to	
  process	
  data	
                     	
  
        	
  &	
  save	
  them	
  with	
  data	
  

                                  Raw	
  data	
  as	
  .csv	
  
Best	
  Practices	
  for	
  Data	
  Management	
  

   1.  Planning	
  
   2.  Data	
  collection	
  &	
  organization	
  
   3.  Quality	
  control	
  &	
  assurance	
  
   4.  Metadata	
  
   5.  Workflows	
  
   6.  Data	
  stewardship	
  &	
  reuse	
  
3.	
  Quality	
  control	
  and	
  quality	
  assurance	
  

Before	
  data	
  collection	
  
•  Define	
  &	
  enforce	
  standards	
  
•  Assign	
  responsibility	
  for	
  data	
  quality	
  




                                                            From	
  Flickr	
  by	
  StacieBee	
  
3.	
  Quality	
  control	
  and	
  quality	
  assurance	
  

During	
  data	
  collection/entry	
  
    •  Minimize	
  manual	
  entry	
  
    •  Use	
  double	
  entry	
  
    •  Use	
  text-­‐to-­‐speech	
  program	
  
       to	
  read	
  data	
  back	
  
    •  Use	
  a	
  database	
  
    •  Document	
  changes	
  




                                                              From	
  Flickr	
  by	
  schock	
  
3.	
  Quality	
  control	
  and	
  quality	
  assurance	
  

After	
  data	
  entry	
  
•  Check	
  for	
  missing,	
  impossible,	
  
   anomalous	
  values	
  
•  Perform	
  statistical	
  summaries	
  	
  
•  Look	
  for	
  outliers	
  
        •  Normal	
  probability	
  plots	
  
        •  Regression	
  
        •  Scatter	
  plots	
                    60	
  
                                                 50	
  
                                                 40	
  
        •  Maps	
                                30	
  
                                                 20	
  
                                                 10	
  
                                                   0	
  
                                                           0	
     10	
     20	
     30	
     40	
  




	
  
Best	
  Practices	
  for	
  Data	
  Management	
  

   1.  Planning	
  
   2.  Data	
  collection	
  &	
  organization	
  
   3.  Quality	
  control	
  &	
  assurance	
  
   4.  Metadata	
  
   5.  Workflows	
  
   6.  Data	
  stewardship	
  &	
  reuse	
  
4.	
  Metadata	
  

  	
  	
  Metadata	
  =	
  Data	
  reporting	
  
                                          	
  



    WHO	
  created	
  the	
  data?	
  
    WHAT	
  is	
  the	
  content	
  of	
  the	
  data	
  set?	
  
    WHEN	
  was	
  it	
  created?	
  




                                                                    From	
  Flickr	
  by	
  	
  //ichael	
  Patric|{	
  
    WHERE	
  was	
  it	
  collected?	
  
    HOW	
  was	
  it	
  developed?	
  
    WHY	
  was	
  it	
  developed?	
  




                                                                    	
  
•    Scientific	
  context	
  

       4.	
  Metadata	
                                                                    •       Scientific	
  reason	
  why	
  the	
  data	
  were	
  
                                                                                                   collected	
  
                                                                                           •       What	
  data	
  were	
  collected	
  
•    Digital	
  context	
                                                                  •       What	
  instruments	
  (including	
  model	
  &	
  
      •     Name	
  of	
  the	
  data	
  set	
                                                     serial	
  number)	
  were	
  used	
  
      •     The	
  name(s)	
  of	
  the	
  data	
  file(s)	
  in	
  the	
  data	
           •       Environmental	
  conditions	
  during	
  collection	
  
            set	
                                                                          •       Where	
  collected	
  &	
  spatial	
  resolution	
  When	
  
      •     Date	
  the	
  data	
  set	
  was	
  last	
  modified	
                                 collected	
  &	
  temporal	
  resolution	
  
      •     Example	
  data	
  file	
  records	
  for	
  each	
  data	
                     •       Standards	
  or	
  calibrations	
  used	
  
            type	
  file	
                                                            •    Information	
  about	
  parameters	
  
      •     Pertinent	
  companion	
  files	
                                               •       How	
  each	
  was	
  measured	
  or	
  produced	
  
      •     List	
  of	
  related	
  or	
  ancillary	
  data	
  sets	
                     •       Units	
  of	
  measure	
  
      •     Software	
  (including	
  version	
  number)	
                                 •       Format	
  used	
  in	
  the	
  data	
  set	
  
            used	
  to	
  prepare/read	
  	
  the	
  data	
  set	
  
                                                                                           •       Precision	
  &	
  accuracy	
  if	
  known	
  
      •     Data	
  processing	
  that	
  was	
  performed	
  
                                                                                     •    Information	
  about	
  data	
  
•    Personnel	
  &	
  stakeholders	
  
                                                                                           •       Definitions	
  of	
  codes	
  used	
  
      •     Who	
  collected	
  	
  
                                                                                           •       Quality	
  assurance	
  &	
  control	
  measures	
  
      •     Who	
  to	
  contact	
  with	
  questions	
  
                                                                                           •       Known	
  problems	
  that	
  limit	
  data	
  use	
  (e.g.	
  
      •     Funders	
                                                                              uncertainty,	
  sampling	
  problems)	
  	
  
                                                                                     •    How	
  to	
  cite	
  the	
  data	
  set	
  
4.	
  Metadata	
  
                                                                                                   What	
  is	
  
                                                                                                  metadata?	
  
   Select	
  the	
  appropriate	
  metadata	
  
   standard	
  

    •  Provides	
  structure	
  to	
  describe	
  data	
  
                  Common	
  terms	
  	
  |	
  	
  definitions	
  	
  |	
  	
  language	
  	
  |	
  	
  structure	
  

    •  Lots	
  of	
  different	
  standards	
  
                	
  EML	
  ,	
  FGDC,	
  ISO19115,	
  DarwinCore,…	
  
    •  Tools	
  for	
  creating	
  metadata	
  files	
  
                	
  Morpho	
  (EML),	
  Metavist	
  (FGDC),	
  NOAA	
  MERMaid	
  (CSGDM)	
  	
  
         	
  
         	
  
Best	
  Practices	
  for	
  Data	
  Management	
  

   1.  Planning	
  
   2.  Data	
  collection	
  &	
  organization	
  
   3.  Quality	
  control	
  &	
  assurance	
  
   4.  Metadata	
  
   5.  Workflows	
  
   6.  Data	
  stewardship	
  &	
  reuse	
  
5.	
  Workflows	
  
  Workflow:	
  how	
  you	
  get	
  from	
  the	
  raw	
  data	
  to	
  the	
  final	
  
  products	
  of	
  your	
  research	
  
                                                                        	
  



         Simple	
  workflows:	
  flow	
  charts	
  
       Temperature	
  
          data	
  
                                                                    Data	
  import	
  into	
  R	
     Data	
  in	
  R	
  
            Salinity	
  	
  	
  	
  	
  	
  	
  	
  
                                                                                                       format	
  
             data	
  
                                                                     Quality	
  control	
  &	
  
                                               “Clean”	
  T	
         data	
  cleaning	
  
                                               &	
  S	
  data	
  

                                                                    Analysis:	
  mean,	
  SD	
  
                                                                                                       Summary	
  
                                                                                                       statistics	
  

                                                                    Graph	
  production	
  
5.	
  Workflows	
  
  Workflow:	
  how	
  you	
  get	
  from	
  the	
  raw	
  data	
  to	
  the	
  final	
  
  products	
  of	
  your	
  research	
  
                                              	
  



         Simple	
  workflows:	
  commented	
  scripts	
  
         •  R,	
  SAS,	
  MATLAB	
  
         •  Well-­‐documented	
  code	
  is…	
  
                   Easier	
  to	
  review	
  
                   Easier	
  to	
  share	
                                       %	
  
                                                                              #	
   $	
  
                   Easier	
  to	
  repeat	
  analysis	
  

                                                                               &	
  
5.	
  Workflows	
  
Fancy	
  Schmancy	
  workflows:	
  Kepler	
  
                                                        Resulting	
  output	
  




                     https://kepler-­‐project.org	
  
5.	
  Workflows	
  

 Workflows	
  enable	
  
 	
  
                                                                                                       From	
  Flickr	
  by	
  merlinprincesse	
  
        Reproducibility	
  
               	
  can	
  someone	
  independently	
  validate	
  findings?	
  
        Transparency	
  	
  
               	
  others	
  can	
  understand	
  how	
  you	
  arrived	
  at	
  your	
  results	
  
        Executability	
  	
  
               	
  others	
  can	
  re-­‐run	
  or	
  re-­‐use	
  your	
  analysis	
  
        	
  
Best	
  Practices	
  for	
  Data	
  Management	
  

   1.  Planning	
  
   2.  Data	
  collection	
  &	
  organization	
  
   3.  Quality	
  control	
  &	
  assurance	
  
   4.  Metadata	
  
   5.  Workflows	
  
   6.  Data	
  stewardship	
  &	
  reuse	
  
   	
  
6.	
  Data	
  stewardship	
  &	
  reuse	
  
                                                                         From	
  Flickr	
  by	
  greensambaman	
  




           The	
  20-­‐Year	
  Rule	
  
    The	
  metadata	
  accompanying	
  a	
  
    data	
  set	
  should	
  be	
  written	
  for	
  a	
  
     user	
  20	
  years	
  into	
  the	
  future	
                    RULE	
  
                            	
  
                                 	
  
  Document	
  Document	
  Document	
  
  Document	
  	
  Document	
  Document	
  
  Document	
  Document	
  Document	
  
  Document	
  Document	
  Document	
  	
  	
                 (National	
  Research	
  Council	
  1991)	
  
                       	
  
6.	
  Data	
  stewardship	
  &	
  reuse	
  

Use	
  stable	
  formats	
  
     	
     	
  csv,	
  txt,	
  tiff	
  
Create	
  back-­‐up	
  copies	
  	
  
             original,	
  near,	
  far	
  
Periodically	
  test	
  back-­‐ups	
  




                                               Modified from R. Cook	
  
6.	
  Data	
  stewardship	
  &	
  reuse	
  

            Store	
  data	
  in	
  a	
  repository	
  
                   Institutional	
  archive	
  
              Discipline/specialty	
  archive	
  

                                                         	
  
                                                         	
  
                                                                	
  




                  From	
  Flickr	
  by	
  torkildr	
  
6.	
  Data	
  stewardship	
  &	
  reuse	
  
   Data	
  Citation	
  
              Allows	
  readers	
  to	
  find	
  data	
  products	
  
              Get	
  credit	
  for	
  data	
  and	
  publications	
  
              Promotes	
  reproducibility	
  
              Better	
  measure	
  of	
  research	
  impact	
  
   Example:	
  
   Sidlauskas,	
  B.	
  2007.	
  Data	
  from:	
  Testing	
  for	
  unequal	
  rates	
  of	
  morphological	
  
   diversification	
  in	
  the	
  absence	
  of	
  a	
  detailed	
  phylogeny:	
  a	
  case	
  study	
  from	
  
   characiform	
  fishes.	
  Dryad	
  Digital	
  Repository.	
  doi:10.5061/dryad.20	
  
   	
  



   Learn	
  more	
  at	
  www.datacite.org	
  
                                                                                                          Modified from R. Cook	
  
Check	
  out	
  the	
  blog	
     dcxl.cdlib.org	
  
    or	
  my	
  website	
         www.carlystrasser.net	
  
           Email	
  me	
          carlystrasser@gmail.com	
  
           Tweet	
  me	
          @carlystrasser	
  |	
  @dcxlCDL	
  
         DCXL	
  on	
  FB	
       DCXLatCDL	
  

More Related Content

What's hot

UC Merced: Data Management for Scientists
UC Merced: Data Management for ScientistsUC Merced: Data Management for Scientists
UC Merced: Data Management for ScientistsCarly Strasser
 
Landscape of Data Curation - Microsoft eScience 2012
Landscape of Data Curation - Microsoft eScience 2012Landscape of Data Curation - Microsoft eScience 2012
Landscape of Data Curation - Microsoft eScience 2012Carly Strasser
 
Data Management: Scientist Perspective - DLF 2012
Data Management: Scientist Perspective - DLF 2012Data Management: Scientist Perspective - DLF 2012
Data Management: Scientist Perspective - DLF 2012Carly Strasser
 
Open Data & Open Access - DLF 2012
Open Data & Open Access - DLF 2012Open Data & Open Access - DLF 2012
Open Data & Open Access - DLF 2012Carly Strasser
 
UC Santa Cruz: Data Management for Scientists
UC Santa Cruz: Data Management for ScientistsUC Santa Cruz: Data Management for Scientists
UC Santa Cruz: Data Management for ScientistsCarly Strasser
 
DMPTool Overview for UC Merced Research Week
DMPTool Overview for UC Merced Research WeekDMPTool Overview for UC Merced Research Week
DMPTool Overview for UC Merced Research WeekCarly Strasser
 
Data Management Solutions from Libraries at NSF Large Facilities Workshop
Data Management Solutions from Libraries at NSF Large Facilities WorkshopData Management Solutions from Libraries at NSF Large Facilities Workshop
Data Management Solutions from Libraries at NSF Large Facilities WorkshopCarly Strasser
 
Data Management Planning for ESA 2013
Data Management Planning for ESA 2013Data Management Planning for ESA 2013
Data Management Planning for ESA 2013Carly Strasser
 
Data Herding for Scientists - UC Davis OA Week
Data Herding for Scientists - UC Davis OA WeekData Herding for Scientists - UC Davis OA Week
Data Herding for Scientists - UC Davis OA WeekCarly Strasser
 
Cal Poly - Data Management for Researchers
Cal Poly - Data Management for ResearchersCal Poly - Data Management for Researchers
Cal Poly - Data Management for ResearchersCarly Strasser
 
UNT: Scientific Data Management and Sharing
UNT: Scientific Data Management and SharingUNT: Scientific Data Management and Sharing
UNT: Scientific Data Management and SharingCarly Strasser
 
Data Herding for Scientists - IGERT Symposium at UF
Data Herding for Scientists - IGERT Symposium at UFData Herding for Scientists - IGERT Symposium at UF
Data Herding for Scientists - IGERT Symposium at UFCarly Strasser
 
Cal Poly - Data Management and the DMPTool
Cal Poly - Data Management and the DMPToolCal Poly - Data Management and the DMPTool
Cal Poly - Data Management and the DMPToolCarly Strasser
 
DMPTool at NNLM Research Lifecycle: Partnering for Success
DMPTool at NNLM Research Lifecycle: Partnering for SuccessDMPTool at NNLM Research Lifecycle: Partnering for Success
DMPTool at NNLM Research Lifecycle: Partnering for SuccessCarly Strasser
 
DataUp Lightning Talk for #iEvoBio
DataUp Lightning Talk for #iEvoBioDataUp Lightning Talk for #iEvoBio
DataUp Lightning Talk for #iEvoBioCarly Strasser
 
Needs for Data Management & Citation Throughout the Information Lifecycle
Needs for Data Management & Citation Throughout  the Information LifecycleNeeds for Data Management & Citation Throughout  the Information Lifecycle
Needs for Data Management & Citation Throughout the Information LifecycleMicah Altman
 
Data Management: Scientist Perspective - UC3 Data Curation Workshop
Data Management: Scientist Perspective - UC3 Data Curation WorkshopData Management: Scientist Perspective - UC3 Data Curation Workshop
Data Management: Scientist Perspective - UC3 Data Curation WorkshopCarly Strasser
 
Supporting UC Research Data Management
Supporting UC Research Data ManagementSupporting UC Research Data Management
Supporting UC Research Data Managementslabrams
 

What's hot (20)

UC Merced: Data Management for Scientists
UC Merced: Data Management for ScientistsUC Merced: Data Management for Scientists
UC Merced: Data Management for Scientists
 
Landscape of Data Curation - Microsoft eScience 2012
Landscape of Data Curation - Microsoft eScience 2012Landscape of Data Curation - Microsoft eScience 2012
Landscape of Data Curation - Microsoft eScience 2012
 
Data Management: Scientist Perspective - DLF 2012
Data Management: Scientist Perspective - DLF 2012Data Management: Scientist Perspective - DLF 2012
Data Management: Scientist Perspective - DLF 2012
 
Open Data & Open Access - DLF 2012
Open Data & Open Access - DLF 2012Open Data & Open Access - DLF 2012
Open Data & Open Access - DLF 2012
 
UC Santa Cruz: Data Management for Scientists
UC Santa Cruz: Data Management for ScientistsUC Santa Cruz: Data Management for Scientists
UC Santa Cruz: Data Management for Scientists
 
DMPTool Overview for UC Merced Research Week
DMPTool Overview for UC Merced Research WeekDMPTool Overview for UC Merced Research Week
DMPTool Overview for UC Merced Research Week
 
Data Management Solutions from Libraries at NSF Large Facilities Workshop
Data Management Solutions from Libraries at NSF Large Facilities WorkshopData Management Solutions from Libraries at NSF Large Facilities Workshop
Data Management Solutions from Libraries at NSF Large Facilities Workshop
 
Data Management Planning for ESA 2013
Data Management Planning for ESA 2013Data Management Planning for ESA 2013
Data Management Planning for ESA 2013
 
Data Herding for Scientists - UC Davis OA Week
Data Herding for Scientists - UC Davis OA WeekData Herding for Scientists - UC Davis OA Week
Data Herding for Scientists - UC Davis OA Week
 
Cal Poly - Data Management for Researchers
Cal Poly - Data Management for ResearchersCal Poly - Data Management for Researchers
Cal Poly - Data Management for Researchers
 
UNT: Scientific Data Management and Sharing
UNT: Scientific Data Management and SharingUNT: Scientific Data Management and Sharing
UNT: Scientific Data Management and Sharing
 
Data Herding for Scientists - IGERT Symposium at UF
Data Herding for Scientists - IGERT Symposium at UFData Herding for Scientists - IGERT Symposium at UF
Data Herding for Scientists - IGERT Symposium at UF
 
DataUp at ACRL 2013
DataUp at ACRL 2013DataUp at ACRL 2013
DataUp at ACRL 2013
 
Cal Poly - Data Management and the DMPTool
Cal Poly - Data Management and the DMPToolCal Poly - Data Management and the DMPTool
Cal Poly - Data Management and the DMPTool
 
DMPTool at NNLM Research Lifecycle: Partnering for Success
DMPTool at NNLM Research Lifecycle: Partnering for SuccessDMPTool at NNLM Research Lifecycle: Partnering for Success
DMPTool at NNLM Research Lifecycle: Partnering for Success
 
DataUp Lightning Talk for #iEvoBio
DataUp Lightning Talk for #iEvoBioDataUp Lightning Talk for #iEvoBio
DataUp Lightning Talk for #iEvoBio
 
Needs for Data Management & Citation Throughout the Information Lifecycle
Needs for Data Management & Citation Throughout  the Information LifecycleNeeds for Data Management & Citation Throughout  the Information Lifecycle
Needs for Data Management & Citation Throughout the Information Lifecycle
 
Data Management: Scientist Perspective - UC3 Data Curation Workshop
Data Management: Scientist Perspective - UC3 Data Curation WorkshopData Management: Scientist Perspective - UC3 Data Curation Workshop
Data Management: Scientist Perspective - UC3 Data Curation Workshop
 
NISO Forum, Denver, Sept. 24, 2012: Data Equivalence
NISO Forum, Denver, Sept. 24, 2012: Data EquivalenceNISO Forum, Denver, Sept. 24, 2012: Data Equivalence
NISO Forum, Denver, Sept. 24, 2012: Data Equivalence
 
Supporting UC Research Data Management
Supporting UC Research Data ManagementSupporting UC Research Data Management
Supporting UC Research Data Management
 

Viewers also liked

Current Landscape for Credit Ratings
Current Landscape for Credit RatingsCurrent Landscape for Credit Ratings
Current Landscape for Credit RatingsFairfax County
 
The Evolution of Corporate Sustainability Goals: Current Landscape and Future...
The Evolution of Corporate Sustainability Goals: Current Landscape and Future...The Evolution of Corporate Sustainability Goals: Current Landscape and Future...
The Evolution of Corporate Sustainability Goals: Current Landscape and Future...Sustainable Brands
 
Situation Analysis and Marketing Plan
Situation Analysis and Marketing PlanSituation Analysis and Marketing Plan
Situation Analysis and Marketing PlanMike Baker
 
Sample Strategic Plan
Sample Strategic PlanSample Strategic Plan
Sample Strategic Planbrucemulkey
 
Business Plan
Business PlanBusiness Plan
Business Planvinaya.hs
 
Strategic planning powerpoint
Strategic planning powerpointStrategic planning powerpoint
Strategic planning powerpointrobdude9626
 

Viewers also liked (7)

Current Landscape for Credit Ratings
Current Landscape for Credit RatingsCurrent Landscape for Credit Ratings
Current Landscape for Credit Ratings
 
The Evolution of Corporate Sustainability Goals: Current Landscape and Future...
The Evolution of Corporate Sustainability Goals: Current Landscape and Future...The Evolution of Corporate Sustainability Goals: Current Landscape and Future...
The Evolution of Corporate Sustainability Goals: Current Landscape and Future...
 
Situation Analysis and Marketing Plan
Situation Analysis and Marketing PlanSituation Analysis and Marketing Plan
Situation Analysis and Marketing Plan
 
The Fundamentals of Strategic Planning
The Fundamentals of Strategic PlanningThe Fundamentals of Strategic Planning
The Fundamentals of Strategic Planning
 
Sample Strategic Plan
Sample Strategic PlanSample Strategic Plan
Sample Strategic Plan
 
Business Plan
Business PlanBusiness Plan
Business Plan
 
Strategic planning powerpoint
Strategic planning powerpointStrategic planning powerpoint
Strategic planning powerpoint
 

Similar to Data Management: The Current Landscape

Data Management for Scientists: Workshop at Ocean Sciences 2012
Data Management for Scientists: Workshop at Ocean Sciences 2012Data Management for Scientists: Workshop at Ocean Sciences 2012
Data Management for Scientists: Workshop at Ocean Sciences 2012Carly Strasser
 
Data Matters for AGU Early Career Conference
Data Matters for AGU Early Career ConferenceData Matters for AGU Early Career Conference
Data Matters for AGU Early Career ConferenceCarly Strasser
 
Cal Poly - Data Management: Who knew it was a hot topic?
Cal Poly - Data Management: Who knew it was a hot topic?Cal Poly - Data Management: Who knew it was a hot topic?
Cal Poly - Data Management: Who knew it was a hot topic?Carly Strasser
 
DMPTool for IMLS #WebWise14
DMPTool for IMLS #WebWise14DMPTool for IMLS #WebWise14
DMPTool for IMLS #WebWise14Carly Strasser
 
DataUp Overview: AGU 2012
DataUp Overview: AGU 2012DataUp Overview: AGU 2012
DataUp Overview: AGU 2012Carly Strasser
 
Data Stewardship for SPATIAL/IsoCamp 2014
Data Stewardship for SPATIAL/IsoCamp 2014Data Stewardship for SPATIAL/IsoCamp 2014
Data Stewardship for SPATIAL/IsoCamp 2014Carly Strasser
 
DataUp: An overview for the DataONE Users Group
DataUp: An overview for the DataONE Users GroupDataUp: An overview for the DataONE Users Group
DataUp: An overview for the DataONE Users GroupCarly Strasser
 
Research Life Cycle for GeoData 2014
Research Life Cycle for GeoData 2014Research Life Cycle for GeoData 2014
Research Life Cycle for GeoData 2014Carly Strasser
 
RDAP 15: You’re in good company: Unifying campus research data services
RDAP 15: You’re in good company: Unifying campus research data servicesRDAP 15: You’re in good company: Unifying campus research data services
RDAP 15: You’re in good company: Unifying campus research data servicesASIS&T
 
Informatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data DecadeInformatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data DecadeLiz Lyon
 
iConference: Overview of data management planning
iConference: Overview of data management planningiConference: Overview of data management planning
iConference: Overview of data management planningCarly Strasser
 
UCLA: Data Management for Scientists
UCLA: Data Management for ScientistsUCLA: Data Management for Scientists
UCLA: Data Management for ScientistsCarly Strasser
 
"Undergrad ecologists aren't learning data management" - ESA 2013
"Undergrad ecologists aren't learning data management" -  ESA 2013"Undergrad ecologists aren't learning data management" -  ESA 2013
"Undergrad ecologists aren't learning data management" - ESA 2013Carly Strasser
 
Big and Small Web Data
Big and Small Web DataBig and Small Web Data
Big and Small Web DataMarieke Guy
 
Data management overview and UC3 tools for IASSIST 2014
Data management overview and UC3 tools for IASSIST 2014Data management overview and UC3 tools for IASSIST 2014
Data management overview and UC3 tools for IASSIST 2014Carly Strasser
 
NISO Webinar on data curation services at the CDL
NISO Webinar on data curation services at the CDLNISO Webinar on data curation services at the CDL
NISO Webinar on data curation services at the CDLCarly Strasser
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE
 

Similar to Data Management: The Current Landscape (20)

Data Management for Scientists: Workshop at Ocean Sciences 2012
Data Management for Scientists: Workshop at Ocean Sciences 2012Data Management for Scientists: Workshop at Ocean Sciences 2012
Data Management for Scientists: Workshop at Ocean Sciences 2012
 
Data Management Plans: Tips, Tricks and Tools
Data Management Plans: Tips, Tricks and ToolsData Management Plans: Tips, Tricks and Tools
Data Management Plans: Tips, Tricks and Tools
 
Data Matters for AGU Early Career Conference
Data Matters for AGU Early Career ConferenceData Matters for AGU Early Career Conference
Data Matters for AGU Early Career Conference
 
Cal Poly - Data Management: Who knew it was a hot topic?
Cal Poly - Data Management: Who knew it was a hot topic?Cal Poly - Data Management: Who knew it was a hot topic?
Cal Poly - Data Management: Who knew it was a hot topic?
 
Digital Curation for Excel (DCXL)
Digital Curation for Excel (DCXL)Digital Curation for Excel (DCXL)
Digital Curation for Excel (DCXL)
 
DMPTool for IMLS #WebWise14
DMPTool for IMLS #WebWise14DMPTool for IMLS #WebWise14
DMPTool for IMLS #WebWise14
 
DataUp Overview: AGU 2012
DataUp Overview: AGU 2012DataUp Overview: AGU 2012
DataUp Overview: AGU 2012
 
Data Stewardship for SPATIAL/IsoCamp 2014
Data Stewardship for SPATIAL/IsoCamp 2014Data Stewardship for SPATIAL/IsoCamp 2014
Data Stewardship for SPATIAL/IsoCamp 2014
 
DataUp: An overview for the DataONE Users Group
DataUp: An overview for the DataONE Users GroupDataUp: An overview for the DataONE Users Group
DataUp: An overview for the DataONE Users Group
 
Research Life Cycle for GeoData 2014
Research Life Cycle for GeoData 2014Research Life Cycle for GeoData 2014
Research Life Cycle for GeoData 2014
 
RDAP 15: You’re in good company: Unifying campus research data services
RDAP 15: You’re in good company: Unifying campus research data servicesRDAP 15: You’re in good company: Unifying campus research data services
RDAP 15: You’re in good company: Unifying campus research data services
 
Informatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data DecadeInformatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data Decade
 
iConference: Overview of data management planning
iConference: Overview of data management planningiConference: Overview of data management planning
iConference: Overview of data management planning
 
UCLA: Data Management for Scientists
UCLA: Data Management for ScientistsUCLA: Data Management for Scientists
UCLA: Data Management for Scientists
 
"Undergrad ecologists aren't learning data management" - ESA 2013
"Undergrad ecologists aren't learning data management" -  ESA 2013"Undergrad ecologists aren't learning data management" -  ESA 2013
"Undergrad ecologists aren't learning data management" - ESA 2013
 
Big and Small Web Data
Big and Small Web DataBig and Small Web Data
Big and Small Web Data
 
Data management overview and UC3 tools for IASSIST 2014
Data management overview and UC3 tools for IASSIST 2014Data management overview and UC3 tools for IASSIST 2014
Data management overview and UC3 tools for IASSIST 2014
 
NISO Webinar on data curation services at the CDL
NISO Webinar on data curation services at the CDLNISO Webinar on data curation services at the CDL
NISO Webinar on data curation services at the CDL
 
Intro to RDM
Intro to RDMIntro to RDM
Intro to RDM
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 

More from Carly Strasser

Funders and Publishers: Agents of Change
Funders and Publishers: Agents of ChangeFunders and Publishers: Agents of Change
Funders and Publishers: Agents of ChangeCarly Strasser
 
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015AIBS Bioinformatics Workforce Needs Workshop, Dec 2015
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015Carly Strasser
 
Lightning Talk on open data for #oaw14sky
Lightning Talk on open data for #oaw14skyLightning Talk on open data for #oaw14sky
Lightning Talk on open data for #oaw14skyCarly Strasser
 
CDL Tools for DataCite 2014
CDL Tools for DataCite 2014CDL Tools for DataCite 2014
CDL Tools for DataCite 2014Carly Strasser
 
ESA Ignite talk on quality control for data
ESA Ignite talk on quality control for dataESA Ignite talk on quality control for data
ESA Ignite talk on quality control for dataCarly Strasser
 
ESA Ignite talk on UC3 Dash platform for data sharing
ESA Ignite talk on UC3 Dash platform for data sharingESA Ignite talk on UC3 Dash platform for data sharing
ESA Ignite talk on UC3 Dash platform for data sharingCarly Strasser
 
Data publication and Citation for CLIR postdoc seminar
Data publication and Citation for CLIR postdoc seminarData publication and Citation for CLIR postdoc seminar
Data publication and Citation for CLIR postdoc seminarCarly Strasser
 
Data Management for Mountain Observatories Workshop
Data Management for Mountain Observatories WorkshopData Management for Mountain Observatories Workshop
Data Management for Mountain Observatories WorkshopCarly Strasser
 
Libraries & Research Data Management for CO Alliance of Resrch Libraries
Libraries & Research Data Management for CO Alliance of Resrch LibrariesLibraries & Research Data Management for CO Alliance of Resrch Libraries
Libraries & Research Data Management for CO Alliance of Resrch LibrariesCarly Strasser
 
Open Science for Australian Institute of Marine Science Workshop
Open Science for Australian Institute of Marine Science WorkshopOpen Science for Australian Institute of Marine Science Workshop
Open Science for Australian Institute of Marine Science WorkshopCarly Strasser
 
Coping with Data for WHOI JP Students
Coping with Data for WHOI JP StudentsCoping with Data for WHOI JP Students
Coping with Data for WHOI JP StudentsCarly Strasser
 
DMPTool for UMass eScience Symposium
DMPTool for UMass eScience SymposiumDMPTool for UMass eScience Symposium
DMPTool for UMass eScience SymposiumCarly Strasser
 
DMPTool 2.0 for #IDCC14
DMPTool 2.0 for #IDCC14DMPTool 2.0 for #IDCC14
DMPTool 2.0 for #IDCC14Carly Strasser
 
Data Publication at CDL for IDCC14
Data Publication at CDL for IDCC14Data Publication at CDL for IDCC14
Data Publication at CDL for IDCC14Carly Strasser
 
Data Publication for UC Davis Publish or Perish
Data Publication for UC Davis Publish or PerishData Publication for UC Davis Publish or Perish
Data Publication for UC Davis Publish or PerishCarly Strasser
 
Bren - UCSB - Spooky spreadsheets
Bren - UCSB - Spooky spreadsheetsBren - UCSB - Spooky spreadsheets
Bren - UCSB - Spooky spreadsheetsCarly Strasser
 
Cal Poly - An Overview of Open Science
Cal Poly - An Overview of Open ScienceCal Poly - An Overview of Open Science
Cal Poly - An Overview of Open ScienceCarly Strasser
 
PLOS ALM Talk on UC3 Services and Altmetrics
PLOS ALM Talk on UC3 Services and AltmetricsPLOS ALM Talk on UC3 Services and Altmetrics
PLOS ALM Talk on UC3 Services and AltmetricsCarly Strasser
 

More from Carly Strasser (19)

Funders and Publishers: Agents of Change
Funders and Publishers: Agents of ChangeFunders and Publishers: Agents of Change
Funders and Publishers: Agents of Change
 
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015AIBS Bioinformatics Workforce Needs Workshop, Dec 2015
AIBS Bioinformatics Workforce Needs Workshop, Dec 2015
 
Lightning Talk on open data for #oaw14sky
Lightning Talk on open data for #oaw14skyLightning Talk on open data for #oaw14sky
Lightning Talk on open data for #oaw14sky
 
CDL Tools for DataCite 2014
CDL Tools for DataCite 2014CDL Tools for DataCite 2014
CDL Tools for DataCite 2014
 
ESA Ignite talk on quality control for data
ESA Ignite talk on quality control for dataESA Ignite talk on quality control for data
ESA Ignite talk on quality control for data
 
ESA Ignite talk on UC3 Dash platform for data sharing
ESA Ignite talk on UC3 Dash platform for data sharingESA Ignite talk on UC3 Dash platform for data sharing
ESA Ignite talk on UC3 Dash platform for data sharing
 
Data publication and Citation for CLIR postdoc seminar
Data publication and Citation for CLIR postdoc seminarData publication and Citation for CLIR postdoc seminar
Data publication and Citation for CLIR postdoc seminar
 
Data Management for Mountain Observatories Workshop
Data Management for Mountain Observatories WorkshopData Management for Mountain Observatories Workshop
Data Management for Mountain Observatories Workshop
 
Libraries & Research Data Management for CO Alliance of Resrch Libraries
Libraries & Research Data Management for CO Alliance of Resrch LibrariesLibraries & Research Data Management for CO Alliance of Resrch Libraries
Libraries & Research Data Management for CO Alliance of Resrch Libraries
 
Open Science for Australian Institute of Marine Science Workshop
Open Science for Australian Institute of Marine Science WorkshopOpen Science for Australian Institute of Marine Science Workshop
Open Science for Australian Institute of Marine Science Workshop
 
Dash for IASSIST 2014
Dash for IASSIST 2014Dash for IASSIST 2014
Dash for IASSIST 2014
 
Coping with Data for WHOI JP Students
Coping with Data for WHOI JP StudentsCoping with Data for WHOI JP Students
Coping with Data for WHOI JP Students
 
DMPTool for UMass eScience Symposium
DMPTool for UMass eScience SymposiumDMPTool for UMass eScience Symposium
DMPTool for UMass eScience Symposium
 
DMPTool 2.0 for #IDCC14
DMPTool 2.0 for #IDCC14DMPTool 2.0 for #IDCC14
DMPTool 2.0 for #IDCC14
 
Data Publication at CDL for IDCC14
Data Publication at CDL for IDCC14Data Publication at CDL for IDCC14
Data Publication at CDL for IDCC14
 
Data Publication for UC Davis Publish or Perish
Data Publication for UC Davis Publish or PerishData Publication for UC Davis Publish or Perish
Data Publication for UC Davis Publish or Perish
 
Bren - UCSB - Spooky spreadsheets
Bren - UCSB - Spooky spreadsheetsBren - UCSB - Spooky spreadsheets
Bren - UCSB - Spooky spreadsheets
 
Cal Poly - An Overview of Open Science
Cal Poly - An Overview of Open ScienceCal Poly - An Overview of Open Science
Cal Poly - An Overview of Open Science
 
PLOS ALM Talk on UC3 Services and Altmetrics
PLOS ALM Talk on UC3 Services and AltmetricsPLOS ALM Talk on UC3 Services and Altmetrics
PLOS ALM Talk on UC3 Services and Altmetrics
 

Recently uploaded

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Recently uploaded (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Data Management: The Current Landscape

  • 1. Data  Management   The  Current  Landscape   Carly  Strasser   California  Digital  Library   2012  IASSIST  Conference   University  of  California  Curation  Center   June  2012  
  • 2.
  • 3. From  Flickr  by    DW0825   From  Flickr  by  Flickmor   From  Flickr  by    deltaMike   Digital  data   www.woodrow.org   C.  Strasser   Courtesey  of  WHOI   From  Flickr  by  US  Army  Environmental  Command  
  • 4. Digital  data   +     Complex  analyses  
  • 5. Data   Models   Maximum   Likelihood   estimation   Matrix   Models   Images   Tables   Paper  
  • 6. UGLY TRUTH Most   Earth  |  Environmental  |  Ecological   scientists…       5shortessays.blogspot.com     are  not  taught  data  management   don’t  know  what  metadata  are   can’t  name  data  centers  or  repositories   don’t  share  data  publicly  or  store  it  in  an  archive   aren’t  convinced  they  should  share  data    
  • 7. Where  data  end  up   From  Flickr  by  diylibrarian   www blog.order2disorder.com   From  Flickr  by  csessums   Data   Metadata   From  Flickr  by  csessums   Recreated  from  Klump  et  al.  2006  
  • 8. Who  cares?     From  Flickr  by  Redden-­‐McAllister   From  Flickr  by  AJC1   www.rba.gov.au  
  • 9. Where  data  end  up   From  Flickr  by  diylibrarian   www Data   www Metadata   From  Flickr  by  torkildr   Recreated  from  Klump  et  al.  2006  
  • 10. Data   Reuse   Data   Sharing   Data   Management  
  • 11. Trends  in  Data  Archiving   Journal  publishers   Joint  Data  Archiving  Agreement     Data  Papers  etc.   Ecological  Archives,  Beyond  the  PDF     Funders   Data  management  requirements    
  • 12. What  is  a  data  management  plan?   A  document  that  describes  what  you  will  do  with  your  data   during  your  research  and  after  you  complete  your  research  
  • 13. Why  should  a  scientist  prepare  a   DMP?       Saves  time   Increases  efficiency   Easier  to  use  data       Others  can  understand  &  use  data   Credit  for  data  products   Funders  require  it    
  • 14. NSF  DMP  Requirements   From  Grant  Proposal  Guidelines:    DMP  supplement  may  include:   1.  the  types  of  data,  samples,  physical  collections,  software,  curriculum   materials,  and  other  materials  to  be  produced  in  the  course  of  the  project   2.   the  standards  to  be  used  for  data  and  metadata  format  and  content  (where   existing  standards  are  absent  or  deemed  inadequate,  this  should  be   documented  along  with  any  proposed  solutions  or  remedies)   3.   policies  for  access  and  sharing  including  provisions  for  appropriate   protection  of  privacy,  confidentiality,  security,  intellectual  property,  or  other   rights  or  requirements   4.   policies  and  provisions  for  re-­‐use,  re-­‐distribution,  and  the  production  of   derivatives   5.   plans  for  archiving  data,  samples,  and  other  research  products,  and  for   preservation  of  access  to  them  
  • 15. NSF’s  Vision*   DMPs  and  their  evaluation  will  grow  &  change  over  time   (similar  to  broader  impacts)   Peer  review  will  determine  next  steps   Community-­‐driven  guidelines     –  Different  disciplines  have  different  definitions  of  acceptable   data  sharing   –  Flexibility  at  the  directorate  and  division  levels   –  Tailor  implementation  of  DMP  requirement   Evaluation  will  vary  with  directorate,  division,  &  program   officer     *Unofficially   Help  from  Jennifer  Schopf,  NSF  
  • 16. dmp.cdlib.org   dmponline.dcc.ac.uk  
  • 17. now  called   DataUp   •  Open  source  add-­‐in  &  web  application   •  Facilitate  data  management,  sharing,  archiving  for  scientists   •  Focus  on  atmospheric,  ecological,  hydrological,  and   oceanographic  data   •  Collecting  requirements  for  add-­‐in  from  scientists,  data   centers,  libraries   Funders:  Gordon  and  Betty  Moore  Foundation,  Microsoft  Research  
  • 18. www.dataone.org   •  Data  Education  Tutorials   •  Database  of  best  practices    &  software  tools   •  Primer  on  data  management   •  Investigator  Toolkit   now  called   DataUp  
  • 19. Data  Management   Best  Practices   Carly  Strasser   California  Digital  Library   2012  IASSIST  Conference   University  of  California  Curation  Center   June  2012  
  • 20. Best  Practices  for  Data  Management   1.  Planning   2.  Data  collection  &  organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 21. Best  Practices  for  Data  Management   1.  Planning   2.  Data  collection  &  organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 22. Best  Practices  for  Data  Management   1.  Planning   2.  Data  collection  &  organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 23. 2.  Data  collection  &  organization   Create  unique  identifiers   •  Decide  on  naming  scheme  early   •  Create  a  key   •  Different  for  each  sample   From  Flickr  by  zebbie   From  Flickr  by  sjbresnahan  
  • 24. 2.  Data  collection  &  organization   Standardize   •  Consistent  within  columns   – only  numbers,  dates,  or  text   •  Consistent  names,  codes,  formats   Modified  from  K.  Vanderbilt     From  Pink  Floyd,  The  Wall      themurkyfringe.com  
  • 25. 2.  Data  collection  &  organization   Use  descriptive  file  names   PhDcomics.com  
  • 26. 2.  Data  collection  &  organization    Use  descriptive  file  names  *   •  Unique   •  Reflect  contents   Bad:    Mydata.xls   Better:  Eaffinis_nanaimo_2010_counts.xls      2001_data.csv      best  version.txt   Study   Year   organism   Site   name   What  was   measured     *Not  for  everyone   From  R  Cook,  ESA  Best  Practices  Workshop  2010  
  • 27. 2.  Data  collection  &  organization    Preserve  information   R  script  for  processing  &   analysis   •  Keep  raw  data  raw   •  Use  scripts  to  process  data      &  save  them  with  data   Raw  data  as  .csv  
  • 28. Best  Practices  for  Data  Management   1.  Planning   2.  Data  collection  &  organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 29. 3.  Quality  control  and  quality  assurance   Before  data  collection   •  Define  &  enforce  standards   •  Assign  responsibility  for  data  quality   From  Flickr  by  StacieBee  
  • 30. 3.  Quality  control  and  quality  assurance   During  data  collection/entry   •  Minimize  manual  entry   •  Use  double  entry   •  Use  text-­‐to-­‐speech  program   to  read  data  back   •  Use  a  database   •  Document  changes   From  Flickr  by  schock  
  • 31. 3.  Quality  control  and  quality  assurance   After  data  entry   •  Check  for  missing,  impossible,   anomalous  values   •  Perform  statistical  summaries     •  Look  for  outliers   •  Normal  probability  plots   •  Regression   •  Scatter  plots   60   50   40   •  Maps   30   20   10   0   0   10   20   30   40    
  • 32. Best  Practices  for  Data  Management   1.  Planning   2.  Data  collection  &  organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 33. 4.  Metadata      Metadata  =  Data  reporting     WHO  created  the  data?   WHAT  is  the  content  of  the  data  set?   WHEN  was  it  created?   From  Flickr  by    //ichael  Patric|{   WHERE  was  it  collected?   HOW  was  it  developed?   WHY  was  it  developed?    
  • 34. •  Scientific  context   4.  Metadata   •  Scientific  reason  why  the  data  were   collected   •  What  data  were  collected   •  Digital  context   •  What  instruments  (including  model  &   •  Name  of  the  data  set   serial  number)  were  used   •  The  name(s)  of  the  data  file(s)  in  the  data   •  Environmental  conditions  during  collection   set   •  Where  collected  &  spatial  resolution  When   •  Date  the  data  set  was  last  modified   collected  &  temporal  resolution   •  Example  data  file  records  for  each  data   •  Standards  or  calibrations  used   type  file   •  Information  about  parameters   •  Pertinent  companion  files   •  How  each  was  measured  or  produced   •  List  of  related  or  ancillary  data  sets   •  Units  of  measure   •  Software  (including  version  number)   •  Format  used  in  the  data  set   used  to  prepare/read    the  data  set   •  Precision  &  accuracy  if  known   •  Data  processing  that  was  performed   •  Information  about  data   •  Personnel  &  stakeholders   •  Definitions  of  codes  used   •  Who  collected     •  Quality  assurance  &  control  measures   •  Who  to  contact  with  questions   •  Known  problems  that  limit  data  use  (e.g.   •  Funders   uncertainty,  sampling  problems)     •  How  to  cite  the  data  set  
  • 35. 4.  Metadata   What  is   metadata?   Select  the  appropriate  metadata   standard   •  Provides  structure  to  describe  data   Common  terms    |    definitions    |    language    |    structure   •  Lots  of  different  standards    EML  ,  FGDC,  ISO19115,  DarwinCore,…   •  Tools  for  creating  metadata  files    Morpho  (EML),  Metavist  (FGDC),  NOAA  MERMaid  (CSGDM)        
  • 36. Best  Practices  for  Data  Management   1.  Planning   2.  Data  collection  &  organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse  
  • 37. 5.  Workflows   Workflow:  how  you  get  from  the  raw  data  to  the  final   products  of  your  research     Simple  workflows:  flow  charts   Temperature   data   Data  import  into  R   Data  in  R   Salinity                 format   data   Quality  control  &   “Clean”  T   data  cleaning   &  S  data   Analysis:  mean,  SD   Summary   statistics   Graph  production  
  • 38. 5.  Workflows   Workflow:  how  you  get  from  the  raw  data  to  the  final   products  of  your  research     Simple  workflows:  commented  scripts   •  R,  SAS,  MATLAB   •  Well-­‐documented  code  is…   Easier  to  review   Easier  to  share   %   #   $   Easier  to  repeat  analysis   &  
  • 39. 5.  Workflows   Fancy  Schmancy  workflows:  Kepler   Resulting  output   https://kepler-­‐project.org  
  • 40. 5.  Workflows   Workflows  enable     From  Flickr  by  merlinprincesse   Reproducibility    can  someone  independently  validate  findings?   Transparency      others  can  understand  how  you  arrived  at  your  results   Executability      others  can  re-­‐run  or  re-­‐use  your  analysis    
  • 41. Best  Practices  for  Data  Management   1.  Planning   2.  Data  collection  &  organization   3.  Quality  control  &  assurance   4.  Metadata   5.  Workflows   6.  Data  stewardship  &  reuse    
  • 42. 6.  Data  stewardship  &  reuse   From  Flickr  by  greensambaman   The  20-­‐Year  Rule   The  metadata  accompanying  a   data  set  should  be  written  for  a   user  20  years  into  the  future   RULE       Document  Document  Document   Document    Document  Document   Document  Document  Document   Document  Document  Document       (National  Research  Council  1991)    
  • 43. 6.  Data  stewardship  &  reuse   Use  stable  formats      csv,  txt,  tiff   Create  back-­‐up  copies     original,  near,  far   Periodically  test  back-­‐ups   Modified from R. Cook  
  • 44. 6.  Data  stewardship  &  reuse   Store  data  in  a  repository   Institutional  archive   Discipline/specialty  archive         From  Flickr  by  torkildr  
  • 45. 6.  Data  stewardship  &  reuse   Data  Citation   Allows  readers  to  find  data  products   Get  credit  for  data  and  publications   Promotes  reproducibility   Better  measure  of  research  impact   Example:   Sidlauskas,  B.  2007.  Data  from:  Testing  for  unequal  rates  of  morphological   diversification  in  the  absence  of  a  detailed  phylogeny:  a  case  study  from   characiform  fishes.  Dryad  Digital  Repository.  doi:10.5061/dryad.20     Learn  more  at  www.datacite.org   Modified from R. Cook  
  • 46. Check  out  the  blog   dcxl.cdlib.org   or  my  website   www.carlystrasser.net   Email  me   carlystrasser@gmail.com   Tweet  me   @carlystrasser  |  @dcxlCDL   DCXL  on  FB   DCXLatCDL