SlideShare a Scribd company logo
1 of 50
Download to read offline
DCXL:	
  Digital	
  Curation	
  for	
  Excel	
  
    Funders:	
  Gordon	
  &	
  Betty	
  Moore	
  Foundation,	
  Microsoft	
  Research	
  




                                    Carly	
  Strasser	
  
                            UC3,	
  California	
  Digital	
  Library	
  
                              carly.strasser@ucop.edu	
  




         22	
  Sept	
  2011	
  	
  UC3	
  Webinar	
  Series	
  	
  	
  California	
  Digital	
  Library	
  
Community	
  
 Build	
  on	
  existing	
                      Engagement	
  
cyberinfrastructure	
  
                          Create	
  new	
  
                      cyberinfrastructure	
  
                                                  Support	
  
                                                communities	
  
Roadmap	
  




                                 4.  How	
  to	
  get	
  involved	
  in	
  DCXL	
  
                       3.  Progress	
  &	
  future	
  plans	
  
            2.  Goals	
  of	
  DCXL	
  project	
  
1.  An	
  overview:	
  why	
  is	
  DCXL	
  needed?	
  
Digital	
  data	
  
     +	
  	
  
 Complex	
  
workAlows	
  
Data	
                               Models	
  

                    Maximum	
  
                    Likelihood	
  
                    estimation	
  



                      Matrix	
  
                      Models	
  



       Images	
       Tables	
       Paper	
  
UGLY TRUTH
                                             Most	
  	
  
                                             Earth	
  |	
  Environmental	
  |	
  Ecological	
  
                                             scientists…	
  	
  
5shortessays.blogspot.com	
  




                     are	
  not	
  taught	
  data	
  management	
  
                     don’t	
  know	
  what	
  metadata	
  are	
  
                     can’t	
  name	
  data	
  centers	
  or	
  repositories	
  
                     don’t	
  share	
  data	
  publicly	
  or	
  store	
  it	
  in	
  an	
  archive	
  
                     aren’t	
  convinced	
  they	
  should	
  share	
  data	
  
2	
  tables	
     Random	
  notes	
  




                                        From	
  Stephanie	
  Hampton	
  (2010)	
          	
  	
  
                                         ESA	
  Workshop	
  on	
  Best	
  Practices	
  
Wash	
  Cres	
  Lake	
  Dec	
  15	
  Dont_Use.xls	
  




                                                   From	
  Stephanie	
  Hampton	
  (2010)	
          	
  	
  
                                                    ESA	
  Workshop	
  on	
  Best	
  Practices	
  
Collaboration	
  and	
  Data	
  Sharing	
  




                                     9	
  
What	
  is	
  this?	
  
The	
  path	
  of	
  research	
  products	
  


                                                                                 www


                                                www.collectionco
                          noaa.gov	
            nnection.alcts.ala.
                                                org	
  

                                                                                    www.Tlickr.com/
                                                                                    photos/csessums	
  




  Data	
          blog.disorder2order.com	
  

Metadata	
  
                                                                                    blog.seattlepi.com	
  




                                                             Recreated	
  from	
  Klump	
  et	
  al.	
  2006	
  
Data	
  
   Reuse	
  

   Data	
  
  Sharing	
  

   Data	
  
Management	
  
The	
  path	
  of	
  research	
  products	
  


                                                                          www


                                         www.collectionco
                          noaa.gov	
     nnection.alcts.ala.
                                         org	
  




  Data	
  
                                                     www
Metadata	
  

                   digital-­
                   servers.com	
  



                                                      Recreated	
  from	
  Klump	
  et	
  al.	
  2006	
  
Barriers	
  

     Cost	
  
ttatteredntornprims.blogspot.com/	
  




                                        Time	
  

                                                   cultblender.wordpress.com	
  



                                                                 Software,	
  
             Personnel	
                                         hardware	
  
Barriers	
  

Cost:	
  time,	
  personnel,	
  software,	
  hardware	
  
                                                            free-­photos.biz	
  

Culture	
  of	
  Science	
  
  •  Not	
  the	
  norm	
  
  •  Lack	
  of	
  training	
  
  •  Disparate	
  data	
  
Barriers	
  

  Cost:	
  time,	
  personnel,	
  software,	
  hardware	
  
  Culture	
  of	
  Science	
  
  Loss	
  of	
  rights	
  or	
  bene:its	
  
                                                              wattsupwiththat.com	
  



colouringbook.org	
  

                                     Misuse	
  of	
  
                                       data	
  


                   Missed	
  
                opportunities	
  
                                                          ConZlict	
  
Barriers	
  

Cost:	
  time,	
  personnel,	
  software,	
  hardware	
  
Culture	
  of	
  Science	
  
Loss	
  of	
  rights	
  or	
  bene:its	
  
Lack	
  of	
  incentives	
                     Time	
  consuming	
  
                                                 &	
  expensive	
  

                            Reward	
  
                           structure	
  
                                               Few	
  
                                           requirements	
  
georgevanantwerp.com	
  
Roadmap	
  




                                 4.  How	
  to	
  get	
  involved	
  in	
  DCXL	
  
                       3.  Progress	
  &	
  future	
  plans	
  
            2.  DCXL	
  project	
  overview	
  
1.  An	
  overview:	
  why	
  is	
  DCXL	
  needed?	
  
DCXL	
  Project	
  Goals	
  
    “A	
  transformation	
  in	
  the	
  conduct	
  of	
  a	
  segment	
  of	
  scientiTic	
  
     research	
  by	
  enabling	
  and	
  promoting	
  publishing,	
  sharing,	
  
                        and	
  archiving	
  of	
  tabular	
  data”	
  

•  Increase	
   	
  interoperability	
   =	
  Sharing	
  
     	
   	
    	
  publishability	
                          =	
  Publishing	
  
     	
   	
    	
  archivability	
  	
  	
  	
  	
  	
  	
   =	
  Archiving	
  

•  Focus	
  on	
  atmospheric,	
  ecological,	
  hydrological,	
  
   and	
  oceanographic	
  data	
  
DCXL	
  Project	
  Goals	
  

               Open	
  Source	
  &	
  Free	
  	
  
                  Excel	
  Add-­in	
  
      Software	
  program	
  that	
  extends	
  the	
  capabilities	
  
      of	
  larger	
  programs	
  
      Complements	
  basic	
  Excel	
  functionality	
  
                                            From	
  www.webopedia.com	
  




                                                                            www.ablebits.com	
  
DCXL	
  Add-­in	
  Goals	
  


          Easier	
      Archiving	
  

                         Sharing	
  

         Harder	
      Publishing	
  
DCXL	
  Project	
  Deliverables	
  

•  Excel	
  add-­‐in	
  
•  Publicly	
  available	
  source	
  code	
  
•  Technical	
  documentation	
  
•  End	
  user	
  documentation	
  	
  
•  Publicly	
  available	
  
   requirements	
  
•  Community	
  	
  

                                                 storageplusgulfport.com	
  
DCXL	
  Project	
  Outcomes	
  

     Enable	
  citation	
  &	
  allow	
  credit	
  
     Enable	
  policy	
  enactment	
  
     Enable	
  re-­‐use	
  by	
  eliminating	
  barriers	
  
     Save	
  time	
  for	
  researcher	
  
     	
  Encourage	
  creation	
  of	
  extensions	
  
Process	
  

Assess	
  needs	
  
•  Quantitative	
  
   –  Surveys	
  
Process	
  

Assess	
  needs	
  
•  Quantitative	
  
   –  Surveys	
  
   –  Quick	
  poll	
  
Process	
  

Assess	
  needs	
  
•  Quantitative	
  


                          ?
   –  Surveys	
  
   –  Quick	
  poll	
  
•  Qualitative	
  
   –  Interviews	
  
Process	
  

Assess	
  needs	
  
Gather	
  requirements	
  

       Recruitment	
  tools	
  
         DCXL/data	
  management	
  seminars	
  
         Listservs	
  &	
  email	
  
         Blog,	
  Facebook,	
  Twitter	
  
         Face-­‐to-­‐face	
  interactions	
  
         Flyers	
  
Process	
  

Assess	
  needs	
  
Gather	
  requirements	
  

        Locations	
  
          	
  Conferences	
  
          	
  UC	
  campus	
  visits	
  
          	
  Remote/web-­‐based	
  
Process	
  

Assess	
  needs	
  
Gather	
  requirements	
  

        Stakeholders	
  &	
  contributors	
  	
  
          	
  Libraries	
  
          	
  Scientists	
  
          	
  Repositories	
  
          	
  Experts:	
  MSR,	
  GBMF	
  
          	
  Personnel	
  on	
  related	
  projects	
  
Process	
  
              Social	
  media,	
  emails,	
                               Social	
  media,	
  
                 campus	
  visits	
                                          emails	
  

                                                        CDL	
  

                                                       Email	
                                    Data	
  
      Libraries	
                                    Seminars	
  
                                                       Flyers	
                                  Centers	
  
                                                    Social	
  media	
  




                                                  Scientists	
  
                                                      Quick	
  poll	
  
                                                        Survey	
  
                                                      Interview	
  

     Related	
  
                                                                                                   Funders	
  
     projects	
  




                                                Requirements	
  
Implementation	
  

Assess	
  needs	
  
Gather	
  requirements	
  
Build	
  requirements	
  document	
  
Implementation	
  

Assess	
  needs	
  
Gather	
  requirements	
  
Build	
  requirements	
  document	
  
Build	
  community	
  
  Libraries	
  
  Scientists	
  
  Repositories	
  
  Programmers/
  Developers	
  	
  
Timeline	
  
            26 Sept     DCXL Kickoff Meeting

             7 Oct      Finalize Requirements Gathering Framework

             9 Nov      1st draft of Requirements to MSR

            30 Nov      2nd draft of Requirements to MSR

           5-9 Dec      AGU Meeting, San Francisco

            15 Dec      Final Requirements to MSR
         2012
            16 Jan      Receive Excel Add-in Version 1

            23 Jan      Rollout Excel Add-in Version 1

         16-19 Feb      AAAS meeting: Add-in user testing

         20-24 Feb      Ocean Sciences meeting: Add-in user testing

            26 Feb      1st Draft of updated Requirements based on Version 1 to MSR

             2 Apr      Deliver updated Requirements based on Version 1 to MSR

            28 May      Receive Excel Add-in Version 2

       29 May- 24 Jun   User testing of Version 2

            25 Jun      Rollout Excel Add-in Version 2

          7-10 July     CSEE meeting: Add-in debut & demo

            13 July     Final code, technical documentation, and requirements published

            31 July     End user documentation published
Roadmap	
  




                                 4.  How	
  to	
  get	
  involved	
  in	
  DCXL	
  
                       3.  Progress	
  &	
  future	
  plans	
  
            2.  DCXL	
  project	
  overview	
  
1.  An	
  overview:	
  why	
  is	
  DCXL	
  needed?	
  
Ecological	
  Society	
  of	
  America	
  
    Summer	
  2011	
  Meeting	
  
ESA	
  Overview	
  


•  Everyone	
  uses	
  Excel	
  
    –  Most	
  use	
  Excel	
  for	
  organizing	
  raw	
  data	
  
    –  Most	
  import	
  spreadsheets	
  into	
  other	
  programs	
  for	
  analysis	
  
    –  ~75%	
  are	
  embarrassed	
  about	
  using	
  Excel	
  

•  Excitement	
  about	
  open	
  source	
  
•  Minimal	
  knowledge	
  about	
  data	
  management,	
  
   organization,	
  and	
  archiving	
  
•  55	
  surveys	
  from	
  diverse	
  group	
  
Operating	
  System	
  
50	
  

45	
  

40	
  

35	
  

30	
  

25	
  

20	
  

15	
  

10	
  

  5	
  

  0	
  

          Mac	
                PC	
           Linux	
  
Use	
  Excel	
  for...	
  

          Sharing	
  



Other	
  Analyses	
  



        Statistics	
  



  Visualization	
  



  Organization	
  


                         0	
     10	
             20	
           30	
           40	
     50	
     60	
  

                                          #	
  Respondents	
  (out	
  of	
  55)	
  
How	
  often	
  do	
  you	
  use	
  Excel?	
  
                       30	
  


                       25	
  
#	
  repsondents	
  




                       20	
  


                       15	
  


                       10	
  


                         5	
  


                         0	
  
                                 Never	
  
                                 Rarely	
                                                Every	
  
                                                                                        Every	
  day	
  
                                                                                            day	
  
What	
  features	
  are	
  used	
  in	
  Excel?	
  

           Comments	
  

         Cell	
  shading	
  

                Macros	
  

Embedded	
  formulas	
  

              Headers	
  

         Pivot	
  Tables	
  

       Multiple	
  Tabs	
  

     Multiple	
  Tables	
  

                               0	
     10	
     20	
     30	
     40	
     50	
     60	
     70	
     80	
     90	
     100	
  

                                                                     Percent	
  
Ray	
  Troll	
  (trollart.com)	
  




American	
  Fisheries	
  Society	
  
  Summer	
  2011	
  Meeting	
  
AFS	
  Overview	
  


•  Everyone	
  uses	
  Excel	
  
•  Most	
  use	
  it	
  only	
  for	
  data	
  organization	
  and	
  sharing	
  
•  36	
  surveys	
  from	
  diverse	
  group	
  
•  Heavy	
  MS	
  Access	
  use	
  
•  100%	
  PC	
  
How	
  often	
  do	
  you	
  use	
  Excel?	
  
                       18	
  

                       16	
  

                       14	
  

                       12	
  
#	
  respondents	
  




                       10	
  

                         8	
  

                         6	
  

                         4	
  

                         2	
  

                         0	
  
                                 Rarely	
                                                 Every	
  day	
  
Tasks	
  performed	
  in	
  Excel?	
  

         Sharing	
  data	
  



Simple	
  Calculations	
  



              Statistics	
  



    Visualizing	
  data	
  



    Organizing	
  data	
  


                               0	
      10	
     20	
     30	
     40	
      50	
     60	
     70	
     80	
     90	
     100	
  

                                                          %	
  respondents	
  (n	
  =	
  36)	
  
What	
  should	
  the	
  add-­in	
  help	
  you	
  do?	
  
                           60	
  


                           50	
  
%	
  	
  Respondents	
  




                           40	
  


                           30	
  


                           20	
  


                           10	
  


                             0	
  
                                     Organize	
  my	
   Organize	
  my	
   Archive	
  my	
       Create	
      Share	
  my	
  data	
   No	
  opinion	
  
                                     data	
  for	
  my	
   data	
  for	
  others	
   data	
     metadata	
        publicly	
  
                                       own	
  use	
         to	
  use	
  more	
  
                                                                  easily	
  
AFS	
  Overview	
  


•  Everyone	
  uses	
  Excel	
  
•  Most	
  use	
  it	
  only	
  for	
  data	
  organization	
  and	
  sharing	
  
•  36	
  surveys	
  from	
  diverse	
  group	
  
•  Heavy	
  MS	
  Access	
  use	
  
•  100%	
  PC	
  
•  Data	
  hoarders	
  

                                                    Myoverstuffedbookshelf.blogspot.com	
  
Roadmap	
  




                                 4.  How	
  to	
  get	
  involved	
  in	
  DCXL	
  
                       3.  Progress	
  &	
  future	
  plans	
  
            2.  DCXL	
  project	
  overview	
  
1.  An	
  overview:	
  why	
  is	
  DCXL	
  needed?	
  
Get	
  Involved	
  

 dcxl.cdlib.org	
  	
  
 Now:	
  	
  
 General	
  info	
  
 Blog	
  
 Forum	
  
 Calendar	
  

 Later:	
  	
  
 Requirements	
  
 Documentation	
  
Get	
  Involved	
  


             @dcxlCDL	
  




                      www.facebook.com/
                      DCXLatCDL	
  
Acknowledgements	
  

•  CDL:	
  Rachael	
  Hu,	
  Trisha	
  Cruse,	
  John	
  Kunze,	
  Tracy	
  Seneca	
  
•  MSR:	
  Lee	
  Dirks	
  
•  GBMF:	
  Chris	
  Mentzel	
  


                          Carly	
  Strasser	
  
                    carly.strasser@ucop.edu	
  

More Related Content

What's hot

Data Management for Scientists: Workshop at Ocean Sciences 2012
Data Management for Scientists: Workshop at Ocean Sciences 2012Data Management for Scientists: Workshop at Ocean Sciences 2012
Data Management for Scientists: Workshop at Ocean Sciences 2012Carly Strasser
 
Data Herding for Scientists - UC Davis OA Week
Data Herding for Scientists - UC Davis OA WeekData Herding for Scientists - UC Davis OA Week
Data Herding for Scientists - UC Davis OA WeekCarly Strasser
 
Open Data & Open Access
Open Data & Open AccessOpen Data & Open Access
Open Data & Open AccessCarly Strasser
 
Research Data and Scholarly Communication
Research Data and Scholarly CommunicationResearch Data and Scholarly Communication
Research Data and Scholarly CommunicationDorothea Salo
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing SerendipityDorothea Salo
 
A Cabinet Of Web2.0 Scientific Curiosities
A Cabinet Of Web2.0 Scientific CuriositiesA Cabinet Of Web2.0 Scientific Curiosities
A Cabinet Of Web2.0 Scientific CuriositiesIan Mulvany
 
DMPTool Overview for UC Merced Research Week
DMPTool Overview for UC Merced Research WeekDMPTool Overview for UC Merced Research Week
DMPTool Overview for UC Merced Research WeekCarly Strasser
 
A structured catalog of open educational datasets
A structured catalog of open educational datasetsA structured catalog of open educational datasets
A structured catalog of open educational datasetsStefan Dietze
 
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...Artificial Intelligence Institute at UofSC
 
UCLA: Data Management for Scientists
UCLA: Data Management for ScientistsUCLA: Data Management for Scientists
UCLA: Data Management for ScientistsCarly Strasser
 
Get the Most Out of Your Tools: Data Management Technologies
Get the Most Out of Your Tools: Data Management TechnologiesGet the Most Out of Your Tools: Data Management Technologies
Get the Most Out of Your Tools: Data Management TechnologiesDATAVERSITY
 
Semantic Web-based Knowledge Management in Distributed Systems
Semantic Web-based Knowledge Management in Distributed SystemsSemantic Web-based Knowledge Management in Distributed Systems
Semantic Web-based Knowledge Management in Distributed SystemsSabin Buraga
 
Data-Ed Online: Structuring Your Unstructured Data Document & Content Management
Data-Ed Online: Structuring Your Unstructured Data Document & Content ManagementData-Ed Online: Structuring Your Unstructured Data Document & Content Management
Data-Ed Online: Structuring Your Unstructured Data Document & Content ManagementDATAVERSITY
 
UC Santa Cruz: Data Management for Scientists
UC Santa Cruz: Data Management for ScientistsUC Santa Cruz: Data Management for Scientists
UC Santa Cruz: Data Management for ScientistsCarly Strasser
 
Data Herding for Scientists - IGERT Symposium at UF
Data Herding for Scientists - IGERT Symposium at UFData Herding for Scientists - IGERT Symposium at UF
Data Herding for Scientists - IGERT Symposium at UFCarly Strasser
 
From Data to Knowledge - Profiling & Interlinking Web Datasets
From Data to Knowledge - Profiling & Interlinking Web DatasetsFrom Data to Knowledge - Profiling & Interlinking Web Datasets
From Data to Knowledge - Profiling & Interlinking Web DatasetsStefan Dietze
 

What's hot (16)

Data Management for Scientists: Workshop at Ocean Sciences 2012
Data Management for Scientists: Workshop at Ocean Sciences 2012Data Management for Scientists: Workshop at Ocean Sciences 2012
Data Management for Scientists: Workshop at Ocean Sciences 2012
 
Data Herding for Scientists - UC Davis OA Week
Data Herding for Scientists - UC Davis OA WeekData Herding for Scientists - UC Davis OA Week
Data Herding for Scientists - UC Davis OA Week
 
Open Data & Open Access
Open Data & Open AccessOpen Data & Open Access
Open Data & Open Access
 
Research Data and Scholarly Communication
Research Data and Scholarly CommunicationResearch Data and Scholarly Communication
Research Data and Scholarly Communication
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing Serendipity
 
A Cabinet Of Web2.0 Scientific Curiosities
A Cabinet Of Web2.0 Scientific CuriositiesA Cabinet Of Web2.0 Scientific Curiosities
A Cabinet Of Web2.0 Scientific Curiosities
 
DMPTool Overview for UC Merced Research Week
DMPTool Overview for UC Merced Research WeekDMPTool Overview for UC Merced Research Week
DMPTool Overview for UC Merced Research Week
 
A structured catalog of open educational datasets
A structured catalog of open educational datasetsA structured catalog of open educational datasets
A structured catalog of open educational datasets
 
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
 
UCLA: Data Management for Scientists
UCLA: Data Management for ScientistsUCLA: Data Management for Scientists
UCLA: Data Management for Scientists
 
Get the Most Out of Your Tools: Data Management Technologies
Get the Most Out of Your Tools: Data Management TechnologiesGet the Most Out of Your Tools: Data Management Technologies
Get the Most Out of Your Tools: Data Management Technologies
 
Semantic Web-based Knowledge Management in Distributed Systems
Semantic Web-based Knowledge Management in Distributed SystemsSemantic Web-based Knowledge Management in Distributed Systems
Semantic Web-based Knowledge Management in Distributed Systems
 
Data-Ed Online: Structuring Your Unstructured Data Document & Content Management
Data-Ed Online: Structuring Your Unstructured Data Document & Content ManagementData-Ed Online: Structuring Your Unstructured Data Document & Content Management
Data-Ed Online: Structuring Your Unstructured Data Document & Content Management
 
UC Santa Cruz: Data Management for Scientists
UC Santa Cruz: Data Management for ScientistsUC Santa Cruz: Data Management for Scientists
UC Santa Cruz: Data Management for Scientists
 
Data Herding for Scientists - IGERT Symposium at UF
Data Herding for Scientists - IGERT Symposium at UFData Herding for Scientists - IGERT Symposium at UF
Data Herding for Scientists - IGERT Symposium at UF
 
From Data to Knowledge - Profiling & Interlinking Web Datasets
From Data to Knowledge - Profiling & Interlinking Web DatasetsFrom Data to Knowledge - Profiling & Interlinking Web Datasets
From Data to Knowledge - Profiling & Interlinking Web Datasets
 

Similar to Digital Curation for Excel (DCXL)

DataUp: An overview for the DataONE Users Group
DataUp: An overview for the DataONE Users GroupDataUp: An overview for the DataONE Users Group
DataUp: An overview for the DataONE Users GroupCarly Strasser
 
Why manage research data?
Why manage research data?Why manage research data?
Why manage research data?Graham Pryor
 
Graham Pryor
Graham PryorGraham Pryor
Graham PryorEduserv
 
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012Lee Dirks
 
DataUp Overview: AGU 2012
DataUp Overview: AGU 2012DataUp Overview: AGU 2012
DataUp Overview: AGU 2012Carly Strasser
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) CommonsJames Hendler
 
Supporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data ManagementSupporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data ManagementMarieke Guy
 
How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?andrea huang
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 
EPA 2013 Air Sensors Meeting Big Data Talk
EPA 2013 Air Sensors Meeting Big Data TalkEPA 2013 Air Sensors Meeting Big Data Talk
EPA 2013 Air Sensors Meeting Big Data TalkAdina Chuang Howe
 
XLDB South America Keynote: eScience Institute and Myria
XLDB South America Keynote: eScience Institute and MyriaXLDB South America Keynote: eScience Institute and Myria
XLDB South America Keynote: eScience Institute and MyriaUniversity of Washington
 
accelerating-data-driven
accelerating-data-drivenaccelerating-data-driven
accelerating-data-drivenJoshua Chudy
 
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...
Claudia Bauzer Medeiros  Digital preservation – caring for our data to foster...Claudia Bauzer Medeiros  Digital preservation – caring for our data to foster...
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...Beniamino Murgante
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsArcadia Data
 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science ServicesIan Foster
 
Data Management: The Current Landscape
Data Management: The Current LandscapeData Management: The Current Landscape
Data Management: The Current LandscapeCarly Strasser
 
Big and Small Web Data
Big and Small Web DataBig and Small Web Data
Big and Small Web DataMarieke Guy
 

Similar to Digital Curation for Excel (DCXL) (20)

DataUp: An overview for the DataONE Users Group
DataUp: An overview for the DataONE Users GroupDataUp: An overview for the DataONE Users Group
DataUp: An overview for the DataONE Users Group
 
DataUp at ACRL 2013
DataUp at ACRL 2013DataUp at ACRL 2013
DataUp at ACRL 2013
 
Why manage research data?
Why manage research data?Why manage research data?
Why manage research data?
 
Graham Pryor
Graham PryorGraham Pryor
Graham Pryor
 
CAEPIA 2011
CAEPIA 2011CAEPIA 2011
CAEPIA 2011
 
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
 
DataUp Overview: AGU 2012
DataUp Overview: AGU 2012DataUp Overview: AGU 2012
DataUp Overview: AGU 2012
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) Commons
 
Intro to RDM
Intro to RDMIntro to RDM
Intro to RDM
 
Supporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data ManagementSupporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data Management
 
How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
EPA 2013 Air Sensors Meeting Big Data Talk
EPA 2013 Air Sensors Meeting Big Data TalkEPA 2013 Air Sensors Meeting Big Data Talk
EPA 2013 Air Sensors Meeting Big Data Talk
 
XLDB South America Keynote: eScience Institute and Myria
XLDB South America Keynote: eScience Institute and MyriaXLDB South America Keynote: eScience Institute and Myria
XLDB South America Keynote: eScience Institute and Myria
 
accelerating-data-driven
accelerating-data-drivenaccelerating-data-driven
accelerating-data-driven
 
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...
Claudia Bauzer Medeiros  Digital preservation – caring for our data to foster...Claudia Bauzer Medeiros  Digital preservation – caring for our data to foster...
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time Analytics
 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science Services
 
Data Management: The Current Landscape
Data Management: The Current LandscapeData Management: The Current Landscape
Data Management: The Current Landscape
 
Big and Small Web Data
Big and Small Web DataBig and Small Web Data
Big and Small Web Data
 

More from University of California Curation Center

ETDs: Electronic Thesis and Dissertation Service at the University of California
ETDs: Electronic Thesis and Dissertation Service at the University of CaliforniaETDs: Electronic Thesis and Dissertation Service at the University of California
ETDs: Electronic Thesis and Dissertation Service at the University of CaliforniaUniversity of California Curation Center
 
The UC Curation Center (UC3): Developing Tools & Services for Managing Research
The UC Curation Center (UC3): Developing Tools & Services for Managing ResearchThe UC Curation Center (UC3): Developing Tools & Services for Managing Research
The UC Curation Center (UC3): Developing Tools & Services for Managing ResearchUniversity of California Curation Center
 

More from University of California Curation Center (20)

ETDs: Electronic Thesis and Dissertation Service at the University of California
ETDs: Electronic Thesis and Dissertation Service at the University of CaliforniaETDs: Electronic Thesis and Dissertation Service at the University of California
ETDs: Electronic Thesis and Dissertation Service at the University of California
 
Dash UCCSC 2016
Dash UCCSC 2016Dash UCCSC 2016
Dash UCCSC 2016
 
Uc3 ucacc-2015-11-16
Uc3 ucacc-2015-11-16Uc3 ucacc-2015-11-16
Uc3 ucacc-2015-11-16
 
Dash: data sharing made easy
Dash: data sharing made easyDash: data sharing made easy
Dash: data sharing made easy
 
CDL research lifecycle
CDL research lifecycleCDL research lifecycle
CDL research lifecycle
 
Ucmp 20150407
Ucmp 20150407Ucmp 20150407
Ucmp 20150407
 
What does "data publication" mean to researchers?
What does "data publication" mean to researchers?What does "data publication" mean to researchers?
What does "data publication" mean to researchers?
 
Researcher perspectives on publication and peer review of data.
Researcher perspectives on publication and peer review of data.Researcher perspectives on publication and peer review of data.
Researcher perspectives on publication and peer review of data.
 
Enhancing DMPTool: Further Streamlineing Data Mangement Planning Process
Enhancing DMPTool: Further Streamlineing Data Mangement Planning ProcessEnhancing DMPTool: Further Streamlineing Data Mangement Planning Process
Enhancing DMPTool: Further Streamlineing Data Mangement Planning Process
 
DataShare: Empowering Researcher Data Curation
DataShare: Empowering Researcher Data CurationDataShare: Empowering Researcher Data Curation
DataShare: Empowering Researcher Data Curation
 
Future of web archiving
Future of web archivingFuture of web archiving
Future of web archiving
 
Data preservation 101
Data preservation 101Data preservation 101
Data preservation 101
 
Creating superior data management plans with the DMPTool
Creating superior data management plans with the DMPToolCreating superior data management plans with the DMPTool
Creating superior data management plans with the DMPTool
 
ESA Ignite talk on the DMPTool by S Abrams
ESA Ignite talk on the DMPTool by S AbramsESA Ignite talk on the DMPTool by S Abrams
ESA Ignite talk on the DMPTool by S Abrams
 
DMPTool2 Webinar #1 for Administrators
DMPTool2 Webinar #1 for AdministratorsDMPTool2 Webinar #1 for Administrators
DMPTool2 Webinar #1 for Administrators
 
DMPTool2 Administrator Webinar #2
DMPTool2 Administrator Webinar #2DMPTool2 Administrator Webinar #2
DMPTool2 Administrator Webinar #2
 
DataShare for UC Campuses
DataShare for UC CampusesDataShare for UC Campuses
DataShare for UC Campuses
 
Helping librarians use the DMPTool as a centerpiece for data management
Helping librarians use the DMPTool as a centerpiece for data managementHelping librarians use the DMPTool as a centerpiece for data management
Helping librarians use the DMPTool as a centerpiece for data management
 
The UC Curation Center (UC3): Developing Tools & Services for Managing Research
The UC Curation Center (UC3): Developing Tools & Services for Managing ResearchThe UC Curation Center (UC3): Developing Tools & Services for Managing Research
The UC Curation Center (UC3): Developing Tools & Services for Managing Research
 
Dataset Metadata Publication Through EZID
Dataset Metadata Publication Through EZIDDataset Metadata Publication Through EZID
Dataset Metadata Publication Through EZID
 

Recently uploaded

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Digital Curation for Excel (DCXL)

  • 1. DCXL:  Digital  Curation  for  Excel   Funders:  Gordon  &  Betty  Moore  Foundation,  Microsoft  Research   Carly  Strasser   UC3,  California  Digital  Library   carly.strasser@ucop.edu   22  Sept  2011    UC3  Webinar  Series      California  Digital  Library  
  • 2. Community   Build  on  existing   Engagement   cyberinfrastructure   Create  new   cyberinfrastructure   Support   communities  
  • 3. Roadmap   4.  How  to  get  involved  in  DCXL   3.  Progress  &  future  plans   2.  Goals  of  DCXL  project   1.  An  overview:  why  is  DCXL  needed?  
  • 4. Digital  data   +     Complex   workAlows  
  • 5. Data   Models   Maximum   Likelihood   estimation   Matrix   Models   Images   Tables   Paper  
  • 6. UGLY TRUTH Most     Earth  |  Environmental  |  Ecological   scientists…     5shortessays.blogspot.com   are  not  taught  data  management   don’t  know  what  metadata  are   can’t  name  data  centers  or  repositories   don’t  share  data  publicly  or  store  it  in  an  archive   aren’t  convinced  they  should  share  data  
  • 7. 2  tables   Random  notes   From  Stephanie  Hampton  (2010)       ESA  Workshop  on  Best  Practices  
  • 8. Wash  Cres  Lake  Dec  15  Dont_Use.xls   From  Stephanie  Hampton  (2010)       ESA  Workshop  on  Best  Practices  
  • 9. Collaboration  and  Data  Sharing   9  
  • 11. The  path  of  research  products   www www.collectionco noaa.gov   nnection.alcts.ala. org   www.Tlickr.com/ photos/csessums   Data   blog.disorder2order.com   Metadata   blog.seattlepi.com   Recreated  from  Klump  et  al.  2006  
  • 12. Data   Reuse   Data   Sharing   Data   Management  
  • 13. The  path  of  research  products   www www.collectionco noaa.gov   nnection.alcts.ala. org   Data   www Metadata   digital-­ servers.com   Recreated  from  Klump  et  al.  2006  
  • 14. Barriers   Cost   ttatteredntornprims.blogspot.com/   Time   cultblender.wordpress.com   Software,   Personnel   hardware  
  • 15. Barriers   Cost:  time,  personnel,  software,  hardware   free-­photos.biz   Culture  of  Science   •  Not  the  norm   •  Lack  of  training   •  Disparate  data  
  • 16. Barriers   Cost:  time,  personnel,  software,  hardware   Culture  of  Science   Loss  of  rights  or  bene:its   wattsupwiththat.com   colouringbook.org   Misuse  of   data   Missed   opportunities   ConZlict  
  • 17. Barriers   Cost:  time,  personnel,  software,  hardware   Culture  of  Science   Loss  of  rights  or  bene:its   Lack  of  incentives   Time  consuming   &  expensive   Reward   structure   Few   requirements   georgevanantwerp.com  
  • 18. Roadmap   4.  How  to  get  involved  in  DCXL   3.  Progress  &  future  plans   2.  DCXL  project  overview   1.  An  overview:  why  is  DCXL  needed?  
  • 19. DCXL  Project  Goals   “A  transformation  in  the  conduct  of  a  segment  of  scientiTic   research  by  enabling  and  promoting  publishing,  sharing,   and  archiving  of  tabular  data”   •  Increase    interoperability   =  Sharing        publishability   =  Publishing        archivability               =  Archiving   •  Focus  on  atmospheric,  ecological,  hydrological,   and  oceanographic  data  
  • 20. DCXL  Project  Goals   Open  Source  &  Free     Excel  Add-­in   Software  program  that  extends  the  capabilities   of  larger  programs   Complements  basic  Excel  functionality   From  www.webopedia.com   www.ablebits.com  
  • 21. DCXL  Add-­in  Goals   Easier   Archiving   Sharing   Harder   Publishing  
  • 22. DCXL  Project  Deliverables   •  Excel  add-­‐in   •  Publicly  available  source  code   •  Technical  documentation   •  End  user  documentation     •  Publicly  available   requirements   •  Community     storageplusgulfport.com  
  • 23. DCXL  Project  Outcomes    Enable  citation  &  allow  credit    Enable  policy  enactment    Enable  re-­‐use  by  eliminating  barriers    Save  time  for  researcher      Encourage  creation  of  extensions  
  • 24. Process   Assess  needs   •  Quantitative   –  Surveys  
  • 25. Process   Assess  needs   •  Quantitative   –  Surveys   –  Quick  poll  
  • 26. Process   Assess  needs   •  Quantitative   ? –  Surveys   –  Quick  poll   •  Qualitative   –  Interviews  
  • 27. Process   Assess  needs   Gather  requirements   Recruitment  tools   DCXL/data  management  seminars   Listservs  &  email   Blog,  Facebook,  Twitter   Face-­‐to-­‐face  interactions   Flyers  
  • 28. Process   Assess  needs   Gather  requirements   Locations    Conferences    UC  campus  visits    Remote/web-­‐based  
  • 29. Process   Assess  needs   Gather  requirements   Stakeholders  &  contributors      Libraries    Scientists    Repositories    Experts:  MSR,  GBMF    Personnel  on  related  projects  
  • 30. Process   Social  media,  emails,   Social  media,   campus  visits   emails   CDL   Email   Data   Libraries   Seminars   Flyers   Centers   Social  media   Scientists   Quick  poll   Survey   Interview   Related   Funders   projects   Requirements  
  • 31. Implementation   Assess  needs   Gather  requirements   Build  requirements  document  
  • 32. Implementation   Assess  needs   Gather  requirements   Build  requirements  document   Build  community   Libraries   Scientists   Repositories   Programmers/ Developers    
  • 33. Timeline   26 Sept DCXL Kickoff Meeting 7 Oct Finalize Requirements Gathering Framework 9 Nov 1st draft of Requirements to MSR 30 Nov 2nd draft of Requirements to MSR 5-9 Dec AGU Meeting, San Francisco 15 Dec Final Requirements to MSR 2012 16 Jan Receive Excel Add-in Version 1 23 Jan Rollout Excel Add-in Version 1 16-19 Feb AAAS meeting: Add-in user testing 20-24 Feb Ocean Sciences meeting: Add-in user testing 26 Feb 1st Draft of updated Requirements based on Version 1 to MSR 2 Apr Deliver updated Requirements based on Version 1 to MSR 28 May Receive Excel Add-in Version 2 29 May- 24 Jun User testing of Version 2 25 Jun Rollout Excel Add-in Version 2 7-10 July CSEE meeting: Add-in debut & demo 13 July Final code, technical documentation, and requirements published 31 July End user documentation published
  • 34. Roadmap   4.  How  to  get  involved  in  DCXL   3.  Progress  &  future  plans   2.  DCXL  project  overview   1.  An  overview:  why  is  DCXL  needed?  
  • 35. Ecological  Society  of  America   Summer  2011  Meeting  
  • 36. ESA  Overview   •  Everyone  uses  Excel   –  Most  use  Excel  for  organizing  raw  data   –  Most  import  spreadsheets  into  other  programs  for  analysis   –  ~75%  are  embarrassed  about  using  Excel   •  Excitement  about  open  source   •  Minimal  knowledge  about  data  management,   organization,  and  archiving   •  55  surveys  from  diverse  group  
  • 37. Operating  System   50   45   40   35   30   25   20   15   10   5   0   Mac   PC   Linux  
  • 38. Use  Excel  for...   Sharing   Other  Analyses   Statistics   Visualization   Organization   0   10   20   30   40   50   60   #  Respondents  (out  of  55)  
  • 39. How  often  do  you  use  Excel?   30   25   #  repsondents   20   15   10   5   0   Never   Rarely   Every   Every  day   day  
  • 40. What  features  are  used  in  Excel?   Comments   Cell  shading   Macros   Embedded  formulas   Headers   Pivot  Tables   Multiple  Tabs   Multiple  Tables   0   10   20   30   40   50   60   70   80   90   100   Percent  
  • 41. Ray  Troll  (trollart.com)   American  Fisheries  Society   Summer  2011  Meeting  
  • 42. AFS  Overview   •  Everyone  uses  Excel   •  Most  use  it  only  for  data  organization  and  sharing   •  36  surveys  from  diverse  group   •  Heavy  MS  Access  use   •  100%  PC  
  • 43. How  often  do  you  use  Excel?   18   16   14   12   #  respondents   10   8   6   4   2   0   Rarely   Every  day  
  • 44. Tasks  performed  in  Excel?   Sharing  data   Simple  Calculations   Statistics   Visualizing  data   Organizing  data   0   10   20   30   40   50   60   70   80   90   100   %  respondents  (n  =  36)  
  • 45. What  should  the  add-­in  help  you  do?   60   50   %    Respondents   40   30   20   10   0   Organize  my   Organize  my   Archive  my   Create   Share  my  data   No  opinion   data  for  my   data  for  others   data   metadata   publicly   own  use   to  use  more   easily  
  • 46. AFS  Overview   •  Everyone  uses  Excel   •  Most  use  it  only  for  data  organization  and  sharing   •  36  surveys  from  diverse  group   •  Heavy  MS  Access  use   •  100%  PC   •  Data  hoarders   Myoverstuffedbookshelf.blogspot.com  
  • 47. Roadmap   4.  How  to  get  involved  in  DCXL   3.  Progress  &  future  plans   2.  DCXL  project  overview   1.  An  overview:  why  is  DCXL  needed?  
  • 48. Get  Involved   dcxl.cdlib.org     Now:     General  info   Blog   Forum   Calendar   Later:     Requirements   Documentation  
  • 49. Get  Involved   @dcxlCDL   www.facebook.com/ DCXLatCDL  
  • 50. Acknowledgements   •  CDL:  Rachael  Hu,  Trisha  Cruse,  John  Kunze,  Tracy  Seneca   •  MSR:  Lee  Dirks   •  GBMF:  Chris  Mentzel   Carly  Strasser   carly.strasser@ucop.edu