18. the status quo tolerates
poor communication
of findings
Ioannidis A. et al. Repeatability of published microarray gene expression analyses. Nature Genetics 41, 149-155 (2009) | doi:10.1038/ng.295
19. the status quo tolerates
poor communication
of findings can
reproduce
partially
can reproduce from 6%
processed data w/
discrepancies
21%
54%
cannot
can reproduce 8% reproduce
w/discrepancies
11%
can
reproduce in
principle
Ioannidis A. et al. Repeatability of published microarray gene expression analyses. Nature Genetics 41, 149-155 (2009) | doi:10.1038/ng.295
20. often what is in principle
reproducible, is not
practically reproducible
unidentified publication
‣ from journal with 5 year impact factor of 28
‣ article freely available for download
‣ data freely available for download
21. often what is in principle
reproducible, is not
practically reproducible
208,294,724
datapoints
124 pages
supplemental material
?? lines
unobtainable source code
?? version or architecture of
statistical analysis program (R)
enumerable R packages
and package dependencies
key R package “ClaNC”
no longer available
442 citations
unidentified publication
‣ from journal with 5 year impact factor of 28
‣ article freely available for download
‣ data freely available for download
22. how are we to move science forward
if we cannot understand what was done previously ?
28. scientific method
1. define a question
2. gather information and resources (background research)
29. scientific method
1. define a question
2. gather information and resources (background research)
3. form a hypothesis
30. scientific method
1. define a question
2. gather information and resources (background research)
3. form a hypothesis
4. test hypothesis experimentally
31. scientific method
1. define a question
2. gather information and resources (background research)
3. form a hypothesis
4. test hypothesis experimentally
5. analyze experimental data
32. scientific method
1. define a question
2. gather information and resources (background research)
3. form a hypothesis
4. test hypothesis experimentally
5. analyze experimental data
6. draw conclusions based on data
33. scientific method
1. define a question
2. gather information and resources (background research)
3. form a hypothesis
4. test hypothesis experimentally
5. analyze experimental data
6. draw conclusions based on data
7. publish results
34. scientific method
1. define a question
2. gather information and resources (background research)
3. form a hypothesis
4. test hypothesis experimentally
5. analyze experimental data
6. draw conclusions based on data
7. publish results
8. retest (frequently done by other scientists)
35. scientific method
1. define a question
2. gather information and resources (background research)
3. form a hypothesis
4. test hypothesis experimentally
5. analyze experimental data
6. draw conclusions based on data
7. publish results
8. retest (frequently done by other scientists)
36. scientific method
1. define a question
2. gather information and resources (background research)
4. test hypothesis experimentally
3. form a hypothesis
5. analyze experimental data
4. test hypothesis experimentally
5. analyze experimental data
6. draw conclusions based on data
6. draw conclusions based on data
7. publish results
7. publish results
8. retest (frequently done by other scientists)
41. printed
on paper
store on local
server
experimentally generate
data @ the bench or static html
from a clinical cohort representation
accepted &
digitally
typeset
static pdf
representation
analyze on local
machine
sent to
write a document reviewers
as pdf
rn al
t to jou
s ubmi
42. printed
on paper
store on local
server
experimentally generate
data @ the bench or static html
from a clinical cohort representation
accepted &
digitally
typeset
static pdf
representation
analyze on local
machine
sent to
write a document reviewers
as pdf
rn al
t to jou
s ubmi
54. clearScience
re-imagining scientific communication
allow consumption of content at a
variety of levels of complexity
and abstraction
leverage (open) RESTful APIs
67. Acknowledgements
Sage Bionetworks External Partners
David Burdick - Rockstar Engineer Myles Axton - Nature Genetics
Stephen Friend - President and CEO Phil Bourne - PLoS Computational Biology
Erich S. Huang - Director of Cancer Research Josh Greenberg - Alfred P. Sloan Foundation
Mike Kellen - Director of Technology Kelly LaMarco - Science Translational Medicine
Ian Mulvaney - eLife Sciences
Eric Schadt - Open Network Biology
Notas del editor
Welcome everyone -- and thanks for sticking around with me for the Tuesday afternoon sessions -- and for making the trek down into the dungeon where they squirrel away the scientists\n\nToday I’d like to introduce you to a pilot program we have taken on at Sage Bionetworks called ‘clearScience’ -- and how we hope it will help positively disrupt scientific communication as we know it today\n
In order to set the stage, I’d like to start with a story\na story at the extreme of spectrum that has become the status quo within scientific research\n\nDr. Anil Potti was a junior faculty member at Duke University where he was involved in studying genomic signatures of response to different cancer therapies\n\nhe spent 5 years carrying out this research\n\nAs explored in a 60 minutes piece this spring, Dr. Potti has since been widely accused of scientific misconduct in his work which as lead to retraction of dozens of research papers in prominent scientific journals\n\nAnd the fallout has been massive\n\nDr. Potti’s fraudulent research lead to 3 clinical trials which enrolled real patients ... patients who were already strapped with having to deal with a cancer diagnosis ... patients who were put at real risk ... unnecessarily\n\n
Patients like Juliet Jacobs. Juliet was a patient of Dr. Potti’s and was enrolled in one of the clinical trials that was launched in response to his research. In a chilling recording from the 60 minutes piece, Dr. Potti is heard telling Mrs. Jacobs “I will help you. Trust me.”\nThree months after being reassured by Potti, three months after being told “trust me,” Juliet Jacobs passed on. And her husband is now suing Duke.\n\nNow, I realize that Dr. Potti is an exception -- but the fact that his research was allowed to travel as far as it did before it was finally caught is symptomatic of problems in the way our research is communicated\n\nIn fact ...\n
Patients like Juliet Jacobs. Juliet was a patient of Dr. Potti’s and was enrolled in one of the clinical trials that was launched in response to his research. In a chilling recording from the 60 minutes piece, Dr. Potti is heard telling Mrs. Jacobs “I will help you. Trust me.”\nThree months after being reassured by Potti, three months after being told “trust me,” Juliet Jacobs passed on. And her husband is now suing Duke.\n\nNow, I realize that Dr. Potti is an exception -- but the fact that his research was allowed to travel as far as it did before it was finally caught is symptomatic of problems in the way our research is communicated\n\nIn fact ...\n
just this morning I ran across news coverage of another study released through PNAS.\n\nThis is the kind of press science has been getting recently. And some wonder why no one trusts scientific findings anymore.\n
just this morning I ran across news coverage of another study released through PNAS.\n\nThis is the kind of press science has been getting recently. And some wonder why no one trusts scientific findings anymore.\n
just this morning I ran across news coverage of another study released through PNAS.\n\nThis is the kind of press science has been getting recently. And some wonder why no one trusts scientific findings anymore.\n
just this morning I ran across news coverage of another study released through PNAS.\n\nThis is the kind of press science has been getting recently. And some wonder why no one trusts scientific findings anymore.\n
In a 274 page report specifically in response to the Duke scandal, scientific leadership at the Institute of Medicine called for researchers to be more “open” and “transparent” with their work\n\nBut what do we, and what do they, mean by ‘open’\n
usually with respect to code\n\nterm was coined in 1998 ... but concept stretches back to ‘free software movement’ in the early 1980s\n
often time open data is ill-defined and simply means that the data is available for download somewhere, under some unspecified restriction, and in some unspecified format\n
The UK Research Council has been a champion of this movement by their acceptance of the Finch Report earlier this year.\n\nWhich provided a number of suggestions regarding expansion of access to research publications\n\n
The UK Research Council has been a champion of this movement by their acceptance of the Finch Report earlier this year.\n\nWhich provided a number of suggestions regarding expansion of access to research publications\n\n
The UK Research Council has been a champion of this movement by their acceptance of the Finch Report earlier this year.\n\nWhich provided a number of suggestions regarding expansion of access to research publications\n\n
similarly, the access2research petition was a grassroots movement calling for a response by the Obama Administration for free access to articles arising from US taxpayer-funded research\n\nthe petition needed 25k signatures and eclipsed that mark in less that 2 wks\n\nmost science is publicly funded and it is meant to provide a base from which others can build\n\nwe’re still awaiting President Obama’s response, but at least we have been able to demonstrate a level of support for this issue and hopefully we will be following the UK’s lead in this matter\n\nbut just because we have access to something, does not necessary make it ... accessible\n\n
\n
\n
Venture capitalists believe that at least 50% of the studies published in top tier life science journals cannot be repeated\n\nThis has resulted in massive underfunding of biomedical startups from venture capital funds\n\nIf this were just a perception of venture capitalists, that would be one thing ...\n
... but in an analysis published in Nature Genetics, it was found that over 50% of microarray studies (also published in Nature Genetics) were not reproducible\n\ncannot reproduce includes\ndata not available\nsoftware not available\nmethods unclear\ndifferent result altogether\n\nreiterates how the status quo tolerates poor communication of science\n\nlet’s explore one example\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
colleague of mine -- an MD / PhD and self described ‘former hacker’ -- tried to reproduce results\n\neven with article and data available, the methods were not described in enough detail to recapitulate the findings\n\nfrom a journal with impact factor of 28\n\nwas not able to reconstruct clusters that are now being cited widely\n\n
\n
great editorial in Nature discussing how too many sloppy mistakes are creeping into scientific papers.\n\n\n
\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
this process is repeated until sufficient evidence is in place to move on\n\nsupposed to be able to build upon previous research\n\nthe iterative cycle is most important part of the method\n\nthis is where innovation occurs\n
publishing is becoming the end-all-be-all\n\nincentives drive researchers to this scientific ‘end game’\n
but science is not a finite game\n\nas the scientific method spelled out years ago, it is not a finite game, but an infinite one.\n
but science is not a finite game\n\nas the scientific method spelled out years ago, it is not a finite game, but an infinite one.\n
but science is not a finite game\n\nas the scientific method spelled out years ago, it is not a finite game, but an infinite one.\n
but science is not a finite game\n\nas the scientific method spelled out years ago, it is not a finite game, but an infinite one.\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
Let’s take a closer look at our current common practice\nthe status quo, if you will\n\nGenerate data\nwhich is stored on a local server\nand analyzed on a local machine\nresults are summarized in a document\nsubmitted to a journal\nand sent to reviewers as a static image\nif the researcher is lucky enough to have their findings published, it is done so in a static representation of the work\n
and, we think, needlessly so\n\nthe technology is here -- it just hasn’t been put together (‘packaged’) in a compelling enough way to force its adoption\n\n‘peer review’ in the data intensive sciences is almost non-existent\n\npieces are not currently there to truly review the science\n
in particular in biomedical research, where I have done most of my work.\n\nThe human body is an amazing machine\n\nBut it is complex and an uncontrolled system\n\nThere are 10 trillion cells in the body each having their own genetic encoding\n\neach interacting with its local environment\n\nwhich is influenced by its macro environment\n\ninsights in our space are very interconnected to other experiments\n
in particular in biomedical research, where I have done most of my work.\n\nThe human body is an amazing machine\n\nBut it is complex and an uncontrolled system\n\nThere are 10 trillion cells in the body each having their own genetic encoding\n\neach interacting with its local environment\n\nwhich is influenced by its macro environment\n\ninsights in our space are very interconnected to other experiments\n
many scientists do not have sufficient background in communication\n\nif they were anything like me, they avoided those classes like the plague\n\n
many scientists do not have sufficient background in communication\n\nif they were anything like me, they avoided those classes like the plague\n\n
so we need to make it easier\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
Mock up of a clearScience interface\npowered by Synapse -- which is an open source smart object repository through which we leverage APIs for\nCompute (in this case AWS AMI with RStudio pre-installed)\nCode (github)\nData (Synapse)\nAll of the individual pieces are currently in place within Synapse, and we have engineering support from the Sloan Foundation for engineers to work on the rendering of pages like this\n\nEnter an analysis on an abstract page -- a high level overview of the project just as in publications today\nDisplay the building blocks that made the analysis happen\nThis includes full user-defined analysis provenance through Synapse\n“Hand me your cloud computer”\nAllows users to step through an analysis at whatever level of granularity they like\ncan consume a publication as they would now\ncan step through the analysis as conducted\ncan interactively explore data as they please -- even more formally ‘fork’ an entire project\n\n
again, the tools and technologies are there -- we simply need to make is easy to do the right thing. To do good science. To do clearScience.\n\nThank you\n\n