We present Caffe con Troll (CcT), a fully compatible end-to-end version of the popular framework Caffe with rebuilt internals. We built CcT to examine the performance characteristics of training and deploying general-purpose convolutional neural networks across different hardware architectures. We find that, by employing standard batching optimizations for CPU training, we achieve a 4.5x throughput improvement over Caffe on popular networks like CaffeNet. Moreover, with these improvements, the end-to-end training time for CNNs is directly proportional to the FLOPS delivered by the CPU, which enables us to efficiently train hybrid CPU-GPU systems for CNNs.
5. KBC Applications
Science is built up with facts, as a house is with
stones.
- Jules Henri Poincaré
Example: Paleontology
Taxon Rock
Age Location
Scientific Facts
Biodiversity
Macroscopic View Insights & Knowledge
Impact of climate
change to bio-
diversity?
7. KBC Applications
Example: Paleontology
Taxon Rock
Age Location
Scientific Facts
Biodiversity
Macroscopic View Insights & Knowledge
Impact of climate
change to bio-
diversity?
1570 1670 1770 1870 1970 2015
Input Sources
KBConstruction
Knowledge
Base (KB)
8. KBC Applications
Paleontology Genomics
Taxon Rock
Age Location
Knowledge Base
Gene Drug
Disease
Knowledge Base
Dark Web
Server Service
Price Location
Knowledge Base
Climate & Biodiversity Social GoodHealth & Medicine
10. Challenge of Manual KBC
Paleontology
Taxon Rock
Age Location
Knowledge Base
Effort on Manual KBC
Sepkoski (1982) manually
compiled a compendium of 3300
animal families with 396
references in his monograph.
300 professional volunteers
(1998-present) spent 8 continuo-
us human years to compile
PaleoDB with 55,479 references.
80
90
100
110
120
2010 2011 2012 2013
#NewPaleo
References…
100K new references
per year! 16 continuous human
years every year just to
keep up-to-date!
13. Case Study - PaleoDeepDive
The Goal
Extract paleobiological facts to build higher coverage fossil
record.
T. Rex are found dating to
the upper Cretaceous.
Appears(“T. Rex”, “Cretaceous”)
DeepDive
14. Case Study - PaleoDeepDive
55K documents
329 geoscientists
8 years
126K fossil mentions
2000 machine cores
46 machine years
1M relations
300K documents
3M fossil mentions
2.1M relations
PaleoDB PaleoDeepDive
Human-created
Paleobiology
database!
Machine-created
Paleobiology
database!
(>90% Precision)
Biodiversity Curve
On the same relation, PaleoDeepDive achieves equal (or
sometimes better) precision as professional human
volunteers.
15. Validation on Real Applications
Paleontology
Geology
Pharmacogenomics
Genomics
Wikipedia-like Relations
Dark Web
“It's a little scary,
the machines
are
getting that
good.”Recall: 2-10x more extractions than human
Precision: 92%-97% (Human ~84%-92%)
Highest score out of 18 teams and 65
submissions (2nd highest is also DeepDive).
Applied Physics
Goal: Enables easy engineering to
build high-quality KBC Systems by
thinking about features not
algorithms.
16. Can we support more sophisticated
image processing in DeepDive?
17. Go Beyond Text-Processing
What kind
of dinosaur
is this?
Does this
patient have
short finger?
Is this sea
star found in
2014 sick?
What’s the
Clinical out-
come of this
patient?
Images are important to many scientific questions.
[User] Can I run Deep Learning on my
datasets with DeepDive?
18. Just before we start the run…
On which machine should we run? CPU or GPU?
I have a GPU
Cluster
I have 5000 CPU cores
I have $100K to spend
on the cloud
EC2: c4.4xlarge
8 cores@2.90GHz
EC2: g2.2xlarge
1.5K cores@800MHz
0.7TFlops 1.2TFlops
Not a 10x gap? Can we close this gap?
25. One of the four shallow ideas…
3 CPU Cores 3 Images Strategy 1 Strategy 2
If the amount of data is too small for each core, the
process might not be CPU bound.
For AlexNet over Haswell CPUs, Strategy 2 is 3-4x faster.
27. Application 1: Paleontology
Images without high-quality human labels also
contain valuable information.
What can we learn from these
images without human labels?
Name of Fossil
Fossil Image
28. Application 1: Paleontology
We apply Distant Supervision!
Porifera Brachiopoda
ClassifierDocument
Can we build a system that automatically “reads” a
Paleontology textbook and learn the difference
between sponges and shells?
29. Application 1: Paleontology
29
Fig. 387,1a-c. *B. rara, Serpukhovian, Kazakhstan,
Dzhezgazgan district; a,b, holotype, viewed
ventrally, laterally, MGU 31/342, XI (Litvinovich, 1967);
Figure Name Mention Taxon Mention
DeepDive Extractions
Fig. 387
Figures
Provide Labels
Train CNN
Test with Human Labels
3K Brachiopoda Images
2K Porifera Images
Accuracy = 94%
Hi everyone, thanks for coming.
Today, I am going to talk about a system that we have being
building over the last couple years called DeepDive, and the
Deep Learning engine for it called Caffe con Troll.
DeepDive is a system we build to support a workload called knowledge
base construction. I will talk more about DeepDive, but for now you
can think about it as a system that takes as input unstructured documents, for
example, a text document containing natural language text, and output
relations extracted from the input document. For example, given a text
like this, DeepDive can extract relations between rock formations and
their age and location, as database relations.
For those of you who are familiar with databases, this might sound
like an information extraction task; and for those of you who are
familiar with machine learning and natural language processing,
this might sound like a relation extraction task. DeepDive learned
a lot from these two communities. On the other hand, DeepDive
tries to extend this ability to other sources of input, like tables,
document layout, and even figures.
This focus on the diversity of input sources require DeepDive to
consume a diverse set of features, no matter they are produced
by user defined function over parse trees of natural language sentences,
or image features extracted by state-of-the-art image processing
algorithms like Deep Learning.
In this talk, I will first tell you more about DeepDive, and a deep learning
Engine we built called Caffe con Troll. These two pieces of softwares
are currently ready for you to download and play with. Other than these
two pieces, I will also tell you about one of our on-going direction that
tries to fuse these two pieces more tightly together.
Over the last couple years, we have learned from many of
Our scientisits collaborators that many pressing scientific
Questions are macroscopic. That is, to get insights and hints
To these questions, one often need to aggregate a huge amount
Of facts about a certain domain.
One of such example can be found in Paleontology.
When the scentists want to understand questions like what is the impact
of climate change to biodiversity, one way to get some insights is to
start from a collection of facts about what fossils appear in what
rock formations, and the age and location of a given rock formation.
Given this collection of fact, they can aggregate them by time to get
a biodiversity curve, which essentially tells us at a given time how
many distinct species are there. From this macroscopic view,
geoscientists can get some insights to help them understanding scientific
questions like the one about climate changes.
We can see that if we want to support this workflow from scientific
facts to scientific insights, we first need to have this collection
of facts ready for analysis, ideally in a structured form that we can
query then with different analytic tools.
However, in practice, many of these facts are not currently organized
in a structured way like a relational tables in database, instead, many
are published in journal articles or books that could be more than four
hundreds years old.
In this talk, I will call this collection of scientific facts organized in
structured form knowledge base, and the process of extracting these
facts from input resources knowledge base construction. As we can seehere, this KBC step could provide one possible starting point for scientists to understand key scientific questions.
This example of workload not only appears for Paleontology. In our past
experience, we find that similar KBC process could be useful for a wholerange of other domains. For example, for genomics, if we could build aknowledge base between gene, drug, and diseases from published journalarticles, it could have genomistis to understand better about drug-repurposingor personalized medicine. If we could build a knowledge base by extracting
from bad guys’ communication on the Web, we could provide opportunities to
make this world a better place.
Now that we have seen that KBC could be an useful process, then a naturalquestion to ask is
Can we just do KBC manually?
Let’s still use our Paleontology example where our target knowledge is about
fossils, rock formations, their age, and location.
Actually, building such a knowledge is so important that people are actually
trying to build it manually.
More than thirty years ago, Sepkoski manually compiled a knowledge base thatcontains more than three thousands of animal families by manually extractingthese facts from four hundreds references. This monograph along has beencited hundreds and times and lead to many discoveries.
The importance compiling such a knowledge base is actually get noticedby the community, and starting from twenty years ago, more than 300 professionalpaleonotlogists have spend more than eight continuous human years to compilea knowledge base called PaleoDB that contains more than 55 thousands references.This project is also highly successfully and lead to more than 200 papers, many ofwhich published in Nature or Science.
However, this effort of manually constructing knowledge base has its own limitations.Take one of the largest database of geoscience-related publications, the number ofreferences are growing in a rate of 100 thousands references per year. If we comparethis rate with the total size of PaleoDB, it means that it might take 16 continuoushuman years just to keep-up-to-date every year. Although this estimation is pretty rough,
it does show us how expensive and time-consuming that manual KBC could be given
the amount of information being produced in recent years. This huge number of publications are not unique for geoscience, and every years there are
millions of new papers published in all fields of science.
Motivated by the sheer amount of data that we need to extract information from, one question
that we are really interested in is
Can we build a machine to read for us?
Here the word “read” could mean a lot of things for human beings, but letsmake it precise for a machine. By “read”, we mean that
Can we build a machine that takes as input all these input sources likejournal articles, and automatically fill in a knowledge base stored as
database tables?
DeepDive is the system we built make this process easier. Over the last
Couple years, we have been building this type of systems for a range
Of domains, and let me tell you more.
One application we built is called PaleoDeepDive.
The goal is to extract paleobiological facts to build higher coverage fossil records.
The input of the system is a collection of journal articles, and the output is a knowledge base containing
information about fossils.
For example, if we see this sentence from the journal article, we expect the system output
A tuple to encode the fact that the dinasaur T. Rex appears in the age Cretaceous.
One of the most interesting aspect of PaleoDeepDive is that it extracts relationsin exactly the same schema as PaleoDB, the manually curated knowledge basethat I just described. This enables us to compare the quality of PaleoDeepDivewith professional volunteers for the KBC task.
As we just mentioned, PaleoDB is an effort from three hundreds professionalpaleontologists who spend 8 continuous human years together. PaleoDB containsmore than 55 thousands documents, 100 thousands fossil mentions, and more than
1 million relations.
On the other hand, PaleoDeepDive is a machine curated knowledge base thatuses more than 2000 machine cores and 46 machine years. It processes more than300 thousands journal articles, and extracted three million fossil mentions and 2million relations.
We can aggregate both systems to get a biodiversity curve with high correlations.
We actually conduct double-blind experiments to ask scientists to label the factsin PaleoDB or PaleoDeepDive on their correctness, and find that on the same relationPaleoDeepDive achieves equal and sometimes better precision as professionalhuman volunteers.
DeepDive has been used to build similar applications cross different domains.
At the early stage of developing DeepDive, we ourselves are the developer of KBCs and get lot of helps fromdomain scientists. Some of these systems actually get pretty high quality. The PaleoDeepDive work that I
Just mentioned was featured in a news article in the July issue of Nature and we are pretty excited
about it. And according to that article, some geoscientists are impressed by how high the quality of our system is.
We also developed a KBC system to extraction information from Web, and it produces the highest score in apopular KBC competition among 18 teams.
Now, KBC system are usually developed by domain experts beside us including applications
In the domain of pharmacogenomics or applied physicas.
To make these domain experts to actually use our system by themselves, DeepDive models the whole
Process as a large inference problem, and the only thing the user needs to do is to keep providing features
To the system without worrying about what algorithm is actually running inside DeepDive.
One of the common thing about these appicatoins is that at their early stage many of them can
Only extract information from text and tables. After we understand how to do these tasks for
Text and table, we find that we need to understand images better.
Although currently DeepDive can support some textural extractions from images, this functionality
could be significantly extended and improved.We are interested in figures and imags because sources like images are important to many scientific questions.
For example, if we could build high-quality image recognition tools with DeepDive, we could automaticallyclassify fossils into different classes or orders; it could also help genomists to automatically identifyphenotypes of a given patient and use this information with the patients’ genotype of decide a plan fortreatment.
We talk about lot of our scientisits collaborators on what type of information they want out of images,
And one of them most frequently first questions they ask is can you guys just run deep learning
On my set of images and classify them?
At first, we think, well, this requriement should be easy to support. There are lot of awesome tools out
There like Caffe or Theano that makes it really easy to specify a deep learning task and run it.
So we start to set up the machine to run this for them, but just before we start the run,
One question that comes up is on which machine should we run? Should we run them on a lot
Of GPUs or just a lot of CPUs?
So we look at existing papers and systems to try to understand this question, but
What we found is somehow a mixed set of information. For some systems it is
Not uncommon for it to run 10 times faster on GPU than CPU; but there are also
Many successful systems built in indrustry that takes advatnage of a cluster of
CPU pretty well. More over, users often have access to a largely diverse set of
Resources, some have a GPU cluster, some have a lot of CPU cores, and some of
Some credits on the clouds that they can spend. When we put all these information
Together, we were kind of getting more confused on this question.
So we decide to look into more deeply into this question, and we start to investigate
The difference between CPU and GPU in terms of running Deep Learning workload.
If we compare a CPU and a GPU that are available on cloud provided by Amazon, we can see that thedifference between the amount of floating point operations they can do per second is not thatdifferent. Therefore, one natural question to ask is because there is not a terrible gap on thepeak flops, can we achieve this peak? Or is there anything special in CPU that prevents us fromachieving this peak?
Therefore, we built a very simple prototype system called
Caffe con Troll to study this question.
We design Caffe con Troll to be a prototype that takes
The same input as a popular deep learning framework
Called Caffe and produces the same output.
We find that the performance of CPU could be optimized such that we get near 80% of the peakflops on CPU. On a single CPU our CPU implementation could be more than 5 times
faster than Caffe.
We find that the performance of CPU could be optimized such that we get near 80% of the peakflops on CPU. On a single CPU our CPU implementation could be more than 5 times
faster than Caffe.
We find that the performance of CPU could be optimized such that we get near 80% of the peakflops on CPU. On a single CPU our CPU implementation could be more than 5 times
faster than Caffe.
Actually, when we add one more CPU to the machine, we get almost 2x
Speed up on CPU and these two 8-core Haswell CPU can match the speed of asingle GPU aviable on Amazon’s cloud.
But a more interesting result is that under our implementation, the difference of running deep learning
Over CPU and GPU is proportional to the number of floating points that a device could provide.
This provides a very simple rule-of-the-thumb for us to guide our users when they ask what types of
Device they need.
Also, because CPU is not that slow compared with GPU, it makes more sense to ask the question
Of how to run Deep Learning over a machine with both CPUs and GPUs together.
Surprisingly, how to achieve this speed-up is actually pretty simple,
Recall that one of our motivation of buildign Caffe con Troll is to
Help our users of DeepDive to process their images. Now we have
A prototype impementation to run Deep learning, how are we going
To use it for knowledge base construction?
The answer to this question is still under exploration, and I will just
Tell you about a very preliminary application that we building.
One interesting direction that we are exploring is how to combine Deep learning with DeepDive.If we are able to run Deep learning efficiently, how can it help DeepDive and our scientists users?
One observation is that state-of-the-art image recongnition methods often require a corpus withhuman labels. However, image without high-quality human labels might also contain value information.
Take this page in a paleontology journal articles for exam, although there are no human labelstelling us the name of the fossil, such name actually appears in the same document.
Therefore, one question that we are interested in is what can we learn from these images without humanlabels but has rich information surrounding it?
More concretely, can we build a system that automatically reads a Paleontologytextbook and learn the difference between different classes of fossils, like sponges and shells?If such a system is possible, the input would be a set of journal articles, and theoutput would be a classifier that can distinguish between different classes of fossils.
Our hypothesis is that we could extend the idea of distant supervision for text processing to automatically generated labels for image applications.
To study this hypothesis, we have built a very simple prototype that use DeepDiveto extract information between Figure number and fossil names from the text, anduse it to label Figures as training examples. We then train convolutionalneural network on this distantly-generated labels. Some early result shows thatwe can achieve pretty high accuracy on this simple task.
This result is pretty preliminary, but we hope to explore further in similar directionsto understand what framework should we provide to the user such that they couldbuild similar applications easily.
Thank you for your attention, both systems that I talk about today
Can be downloaded from their Web sites, and we are actively
Working on understanding how to fuse these two systems together
To support applications that requires inference cross text and images.