5. Accidental complexity in existing tools
Pig The query language is different
than the programming language
Hive
6. When query tool separate from
programming language
Friction when embedding custom operations
Interlacing queries with regular application logic
is unnatural
Generating queries dynamically is difficult
7. Clojure
General purpose programming language
Dialect of Lisp that compiles to Java bytecode
“Programmable programming language”: Easy to
build Domain Specific Languages (DSL) in Clojure
10. Cascalog
Full power of a general purpose
programming language available at all times
Cascalog is a Clojure library
Example query: (?<- (stdout) [?p ?a] (age ?p 25))
12. Some of Cascalog’s features
Inner and outer joins
Aggregators
Functions
Subqueries
Sorting
Read from and write to arbitrary data sources
› HDFS
› HBase
› MySQL
› Etc.
13. When query tool separate from
programming language
Friction when embedding custom operations
Interlacing queries with regular application logic
is unnatural
Generating queries dynamically is difficult
14. Cascalog, on the other hand...
Custom operations defined just like any other
function
Interlacing queries with regular application logic
is trivial
Generating queries dynamically is easy and
idiomatic
15. Try Cascalog yourself!
Project Page
http://www.github.com/nathanmarz/cascalog
Introductory Tutorial
http://nathanmarz.com/blog/introducing-
cascalog/
5 minutes to install Clojure, Hadoop, and
Cascalog locally! See project README