SlideShare una empresa de Scribd logo
1 de 47
Descargar para leer sin conexión
The REturn of
Clojure Data Science
Elise Huard @elise_huard - Euroclojure 2017
Thursday, 20 July 17
Thursday, 20 July 17
Thursday, 20 July 17
Thursday, 20 July 17
The REturn of
Clojure Data Science
Elise Huard @elise_huard - Euroclojure 2017
Thursday, 20 July 17
http://www.mastodonc.com/
Thursday, 20 July 17
Basic tooling
Data structures
2 Examples
Roadmap?
Credits
Thursday, 20 July 17
Your Mileage
May vary
Thursday, 20 July 17
Basic tooling
Thursday, 20 July 17
“Awkward-sized
data”
Thursday, 20 July 17
Sampling
https://github.com/bigmlcom/sampling
Thursday, 20 July 17
Making sense of data
Thursday, 20 July 17
Making sense of data
Thursday, 20 July 17
Notebooks
Thursday, 20 July 17
Thursday, 20 July 17
Thursday, 20 July 17
Thursday, 20 July 17
Data
structures
Thursday, 20 July 17
(deftype DataSet
[^IPersistentVector column-names
^IPersistentVector columns
^IPersistentVector shape]
clojure.lang.IMeta
(meta [m]
nil)
clojure.lang.IObj
(withMeta [m meta]
(with-meta (mp/convert-to-nested-vectors m) meta))
...)
Thursday, 20 July 17
Thursday, 20 July 17
Thursday, 20 July 17
Plain old Clojure Data
Structures
[{:vote "I would probably vote for it", :age 57, :rural
"urban", :arguments-against "It might encourage people to stop
working", :age_group "40_65", :country_code "AT", :weight
"1.533.248.826", :arguments-for "It increases appreciation for household
work and volunteering | It encourages financial independence and self-
responsibility | It reduces anxiety about financing basic needs", :gender
"male", :dem_has_children "yes", :dem_full_time_job
"yes", :dem_education_level "high", :awareness "I understand it
fully"} ... ]
Thursday, 20 July 17
clojure.spec
(s/def ::acctid int?)
(s/def ::first-name string?)
(s/def ::last-name string?)
(s/def ::email ::email-type)
(s/def ::person (s/keys :req [::first-name ::last-name ::email]
:opt [::phone]))
Thursday, 20 July 17
clojure.spec
(defn -integer?
[x]
(cond (string? x) (str->int x)
(clojure.core/integer? x) x
(and (clojure.core/double? x)
(double->int x)) (double->int x)
:else :clojure.spec/invalid))
(def integer? (s/conformer -integer?))
Thursday, 20 July 17
shiny transducers:
kixi.stats
• Arithmetic mean
• Geometric mean
• Harmonic mean
• Variance
• Standard deviation
• Standard error
• Skewness
• Kurtosis
• Covariance
• Covariance matrix
• Correlation
• Correlation matrix
• Simple linear regression
...
https://github.com/MastodonC/kixi.stats
Thursday, 20 July 17
(->> [{:x 1 :y 3 :z 2} {:x 2 :y 2 :z 4} {:x 3 :y
1 :z 6}]
(transduce identity (correlation-matrix
{:x :x :y :y :z :z})))
;; => {[:x :y] -1.0, [:x :z] 1.0, [:y :z] -1.0,
;; [:y :x] -1.0, [:z :x] 1.0, [:z :y] -1.0}
Thursday, 20 July 17
xform https://github.com/cgrand/xforms
redux https://github.com/henrygarner/redux
huri https://github.com/sbelak/huri
Thursday, 20 July 17
Example:
universal basic
income Eu survey
Thursday, 20 July 17
Thursday, 20 July 17
Thursday, 20 July 17
Thursday, 20 July 17
(def in-favour (filter #(= (:vote %) "I would vote for it") data))
(defn comp-by-numbers
[a b]
(> (:how-many a) (:how-many b)))
(defn tally-numbers-fn
[d]
(fn[reason] (hash-map :reason reason
:how-many (reduce + 0 (map #(get % reason) d)))))
(table-view (sort comp-by-numbers
(map (tally-numbers-fn in-favour) reasons-for)))
Thursday, 20 July 17
Thursday, 20 July 17
Thursday, 20 July 17
Example:
neural nets
Thursday, 20 July 17
https://github.com/thinktopic/cortex
Cortex
deeplearning4j
https://deeplearning4j.org/
Thursday, 20 July 17
http://yann.lecun.com/exdb/mnist/
Thursday, 20 July 17
(defn initial-description
[input-w input-h num-classes]
[(layers/input input-w input-h 1 :id :data)
(layers/convolutional 5 0 1 20)
(layers/max-pooling 2 0 2)
(layers/dropout 0.9)
(layers/relu)
(layers/convolutional 5 0 1 50)
(layers/max-pooling 2 0 2)
(layers/batch-normalization)
(layers/linear 1000)
(layers/relu :center-loss {:label-indexes {:stream :labels}
:label-inverse-counts
{:stream :labels}
:labels {:stream :labels}
:alpha 0.9
:lambda 1e-4})
(layers/dropout 0.5)
(layers/linear num-classes)
(layers/softmax :id :labels)])
Thursday, 20 July 17
Thursday, 20 July 17
Thursday, 20 July 17
Thursday, 20 July 17
Java bindings
https://github.com/mastodonc/kixi.mallet
long wishlist ...
Thursday, 20 July 17
Roadmap?
Thursday, 20 July 17
•Notebooks: continue to improve/
upgrade gorilla-repl, new
alternative?
•Adding more to kixi.stats
•Mine the good bits of incanter
•More clojure bindings to java libs
Thursday, 20 July 17
Credits
Thursday, 20 July 17
David Edgar Liebke
Michael Anderson
Jony Hudson
Carin Meyer
Simon Belak
...
Thursday, 20 July 17
Thank you
Elise Huard @elise_huard - Euroclojure 2017
Thursday, 20 July 17

Más contenido relacionado

Similar a Euroclojure 2017

Making use of OpenStreetMap data with Python
Making use of OpenStreetMap data with PythonMaking use of OpenStreetMap data with Python
Making use of OpenStreetMap data with Python
Andrii Mishkovskyi
 
Document-Oriented Databases: Couchdb Primer
Document-Oriented Databases: Couchdb PrimerDocument-Oriented Databases: Couchdb Primer
Document-Oriented Databases: Couchdb Primer
jsiarto
 

Similar a Euroclojure 2017 (11)

The Artful Business of Data Mining: Computational Statistics with Open Source...
The Artful Business of Data Mining: Computational Statistics with Open Source...The Artful Business of Data Mining: Computational Statistics with Open Source...
The Artful Business of Data Mining: Computational Statistics with Open Source...
 
Unlocking Museum Systems with Open Source
Unlocking Museum Systems with Open SourceUnlocking Museum Systems with Open Source
Unlocking Museum Systems with Open Source
 
CS4TX Austin - Sept 2017
CS4TX Austin - Sept 2017CS4TX Austin - Sept 2017
CS4TX Austin - Sept 2017
 
Data visualization in python/Django
Data visualization in python/DjangoData visualization in python/Django
Data visualization in python/Django
 
University of arizona mobile matters - technology, a means to an end
University of arizona   mobile matters - technology, a means to an endUniversity of arizona   mobile matters - technology, a means to an end
University of arizona mobile matters - technology, a means to an end
 
Rails in the enterprise
Rails in the enterpriseRails in the enterprise
Rails in the enterprise
 
Linked open data sandwich
Linked open data sandwichLinked open data sandwich
Linked open data sandwich
 
Teaching Programming Online
Teaching Programming OnlineTeaching Programming Online
Teaching Programming Online
 
Making use of OpenStreetMap data with Python
Making use of OpenStreetMap data with PythonMaking use of OpenStreetMap data with Python
Making use of OpenStreetMap data with Python
 
Document-Oriented Databases: Couchdb Primer
Document-Oriented Databases: Couchdb PrimerDocument-Oriented Databases: Couchdb Primer
Document-Oriented Databases: Couchdb Primer
 
Session 1.5 supporting virtual integration of linked data with just-in-time...
Session 1.5   supporting virtual integration of linked data with just-in-time...Session 1.5   supporting virtual integration of linked data with just-in-time...
Session 1.5 supporting virtual integration of linked data with just-in-time...
 

Más de ehuard

Ruby goes to hollywood
Ruby goes to hollywoodRuby goes to hollywood
Ruby goes to hollywood
ehuard
 
Ruby hollywood
Ruby hollywoodRuby hollywood
Ruby hollywood
ehuard
 
Concurrency
ConcurrencyConcurrency
Concurrency
ehuard
 

Más de ehuard (15)

Ruby goes to Hollywood
Ruby goes to HollywoodRuby goes to Hollywood
Ruby goes to Hollywood
 
Ruby hollywood nordic
Ruby hollywood nordicRuby hollywood nordic
Ruby hollywood nordic
 
Ruby goes to hollywood
Ruby goes to hollywoodRuby goes to hollywood
Ruby goes to hollywood
 
Ruby hollywood
Ruby hollywoodRuby hollywood
Ruby hollywood
 
Concurrency: Rubies, plural
Concurrency: Rubies, pluralConcurrency: Rubies, plural
Concurrency: Rubies, plural
 
Concurrency
ConcurrencyConcurrency
Concurrency
 
Concurrency
ConcurrencyConcurrency
Concurrency
 
12 hours to rate a rails application
12 hours to rate a rails application12 hours to rate a rails application
12 hours to rate a rails application
 
how to rate a Rails application
how to rate a Rails applicationhow to rate a Rails application
how to rate a Rails application
 
12 Hours To Rate A Rails Application
12 Hours To Rate A Rails Application12 Hours To Rate A Rails Application
12 Hours To Rate A Rails Application
 
Barcamp Ghent2009
Barcamp Ghent2009Barcamp Ghent2009
Barcamp Ghent2009
 
Tokyo Cabinet
Tokyo CabinetTokyo Cabinet
Tokyo Cabinet
 
The real-time web
The real-time webThe real-time web
The real-time web
 
Rails and the internet of things
Rails and the internet of thingsRails and the internet of things
Rails and the internet of things
 
Oauth
OauthOauth
Oauth
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Euroclojure 2017