SlideShare una empresa de Scribd logo
1 de 27
Scala and Python
Integrating scikit-learn into a Scala Stack to build
realtime predictive models
Dan Chiao
VP Engineering
Why it was necessary
We pivoted
The original product
• Social data append
– PeopleGraph: match email addresses
to public demographics and social
profiles
– BrandGraph: match company URLs to
public firmographics and social
profiles
• Requirements
– Integrate a large (and expanding)
number of web data sources (REST,
SOAP, flat files)
– Realtime processing of large volumes
of contacts (60 queries/s)
The original technology stack
• Scala
– Best of both worlds
• Concise functional syntax
• Java libraries and deployment architecture
• Scala-specific libraries (Dispatch, Lift Web Framework)
• Twitter (soon to be Apache) Storm
– Streaming intake and normalization of large amounts of data
• MongoDB
– Expanding data sources = constantly updating schema
– Most sophisticated query syntax of NoSQL options
• AWS and Azure
– Well, duh
The new product
• Moving up the application stack
– Focus on the most compelling single-use case for our data
– Fliptop SpendScore
• Predictive analytics for sales and marketing teams
• “Machine learning for Salesforce”
The updated technology stack
• Still need to wrangle large amounts of data, so no changes
there
• New requirement: fast, scalable machine learning
Why not Scala (Java) native?
• The options
– Apache Mahout
• Only skeleton implementations for most sophicated machine
learning techniques (e.g. Random Forest, Adaboost)
• Customer-specific models – don’t need Big Data
– Weka – GPL
– Scala-native libraries – Too early to use in production
Why Python?
• scikit-learn
– Mature – around since 2006
– Actively-developed – Last stable release Aug 2013
– Sophisticated – Random Forest and Adaboost classifier show
comparable performance to R
• Why not R? Not really production grade.
Requirements
• APIs to exploit Python’s modeling power
– Train, predict, model info query, etc.
• Scalability
– On demand Python serving nodes
Tools for Scala-Python Integration
• Reimplementation of Python
– Jython (JPython)
• Communication through JNI
– Jepp
• Communication through IPC
– Apache Thrift
• Communication through REST API calls
– Bottle
Jython
• Re-Implementation of Python in Java
• Can import and use any Java class.
• Includes almost all of the modules in the standard Python
distribution
– Except some of the modules implemented originally in C.
• Compiles to Java bytecode
– either on demand or statically.
1
1
Jython
1
2
JVM
Scala Code
Python Code
Jython
Jython
• Lacks support for lots of extensions for scientific computing
– Numpy, Scipy, etc.
• JyNI (Jython Native Interface) to the rescue?
– Specifically designed to support CPython extensions like
Numpy, Scipy
– Still in alpha
1
3
Communication through JNI
• Jepp (Java Embedded Python)
– Embeds CPython in Java
– Runs Python code in CPython
– Leverages both JNI and Python/C for integration
Python Interpreter
Jepp
1
5
JVM
Scala Code
Python Code
JNI Jepp
Jepp
1
6
object TestJepp extends App {
val jep = new Jep()
jep.runScript("python_util.py")
val a = (2).asInstanceOf[AnyRef]
val b = (3).asInstanceOf[AnyRef]
val sumByPython = jep.invoke("python_add", a, b)
println(sumByPython.asInstanceOf[Int])
}
def python_add(a, b):
return a + b
python_util.py
TestJepp.scala
Communication through IPC
• Apache Thrift
– Developed & open-sourced by Facebook
– More community support than Protobuf, Avro
– IDL-based (Interface Definition Language)
– Generates server/client code in specified languages
– Take care of protocol and transport layer details
– Comes with generators for Java, Python, C++, etc.
• No Scala generator
• Scrooge (Twitter) to the rescue!
1
7
Thrift – IDL
1
8
namespace java python_service_test
namespace py python_service_test
service PythonAddService
{
i32 pythonAdd (1:i32 a, 2:i32 b),
}
TestThrift.thrift
$ thrift --gen java --gen py TestThrift.thrift
Thrift – Python Server
1
9
class ExampleHandler(python_service_test.PythonAddService.Iface):
def pythonAdd(self, a, b):
return a + b
handler = ExampleHandler()
processor = Example.Processor(handler)
transport = TSocket.TServerSocket(9090)
tfactory = TTransport.TBufferedTransportFactory()
pfactory = TBinaryProtocol.TBinaryProtocolFactory()
server = TServer.TThreadedServer(processor, transport, tfactory, pfactory)
server.serve()
PythonAddServer.py
class Iface:
def pythonAdd(self, a, b):
pass
PythonAddService.p
y
Thrift – Scala Client
2
0
object PythonAddClient extends App {
val transport: TTransport = new TSocket("localhost", 9090)
val protocol: TProtocol = new TBinaryProtocol(transport)
val client = new PythonAddService.Client(protocol)
transport.open()
val sumByPython = client.python_add(3, 5)
println("3 + 5 = " + sumByPython)
transport.close()
}
PythonAddClient.sc
ala
Thrift
2
1
JVM Scala Code
Thrift
Python Code
Python Interpreter
Thrift
Python Code
Python Interpreter
Thrift
…
Auto Balancing、
Built-in Encryption
REST API Architecture
2
2
…Bottle
Python Code
Bottle
Python Code
Bottle
Python Code
JVM
Scala Code
Auto Balancer?
Encoding?
Thrift v.s. REST
Thrift REST
Load Balancer
✔
Encode/Decode
✔
Low Learning Curve
✔
No Dependency
✔
Does it matter?
No
(AWS & Azure)
No
(We’re already doing
it)
Yes
Yes
Fliptop’s Architecture
2
4
Load Balancer
…Bottle
Python Code
Bottle
Python Code
Bottle
Python Code
JVM Scala Code
5 Python servers
~5,000 requests/sec
Summary
• Jython
• (✓) Tight integration with Scala/Java
• (✗) Lack support for C extensions (JyNI might help in the future)
• Jepp
• (✓) Access high quality Python extensions with CPython speed
• (✗) Two runtime environments
• Thrift, REST
• (✓) Language-independent development
• (✗) Bigger communication overhead
2
5
Questions?
Ask this guy
Thank You
2
7

Más contenido relacionado

La actualidad más candente

Python Programming - XIII. GUI Programming
Python Programming - XIII. GUI ProgrammingPython Programming - XIII. GUI Programming
Python Programming - XIII. GUI ProgrammingRanel Padon
 
Python Programming
Python ProgrammingPython Programming
Python Programmingsameer patil
 
Why Python?
Why Python?Why Python?
Why Python?Adam Pah
 
Python and Machine Learning
Python and Machine LearningPython and Machine Learning
Python and Machine Learningtrygub
 
Rifartek Robot Training Course - How to use ClientRobot
Rifartek Robot Training Course - How to use ClientRobotRifartek Robot Training Course - How to use ClientRobot
Rifartek Robot Training Course - How to use ClientRobotTsai Tsung-Yi
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyTravis Oliphant
 
Programming with Python - Basic
Programming with Python - BasicProgramming with Python - Basic
Programming with Python - BasicMosky Liu
 
Ekon 25 Python4Delphi_MX475
Ekon 25 Python4Delphi_MX475Ekon 25 Python4Delphi_MX475
Ekon 25 Python4Delphi_MX475Max Kleiner
 
Introduction about Python by JanBask Training
Introduction about Python by JanBask TrainingIntroduction about Python by JanBask Training
Introduction about Python by JanBask TrainingJanBask Training
 
Getting started with Linux and Python by Caffe
Getting started with Linux and Python by CaffeGetting started with Linux and Python by Caffe
Getting started with Linux and Python by CaffeLihang Li
 
Extending Python with ctypes
Extending Python with ctypesExtending Python with ctypes
Extending Python with ctypesAnant Narayanan
 
C Types - Extending Python
C Types - Extending PythonC Types - Extending Python
C Types - Extending PythonPriyank Kapadia
 
Python 3.5: An agile, general-purpose development language.
Python 3.5: An agile, general-purpose development language.Python 3.5: An agile, general-purpose development language.
Python 3.5: An agile, general-purpose development language.Carlos Miguel Ferreira
 
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...Edureka!
 
Sour Pickles
Sour PicklesSour Pickles
Sour PicklesSensePost
 
Python Seminar PPT
Python Seminar PPTPython Seminar PPT
Python Seminar PPTShivam Gupta
 

La actualidad más candente (20)

Python Programming - XIII. GUI Programming
Python Programming - XIII. GUI ProgrammingPython Programming - XIII. GUI Programming
Python Programming - XIII. GUI Programming
 
Python Programming
Python ProgrammingPython Programming
Python Programming
 
Why Python?
Why Python?Why Python?
Why Python?
 
Python and Machine Learning
Python and Machine LearningPython and Machine Learning
Python and Machine Learning
 
Python final ppt
Python final pptPython final ppt
Python final ppt
 
Introduction to Python Basics Programming
Introduction to Python Basics ProgrammingIntroduction to Python Basics Programming
Introduction to Python Basics Programming
 
Rifartek Robot Training Course - How to use ClientRobot
Rifartek Robot Training Course - How to use ClientRobotRifartek Robot Training Course - How to use ClientRobot
Rifartek Robot Training Course - How to use ClientRobot
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
 
Programming with Python - Basic
Programming with Python - BasicProgramming with Python - Basic
Programming with Python - Basic
 
Ekon 25 Python4Delphi_MX475
Ekon 25 Python4Delphi_MX475Ekon 25 Python4Delphi_MX475
Ekon 25 Python4Delphi_MX475
 
Introduction about Python by JanBask Training
Introduction about Python by JanBask TrainingIntroduction about Python by JanBask Training
Introduction about Python by JanBask Training
 
Getting started with Linux and Python by Caffe
Getting started with Linux and Python by CaffeGetting started with Linux and Python by Caffe
Getting started with Linux and Python by Caffe
 
Extending Python with ctypes
Extending Python with ctypesExtending Python with ctypes
Extending Python with ctypes
 
C Types - Extending Python
C Types - Extending PythonC Types - Extending Python
C Types - Extending Python
 
Python 3.5: An agile, general-purpose development language.
Python 3.5: An agile, general-purpose development language.Python 3.5: An agile, general-purpose development language.
Python 3.5: An agile, general-purpose development language.
 
Ctypes
CtypesCtypes
Ctypes
 
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
 
Sour Pickles
Sour PicklesSour Pickles
Sour Pickles
 
Python Seminar PPT
Python Seminar PPTPython Seminar PPT
Python Seminar PPT
 
Python
PythonPython
Python
 

Destacado

Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with pythonSimone Piunno
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...Jose Quesada (hiring)
 
Data Engineering with Solr and Spark
Data Engineering with Solr and SparkData Engineering with Solr and Spark
Data Engineering with Solr and SparkLucidworks
 
Scala for Machine Learning
Scala for Machine LearningScala for Machine Learning
Scala for Machine LearningPatrick Nicolas
 
Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Ac...
Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Ac...Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Ac...
Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Ac...Jonas Bonér
 
Machine Learning with Spark MLlib
Machine Learning with Spark MLlibMachine Learning with Spark MLlib
Machine Learning with Spark MLlibTodd McGrath
 
PredictionIO – A Machine Learning Server in Scala – SF Scala
PredictionIO – A Machine Learning Server in Scala – SF ScalaPredictionIO – A Machine Learning Server in Scala – SF Scala
PredictionIO – A Machine Learning Server in Scala – SF Scalapredictionio
 
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics FrameworkOverview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics FrameworkSlim Baltagi
 
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Slim Baltagi
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsSlim Baltagi
 
Machine Learning using Apache Spark MLlib
Machine Learning using Apache Spark MLlibMachine Learning using Apache Spark MLlib
Machine Learning using Apache Spark MLlibIMC Institute
 
MLlib and Machine Learning on Spark
MLlib and Machine Learning on SparkMLlib and Machine Learning on Spark
MLlib and Machine Learning on SparkPetr Zapletal
 
Jython 2.7 and techniques for integrating with Java - Frank Wierzbicki
Jython 2.7 and techniques for integrating with Java - Frank WierzbickiJython 2.7 and techniques for integrating with Java - Frank Wierzbicki
Jython 2.7 and techniques for integrating with Java - Frank Wierzbickifwierzbicki
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications Ahmed_hashmi
 
Hidden Markov Model & Stock Prediction
Hidden Markov Model & Stock PredictionHidden Markov Model & Stock Prediction
Hidden Markov Model & Stock PredictionDavid Chiu
 

Destacado (20)

Python to scala
Python to scalaPython to scala
Python to scala
 
Python y Flink
Python y FlinkPython y Flink
Python y Flink
 
Piazza 2 lecture
Piazza 2 lecturePiazza 2 lecture
Piazza 2 lecture
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with python
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
 
Data Engineering with Solr and Spark
Data Engineering with Solr and SparkData Engineering with Solr and Spark
Data Engineering with Solr and Spark
 
Scala for Machine Learning
Scala for Machine LearningScala for Machine Learning
Scala for Machine Learning
 
Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Ac...
Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Ac...Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Ac...
Akka: Simpler Scalability, Fault-Tolerance, Concurrency & Remoting through Ac...
 
Machine Learning with Spark MLlib
Machine Learning with Spark MLlibMachine Learning with Spark MLlib
Machine Learning with Spark MLlib
 
PredictionIO – A Machine Learning Server in Scala – SF Scala
PredictionIO – A Machine Learning Server in Scala – SF ScalaPredictionIO – A Machine Learning Server in Scala – SF Scala
PredictionIO – A Machine Learning Server in Scala – SF Scala
 
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics FrameworkOverview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
 
Hidden markov model
Hidden markov modelHidden markov model
Hidden markov model
 
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Machine Learning using Apache Spark MLlib
Machine Learning using Apache Spark MLlibMachine Learning using Apache Spark MLlib
Machine Learning using Apache Spark MLlib
 
MLlib and Machine Learning on Spark
MLlib and Machine Learning on SparkMLlib and Machine Learning on Spark
MLlib and Machine Learning on Spark
 
Jython 2.7 and techniques for integrating with Java - Frank Wierzbicki
Jython 2.7 and techniques for integrating with Java - Frank WierzbickiJython 2.7 and techniques for integrating with Java - Frank Wierzbicki
Jython 2.7 and techniques for integrating with Java - Frank Wierzbicki
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
 
Hidden Markov Model & Stock Prediction
Hidden Markov Model & Stock PredictionHidden Markov Model & Stock Prediction
Hidden Markov Model & Stock Prediction
 

Similar a How to integrate python into a scala stack

TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsStijn Decubber
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Michael Rys
 
ApacheCon 2021 Apache Deep Learning 302
ApacheCon 2021   Apache Deep Learning 302ApacheCon 2021   Apache Deep Learning 302
ApacheCon 2021 Apache Deep Learning 302Timothy Spann
 
Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Wes McKinney
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkDatabricks
 
AI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayAI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayNick Pentreath
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Timothy Spann
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019Travis Oliphant
 
Python 101 for the .NET Developer
Python 101 for the .NET DeveloperPython 101 for the .NET Developer
Python 101 for the .NET DeveloperSarah Dutkiewicz
 
Integrating Existing C++ Libraries into PySpark with Esther Kundin
Integrating Existing C++ Libraries into PySpark with Esther KundinIntegrating Existing C++ Libraries into PySpark with Esther Kundin
Integrating Existing C++ Libraries into PySpark with Esther KundinDatabricks
 
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)Jason Dai
 
Koalas: Unifying Spark and pandas APIs
Koalas: Unifying Spark and pandas APIsKoalas: Unifying Spark and pandas APIs
Koalas: Unifying Spark and pandas APIsXiao Li
 
aip_developer_overview_icar_2014
aip_developer_overview_icar_2014aip_developer_overview_icar_2014
aip_developer_overview_icar_2014Matthew Vaughn
 
Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Fwdays
 
Apache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code WorkshopApache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code WorkshopAmanda Casari
 
Data science-toolchain
Data science-toolchainData science-toolchain
Data science-toolchainJie-Han Chen
 
Enabling Python to be a Better Big Data Citizen
Enabling Python to be a Better Big Data CitizenEnabling Python to be a Better Big Data Citizen
Enabling Python to be a Better Big Data CitizenWes McKinney
 
The road ahead for scientific computing with Python
The road ahead for scientific computing with PythonThe road ahead for scientific computing with Python
The road ahead for scientific computing with PythonRalf Gommers
 

Similar a How to integrate python into a scala stack (20)

TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
 
ApacheCon 2021 Apache Deep Learning 302
ApacheCon 2021   Apache Deep Learning 302ApacheCon 2021   Apache Deep Learning 302
ApacheCon 2021 Apache Deep Learning 302
 
Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
 
AI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI DayAI and Spark - IBM Community AI Day
AI and Spark - IBM Community AI Day
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019
 
Python 101 for the .NET Developer
Python 101 for the .NET DeveloperPython 101 for the .NET Developer
Python 101 for the .NET Developer
 
Integrating Existing C++ Libraries into PySpark with Esther Kundin
Integrating Existing C++ Libraries into PySpark with Esther KundinIntegrating Existing C++ Libraries into PySpark with Esther Kundin
Integrating Existing C++ Libraries into PySpark with Esther Kundin
 
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
 
Koalas: Unifying Spark and pandas APIs
Koalas: Unifying Spark and pandas APIsKoalas: Unifying Spark and pandas APIs
Koalas: Unifying Spark and pandas APIs
 
aip_developer_overview_icar_2014
aip_developer_overview_icar_2014aip_developer_overview_icar_2014
aip_developer_overview_icar_2014
 
ING - Mind the Gap
ING - Mind the GapING - Mind the Gap
ING - Mind the Gap
 
Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"
 
Apache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code WorkshopApache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code Workshop
 
Data science-toolchain
Data science-toolchainData science-toolchain
Data science-toolchain
 
Enabling Python to be a Better Big Data Citizen
Enabling Python to be a Better Big Data CitizenEnabling Python to be a Better Big Data Citizen
Enabling Python to be a Better Big Data Citizen
 
The road ahead for scientific computing with Python
The road ahead for scientific computing with PythonThe road ahead for scientific computing with Python
The road ahead for scientific computing with Python
 

Más de Fliptop

Webinar: Predictive Digital Advertising
Webinar: Predictive Digital Advertising Webinar: Predictive Digital Advertising
Webinar: Predictive Digital Advertising Fliptop
 
Fliptop - NewVoiceMedia Looks to Predictive Marketing to Meet Aggressive Grow...
Fliptop - NewVoiceMedia Looks to Predictive Marketing to Meet Aggressive Grow...Fliptop - NewVoiceMedia Looks to Predictive Marketing to Meet Aggressive Grow...
Fliptop - NewVoiceMedia Looks to Predictive Marketing to Meet Aggressive Grow...Fliptop
 
Fliptop Customer Showcase Webinar - How InsideView Doubled Their Lead to MQL ...
Fliptop Customer Showcase Webinar - How InsideView Doubled Their Lead to MQL ...Fliptop Customer Showcase Webinar - How InsideView Doubled Their Lead to MQL ...
Fliptop Customer Showcase Webinar - How InsideView Doubled Their Lead to MQL ...Fliptop
 
Webinar: Integrating Predictive Lead Scoring in your Marketo
Webinar: Integrating Predictive Lead Scoring in your MarketoWebinar: Integrating Predictive Lead Scoring in your Marketo
Webinar: Integrating Predictive Lead Scoring in your MarketoFliptop
 
Dreamforce Presentation - Fliptop + InsideView
Dreamforce Presentation - Fliptop + InsideViewDreamforce Presentation - Fliptop + InsideView
Dreamforce Presentation - Fliptop + InsideViewFliptop
 
The Quest for the Holy Grail: Driving Predictable Revenue
The Quest for the Holy Grail: Driving Predictable RevenueThe Quest for the Holy Grail: Driving Predictable Revenue
The Quest for the Holy Grail: Driving Predictable RevenueFliptop
 
Predict 2014, Doug Camplejohn Welcome to Predict
Predict 2014, Doug Camplejohn Welcome to PredictPredict 2014, Doug Camplejohn Welcome to Predict
Predict 2014, Doug Camplejohn Welcome to PredictFliptop
 
Predict 2014, Norman Happ Precision Marketing in a Sea of Opportunity
Predict 2014, Norman Happ Precision Marketing in a Sea of OpportunityPredict 2014, Norman Happ Precision Marketing in a Sea of Opportunity
Predict 2014, Norman Happ Precision Marketing in a Sea of OpportunityFliptop
 
Predict 2014, Sean Ellis Growth Hacking for B2B Marketers
Predict 2014, Sean Ellis Growth Hacking for B2B MarketersPredict 2014, Sean Ellis Growth Hacking for B2B Marketers
Predict 2014, Sean Ellis Growth Hacking for B2B MarketersFliptop
 
Predict 2014, SiriusDecisions Kerry Cunningham
Predict 2014,  SiriusDecisions Kerry CunninghamPredict 2014,  SiriusDecisions Kerry Cunningham
Predict 2014, SiriusDecisions Kerry CunninghamFliptop
 
Predict 2014, Brian Kelly of InsideView, Marketing to Marketers - How We Do It
Predict 2014, Brian Kelly of InsideView, Marketing to Marketers - How We Do ItPredict 2014, Brian Kelly of InsideView, Marketing to Marketers - How We Do It
Predict 2014, Brian Kelly of InsideView, Marketing to Marketers - How We Do ItFliptop
 
Predict 2014 - Account Based Marketing with Peter Isaacson of Demandbase
Predict 2014 - Account Based Marketing with Peter Isaacson of DemandbasePredict 2014 - Account Based Marketing with Peter Isaacson of Demandbase
Predict 2014 - Account Based Marketing with Peter Isaacson of DemandbaseFliptop
 
Webinar: Predictive Lead Scoring - What Makes It So Predictive?
Webinar: Predictive Lead Scoring - What Makes It So Predictive?Webinar: Predictive Lead Scoring - What Makes It So Predictive?
Webinar: Predictive Lead Scoring - What Makes It So Predictive?Fliptop
 
Webinar: True Cost of Calling Every Lead
Webinar: True Cost of Calling Every LeadWebinar: True Cost of Calling Every Lead
Webinar: True Cost of Calling Every LeadFliptop
 
Webinar: The Science of Predictive Lead Scoring
Webinar: The Science of Predictive Lead Scoring Webinar: The Science of Predictive Lead Scoring
Webinar: The Science of Predictive Lead Scoring Fliptop
 
Big Data Will Change Our World
Big Data Will Change Our WorldBig Data Will Change Our World
Big Data Will Change Our WorldFliptop
 
Marketer's Time Saving Survey
Marketer's Time Saving SurveyMarketer's Time Saving Survey
Marketer's Time Saving SurveyFliptop
 

Más de Fliptop (17)

Webinar: Predictive Digital Advertising
Webinar: Predictive Digital Advertising Webinar: Predictive Digital Advertising
Webinar: Predictive Digital Advertising
 
Fliptop - NewVoiceMedia Looks to Predictive Marketing to Meet Aggressive Grow...
Fliptop - NewVoiceMedia Looks to Predictive Marketing to Meet Aggressive Grow...Fliptop - NewVoiceMedia Looks to Predictive Marketing to Meet Aggressive Grow...
Fliptop - NewVoiceMedia Looks to Predictive Marketing to Meet Aggressive Grow...
 
Fliptop Customer Showcase Webinar - How InsideView Doubled Their Lead to MQL ...
Fliptop Customer Showcase Webinar - How InsideView Doubled Their Lead to MQL ...Fliptop Customer Showcase Webinar - How InsideView Doubled Their Lead to MQL ...
Fliptop Customer Showcase Webinar - How InsideView Doubled Their Lead to MQL ...
 
Webinar: Integrating Predictive Lead Scoring in your Marketo
Webinar: Integrating Predictive Lead Scoring in your MarketoWebinar: Integrating Predictive Lead Scoring in your Marketo
Webinar: Integrating Predictive Lead Scoring in your Marketo
 
Dreamforce Presentation - Fliptop + InsideView
Dreamforce Presentation - Fliptop + InsideViewDreamforce Presentation - Fliptop + InsideView
Dreamforce Presentation - Fliptop + InsideView
 
The Quest for the Holy Grail: Driving Predictable Revenue
The Quest for the Holy Grail: Driving Predictable RevenueThe Quest for the Holy Grail: Driving Predictable Revenue
The Quest for the Holy Grail: Driving Predictable Revenue
 
Predict 2014, Doug Camplejohn Welcome to Predict
Predict 2014, Doug Camplejohn Welcome to PredictPredict 2014, Doug Camplejohn Welcome to Predict
Predict 2014, Doug Camplejohn Welcome to Predict
 
Predict 2014, Norman Happ Precision Marketing in a Sea of Opportunity
Predict 2014, Norman Happ Precision Marketing in a Sea of OpportunityPredict 2014, Norman Happ Precision Marketing in a Sea of Opportunity
Predict 2014, Norman Happ Precision Marketing in a Sea of Opportunity
 
Predict 2014, Sean Ellis Growth Hacking for B2B Marketers
Predict 2014, Sean Ellis Growth Hacking for B2B MarketersPredict 2014, Sean Ellis Growth Hacking for B2B Marketers
Predict 2014, Sean Ellis Growth Hacking for B2B Marketers
 
Predict 2014, SiriusDecisions Kerry Cunningham
Predict 2014,  SiriusDecisions Kerry CunninghamPredict 2014,  SiriusDecisions Kerry Cunningham
Predict 2014, SiriusDecisions Kerry Cunningham
 
Predict 2014, Brian Kelly of InsideView, Marketing to Marketers - How We Do It
Predict 2014, Brian Kelly of InsideView, Marketing to Marketers - How We Do ItPredict 2014, Brian Kelly of InsideView, Marketing to Marketers - How We Do It
Predict 2014, Brian Kelly of InsideView, Marketing to Marketers - How We Do It
 
Predict 2014 - Account Based Marketing with Peter Isaacson of Demandbase
Predict 2014 - Account Based Marketing with Peter Isaacson of DemandbasePredict 2014 - Account Based Marketing with Peter Isaacson of Demandbase
Predict 2014 - Account Based Marketing with Peter Isaacson of Demandbase
 
Webinar: Predictive Lead Scoring - What Makes It So Predictive?
Webinar: Predictive Lead Scoring - What Makes It So Predictive?Webinar: Predictive Lead Scoring - What Makes It So Predictive?
Webinar: Predictive Lead Scoring - What Makes It So Predictive?
 
Webinar: True Cost of Calling Every Lead
Webinar: True Cost of Calling Every LeadWebinar: True Cost of Calling Every Lead
Webinar: True Cost of Calling Every Lead
 
Webinar: The Science of Predictive Lead Scoring
Webinar: The Science of Predictive Lead Scoring Webinar: The Science of Predictive Lead Scoring
Webinar: The Science of Predictive Lead Scoring
 
Big Data Will Change Our World
Big Data Will Change Our WorldBig Data Will Change Our World
Big Data Will Change Our World
 
Marketer's Time Saving Survey
Marketer's Time Saving SurveyMarketer's Time Saving Survey
Marketer's Time Saving Survey
 

Último

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 

Último (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

How to integrate python into a scala stack

  • 1. Scala and Python Integrating scikit-learn into a Scala Stack to build realtime predictive models Dan Chiao VP Engineering
  • 2. Why it was necessary We pivoted
  • 3. The original product • Social data append – PeopleGraph: match email addresses to public demographics and social profiles – BrandGraph: match company URLs to public firmographics and social profiles • Requirements – Integrate a large (and expanding) number of web data sources (REST, SOAP, flat files) – Realtime processing of large volumes of contacts (60 queries/s)
  • 4. The original technology stack • Scala – Best of both worlds • Concise functional syntax • Java libraries and deployment architecture • Scala-specific libraries (Dispatch, Lift Web Framework) • Twitter (soon to be Apache) Storm – Streaming intake and normalization of large amounts of data • MongoDB – Expanding data sources = constantly updating schema – Most sophisticated query syntax of NoSQL options • AWS and Azure – Well, duh
  • 5. The new product • Moving up the application stack – Focus on the most compelling single-use case for our data – Fliptop SpendScore • Predictive analytics for sales and marketing teams • “Machine learning for Salesforce”
  • 6. The updated technology stack • Still need to wrangle large amounts of data, so no changes there • New requirement: fast, scalable machine learning
  • 7. Why not Scala (Java) native? • The options – Apache Mahout • Only skeleton implementations for most sophicated machine learning techniques (e.g. Random Forest, Adaboost) • Customer-specific models – don’t need Big Data – Weka – GPL – Scala-native libraries – Too early to use in production
  • 8. Why Python? • scikit-learn – Mature – around since 2006 – Actively-developed – Last stable release Aug 2013 – Sophisticated – Random Forest and Adaboost classifier show comparable performance to R • Why not R? Not really production grade.
  • 9. Requirements • APIs to exploit Python’s modeling power – Train, predict, model info query, etc. • Scalability – On demand Python serving nodes
  • 10. Tools for Scala-Python Integration • Reimplementation of Python – Jython (JPython) • Communication through JNI – Jepp • Communication through IPC – Apache Thrift • Communication through REST API calls – Bottle
  • 11. Jython • Re-Implementation of Python in Java • Can import and use any Java class. • Includes almost all of the modules in the standard Python distribution – Except some of the modules implemented originally in C. • Compiles to Java bytecode – either on demand or statically. 1 1
  • 13. Jython • Lacks support for lots of extensions for scientific computing – Numpy, Scipy, etc. • JyNI (Jython Native Interface) to the rescue? – Specifically designed to support CPython extensions like Numpy, Scipy – Still in alpha 1 3
  • 14. Communication through JNI • Jepp (Java Embedded Python) – Embeds CPython in Java – Runs Python code in CPython – Leverages both JNI and Python/C for integration
  • 16. Jepp 1 6 object TestJepp extends App { val jep = new Jep() jep.runScript("python_util.py") val a = (2).asInstanceOf[AnyRef] val b = (3).asInstanceOf[AnyRef] val sumByPython = jep.invoke("python_add", a, b) println(sumByPython.asInstanceOf[Int]) } def python_add(a, b): return a + b python_util.py TestJepp.scala
  • 17. Communication through IPC • Apache Thrift – Developed & open-sourced by Facebook – More community support than Protobuf, Avro – IDL-based (Interface Definition Language) – Generates server/client code in specified languages – Take care of protocol and transport layer details – Comes with generators for Java, Python, C++, etc. • No Scala generator • Scrooge (Twitter) to the rescue! 1 7
  • 18. Thrift – IDL 1 8 namespace java python_service_test namespace py python_service_test service PythonAddService { i32 pythonAdd (1:i32 a, 2:i32 b), } TestThrift.thrift $ thrift --gen java --gen py TestThrift.thrift
  • 19. Thrift – Python Server 1 9 class ExampleHandler(python_service_test.PythonAddService.Iface): def pythonAdd(self, a, b): return a + b handler = ExampleHandler() processor = Example.Processor(handler) transport = TSocket.TServerSocket(9090) tfactory = TTransport.TBufferedTransportFactory() pfactory = TBinaryProtocol.TBinaryProtocolFactory() server = TServer.TThreadedServer(processor, transport, tfactory, pfactory) server.serve() PythonAddServer.py class Iface: def pythonAdd(self, a, b): pass PythonAddService.p y
  • 20. Thrift – Scala Client 2 0 object PythonAddClient extends App { val transport: TTransport = new TSocket("localhost", 9090) val protocol: TProtocol = new TBinaryProtocol(transport) val client = new PythonAddService.Client(protocol) transport.open() val sumByPython = client.python_add(3, 5) println("3 + 5 = " + sumByPython) transport.close() } PythonAddClient.sc ala
  • 21. Thrift 2 1 JVM Scala Code Thrift Python Code Python Interpreter Thrift Python Code Python Interpreter Thrift … Auto Balancing、 Built-in Encryption
  • 22. REST API Architecture 2 2 …Bottle Python Code Bottle Python Code Bottle Python Code JVM Scala Code Auto Balancer? Encoding?
  • 23. Thrift v.s. REST Thrift REST Load Balancer ✔ Encode/Decode ✔ Low Learning Curve ✔ No Dependency ✔ Does it matter? No (AWS & Azure) No (We’re already doing it) Yes Yes
  • 24. Fliptop’s Architecture 2 4 Load Balancer …Bottle Python Code Bottle Python Code Bottle Python Code JVM Scala Code 5 Python servers ~5,000 requests/sec
  • 25. Summary • Jython • (✓) Tight integration with Scala/Java • (✗) Lack support for C extensions (JyNI might help in the future) • Jepp • (✓) Access high quality Python extensions with CPython speed • (✗) Two runtime environments • Thrift, REST • (✓) Language-independent development • (✗) Bigger communication overhead 2 5