2. ABOUT ME
● Lead Software Engineer at Wavy Global/Movile
● RedisConf Speaker (2018) / Redis Day London (2019)
● Interested in: Distributed System, Computer Theory
and Machine Learning.
● BS degree with praise in Computer Science by
University of Campinas
● @lucianosabenca
Luciano Sabença
3. ● About Wavy
● Why did we need another lang?
● Compilers 101
● Defining the Grammar
● Implementing the Grammar
● Possible Follow ups
AGENDA
4. Disclaimers
● I'm not a compiler engineer
● I'm not an expert in:
○ Formal Languages
○ Lexers/Parsers
● This talk will be a little bit dense:
○ I promise I will do my best
○ Code is available at:
https://github.com/luciano-sabenca/TDC-2020-Implementing-a-mini-language
7. Why did we need another lang?
● Create good conversations are hard!
○ You really need to take dynamic decisions
based on the conversation messages.
Let's take an example:
8. Why did we need another lang?
● Create good conversations are hard!
○ You really need to take dynamic decisions
based on the conversation messages.
Let's take an example:
9. Why did we need another lang?
100852
From 0 to 10, what's
your opinion about our
last contact?
100852
From 0 to 10, what's
your opinion about our
last contact?
8
100852
From 0 to 10, what's
your opinion about our
last contact?
8
We're glad that you
enjoy our contact!
Please free feel to
contact us anytime
10. Why did we need another lang?
100852
From 0 to 10, what's
your opinion about our
last contact?
100852
From 0 to 10, what's
your opinion about our
last contact?
8
100852
From 0 to 10, what's
your opinion about our
last contact?
8
We're glad that you
enjoy our contact!
Please free feel to
contact us anytime
11. Why did we need another lang?
100852
From 0 to 10, what's
your opinion about our
last contact?
100852
From 0 to 10, what's
your opinion about our
last contact?
8
100852
From 0 to 10, what's
your opinion about our
last contact?
8
We're glad that you
enjoy our contact!
Please free feel to
contact us anytime
12. Why did we need another lang?
100852
From 0 to 10, what's
your opinion about our
last contact?
100852
From 0 to 10, what's
your opinion about our
last contact?
5
100852
From 0 to 10, what's
your opinion about our
last contact?
5
We're sorry for you
experience. Please
could you share with
us your thoughts
about:
100852
5
We're sorry for you
experience. Please
could you share with
us your thoughts
about:
...
13. Why did we need another lang?
100852
From 0 to 10, what's
your opinion about our
last contact?
100852
From 0 to 10, what's
your opinion about our
last contact?
5
100852
From 0 to 10, what's
your opinion about our
last contact?
5
We're sorry for you
experience. Please
could you share with
us your thoughts
about:
100852
5
We're sorry for you
experience. Please
could you share with
us your thoughts
about:
...
14. Why did we need another lang?
100852
From 0 to 10, what's
your opinion about our
last contact?
100852
From 0 to 10, what's
your opinion about our
last contact?
5
100852
From 0 to 10, what's
your opinion about our
last contact?
5
We're sorry for you
experience. Please
could you share with
us your thoughts
about:
100852
5
We're sorry for you
experience. Please
could you share with
us your thoughts
about:
...
15. Why did we need another lang?
100852
From 0 to 10, what's
your opinion about our
last contact?
100852
From 0 to 10, what's
your opinion about our
last contact?
5
100852
From 0 to 10, what's
your opinion about our
last contact?
5
We're sorry for you
experience. Please
could you share with
us your thoughts
about:
100852
5
We're sorry for you
experience. Please
could you share with
us your thoughts
about:
...
16. Why did we need another lang?
● It's impossible to predict ALL the possible
conversations.
● We need to be prepared to take complex decisions
during the conversation! We need flexibility!
17. Embedding a full language?
○ It's possible, but it's too much power! It may
cause:
■ Security bugs
■ Performance problems (e.g. infinite loop,
expensive operations, recursions…)
■ Annoying type conversions
Why did we need another lang?
18. Create a mini-language?
○ Pros:
■ We can easily limit what the user can do
■ We can design the language to solve a
specific problem.
○ Cons:
■ It's a little bit challenger
Let's do it!
Why did we need another lang?
19. Create a mini-language?
○ Pros:
■ We can easily limit what the user can do
■ We can design the language to solve a
specific problem.
○ Cons:
■ It's a little bit challenger
Let's do it!
Why did we need another lang?
27. Compilers 101 - Lexical Analysis
Objective:
○ Transform the input in a token sequence.
○ Categorize the tokens.
Example:
myData == "ABC"
Type: Identifier
Value: myData
Type: Operator
Value: ==
Type: String
Value: "ABC"
28. Compilers 101 - Syntax Analysis (parsing)
Objective:
○ Ensure that the input is a valid grammar
○ Generate a parse tree
Example (simplified):
==
"ABC"
myData
[value1]
29. Compilers 101 - Syntax Analysis (parsing)
Objective:
○ Ensure that the input is a valid grammar
○ Generate a parse tree
Example (simplified):
==
"ABC"
myData
[value1]
30. Compilers 101 - Semantic Implementation
Objective:
○ Effectively implements the behaviour of this
grammar.
○ We won't implement a full compiler here but we will do
implement the grammar
32. ● ANTLR (ANother Tool For Language Recognition)
is a Lexer + Parser written in Java.
○ Used by a lot of projects:
■ Apache Cassandra
■ Twitter's Search Engine
■ Presto
■ Apex (Salesforce Programming Language)
Workflow:
Defining the Grammar - ANTLR
Define the
Grammar
ANTLR
Implement
the grammar
33. ● ANTLR (ANother Tool For Language Recognition)
is a Lexer + Parser written in Java.
○ Used by a lot of projects:
■ Apache Cassandra
■ Twitter's Search Engine
■ Presto
■ Apex (Salesforce Programming Language)
Workflow:
Defining the Grammar - ANTLR
Define the
Grammar
ANTLR
Implement
the grammar
40. Defining the Grammar - Rules
● Define the Rules is the single most important
step on creating a new language
● You MUST avoid ambiguity!
● You MUST avoid infinite recursion!
● It MUST be general!
41. Defining the Grammar - Rules
● Define the Rules is the single most important
step on creating a new language
● You MUST avoid ambiguity!
● You MUST avoid infinite recursion!
● It MUST be general!
42. Defining the Grammar - Rules
● Define the Rules is the single most important
step on creating a new language
● You MUST avoid ambiguity!
● You MUST avoid infinite recursion!
● It MUST be general!
43. Defining the Grammar - Rules (Good News)
● Good News:
○ We don't need to create crazy things here
○ You have A LOT, really A LOT of ANTLR
Grammars :
■ https://github.com/antlr/grammars-v4
44. Defining the Grammar - WHEN
Remember:
WHEN { myData == "ABC" -> "A"; myData == "CB" OR myData2 > 2 -> "B"; else
-> "C"}
Examples:
myData myData2 Output
"ABC" - "A"
"CB" - "B"
(except "CB"|"ABC") 3 "B"
(except "CB"|"ABC") 2 "C"
45. Defining the Grammar - WHEN
Remember:
WHEN { myData == "ABC" -> "A"; myData == "CB" OR myData2 > 2 -> "B"; else
-> "C"}
Examples:
myData myData2 Output
"ABC" - "A"
"CB" - "B"
(except "CB"|"ABC") 3 "B"
(except "CB"|"ABC") 2 "C"
51. Implementing the Grammar - ANTLR
● ANTLR is able to generate two kinds of APIs to implement
the grammar:
○ Listener: The default one, based on Java JDOM API
○ Visitor: Based on Visitor Pattern
52. Implementing the Grammar - Visitor
We will use the Visitor API.
Motivation:
● It is less stateful. One must return the value.
● You can control the visitation:
○ Short-circuit the OR/AND
○ "Lazy" When
53. Implementing the Grammar - Visitor
We will use the Visitor API.
Motivation:
● It is less stateful. One must return the value.
● You can control the visitation:
○ Short-circuit the OR/AND
○ "Lazy" When
54. Implementing the Grammar - Visitor
We will use the Visitor API.
Motivation:
● It is less stateful. One must return the value.
● You can control the visit of the nodes:
○ Short-circuit the OR/AND
○ "Lazy" When
55. Implementing the Grammar - Visitor
We will use the Visitor API.
Motivation:
● It is less stateful. One must return the value.
● You can control the visit of the nodes:
○ Short-circuit the OR/AND
○ "Lazy" When
56. Implementing the Grammar - Visitor
We will use the Visitor API.
Motivation:
● It is less stateful. One must return the value.
● You can control the visit of the nodes:
○ Short-circuit the OR/AND
○ "Lazy" When
59. Possible Follow Ups
This grammar solved our problem!
We are now able to take complex decisions dynamically!
However there are a lot room for improvements:
● Implement all the basic arithmetic operations (+,-,/,*)
● Error Handling:
○ What should we do when the variable is NaN?
○ What should we do when the variable does now exist?
60. Possible Follow Ups
This grammar solved our problem!
We are now able to take complex decisions dynamically!
However there are a lot room for improvements:
● Implement all the basic arithmetic operations (+,-,/,*)
● Error Handling:
○ What should we do when the variable is NaN?
○ What should we do when the variable does now exist?
61. Possible Follow Ups
This grammar solved our problem!
We are now able to take complex decisions dynamically!
However there are a lot room for improvements:
● Implement all the basic arithmetic operations (+,-,/,*)
● Error Handling:
○ What should we do when the variable is NaN?
○ What should we do when the variable does now exist?
62. Possible Follow Ups
This grammar solved our problem!
We are now able to take complex decisions dynamically!
However there are a lot room for improvements:
● Implement all the basic arithmetic operations (+,-,/,*)
● Error Handling:
○ What should we do when the variable is NaN?
○ What should we do when the variable does not exist?
63. Nice to Wavy you!
Questions?
luciano.sabença@wavy.global
We Are Hiring!
Use the QR Code for more information about it ;)