SlideShare una empresa de Scribd logo
1 de 67
Descargar para leer sin conexión
Crystal internals
Part 1
Is a compiler a hard thing?
At Manas we usually do webapps
Let’s talk about webapps...
Let’s talk about webapps...
● HTML/CSS/JS
● React/Angular/Knockout
● Ruby/Erlang/Elixir
● Database (mysql/postgres)
● Elasticsearch
● Redis/Sidekiq/Background-jobs
● Docker, capistrano, deploy, servers
Let’s talk about webapps...
● HTML/CSS/JS
● React/Angular/Knockout
● Ruby/Erlang/Elixir
● Database (mysql/postgres)
● Elasticsearch
● Redis/Sidekiq/Background-jobs
● Docker, capistrano, deploy, servers
Easy…?
Let’s talk about compilers...
● HTML/CSS/JS
● React/Angular/Knockout
● Ruby/Erlang/Elixir
● Database (mysql/postgres)
● Elasticsearch
● Redis/Sidekiq/Background-jobs
● Docker, capistrano, deploy, servers
Easy!
Let’s talk about compilers...
Let’s talk about compilers...
No, let’s talk about usual programs
No, let’s talk about usual programs
INPUT -> [PROCESSING…] -> OUTPUT
No, let’s talk about compilers
SOURCE CODE -> [PROCESSING…] -> EXECUTABLE
No, let’s talk about compilers
SOURCE CODE -> [PROCESSING…] -> EXECUTABLE
How do we go from source code to an executable?
Traditional stages of a compiler
class Foo
def bar
1 + 2
end
end
● Lexer: [“class”, “Foo”, “;”, “def”, “bar”, “;”, “1”, “+”, “2”, “;”, “end”, “;”, “end”]
● Parser: ClassDef(“Foo”, body: [Def.new(“bar”)])
● Semantic (a.k.a “type check”): make sure there are no type errors
● Codegen: generate machine code
Let’s start with the codegen phase
Goal: generate efficient assembly code for many architectures (32 bits, 64 bits,
intel, arm, etc.)
● Generating assembly code is hard
● Generating efficient assembly code is harder
● Generating assembly code for many architectures is hard/tedious/boring
Let’s start with the codegen phase
Goal: generate efficient assembly code for many architectures (32 bits, 64 bits,
intel, arm, etc.)
● Generating assembly code is hard
● Generating efficient assembly code is harder
● Generating assembly code for many architectures is hard/tedious/boring
Thus: writing a compiler is HARD! :-(
Let’s start with the codegen phase
Goal: generate efficient assembly code for many architectures (32 bits, 64 bits,
intel, arm, etc.)
● Generating assembly code is hard
● Generating efficient assembly code is harder
● Generating assembly code for many architectures is hard/tedious/boring
Thus: writing a compiler is HARD! :-(
Well, not anymore...
Codegen
With LLVM, we generate LLVM IR (internal representation) instead of assembly,
and LLVM takes care of generating efficient assembly code for us!
The hardest part is solved :-)
define i32 @add(i32 %x, i32 %y) {
%0 = add i32 %x, %y
ret i32 %0
}
Codegen: LLVM (example)
LLVM provides a nice API to generate IR
require "llvm"
mod = LLVM::Module.new("main")
mod.functions.add("add", [LLVM::Int32, LLVM::Int32], LLVM::Int32) do |func|
func.basic_blocks.append do |builder|
res = builder.add(func.params[0], func.params[1])
builder.ret(res)
end
end
puts mod
● Lexer
● Parser
● Semantic
Remaining phases
● Kind of easy: go char by char until we get a keyword, identifier, number, etc.
● We won’t go into implementation details...
Lexer
● Kind of easy: go token by token and create a tree of expressions
● This tree is called AST: Abstract Syntax Tree
● An AST is like a directed, acyclic graph
● We won’t go into implementation details...
Parser
● This is the fundamental piece of the compiler
● It takes an AST as input and analyzes it
● Analysis can result in:
○ Declaring types: for example “class Foo; end” will declare a type Foo
○ Checking methods: for example “Foo.bar” will check that “Foo” is a declared type and that the
method “bar” exists in it, and has the correct arity and types
○ Giving each non-dead expression in the program a type
○ Gathering some info for the codegen phase: for example know the local variables of a method,
and their type
Semantic
● The interesting part of the compiler is the semantic phase
● It’s just about processing an AST
● In Crystal’s compiler you just need to know one language: Crystal!
● No HTML/CSS/JS/JSX/etc.
● No untyped, dynamic languages: no Ruby/Erlang/Elixir. Type safe!
● Stuff is processed in memory
● No databases, no Elasticsearch, no Redis
Semantic
● The interesting part of the compiler is the semantic phase
● It’s just about processing an AST
● In Crystal’s compiler you just need to know one language: Crystal!
● No HTML/CSS/JS/JSX/etc.
● No untyped, dynamic languages: no Ruby/Erlang/Elixir. Type safe!
● Stuff is processed in memory
● No databases, no Elasticsearch, no Redis
Writing a compiler is easier than writing a web app! ^_^
Semantic
● The interesting part of the compiler is the semantic phase
● It’s just about processing an AST
● In Crystal’s compiler you just need to know one language: Crystal!
● No HTML/CSS/JS/JSX/etc.
● No untyped, dynamic languages: no Ruby/Erlang/Elixir. Type safe!
● Stuff is processed in memory
● No databases, no Elasticsearch, no Redis
Writing a compiler is easier than writing a web app! ^_^
(Or at least it’s more fun :-P)
Semantic
Directory layout
● src/compiler/crystal
○ command/
○ syntax/
○ semantic/
○ macros/
○ codegen/
○ tools/
○ compiler.cr
○ types.cr
○ program.cr
Directory layout
● src/compiler/crystal
○ command/ : the command line interface
○ syntax/ : lexer, parser, ast, visitor, transformer
○ semantic/ : type declaration, method lookup, etc.
○ macros/ : macro expansion logic
○ codegen/ : codegen
○ tools/ : doc generator, formatter, init
○ compiler.cr : combines syntax + semantic + codegen
○ types.cr : all possible types in Crystal (Int32, String, unions, custom types, etc.)
○ program.cr : holds definitions of a program (holds Int32, String, etc.)
Directory layout
● src/compiler/crystal : ~43K LOC
○ command/ : ~300LOC
○ syntax/ : ~10K LOC
○ semantic/ : ~12K LOC
○ macros/ : ~2K LOC
○ codegen/ : ~6K LOC
○ tools/ : ~7K LOC
○ compiler.cr : ~300LOC
○ types.cr :~2K LOC
○ program.cr : ~300 LOC
Directory layout
● src/compiler/crystal : ~43K LOC
○ command/ : ~300LOC
○ syntax/ : ~10K LOC
○ semantic/ : ~12K LOC
○ macros/ : ~2K LOC
○ codegen/ : ~6K LOC
○ tools/ : ~7K LOC
○ compiler.cr : ~300LOC
○ types.cr :~2K LOC
○ program.cr : ~300 LOC
About 14K LOC to analyze source code.
Directory layout
● src/compiler/crystal : ~43K LOC
○ command/ : ~300LOC
○ syntax/ : ~10K LOC
○ semantic/ : ~12K LOC
○ macros/ : ~2K LOC
○ codegen/ : ~6K LOC
○ tools/ : ~7K LOC
○ compiler.cr : ~300LOC
○ types.cr :~2K LOC
○ program.cr : ~300 LOC
About 14K LOC to analyze source code.
One big Rails app at Manas has 14K LOC in “./app”
Directory layout
● src/compiler/crystal : ~43K LOC
○ command/ : ~300LOC
○ syntax/ : ~10K LOC
○ semantic/ : ~12K LOC
○ macros/ : ~2K LOC
○ codegen/ : ~6K LOC
○ tools/ : ~7K LOC
○ compiler.cr : ~300LOC
○ types.cr :~2K LOC
○ program.cr : ~300 LOC
About 14K LOC to analyze source code.
One big Rails app at Manas has 14K LOC in “./app”
A compiler can’t be that hard! ;-)
Show me the code
Show me the code
# src/compiler/crystal/compiler.cr
def compile(source : Source | Array(Source), output_filename : String) : Result
source = [source] unless source.is_a?(Array)
program = new_program(source)
node = parse program, source
node = program.semantic node, @stats
codegen program, node, source, output_filename unless @no_codegen
Result.new program, node
end
Show me the code
# src/compiler/crystal/compiler.cr
def compile(source : Source | Array(Source), output_filename : String) : Result
source = [source] unless source.is_a?(Array)
program = new_program(source)
node = parse program, source
node = program.semantic node, @stats
codegen program, node, source, output_filename unless @no_codegen
Result.new program, node
end
Show me the code
# src/compiler/crystal/compiler.cr
def compile(source : Source | Array(Source), output_filename : String) : Result
source = [source] unless source.is_a?(Array)
program = new_program(source)
node = parse program, source
node = program.semantic node, @stats
codegen program, node, source, output_filename unless @no_codegen
Result.new program, node
end
What is a program?
Program
● Holds all types and top-level methods for a given compilation
● For example, if I compile “class Foo; end” and you compile “class Bar; end”,
the first program will have a type named “Foo”, and the second one won’t (but
it will have a type named “Bar”)
● It lets us test the compiler more easily, because we can use different Program
instances for each snippet of code that we want to test
● In contrast of having global variables holding all of a program’s data
● A Program is passed around in all phases of a compilation (except lexing and
parsing, which don’t need semantic info)
Show me the code
# src/compiler/crystal/compiler.cr
def compile(source : Source | Array(Source), output_filename : String) : Result
source = [source] unless source.is_a?(Array)
program = new_program(source)
node = parse program, source # from source to Crystal::ASTNode
node = program.semantic node, @stats
codegen program, node, source, output_filename unless @no_codegen
Result.new program, node
end
What is a program?
Show me the code
# src/compiler/crystal/compiler.cr
def compile(source : Source | Array(Source), output_filename : String) : Result
source = [source] unless source.is_a?(Array)
program = new_program(source)
node = parse program, source
node = program.semantic node, @stats # Semantic! :-)
codegen program, node, source, output_filename unless @no_codegen
Result.new program, node
end
What is a program?
Semantic
● The entry point for semantic analysis is in
src/compiler/crystal/semantic.cr
● Other files are in src/compiler/crystal/semantic/
● The file semantic.cr has comments that explain the overall algorithm :-)
Semantic: overall algorithm
● top level: declare classes, modules, macros, defs and other top-level stuff
● new methods: create `new` methods for every `initialize` method
● type declarations: process type declarations like `@x : Int32`
● check abstract defs: check that abstract defs are implemented
● class_vars_initializers: process initializers like `@@x = 1`
● instance_vars_initializers: process initializers like `@x = 1`
● main: process "main" code, calls and method bodies (the whole program).
● cleanup: remove dead code and other simplifications
● check recursive structs: check that structs are not recursive (impossible to
codegen)
Semantic: overall algorithm
Note!
● This algorithm didn’t come from the Skies
(nor from a textbook, nor from a paper)
● It’s not written in stone!
● It can definitely be improved: readability,
performance, etc.
Note!
● It’s actually more like this…
Semantic: overall algorithm
Semantic
But before looking at each phase, we need to learn about the most useful pattern
for analyzing an AST...
The Visitor pattern
require "compiler/crystal/syntax"
class SumVisitor < Crystal::Visitor
getter sum = 0
def visit(node : Crystal::NumberLiteral)
@sum += node.value.to_i
end
def visit(node : Crystal::ASTNode)
true # true: continue visiting children nodes
end
end
ast = Crystal::Parser.parse("foo(1 + 2, 3, [4])")
visitor = SumVisitor.new
ast.accept(visitor)
puts visitor.sum
The Visitor pattern
● We define a visit method for each node of interest
● We process the nodes
● We return true if we want to process children, false otherwise
● Example: if we only want to process class declarations, we could just define
visit(node : Crystal::ClassDef) and define some logic there (and return true,
because of nested class definitions)
● A visitor abstracts over the way nodes are composed
● ...though in many cases, for semantic purposes, we need and use the way a
node is composed (for example, to analyze a call we need to know the
argument types, so we check the arguments, not all children in a generic way)
Semantic: overall algorithm
● top level: declare classes, modules, macros, defs and other top-level stuff
● new methods
● type declarations
● check abstract defs
● class_vars_initializers
● instance_vars_initializers
● main
● cleanup
● check recursive structs
Top level: declare classes, modules, macros, defs...
# src/compiler/crystal/semantic/top_level_visitor.cr
class Crystal::TopLevelVisitor < Crystal::SemanticVisitor
# ...
end
● Located at semantic_visitor.cr
● This is a base visitor used in most of the phases of the semantic analysis
● It keeps track of the “current type”
● For example in “class Foo; class Bar; baz; end; end”, “current type” starts at
the top-level (the Program). When “class Foo” is found, the current type
becomes “Foo” (we search “Foo” in the current type). When “class Bar” is
found, the current type becomes “Foo::Bar” (we search “Bar” in the current
type). When “baz” is found, it will be looked up inside the current type.
● But initially there’s no “Foo” inside the current type (the Program). Who
defines it? … The top-level visitor!
Crystal::SemanticVisitor
● Located at top_level_visitor.cr
● Defines classes, methods, etc.
● Given “class Foo; class Bar; baz; end; end”...
● current_type starts at Program
● When “class Foo” is found (ClassDef), we check if “Foo” exists in the current
type. If not, we create it. If it exists with a different type (if it’s a module), we
give an error.
● We attach this type “Foo” to the AST node ClassDef. SemnticVisitor will use
this in every subsequent phase.
● … the “baz” call is not analyzed here (unless it’s a macro)
Crystal::TopLevelVisitor
Crystal::TopLevelVisitor
● Many other things done in this visitor: methods and macros are added to
types, aliases and enums are defined, etc.
● Question: why are methods and macros defined at this phase?
● The “inherited” macro hook must be processed as soon as “Bar <
Foo” and “Baz < Foo” are found
● The macro expands to “do_something”, which must expand to
“def foo; 1; end”
● This must happen before we continue processing Baz’s body:
“def foo; 3; end” must win and be the method found when doing
“Baz.new.foo”
● Conclusion: methods, macros and hooks must be defined in the
first pass, when defining types. Additionally, macros might be
looked up in types in this same pass (like “do_something”)
● SemanticVisitor takes care to look up and expand calls that
resolve to macro calls
When should macros be defined and expanded
class Foo
macro inherited
do_something
end
macro do_something
def foo; 1; end
end
end
class Bar < Foo; end
class Baz < Foo
def foo; 3; end
end
puts Bar.new.foo # => 1
puts Baz.new.foo # => 3
Method overloads
● Crystal methods are very powerful! For example: optional type restrictions,
different number of arguments, default arguments, splat, etc.
● When methods are added to types we need to:
○ Know if a method replaces (redefines) an old method
○ Track whether a method is “stricter” than another method, to quickly know, given a call
argument types, in which order they are going to be tested
Method restrictions
def foo(x : Int32)
puts 1
end
def foo(x)
puts 2
end
foo(1)
foo('a')
● Given foo(1), both methods match it. However, the first overload
should be invoked because it has a stronger restriction than the
second overload.
● If we define the methods in a different order, it still works the
same
● This is because an argument with a type restriction is stronger than
one without one. We say that the first one is a restriction of the
second one (we should probably rename this to use stronger)
● This applies to types too: Int32 is stronger than Int32 |
String. And Bar is stronger than Foo, if Bar < Foo.
● Given two methods with the same name, if all arguments of a
method are stronger than the others’, the whole method is stronger
and should come first. Each type stores an ordered list of methods
indexed by method name, with this notion.
● If the methods are both stronger than each other, they have the
same restriction.
Method restrictions
def foo(x : Int32)
puts 1
end
def foo(x)
puts 2
end
foo(1)
foo('a')
● This logic is located at restrictions.cr
● A lot of cases to consider: generics, tuples, splats, etc.
● The code and algorithms could probably use a simpler, unified logic
and a cleanup, but first all of these concepts and definitions must be
defined much more formally
Semantic: overall algorithm
● top level
● new methods: create `new` methods for every `initialize` method
● type declarations
● check abstract defs
● class_vars_initializers
● instance_vars_initializers
● main
● cleanup
● check recursive structs
● Located at new.cr
● TopLevelVisitor creates a `new` class method for every `initialize` method it
finds (the logic for this is also in new.cr)
● Classes that end up without an `initialize` need a default, argless `self.new`
method
● This phase is a bit messy right now because of some missing things related to
generics…
Semantic: new methods
class Foo
def initialize(x : Int32)
@x = x
end
# Generated from the above
def self.new(x : Int32)
instance = allocate
instance.initialize(x)
if instance.responds_to?(:finalize)
::GC.add_finalizer(instance)
end
end
end
Semantic: new methods
Semantic: overall algorithm
● top level
● new methods
● type declarations: process type declarations like `@x : Int32`
● check abstract defs
● class_vars_initializers
● instance_vars_initializers
● main
● cleanup
● check recursive structs
● Located at type_declaration_processor.cr (and
type_declaration_visitor.cr and type_guess_visitor.cr)
● Combines info gathered by these two visitors to declare the type of instance
and class variables.
● TypeDeclarationVisitor deals with explicit type declarations
● TypeGuessVisitor tries to “guess” the type of instance and class variables
without an explicit type annotations (for example @x = 1 and @x =
Foo.new)
Semantic: type declarations
Semantic: overall algorithm
● top level
● new methods
● type declarations
● check abstract defs: check that abstract defs are implemented
● class_vars_initializers
● instance_vars_initializers
● main
● cleanup
● check recursive structs
● Located at abstract_def_checker.cr
● Not a visitor, but traverses all types, and for those that have abstract defs
checks that subclasses or including modules defined those methods
Semantic: check abstract defs

Más contenido relacionado

La actualidad más candente

(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf
Guido Schmutz
 

La actualidad más candente (20)

AWS CloudFormation macros: Coding best practices - MAD201 - New York AWS Summit
AWS CloudFormation macros: Coding best practices - MAD201 - New York AWS SummitAWS CloudFormation macros: Coding best practices - MAD201 - New York AWS Summit
AWS CloudFormation macros: Coding best practices - MAD201 - New York AWS Summit
 
The AWS Shared Security Responsibility Model in Practice
The AWS Shared Security Responsibility Model in PracticeThe AWS Shared Security Responsibility Model in Practice
The AWS Shared Security Responsibility Model in Practice
 
"Continuously delivering infrastructure using Terraform and Packer" training ...
"Continuously delivering infrastructure using Terraform and Packer" training ..."Continuously delivering infrastructure using Terraform and Packer" training ...
"Continuously delivering infrastructure using Terraform and Packer" training ...
 
Dynatrace
DynatraceDynatrace
Dynatrace
 
Building APIs with Apigee Edge and Microsoft Azure
Building APIs with Apigee Edge and Microsoft AzureBuilding APIs with Apigee Edge and Microsoft Azure
Building APIs with Apigee Edge and Microsoft Azure
 
Configure an End-to-End Video Channel to Deliver Low Latency (CTD411-R3) - AW...
Configure an End-to-End Video Channel to Deliver Low Latency (CTD411-R3) - AW...Configure an End-to-End Video Channel to Deliver Low Latency (CTD411-R3) - AW...
Configure an End-to-End Video Channel to Deliver Low Latency (CTD411-R3) - AW...
 
Definitive Guide to API Management
Definitive Guide to API ManagementDefinitive Guide to API Management
Definitive Guide to API Management
 
(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf
 
Best Practices, AWS Elemental and Media Services
Best Practices, AWS Elemental and Media ServicesBest Practices, AWS Elemental and Media Services
Best Practices, AWS Elemental and Media Services
 
AWS Security Best Practices in a Zero Trust Security Model - DEM08 - Toronto ...
AWS Security Best Practices in a Zero Trust Security Model - DEM08 - Toronto ...AWS Security Best Practices in a Zero Trust Security Model - DEM08 - Toronto ...
AWS Security Best Practices in a Zero Trust Security Model - DEM08 - Toronto ...
 
GraphQL Data Loaders - How to feed your GraphQL API with data the smart way
GraphQL Data Loaders - How to feed your GraphQL API with data the smart wayGraphQL Data Loaders - How to feed your GraphQL API with data the smart way
GraphQL Data Loaders - How to feed your GraphQL API with data the smart way
 
A Checklist for Every API Call
A Checklist for Every API CallA Checklist for Every API Call
A Checklist for Every API Call
 
Hands-On with Advanced AWS CloudFormation Techniques and New Features (DEV335...
Hands-On with Advanced AWS CloudFormation Techniques and New Features (DEV335...Hands-On with Advanced AWS CloudFormation Techniques and New Features (DEV335...
Hands-On with Advanced AWS CloudFormation Techniques and New Features (DEV335...
 
Active directory interview_questions
Active directory interview_questionsActive directory interview_questions
Active directory interview_questions
 
Introduction to the AWS Cloud - AWSome Day 2019 - Denver
Introduction to the AWS Cloud - AWSome Day 2019 - Denver Introduction to the AWS Cloud - AWSome Day 2019 - Denver
Introduction to the AWS Cloud - AWSome Day 2019 - Denver
 
Gloo 1.0 - API Gateway Overview and Demo
Gloo 1.0 - API Gateway Overview and DemoGloo 1.0 - API Gateway Overview and Demo
Gloo 1.0 - API Gateway Overview and Demo
 
AWS CodeBuild Demo
AWS CodeBuild DemoAWS CodeBuild Demo
AWS CodeBuild Demo
 
Aws multi-region High Availability
Aws multi-region High Availability Aws multi-region High Availability
Aws multi-region High Availability
 
Datapower Steven Cawn
Datapower Steven CawnDatapower Steven Cawn
Datapower Steven Cawn
 
Analyze Amazon CloudFront and Lambda@Edge Logs to Improve Customer Experience...
Analyze Amazon CloudFront and Lambda@Edge Logs to Improve Customer Experience...Analyze Amazon CloudFront and Lambda@Edge Logs to Improve Customer Experience...
Analyze Amazon CloudFront and Lambda@Edge Logs to Improve Customer Experience...
 

Similar a Crystal internals (part 1)

[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...
[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...
[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...
Sang Don Kim
 
TI1220 Lecture 14: Domain-Specific Languages
TI1220 Lecture 14: Domain-Specific LanguagesTI1220 Lecture 14: Domain-Specific Languages
TI1220 Lecture 14: Domain-Specific Languages
Eelco Visser
 
mloc.js 2014 - JavaScript and the browser as a platform for game development
mloc.js 2014 - JavaScript and the browser as a platform for game developmentmloc.js 2014 - JavaScript and the browser as a platform for game development
mloc.js 2014 - JavaScript and the browser as a platform for game development
David Galeano
 
Structure-Compiler-phases information about basics of compiler. Pdfpdf
Structure-Compiler-phases information  about basics of compiler. PdfpdfStructure-Compiler-phases information  about basics of compiler. Pdfpdf
Structure-Compiler-phases information about basics of compiler. Pdfpdf
ovidlivi91
 

Similar a Crystal internals (part 1) (20)

Dart the Better JavaScript
Dart the Better JavaScriptDart the Better JavaScript
Dart the Better JavaScript
 
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
 
(1) c sharp introduction_basics_dot_net
(1) c sharp introduction_basics_dot_net(1) c sharp introduction_basics_dot_net
(1) c sharp introduction_basics_dot_net
 
[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...
[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...
[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...
 
C Language
C LanguageC Language
C Language
 
A Life of breakpoint
A Life of breakpointA Life of breakpoint
A Life of breakpoint
 
Ruxmon.2013-08.-.CodeBro!
Ruxmon.2013-08.-.CodeBro!Ruxmon.2013-08.-.CodeBro!
Ruxmon.2013-08.-.CodeBro!
 
Lecture 1 introduction to language processors
Lecture 1  introduction to language processorsLecture 1  introduction to language processors
Lecture 1 introduction to language processors
 
Road to sbt 1.0 paved with server
Road to sbt 1.0   paved with serverRoad to sbt 1.0   paved with server
Road to sbt 1.0 paved with server
 
ArangoDB
ArangoDBArangoDB
ArangoDB
 
Road to sbt 1.0: Paved with server (2015 Amsterdam)
Road to sbt 1.0: Paved with server (2015 Amsterdam)Road to sbt 1.0: Paved with server (2015 Amsterdam)
Road to sbt 1.0: Paved with server (2015 Amsterdam)
 
Sugar Presentation - YULHackers March 2009
Sugar Presentation - YULHackers March 2009Sugar Presentation - YULHackers March 2009
Sugar Presentation - YULHackers March 2009
 
Power Leveling your TypeScript
Power Leveling your TypeScriptPower Leveling your TypeScript
Power Leveling your TypeScript
 
TI1220 Lecture 14: Domain-Specific Languages
TI1220 Lecture 14: Domain-Specific LanguagesTI1220 Lecture 14: Domain-Specific Languages
TI1220 Lecture 14: Domain-Specific Languages
 
mloc.js 2014 - JavaScript and the browser as a platform for game development
mloc.js 2014 - JavaScript and the browser as a platform for game developmentmloc.js 2014 - JavaScript and the browser as a platform for game development
mloc.js 2014 - JavaScript and the browser as a platform for game development
 
Software Development Automation With Scripting Languages
Software Development Automation With Scripting LanguagesSoftware Development Automation With Scripting Languages
Software Development Automation With Scripting Languages
 
Language Server Protocol - Why the Hype?
Language Server Protocol - Why the Hype?Language Server Protocol - Why the Hype?
Language Server Protocol - Why the Hype?
 
Structure-Compiler-phases information about basics of compiler. Pdfpdf
Structure-Compiler-phases information  about basics of compiler. PdfpdfStructure-Compiler-phases information  about basics of compiler. Pdfpdf
Structure-Compiler-phases information about basics of compiler. Pdfpdf
 
Build Great Networked APIs with Swift, OpenAPI, and gRPC
Build Great Networked APIs with Swift, OpenAPI, and gRPCBuild Great Networked APIs with Swift, OpenAPI, and gRPC
Build Great Networked APIs with Swift, OpenAPI, and gRPC
 
Compiler_Lecture1.pdf
Compiler_Lecture1.pdfCompiler_Lecture1.pdf
Compiler_Lecture1.pdf
 

Último

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 

Último (20)

%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 

Crystal internals (part 1)

  • 2. Is a compiler a hard thing?
  • 3. At Manas we usually do webapps
  • 4. Let’s talk about webapps...
  • 5. Let’s talk about webapps... ● HTML/CSS/JS ● React/Angular/Knockout ● Ruby/Erlang/Elixir ● Database (mysql/postgres) ● Elasticsearch ● Redis/Sidekiq/Background-jobs ● Docker, capistrano, deploy, servers
  • 6. Let’s talk about webapps... ● HTML/CSS/JS ● React/Angular/Knockout ● Ruby/Erlang/Elixir ● Database (mysql/postgres) ● Elasticsearch ● Redis/Sidekiq/Background-jobs ● Docker, capistrano, deploy, servers Easy…?
  • 7. Let’s talk about compilers... ● HTML/CSS/JS ● React/Angular/Knockout ● Ruby/Erlang/Elixir ● Database (mysql/postgres) ● Elasticsearch ● Redis/Sidekiq/Background-jobs ● Docker, capistrano, deploy, servers Easy!
  • 8. Let’s talk about compilers...
  • 9. Let’s talk about compilers...
  • 10. No, let’s talk about usual programs
  • 11. No, let’s talk about usual programs INPUT -> [PROCESSING…] -> OUTPUT
  • 12. No, let’s talk about compilers SOURCE CODE -> [PROCESSING…] -> EXECUTABLE
  • 13. No, let’s talk about compilers SOURCE CODE -> [PROCESSING…] -> EXECUTABLE How do we go from source code to an executable?
  • 14. Traditional stages of a compiler class Foo def bar 1 + 2 end end ● Lexer: [“class”, “Foo”, “;”, “def”, “bar”, “;”, “1”, “+”, “2”, “;”, “end”, “;”, “end”] ● Parser: ClassDef(“Foo”, body: [Def.new(“bar”)]) ● Semantic (a.k.a “type check”): make sure there are no type errors ● Codegen: generate machine code
  • 15. Let’s start with the codegen phase Goal: generate efficient assembly code for many architectures (32 bits, 64 bits, intel, arm, etc.) ● Generating assembly code is hard ● Generating efficient assembly code is harder ● Generating assembly code for many architectures is hard/tedious/boring
  • 16. Let’s start with the codegen phase Goal: generate efficient assembly code for many architectures (32 bits, 64 bits, intel, arm, etc.) ● Generating assembly code is hard ● Generating efficient assembly code is harder ● Generating assembly code for many architectures is hard/tedious/boring Thus: writing a compiler is HARD! :-(
  • 17. Let’s start with the codegen phase Goal: generate efficient assembly code for many architectures (32 bits, 64 bits, intel, arm, etc.) ● Generating assembly code is hard ● Generating efficient assembly code is harder ● Generating assembly code for many architectures is hard/tedious/boring Thus: writing a compiler is HARD! :-( Well, not anymore...
  • 18.
  • 19.
  • 20. Codegen With LLVM, we generate LLVM IR (internal representation) instead of assembly, and LLVM takes care of generating efficient assembly code for us! The hardest part is solved :-)
  • 21. define i32 @add(i32 %x, i32 %y) { %0 = add i32 %x, %y ret i32 %0 } Codegen: LLVM (example)
  • 22. LLVM provides a nice API to generate IR require "llvm" mod = LLVM::Module.new("main") mod.functions.add("add", [LLVM::Int32, LLVM::Int32], LLVM::Int32) do |func| func.basic_blocks.append do |builder| res = builder.add(func.params[0], func.params[1]) builder.ret(res) end end puts mod
  • 23. ● Lexer ● Parser ● Semantic Remaining phases
  • 24. ● Kind of easy: go char by char until we get a keyword, identifier, number, etc. ● We won’t go into implementation details... Lexer
  • 25. ● Kind of easy: go token by token and create a tree of expressions ● This tree is called AST: Abstract Syntax Tree ● An AST is like a directed, acyclic graph ● We won’t go into implementation details... Parser
  • 26. ● This is the fundamental piece of the compiler ● It takes an AST as input and analyzes it ● Analysis can result in: ○ Declaring types: for example “class Foo; end” will declare a type Foo ○ Checking methods: for example “Foo.bar” will check that “Foo” is a declared type and that the method “bar” exists in it, and has the correct arity and types ○ Giving each non-dead expression in the program a type ○ Gathering some info for the codegen phase: for example know the local variables of a method, and their type Semantic
  • 27. ● The interesting part of the compiler is the semantic phase ● It’s just about processing an AST ● In Crystal’s compiler you just need to know one language: Crystal! ● No HTML/CSS/JS/JSX/etc. ● No untyped, dynamic languages: no Ruby/Erlang/Elixir. Type safe! ● Stuff is processed in memory ● No databases, no Elasticsearch, no Redis Semantic
  • 28. ● The interesting part of the compiler is the semantic phase ● It’s just about processing an AST ● In Crystal’s compiler you just need to know one language: Crystal! ● No HTML/CSS/JS/JSX/etc. ● No untyped, dynamic languages: no Ruby/Erlang/Elixir. Type safe! ● Stuff is processed in memory ● No databases, no Elasticsearch, no Redis Writing a compiler is easier than writing a web app! ^_^ Semantic
  • 29. ● The interesting part of the compiler is the semantic phase ● It’s just about processing an AST ● In Crystal’s compiler you just need to know one language: Crystal! ● No HTML/CSS/JS/JSX/etc. ● No untyped, dynamic languages: no Ruby/Erlang/Elixir. Type safe! ● Stuff is processed in memory ● No databases, no Elasticsearch, no Redis Writing a compiler is easier than writing a web app! ^_^ (Or at least it’s more fun :-P) Semantic
  • 30.
  • 31. Directory layout ● src/compiler/crystal ○ command/ ○ syntax/ ○ semantic/ ○ macros/ ○ codegen/ ○ tools/ ○ compiler.cr ○ types.cr ○ program.cr
  • 32. Directory layout ● src/compiler/crystal ○ command/ : the command line interface ○ syntax/ : lexer, parser, ast, visitor, transformer ○ semantic/ : type declaration, method lookup, etc. ○ macros/ : macro expansion logic ○ codegen/ : codegen ○ tools/ : doc generator, formatter, init ○ compiler.cr : combines syntax + semantic + codegen ○ types.cr : all possible types in Crystal (Int32, String, unions, custom types, etc.) ○ program.cr : holds definitions of a program (holds Int32, String, etc.)
  • 33. Directory layout ● src/compiler/crystal : ~43K LOC ○ command/ : ~300LOC ○ syntax/ : ~10K LOC ○ semantic/ : ~12K LOC ○ macros/ : ~2K LOC ○ codegen/ : ~6K LOC ○ tools/ : ~7K LOC ○ compiler.cr : ~300LOC ○ types.cr :~2K LOC ○ program.cr : ~300 LOC
  • 34. Directory layout ● src/compiler/crystal : ~43K LOC ○ command/ : ~300LOC ○ syntax/ : ~10K LOC ○ semantic/ : ~12K LOC ○ macros/ : ~2K LOC ○ codegen/ : ~6K LOC ○ tools/ : ~7K LOC ○ compiler.cr : ~300LOC ○ types.cr :~2K LOC ○ program.cr : ~300 LOC About 14K LOC to analyze source code.
  • 35. Directory layout ● src/compiler/crystal : ~43K LOC ○ command/ : ~300LOC ○ syntax/ : ~10K LOC ○ semantic/ : ~12K LOC ○ macros/ : ~2K LOC ○ codegen/ : ~6K LOC ○ tools/ : ~7K LOC ○ compiler.cr : ~300LOC ○ types.cr :~2K LOC ○ program.cr : ~300 LOC About 14K LOC to analyze source code. One big Rails app at Manas has 14K LOC in “./app”
  • 36. Directory layout ● src/compiler/crystal : ~43K LOC ○ command/ : ~300LOC ○ syntax/ : ~10K LOC ○ semantic/ : ~12K LOC ○ macros/ : ~2K LOC ○ codegen/ : ~6K LOC ○ tools/ : ~7K LOC ○ compiler.cr : ~300LOC ○ types.cr :~2K LOC ○ program.cr : ~300 LOC About 14K LOC to analyze source code. One big Rails app at Manas has 14K LOC in “./app” A compiler can’t be that hard! ;-)
  • 37. Show me the code
  • 38. Show me the code # src/compiler/crystal/compiler.cr def compile(source : Source | Array(Source), output_filename : String) : Result source = [source] unless source.is_a?(Array) program = new_program(source) node = parse program, source node = program.semantic node, @stats codegen program, node, source, output_filename unless @no_codegen Result.new program, node end
  • 39. Show me the code # src/compiler/crystal/compiler.cr def compile(source : Source | Array(Source), output_filename : String) : Result source = [source] unless source.is_a?(Array) program = new_program(source) node = parse program, source node = program.semantic node, @stats codegen program, node, source, output_filename unless @no_codegen Result.new program, node end
  • 40. Show me the code # src/compiler/crystal/compiler.cr def compile(source : Source | Array(Source), output_filename : String) : Result source = [source] unless source.is_a?(Array) program = new_program(source) node = parse program, source node = program.semantic node, @stats codegen program, node, source, output_filename unless @no_codegen Result.new program, node end What is a program?
  • 41. Program ● Holds all types and top-level methods for a given compilation ● For example, if I compile “class Foo; end” and you compile “class Bar; end”, the first program will have a type named “Foo”, and the second one won’t (but it will have a type named “Bar”) ● It lets us test the compiler more easily, because we can use different Program instances for each snippet of code that we want to test ● In contrast of having global variables holding all of a program’s data ● A Program is passed around in all phases of a compilation (except lexing and parsing, which don’t need semantic info)
  • 42. Show me the code # src/compiler/crystal/compiler.cr def compile(source : Source | Array(Source), output_filename : String) : Result source = [source] unless source.is_a?(Array) program = new_program(source) node = parse program, source # from source to Crystal::ASTNode node = program.semantic node, @stats codegen program, node, source, output_filename unless @no_codegen Result.new program, node end What is a program?
  • 43. Show me the code # src/compiler/crystal/compiler.cr def compile(source : Source | Array(Source), output_filename : String) : Result source = [source] unless source.is_a?(Array) program = new_program(source) node = parse program, source node = program.semantic node, @stats # Semantic! :-) codegen program, node, source, output_filename unless @no_codegen Result.new program, node end What is a program?
  • 44. Semantic ● The entry point for semantic analysis is in src/compiler/crystal/semantic.cr ● Other files are in src/compiler/crystal/semantic/ ● The file semantic.cr has comments that explain the overall algorithm :-)
  • 45. Semantic: overall algorithm ● top level: declare classes, modules, macros, defs and other top-level stuff ● new methods: create `new` methods for every `initialize` method ● type declarations: process type declarations like `@x : Int32` ● check abstract defs: check that abstract defs are implemented ● class_vars_initializers: process initializers like `@@x = 1` ● instance_vars_initializers: process initializers like `@x = 1` ● main: process "main" code, calls and method bodies (the whole program). ● cleanup: remove dead code and other simplifications ● check recursive structs: check that structs are not recursive (impossible to codegen)
  • 46. Semantic: overall algorithm Note! ● This algorithm didn’t come from the Skies (nor from a textbook, nor from a paper) ● It’s not written in stone! ● It can definitely be improved: readability, performance, etc.
  • 47. Note! ● It’s actually more like this… Semantic: overall algorithm
  • 48. Semantic But before looking at each phase, we need to learn about the most useful pattern for analyzing an AST...
  • 50. require "compiler/crystal/syntax" class SumVisitor < Crystal::Visitor getter sum = 0 def visit(node : Crystal::NumberLiteral) @sum += node.value.to_i end def visit(node : Crystal::ASTNode) true # true: continue visiting children nodes end end ast = Crystal::Parser.parse("foo(1 + 2, 3, [4])") visitor = SumVisitor.new ast.accept(visitor) puts visitor.sum
  • 51. The Visitor pattern ● We define a visit method for each node of interest ● We process the nodes ● We return true if we want to process children, false otherwise ● Example: if we only want to process class declarations, we could just define visit(node : Crystal::ClassDef) and define some logic there (and return true, because of nested class definitions) ● A visitor abstracts over the way nodes are composed ● ...though in many cases, for semantic purposes, we need and use the way a node is composed (for example, to analyze a call we need to know the argument types, so we check the arguments, not all children in a generic way)
  • 52. Semantic: overall algorithm ● top level: declare classes, modules, macros, defs and other top-level stuff ● new methods ● type declarations ● check abstract defs ● class_vars_initializers ● instance_vars_initializers ● main ● cleanup ● check recursive structs
  • 53. Top level: declare classes, modules, macros, defs... # src/compiler/crystal/semantic/top_level_visitor.cr class Crystal::TopLevelVisitor < Crystal::SemanticVisitor # ... end
  • 54. ● Located at semantic_visitor.cr ● This is a base visitor used in most of the phases of the semantic analysis ● It keeps track of the “current type” ● For example in “class Foo; class Bar; baz; end; end”, “current type” starts at the top-level (the Program). When “class Foo” is found, the current type becomes “Foo” (we search “Foo” in the current type). When “class Bar” is found, the current type becomes “Foo::Bar” (we search “Bar” in the current type). When “baz” is found, it will be looked up inside the current type. ● But initially there’s no “Foo” inside the current type (the Program). Who defines it? … The top-level visitor! Crystal::SemanticVisitor
  • 55. ● Located at top_level_visitor.cr ● Defines classes, methods, etc. ● Given “class Foo; class Bar; baz; end; end”... ● current_type starts at Program ● When “class Foo” is found (ClassDef), we check if “Foo” exists in the current type. If not, we create it. If it exists with a different type (if it’s a module), we give an error. ● We attach this type “Foo” to the AST node ClassDef. SemnticVisitor will use this in every subsequent phase. ● … the “baz” call is not analyzed here (unless it’s a macro) Crystal::TopLevelVisitor
  • 56. Crystal::TopLevelVisitor ● Many other things done in this visitor: methods and macros are added to types, aliases and enums are defined, etc. ● Question: why are methods and macros defined at this phase?
  • 57. ● The “inherited” macro hook must be processed as soon as “Bar < Foo” and “Baz < Foo” are found ● The macro expands to “do_something”, which must expand to “def foo; 1; end” ● This must happen before we continue processing Baz’s body: “def foo; 3; end” must win and be the method found when doing “Baz.new.foo” ● Conclusion: methods, macros and hooks must be defined in the first pass, when defining types. Additionally, macros might be looked up in types in this same pass (like “do_something”) ● SemanticVisitor takes care to look up and expand calls that resolve to macro calls When should macros be defined and expanded class Foo macro inherited do_something end macro do_something def foo; 1; end end end class Bar < Foo; end class Baz < Foo def foo; 3; end end puts Bar.new.foo # => 1 puts Baz.new.foo # => 3
  • 58. Method overloads ● Crystal methods are very powerful! For example: optional type restrictions, different number of arguments, default arguments, splat, etc. ● When methods are added to types we need to: ○ Know if a method replaces (redefines) an old method ○ Track whether a method is “stricter” than another method, to quickly know, given a call argument types, in which order they are going to be tested
  • 59. Method restrictions def foo(x : Int32) puts 1 end def foo(x) puts 2 end foo(1) foo('a') ● Given foo(1), both methods match it. However, the first overload should be invoked because it has a stronger restriction than the second overload. ● If we define the methods in a different order, it still works the same ● This is because an argument with a type restriction is stronger than one without one. We say that the first one is a restriction of the second one (we should probably rename this to use stronger) ● This applies to types too: Int32 is stronger than Int32 | String. And Bar is stronger than Foo, if Bar < Foo. ● Given two methods with the same name, if all arguments of a method are stronger than the others’, the whole method is stronger and should come first. Each type stores an ordered list of methods indexed by method name, with this notion. ● If the methods are both stronger than each other, they have the same restriction.
  • 60. Method restrictions def foo(x : Int32) puts 1 end def foo(x) puts 2 end foo(1) foo('a') ● This logic is located at restrictions.cr ● A lot of cases to consider: generics, tuples, splats, etc. ● The code and algorithms could probably use a simpler, unified logic and a cleanup, but first all of these concepts and definitions must be defined much more formally
  • 61. Semantic: overall algorithm ● top level ● new methods: create `new` methods for every `initialize` method ● type declarations ● check abstract defs ● class_vars_initializers ● instance_vars_initializers ● main ● cleanup ● check recursive structs
  • 62. ● Located at new.cr ● TopLevelVisitor creates a `new` class method for every `initialize` method it finds (the logic for this is also in new.cr) ● Classes that end up without an `initialize` need a default, argless `self.new` method ● This phase is a bit messy right now because of some missing things related to generics… Semantic: new methods
  • 63. class Foo def initialize(x : Int32) @x = x end # Generated from the above def self.new(x : Int32) instance = allocate instance.initialize(x) if instance.responds_to?(:finalize) ::GC.add_finalizer(instance) end end end Semantic: new methods
  • 64. Semantic: overall algorithm ● top level ● new methods ● type declarations: process type declarations like `@x : Int32` ● check abstract defs ● class_vars_initializers ● instance_vars_initializers ● main ● cleanup ● check recursive structs
  • 65. ● Located at type_declaration_processor.cr (and type_declaration_visitor.cr and type_guess_visitor.cr) ● Combines info gathered by these two visitors to declare the type of instance and class variables. ● TypeDeclarationVisitor deals with explicit type declarations ● TypeGuessVisitor tries to “guess” the type of instance and class variables without an explicit type annotations (for example @x = 1 and @x = Foo.new) Semantic: type declarations
  • 66. Semantic: overall algorithm ● top level ● new methods ● type declarations ● check abstract defs: check that abstract defs are implemented ● class_vars_initializers ● instance_vars_initializers ● main ● cleanup ● check recursive structs
  • 67. ● Located at abstract_def_checker.cr ● Not a visitor, but traverses all types, and for those that have abstract defs checks that subclasses or including modules defined those methods Semantic: check abstract defs