SlideShare una empresa de Scribd logo
1 de 18
Descargar para leer sin conexión
Writing parsers in C#
(“Projecting arbitrary character streams into C# objects using monadic
parser combinators”)
Speaker: Alexey Golub @Tyrrrz
What is a parser?
• To parse — to resolve text into logical syntactic components
• i.e. IEnumerable<T> Parse(IEnumerable<char> text)
• e.g. double.Parse, XDocument.Parse
Where are parsers used?
• Data deserialization (JSON, XML, YAML)
• Static code analysis (ReSharper, TSLint)
• Syntax highlighting (VS Code, Highlight.js)
• Compilers, transpilers, interpreters (Roslyn, Markdig, Babel, SQL)
• Template engines (Razor, Liquid, Scriban)
• Natural language processing (Spellchecking, Translation)
What do parsers do?
• Disambiguate text into domain objects
• Assert that the text is well-formed
123 456,93
numeric literals
thousands separator
decimal separator
numeric literal
Formal language theory
• Alphabet – set of allowed characters
• Language – set of words made from characters in alphabet
• Grammar – set of rules that define how words are generated
Grammar types
• Regular grammar – RHS of a production rule is a terminal or a
terminal plus non-terminal
• Context-free grammar – RHS of a production rule is a finite sequence
of terminals and/or non-terminals
Rules of thumb
• If a language has recursive grammar rules – it’s not regular
• Regular grammar can be represented with regular expressions
• Context-free grammar cannot be directly represented with regular
expressions (in .NET)
Syntax trees
• Primary goal of a parser is to break down text into syntactic
components
• Syntactic structure of context-free languages is represented by a
syntax tree
• Program can then further evaluate the syntax tree as required
Root
Terminal
node
Non-terminal
node
Terminal
node
Terminal
node
Example AST produced by C-like code
Approaches
• Loop/stack-based manual parsers
• Loop through all characters in the input
• Maintain context on a stack
• Parser generators
• Custom language that defines grammar
• Compiles into code that you can execute
• Parser combinators
• Each parser is a delegate
• Parsers can be combined into higher-order parsers
Example from JSON.net (manual parser)
ANTLR (parser generator)
Sprache (parser combinator)
Parser combinators
• Start by building simple parsers
• Combine them into more complex parsers
• Repeat until you reach the root
• Hierarchy of parsers should resemble target syntax tree
Parser combinators (illustrated)
10 + 5
NumberParser WhiteSpaceParser SignParser
NumberParser THEN WhiteSpaceParser THEN SignParser THEN WhiteSpaceParser THEN NumberParser
Number (5)Number (10)
PlusOperator
OperatorParser
Coding challenge
Let’s develop a basic JSON parser
Further reading
• Formal grammar on Wikipedia –
https://en.wikipedia.org/wiki/Formal_grammar
• Parsing in C# by Federico Tomassetti –
https://tomassetti.me/parsing-in-csharp
Thank you!
@Tyrrrz

Más contenido relacionado

La actualidad más candente

Episode 8 - Path To Code - Integrate Salesforce with external system using R...
Episode 8  - Path To Code - Integrate Salesforce with external system using R...Episode 8  - Path To Code - Integrate Salesforce with external system using R...
Episode 8 - Path To Code - Integrate Salesforce with external system using R...Jitendra Zaa
 
Topic Modelling and APIs
Topic Modelling and APIsTopic Modelling and APIs
Topic Modelling and APIsAli Kheyrollahi
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solrskillupevent
 
Introduction to Operational Semantics
Introduction to Operational Semantics Introduction to Operational Semantics
Introduction to Operational Semantics jsinglet
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedBeyondTrees
 
Language processor implementation using python
Language processor implementation using pythonLanguage processor implementation using python
Language processor implementation using pythonViktor Pyskunov
 
Webinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrWebinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrLucidworks
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflowsDaniel S. Katz
 
Building NLP solutions using Python
Building NLP solutions using PythonBuilding NLP solutions using Python
Building NLP solutions using Pythonbotsplash.com
 

La actualidad más candente (15)

Episode 8 - Path To Code - Integrate Salesforce with external system using R...
Episode 8  - Path To Code - Integrate Salesforce with external system using R...Episode 8  - Path To Code - Integrate Salesforce with external system using R...
Episode 8 - Path To Code - Integrate Salesforce with external system using R...
 
Topic Modelling and APIs
Topic Modelling and APIsTopic Modelling and APIs
Topic Modelling and APIs
 
3 describing syntax
3 describing syntax3 describing syntax
3 describing syntax
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Plagirism checker
Plagirism checkerPlagirism checker
Plagirism checker
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Introduction to php
Introduction to phpIntroduction to php
Introduction to php
 
Introduction to Operational Semantics
Introduction to Operational Semantics Introduction to Operational Semantics
Introduction to Operational Semantics
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
Language processor implementation using python
Language processor implementation using pythonLanguage processor implementation using python
Language processor implementation using python
 
OWL briefing
OWL briefingOWL briefing
OWL briefing
 
Webinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrWebinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with Solr
 
HypergraphDB
HypergraphDBHypergraphDB
HypergraphDB
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflows
 
Building NLP solutions using Python
Building NLP solutions using PythonBuilding NLP solutions using Python
Building NLP solutions using Python
 

Similar a Alexey Golub - Writing parsers in c# | 3Shape Meetup

ANTLR - Writing Parsers the Easy Way
ANTLR - Writing Parsers the Easy WayANTLR - Writing Parsers the Easy Way
ANTLR - Writing Parsers the Easy WayMichael Yarichuk
 
Lexical analysis - Compiler Design
Lexical analysis - Compiler DesignLexical analysis - Compiler Design
Lexical analysis - Compiler DesignKuppusamy P
 
Json - ideal for data interchange
Json - ideal for data interchangeJson - ideal for data interchange
Json - ideal for data interchangeChristoph Santschi
 
COMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptxCOMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptxRossy719186
 
Compiler Design
Compiler DesignCompiler Design
Compiler DesignMir Majid
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler ConstructionAhmed Raza
 
Understanding Character Encodings
Understanding Character EncodingsUnderstanding Character Encodings
Understanding Character EncodingsMobisoft Infotech
 
Assignment4.pptx
Assignment4.pptxAssignment4.pptx
Assignment4.pptxjatinchand3
 
Compier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.pptCompier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.pptApoorv Diwan
 
.NET Fest 2019. Алексей Голуб. Монадные парсер-комбинаторы в C# (простой спос...
.NET Fest 2019. Алексей Голуб. Монадные парсер-комбинаторы в C# (простой спос....NET Fest 2019. Алексей Голуб. Монадные парсер-комбинаторы в C# (простой спос...
.NET Fest 2019. Алексей Голуб. Монадные парсер-комбинаторы в C# (простой спос...NETFest
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.pptsivaganesh293
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.pptsivaganesh293
 

Similar a Alexey Golub - Writing parsers in c# | 3Shape Meetup (20)

ANTLR - Writing Parsers the Easy Way
ANTLR - Writing Parsers the Easy WayANTLR - Writing Parsers the Easy Way
ANTLR - Writing Parsers the Easy Way
 
Lexical analysis - Compiler Design
Lexical analysis - Compiler DesignLexical analysis - Compiler Design
Lexical analysis - Compiler Design
 
Compiler1
Compiler1Compiler1
Compiler1
 
Json - ideal for data interchange
Json - ideal for data interchangeJson - ideal for data interchange
Json - ideal for data interchange
 
COMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptxCOMPILER CONSTRUCTION KU 1.pptx
COMPILER CONSTRUCTION KU 1.pptx
 
1._Introduction_.pptx
1._Introduction_.pptx1._Introduction_.pptx
1._Introduction_.pptx
 
Plc part 2
Plc  part 2Plc  part 2
Plc part 2
 
Syntax
SyntaxSyntax
Syntax
 
Compiler Design
Compiler DesignCompiler Design
Compiler Design
 
Python Tutorial Part 1
Python Tutorial Part 1Python Tutorial Part 1
Python Tutorial Part 1
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler Construction
 
XML
XMLXML
XML
 
Understanding Character Encodings
Understanding Character EncodingsUnderstanding Character Encodings
Understanding Character Encodings
 
Assignment4.pptx
Assignment4.pptxAssignment4.pptx
Assignment4.pptx
 
1 cc
1 cc1 cc
1 cc
 
Compier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.pptCompier Design_Unit I_SRM.ppt
Compier Design_Unit I_SRM.ppt
 
.NET Fest 2019. Алексей Голуб. Монадные парсер-комбинаторы в C# (простой спос...
.NET Fest 2019. Алексей Голуб. Монадные парсер-комбинаторы в C# (простой спос....NET Fest 2019. Алексей Голуб. Монадные парсер-комбинаторы в C# (простой спос...
.NET Fest 2019. Алексей Голуб. Монадные парсер-комбинаторы в C# (простой спос...
 
Lexical Analysis.pdf
Lexical Analysis.pdfLexical Analysis.pdf
Lexical Analysis.pdf
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
 
Compier Design_Unit I.ppt
Compier Design_Unit I.pptCompier Design_Unit I.ppt
Compier Design_Unit I.ppt
 

Más de Oleksii Holub

Reality-Driven Testing using TestContainers
Reality-Driven Testing using TestContainersReality-Driven Testing using TestContainers
Reality-Driven Testing using TestContainersOleksii Holub
 
Expression trees in C#
Expression trees in C#Expression trees in C#
Expression trees in C#Oleksii Holub
 
Fallacies of unit testing
Fallacies of unit testingFallacies of unit testing
Fallacies of unit testingOleksii Holub
 
Expression trees in c#
Expression trees in c#Expression trees in c#
Expression trees in c#Oleksii Holub
 
GitHub Actions in action
GitHub Actions in actionGitHub Actions in action
GitHub Actions in actionOleksii Holub
 
Alexey Golub - Dependency absolution (application as a pipeline) | Svitla Sma...
Alexey Golub - Dependency absolution (application as a pipeline) | Svitla Sma...Alexey Golub - Dependency absolution (application as a pipeline) | Svitla Sma...
Alexey Golub - Dependency absolution (application as a pipeline) | Svitla Sma...Oleksii Holub
 

Más de Oleksii Holub (8)

Reality-Driven Testing using TestContainers
Reality-Driven Testing using TestContainersReality-Driven Testing using TestContainers
Reality-Driven Testing using TestContainers
 
Intro to CliWrap
Intro to CliWrapIntro to CliWrap
Intro to CliWrap
 
Intro to CliWrap
Intro to CliWrapIntro to CliWrap
Intro to CliWrap
 
Expression trees in C#
Expression trees in C#Expression trees in C#
Expression trees in C#
 
Fallacies of unit testing
Fallacies of unit testingFallacies of unit testing
Fallacies of unit testing
 
Expression trees in c#
Expression trees in c#Expression trees in c#
Expression trees in c#
 
GitHub Actions in action
GitHub Actions in actionGitHub Actions in action
GitHub Actions in action
 
Alexey Golub - Dependency absolution (application as a pipeline) | Svitla Sma...
Alexey Golub - Dependency absolution (application as a pipeline) | Svitla Sma...Alexey Golub - Dependency absolution (application as a pipeline) | Svitla Sma...
Alexey Golub - Dependency absolution (application as a pipeline) | Svitla Sma...
 

Último

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Alexey Golub - Writing parsers in c# | 3Shape Meetup

  • 1. Writing parsers in C# (“Projecting arbitrary character streams into C# objects using monadic parser combinators”) Speaker: Alexey Golub @Tyrrrz
  • 2. What is a parser? • To parse — to resolve text into logical syntactic components • i.e. IEnumerable<T> Parse(IEnumerable<char> text) • e.g. double.Parse, XDocument.Parse
  • 3. Where are parsers used? • Data deserialization (JSON, XML, YAML) • Static code analysis (ReSharper, TSLint) • Syntax highlighting (VS Code, Highlight.js) • Compilers, transpilers, interpreters (Roslyn, Markdig, Babel, SQL) • Template engines (Razor, Liquid, Scriban) • Natural language processing (Spellchecking, Translation)
  • 4. What do parsers do? • Disambiguate text into domain objects • Assert that the text is well-formed 123 456,93 numeric literals thousands separator decimal separator numeric literal
  • 5. Formal language theory • Alphabet – set of allowed characters • Language – set of words made from characters in alphabet • Grammar – set of rules that define how words are generated
  • 6. Grammar types • Regular grammar – RHS of a production rule is a terminal or a terminal plus non-terminal • Context-free grammar – RHS of a production rule is a finite sequence of terminals and/or non-terminals
  • 7. Rules of thumb • If a language has recursive grammar rules – it’s not regular • Regular grammar can be represented with regular expressions • Context-free grammar cannot be directly represented with regular expressions (in .NET)
  • 8. Syntax trees • Primary goal of a parser is to break down text into syntactic components • Syntactic structure of context-free languages is represented by a syntax tree • Program can then further evaluate the syntax tree as required Root Terminal node Non-terminal node Terminal node Terminal node
  • 9. Example AST produced by C-like code
  • 10. Approaches • Loop/stack-based manual parsers • Loop through all characters in the input • Maintain context on a stack • Parser generators • Custom language that defines grammar • Compiles into code that you can execute • Parser combinators • Each parser is a delegate • Parsers can be combined into higher-order parsers
  • 11. Example from JSON.net (manual parser)
  • 14. Parser combinators • Start by building simple parsers • Combine them into more complex parsers • Repeat until you reach the root • Hierarchy of parsers should resemble target syntax tree
  • 15. Parser combinators (illustrated) 10 + 5 NumberParser WhiteSpaceParser SignParser NumberParser THEN WhiteSpaceParser THEN SignParser THEN WhiteSpaceParser THEN NumberParser Number (5)Number (10) PlusOperator OperatorParser
  • 16. Coding challenge Let’s develop a basic JSON parser
  • 17. Further reading • Formal grammar on Wikipedia – https://en.wikipedia.org/wiki/Formal_grammar • Parsing in C# by Federico Tomassetti – https://tomassetti.me/parsing-in-csharp