1. Sumit Gulwani presented on programming by examples, an approach where users provide input-output examples to automatically generate programs.
2. Programming by examples has been used to automate tasks like data transformation, table extraction and reshaping, and code refactoring by learning patterns from examples.
3. Current research aims to improve predictive program synthesis by generating programs from just input examples, synthesizing code in different target languages, and integrating examples into existing workflows like IDEs and notebooks. The future of programming by examples includes handling multiple modalities like examples and natural language.
51. Table Reshaping
51
50% spreadsheets are semi-structured.
KPMG, Deloitte budget millions of dollars for normalization.
“FlashRelate: Extracting Relational Data from Semi-Structured Spreadsheets Using Examples”
[PLDI 2015] Dan Barowy, Sumit Gulwani, Ted Hart, Ben Zorn
Bureau of I.A.
Regional Dir. Numbers
Niles C. Tel: (800)645-8397
Fax: (907)586-7252
Jean H. Tel: (918)781-4600
Fax: (918)781-4604
Frank K. Tel: (615)564-6500
Fax: (615)564-6701
Tel Fax
Niles C. (800)645-8397 (907)586-7252
Jean H. (918)781-4600 (918)781-4604
Frank K. (615)564-6500 (615)564-6701
FlashRelate
From few
examples
of rows in
output table
65. Disambiguator
More Examples
Intended
Program in D
PBE Architecture
65
Examples
Program set
Test inputs
Ranked
Program set
DSL D
Program
Ranker
“Programming by Examples: PL meets ML”
[APLAS 2017] Sumit Gulwani, Prateek Jain
Search
Engine
Search
• Logical Deduction: [OOPSLA '15] FlashMeta: A framework for inductive program synthesis
• Machine Learning: [ICLR '18] Neural-guided deductive search for real-time program synthesis from examples
Ranking
• Program Features: [CAV '15] Predicting a correct program in programming by example
• Output Features: [IJCAI '17] Learning to learn programs from examples: going beyond program structure
Disambiguation
• Distinguishing Inputs: [UIST '15] User Interaction Models for Disambiguation in Programming by Example
• Clustering: [OOPSLA '18] FlashProfile: A Framework for Synthesizing Data Profiles
66. New Frontiers
Predictive Synthesis
Synthesis of intended programs from just the input.
• Tabular data extraction, Sort, Join
Synthesis of readable/modifiable code
Synthesis in target language of choice.
• Scala, R, PySpark
Code-first experience in existing workflows.
• IDE, Notebook
66“Automated Data Extraction using Predictive Program Synthesis”
[AAAI 2017] Mohammad Raza, Sumit Gulwani
73. Code Transformations by Examples
• Code refactoring consumes 40% time in migration.
– Old version to new version
– On-prem to cloud
– One framework to another
• Custom formatting
• Performance enhancements
• Repetitive bug fixes
– Feedback generation for programming education
73“Learning syntactic program transformations from examples”
[ICSE 2017] Reudismam Rolim, Gustavo Soares, et.al.
74. Programming by examples is a new frontier in AI.
• 10-100x productivity increase in some domains.
– Data Wrangling: Data scientists spend 80% time.
– Code Refactoring: Developers spend 40% time in migration.
• 99% of end users are non-programmers.
Next-generational AI techniques under the hood
• Logical Reasoning + Machine Learning
The Future: Multi-modal programming with Examples and NL
Questions/Feedback: Contact me at sumitg@microsoft.com
Conclusion
74Microsoft PROSE (PROgram Synthesis by Examples) Framework
Available for non-commercial use : https://microsoft.github.io/prose/