8. Them turtles
c 2014– Andrei Alexandrescu. 8 / 47
#!/usr/bin/rdmd
void main() {
import std.stdio;
writeln("Hello, world!");
}
• Most everything can be scoped everywhere
• Better scoping, reasoning, dependency mgmt
• Functions
• Types (Voldermort types)
• Even generics
9. Segue into generics
c 2014– Andrei Alexandrescu. 9 / 47
void log(T)(T stuff) {
import std.datetime, std.stdio;
writeln(Clock.currTime(), ’ ’, stuff);
}
void main() {
log("hello");
}
• If not instantiated, no import
• imports cached once realized
• Generics faster to build, import
• Less pressure on linker
11. Memory Safety
c 2014– Andrei Alexandrescu. 11 / 47
• All data is read with the type it was written
• Eliminates unpleasant class of errors
• Makes bugs reproducible
• Preserves the integrity of the type system
12. Challenge
c 2014– Andrei Alexandrescu. 12 / 47
• System-level languages must:
◦ Forge pointers (linkers/loaders)
◦ Magically change type of memory (allocators/GCs)
◦ Recycle memory “early”
◦ Circumvent type system
13. Attack
c 2014– Andrei Alexandrescu. 13 / 47
• All of D can’t be safe
• Define a safe subset
• Safe subset usable for large portions or entire programs
• Tagged with @safe attribute
@safe double cube(double x) {
return x * x * x;
}
• Unsafe functions tagged with @system
@system void free(void*);
14. But wait
c 2014– Andrei Alexandrescu. 14 / 47
• Important case: safe functions with
unknown/unprovable implementation
◦ fopen, malloc, isAlpha
◦ Small buffer optimization
◦ Tagged union implementation
• Define attribute @trusted
15. Rules
c 2014– Andrei Alexandrescu. 15 / 47
• @safe functions cannot:
◦ call @system functions
◦ escape addresses of locals
◦ forge pointers
◦ cast pointers around
◦ do pointer arithmetic
• N.B. Not all rules completely implemented
◦ Some on the conservative side
◦ A few details in flux
16. Resulting setup
c 2014– Andrei Alexandrescu. 16 / 47
• Low notational overhead: entire modules or portions
thereof can be annotated
• Deduction for generics
• Application divided in:
◦ Safe part, uses GC, easy to write and debug
◦ Trusted part, encapsulated unsafe manipulation,
proven by means of code review
◦ System part, non-encapsulated unsafe
manipulation, only when solidly justified
18. Thesis
c 2014– Andrei Alexandrescu. 18 / 47
• Writing entire programs in pure style difficult
• Writing fragments of programs in pure style easy, useful
• + Easier to verify useful properties, debug
• + Better code generation
• − Challenge: interfacing pure and impure code
19. Functional Factorial (yawn)
c 2014– Andrei Alexandrescu. 19 / 47
ulong factorial(uint n) {
return n <= 1 ? 1 : n * factorial(n - 1);
}
• It’s PSPACE!
• Somebody should do hard time for this
20. However, it’s pure
c 2014– Andrei Alexandrescu. 20 / 47
pure ulong factorial(uint n) {
return n <= 1 ? 1 : n * factorial(n - 1);
}
• Pure is good
21. Functional Factorial, Fixed
c 2014– Andrei Alexandrescu. 21 / 47
pure ulong factorial(uint n) {
ulong crutch(uint n, ulong result) {
return n <= 1
? result
: crutch(n - 1, n * result);
}
return crutch(n, 1);
}
• Threads state through as parameters
• You know what? I don’t care for it
22. Honest Factorial
c 2014– Andrei Alexandrescu. 22 / 47
ulong factorial(uint n) {
ulong result = 1;
foreach (uint i = 2; i <= n; ++i) {
result *= i;
}
return result;
}
• But no longer pure!
• Well allow me to retort
23.
24. Pure is as pure does
c 2014– Andrei Alexandrescu. 24 / 47
• “Pure functions always return the same result for the
same arguments”
• No reading and writing of global variables
◦ (Global constants okay)
• No calling of impure functions
• Who said anything about local, transient state inside the
function?
25. Transitive State
c 2014– Andrei Alexandrescu. 25 / 47
pure void reverse(T)(T[] a) {
foreach (i; 0 .. data.length / 2) {
swap(data[i], data[$ - i - 1]);
}
}
• Possibility: disallow
• More useful: relaxed rule
• Operate with transitive closure of state reachable
through parameter
• Not functional pure, but an interesting superset
• No need for another annotation, it’s all in the signature!
26. User-defined types
c 2014– Andrei Alexandrescu. 26 / 47
pure BigInt factorial(uint n) {
BigInt result = 1;
for (; n > 1; --n) {
result *= n;
}
return result;
}
• Works, but not in released version
• Most of stdlib “purified”
27. Aftermath
c 2014– Andrei Alexandrescu. 27 / 47
• If parameters reach mutable state:
◦ Relaxed pure—no globals, no I/O, no impure calls
• If parameters can’t reach mutable state:
◦ “Haskell-grade” observed purity
◦ Yet imperative implementation possible
◦ As long as it’s local only
29. Generative programming
c 2014– Andrei Alexandrescu. 29 / 47
• In brief: code that generates code
• Generic programming often requires algorithm
specialization
• Specification often present in a DSL
30. Embedded DSLs
c 2014– Andrei Alexandrescu. 30 / 47
Force into host language’s syntax?
36. Embedded DSLs
c 2014– Andrei Alexandrescu. 31 / 47
• Formatted printing?
• Regular expressions?
• EBNF?
• PEG?
• SQL?
• . . . Pasta for everyone!
37. Embedded DSLs
c 2014– Andrei Alexandrescu. 32 / 47
Here: use with native grammar
Process during compilation
Generate D code accordingly
38. Compile-Time Evaluation
c 2014– Andrei Alexandrescu. 33 / 47
• A large subset of D available for compile-time evaluation
ulong factorial(uint n) {
ulong result = 1;
for (; n > 1; --n) result *= n;
return result;
}
...
auto f1 = factorial(10); // run-time
static f2 = factorial(10); // compile-time
39. Code injection with mixin
c 2014– Andrei Alexandrescu. 34 / 47
mixin("writeln("hello, world");");
mixin(generateSomeCode());
• Not as glamorous as AST manipulation but darn
effective
• Easy to understand and debug
• Now we have compile-time evaluation AND mixin. . .
41. Example: bitfields in library
c 2014– Andrei Alexandrescu. 36 / 47
struct A {
int a;
mixin(bitfields!(
uint, "x", 2,
int, "y", 3,
uint, "z", 2,
bool, "flag", 1));
}
A obj;
obj.x = 2;
obj.z = obj.x;
42. Example: boilerplate
c 2014– Andrei Alexandrescu. 37 / 47
string forward(string p, string[] names...)
{
string r;
foreach (n; names)
{
r ~= "static if (hasMember!(typeof("~p~
"), ‘"~n~"‘)"
~ ") auto ref "~n~
"(T_...)(T_ t_) { return "~
p~"."~n~"(t_); }n";
}
return r;
}
43. Example: boilerplate
c 2014– Andrei Alexandrescu. 38 / 47
struct A {
int fun(int, string) { ... }
int gun(int, string, double) { ... }
}
struct B {
A a;
//pragma(msg, forward("a", "fun", "gun"));
mixin(forward("a", "fun", "gun"));
}
44. Parser
c 2014– Andrei Alexandrescu. 39 / 47
import pegged.grammar; // by Philippe Sigaud
mixin(grammar("
Expr < Factor AddExpr*
AddExpr < (’+’/’-’) Factor
Factor < Primary MulExpr*
MulExpr < (’*’/’/’) Primary
Primary < Parens / Number / Variable
/ ’-’ Primary
Parens < ’(’ Expr ’)’
Number <~ [0-9]+
Variable <- Identifier
"));
45. Usage
c 2014– Andrei Alexandrescu. 40 / 47
// Parsing at compile-time:
static parseTree1 = Expr.parse(
"1 + 2 - (3*x-5)*6");
pragma(msg, parseTree1.capture);
// Parsing at run-time:
auto parseTree2 = Expr.parse(readln());
writeln(parseTree2.capture);
46. Scaling up
c 2014– Andrei Alexandrescu. 41 / 47
1000 lines of D grammar →
3000 lines D parser
47. c 2014– Andrei Alexandrescu. 42 / 47
Highly integrated lex+yacc
48. c 2014– Andrei Alexandrescu. 43 / 47
What about regexen?
49. Compile-time regular expressions
c 2014– Andrei Alexandrescu. 44 / 47
• GSoC project by Dmitry Olshansky, merged into std
• Fully UTF capable, no special casing for ASCII
• Two modes sharing the same backend:
auto r1 = regex("^.*/([^/]+)/?$");
static r2 = ctRegex!("^.*/([^/]+)/?$");
• Run-time version uses intrinsics in a few places
• Static version generates specialized automaton, then
compiles it