4. what’s pack?
• like sprintf()
• only for bytes, not for presentation
• template rules very complex
• DWIM: packs empty strings for
missing arguments
4
5. what’s unpack?
• like sscanf() for bytes
• (not really: we’ll come back to it)
• (mostly) identical template rules to
pack
• dies if it runs out of input bytes
5
20. • vec(): treat a scalar as an arbitrary
length bit vector
• (you’re not using numbers, are you?)
• pack and unpack ‘b’ template is
perfect for working with the vector as
a whole
• convert vectors to and from from
strings “011100” or lists (0,1,1,1,0,0)
• count bits with unpack checksum
• perldoc -f vec
19
21. example: one million
bits!
## create a 125,001 byte vector
my $bit_vector = '';
(vec $bit_vector, 1_000_000, 1) = 1;
## stringify: “00000...1”
my $bits = unpack 'b*', $bit_vector;
## listify: (0,0,0,...,1)
my @bits = split //, unpack 'b*', $bit_vector;
## how many bits are on?
my $on_bits = unpack '%32b*', $bit_vector;
• the 1000001st through 1000008th
bits are free!
20
26. use bytes
• binary data + DWIM + unicode
• ouch!
• pragma to the rescue: “No matter
what you think might be in this PV, do
not cleverly switch to character
semantics when I’m not looking.”
• pack/unpack themselves don’t care,
it’s things like length and substr
25
30. typedef struct TWO_THINGS {
char a;
char b;
} two_things;
two_things things;
two_things lots_of_things[1000];
• where is things.a? things.
• where is things.b? *(&things + 1).
• where is lots_of_things[2].b?
lots_of_things + (2 *
sizeof(two_things)) + 1.
• where is the point? next slide.
29
31. Readonly my $FORMAT => ‘cc’;
my $things = pack $FORMAT;
my $lots_of_things = pack “($FORMAT)1000”;
• where is $things.a? unpack ‘cx’,
$things;
• where is $things.b? unpack ‘xc’,
$things;
• where is $lots_of_things[2].b? unpack
‘(xx)2xc’, $lots_of_things
30
32. • bytes, bytes, bytes on the brain
• byte offsets a natural way of thinking
about working with data
• “language neutral” is just a cute way
of saying “C”
31
33. • “strong typing” the roundabout way
• unpack() == C cast: “I, programmer,
assure you, language, that these bytes
contain precisely data of this type,
and I will live with the consequences if
I’m wrong.”
32
34. example: SEGV!
my $bar = unpack 'P', ‘asdf’;
• god, I miss pointers sometimes
• (but not right now)
33
39. space efficiency
• Storable: general-purpose
• what does that mean?
• if you’re thinking like a C
programmer, maybe you can do
better...
37
40. example: array of shorts
@shorts = map {int((rand 256)-128)} (1..10000);
## 20,000 bytes: 2 bytes per element
$packed = pack 's*', @shorts;
## 20,016 bytes: 2 bytes per element
$stored = Storable::freeze(@shorts);
## harmlessly examine contents of @shorts...
print quot;$_nquot; for @shorts;
## roughly 46,000 bytes: ???
$stored = Storable::freeze(@shorts);
• Extra credit: deserialize just
$shorts[2113]...
38
41. fixed width
• depending on what you’re serializing
• interesting properties
• more in a bit
39
42. keyless hashes
• when a hash is really a struct/record
• thinking like a C programmer again!
• serialize bags of them without bags of
redundant copies of their keys
40
43. idiom
## shape of the “structure” and format are
## passed or encoded separately
Readonly my $TEMPLATE => ‘VVC';
Readonly my @FIELDS => qw(thing1 thing2 kite);
## get the bytes
my $bytes = get_from_somewhere();
## unpack via hash slice FTW!
my %thing;
@thing{@FIELDS} = unpack $TEMPLATE, $bytes;
41
44. example: keyless hash
my @records = map {
{ thing1 => int rand 4294967296,
thing2 => int rand 4294967296,
kite => int rand 255, } } (1 .. 10000);
## 90,000 bytes: 9 bytes per record
my $packed = pack quot;($TEMPLATE)*quot;,
map { @{$_}{@FIELDS} } @records;
## roughly 544,000 bytes: 54 bytes per record
my $stored = Storable::freeze(@records);
42
46. • for transient bytes e.g. from key-value
storage
• for sparse algorithms e.g. binary
search
• otherwise, don’t do this!
• or at least, don’t blame me
44
47. example: filtering
• problem scale: 100k x 20k x 100
• idea 1: regular expressions!
• idea 2: binary search, of course!
• idea 3: binary search + lazy
perlification
45