Specifying and Implementing SNOW3G with Cryptol

Specifying and implementing SNOW3G with
Cryptol

Pedro Pereira and Ulisses Costa

Universidade do Minho
{pedro.mdp,ulissesaraujocosta}@gmail.com

Abstract. This paper presents a non-traditional approach to the design
of symmetric-key cryptographic algorithms. SNOW 3G is the chosen al-
gorithm and a single tool, Cryptol, is used during the process. Cryptol also
provides a push-button verification framework for equivalence and safety
checking of both specification and implementation.
Keywords: Stream cipher, SNOW 3G, Cryptol, formal methods

1 Introduction

We claim that non-hardware people can get good results by working
in Cryptol and would like to confirm or deny that claim.

Galois, Inc.

The ever-growing complexity of cryptographic algorithms is requiring fun-
damental changes to the traditional way they are tested and implemented in
hardware. The whole specify-implement process has become increasingly time-
consuming and different tools/languages are required for different stages of the
process. And since functional specifications are written in software for clarity
and are generally not optimized or intended for synthesis, the hardware imple-
mentation must be proven logically equivalent. This functional validation is done
in software and also adds to the effort and time.
Cryptography can be seen as the mathematical manipulation of sequences of
data. This is reflected in the design of Cryptol, a domain-specific language (DSL)
which consists of arithmetic operations and manipulation of sequences. As a
DSL, it allows cryptographers to design and implement cryptographic algorithms
using familiar concepts and constructs. With the toolset, it’s possible to provide
a high degree of assurance in the correctness of their design and at the same
time, produce high performance implementations in C and VHDL. Cryptol was
developed in Haskell during the past decade by Galois, Inc [15].
The Cryptol interpreter is the toolset’s frontend and interacts with an Inter-
mediate Representation 1 (IR) explicitly annotated with types which allows for
type-directed evaluation/translation in backends. In this project all interpreter
modes were used:
1
Generated by the interpreter after parsing and type-checking.

2 Pedro Pereira and Ulisses Costa

– bit mode which performs interpretation on the IR and supports the entire
set of Cryptol. This is where Cryptol programs are run;
– the symbolic mode performs symbolic interpretation on the IR and supports
equivalence checking;
– in C mode programs are translated to the C language;
– sbv compiles programs into an IR called symbolic bit-vector (SBV) and can
also translate them to C. This mode also supports safety checking;
– in spir mode, the IR is compiled to Signal Processing Intermediate Represen-
tation (SPIR). Also provides rough profiling information of the final circuit
and supports equivalence checking;
– the fpga mode compiles to SPIR, translates to VHDL and uses external tools
to synthesize the VHDL to an architecture dependent netlist.
All necessary details about these modes will be presented in following sections.
The paper is structured as follows. Section 2 presents a brief overview of the
SNOW 3G cipher. Section 3 explains a few design choices in the specification of
the cipher in Cryptol. Section 4 details the refinement of the specification. Section
5 addresses usage of the verification framework. Section 6 exposes related work.
Section 7 concludes the paper.

2 SNOW 3G cipher
SNOW 3G is a word-based synchronous stream cipher developed by Thomas
Johansson and Patrik Ekdahl at Lund University in 2001. It was chosen as the
stream cipher for the 3GPP encryption algorithms UEA2 and UIA2 [12].
SNOW [7], the cipher’s first version, was originally submitted to NESSIE
[22]. The NESSIE research project was funded from 2000-2003 to identify se-
cure cryptographic primitives. During SNOW’s evaluation some weaknesses were
found [4, 16] and, as a result, it was not included in the NESSIE suite of algo-
rithms. The authors then developed version 2.0 [8] of the cipher which solves the
weaknesses and has improved performance. When submitted to the ETSI [11]
Security Algorithms Group of Experts (SAGE) evaluation, the design was fur-
ther modified to increase its resistance against algebraic attacks and the result
became SNOW 3G [14].
SNOW 3G generates a keystream of 32-bit words which mask the plaintext.
The cipher requires a 128-bit key and initialization vector. Its structure is essen-
tially a combination of a sixteen stage Linear Feedback Shift Register (LFSR)
and a Finite State Machine (FSM) composed of three registers R1, R2 and R3
as represented by Figure 1.
First, a key initialization phase consisting of thirty-two clock cycles is per-
formed, altering the LFSR’s and FSM’s state. The cipher then enters the keystream
generation phase in which the first clocked output is discarded. With every sub-
sequent clock tick, a 32-bit word from the keystream is produced.
The bitwise xor operation is denoted by ⊕ and addition modulo 232 denoted
by . The M U Lα operation is represented by α and its inverse, DIV α is rep-
resented by α−1 . These are bit-mapping functions.

Specifying and implementing SNOW3G with Cryptol 3

Fig. 1. SNOW 3G

The LFSR feeds input into the FSM and its state at time t is denoted by
(st , ..., st+15 ). Each of these stages can be divided in four 8-bit words:

st = (st,0 st,1 st,2 st,3 )

Therefore, the FSM’s input V is defined as:

V = ((s0,1 s0,2 s0,3 0x00) ⊕ α(s0,0 ) ⊕ s2 ) ⊕ (0x00 s11,0 s11,1
s11,2 ) ⊕ α−1 (s11,3 )

While its output F is defined as:

F = (s15 R1) ⊕ R2

The output zt , ie. the keystream, is:

zt = F ⊕ st

The register R1 is updated as:

R1 = (s5 ⊕ R3) R2

And R2 is updated from a S-box transformation S1(R1) and so is R3 by S2(R2).
S1 is based on Rijndael’s round function and S-box SR [5] while S2 is based on
the SQ S-box. SQ is constructed using Dickson polynomial [6]. For further details,
SNOW 3G’s complete specification can be found in [13].


3 Specifying SNOW 3G in Cryptol

Cryptol has a Hindley-Milner style polymorphic type system extended with size
polymorphism and arithmetic predicates. This design precisely captures con-
straints that naturally arise in cryptographic specifications. For instance, con-
sider the following description from [13]:

SNOW 3G (...) generates a sequence of 32-bit words under the control
of a 128-bit key and a 128-bit initialization variable.

Hence, our keystream generation function has the following type:
GenKS : ([4][32] , [4][32]) -> [ inf ][32]

Note how it rigorously corresponds to the textual description, as it is statically
ensured that both key and initialization vector are 128-bit long (each of them is
represented by four 32-bit words) and the keystream is a 32-bit word sequence
of unbounded size (inf). We’re allowed to declare finite or infinite sequences of
data in Cryptol thanks to lazy evaluation.
Due to the simple and mathematical nature of SNOW 3G’s components they
are trivially written in Cryptol. To illustrate, let’s consider the specification for
MULx which maps 16 to 8 bits:

(v 1) ⊕ c if v’s most significant bit == 1
M U Lx(v, c) =
v 1 otherwise

And its equivalent in Cryptol:
MULx : ([8] , [8]) -> [8];
MULx (v , c ) = if ( v ! 0 ) then ( v << 1) ^ c
else v << 1;

Cryptol indexes words in little-endian by default, thus the ! operator to retrieve
v’s most significant bit. Since the other components’ definitions are practically
identical to their specifications we’re omitting them from this section but they
can be viewed in appendix A.
During initialization mode, the cipher executes two clocking operations: one
for the LFSR and the other for the FSM. These were written as recursive streams
which successfully capture the cyclic essence of the operations:
Init : ([4][32] , [4][32]) -> ([16][32] , [3][32]) ;
Init (k , iv ) = ( ClockLFSR_IM@32 , ClockFSM@32 ) where {

ClockLFSR_IM : [ inf ][16][32];
ClockLFSR_IM =
[ ( Init_LFSR (k , iv ) ) ] #
[| ( drop (1 , LFSR ) #
[ ( V ( LFSR@0 , LFSR@2 , LFSR@11 ) ^ F ( R@0 , R@1 , LFSR@15 ) ) ]
)
|| LFSR <- ClockLFSR_IM


|| R <- ClockFSM |];

ClockFSM : [ inf ][3][32];
ClockFSM =
[ [ 0 0 0 ] ] #
[| [ ( ( R@1 + ( R@2 ^ LFSR@5 ) ) & 0 xFFFFFFFF )
( S1 ( R@0 ) )
( S2 ( R@1 ) ) ]
};
Each element from both streams represents a particular iteration and we’re only
interested in the 32nd ones. Although the streams’ size isn’t finitely restricted,
by requesting the first 32 elements of each stream, lazy evaluation guarantees
that recursion ends after this iteration.
The keystream generation mode was written in a similar way and its defini-
tion can also be viewed in appendix A.

4 Refining into an implementation
Unfortunately, there are some restrictions that must be applied to the code in
order for the compiler to successfully translate from Cryptol to VHDL:
– No divisions by powers other than 2;
– No polymorphic definitions;
– No recursive functions;
– No high-order functions (partially);
The first restriction is a hardware limitation and not required by the Cryp-
tol compiler itself. Regarding the second one, the compiler can’t generate an
infinite number of definitions and therefore a specific size must be attributed to
all functions. The third restriction forbids definitions of recursive functions but
we can, however, define recursive sequences and every recursive function can be
rewritten with a recursive sequence. Finally, although we can’t have functions re-
turning other functions, they can be passed as parameters to others. But for this
particular algorithm, first-order functions are sufficient. Therefore, the presented
specification needs to be rewritten according only to the third restriction.
The only recursive function defined is MULxPOW which can be trivially rewrit-
ten as:
MULxPOW : ([8] , [32] , [8]) -> [8];
MULxPOW (v , i , c ) = res @ i
where res = [ v ] # [| MULx (e , c ) || e <- res |];
On a different subject, Init will be laid over time2 because its more liberal
definition (as espected in a specification) deals with infinite streams even though
2
An infinite stream of output also requires infinite hardware, instead, circuitry is
reused forcing data to be processed over time.


only the first 32 iterations are actually required. One could argue the advantage
of restricting the streams’ size to 33 elements since it seems useless to keep them
infinite. However, this is ignored as explained later in this section.
There are two ways of representing a sequential circuit in Cryptol: the un-
clocked step model and the clocked stream model. An accurate performance anal-
ysis requires data to be processed over time because of the useful clocking con-
straints. The only way to explicitly force processing over time is by converting
the top-level function into the stream model which essentially implies receiving
and/or producing data. Our GenKS already outputs infinite data so no changes
are required.
The interpreter provides rough timing analysis and size estimates when trans-
lating to VHDL in spir mode if spir_profile=detailed is set. It’s best to
keep refining an implementation while using this mode because translating to
SPIR takes significantly less time than synthesis, yet still provides enough infor-
mation to help produce an efficient implementation.
In order to generate an efficient circuit, some optimizations are required.
Optimizations rely on space-time tradeoffs. A possible optimization would be
trying to reduce some of the computational effort via conversion of mapping
functions to static lookup tables, trading more space for less time. Static lookup
tables are also automatically translated into BlockRAMs (fast component of
FPGA circuits) by the Cryptol compiler. For instance, MULxPOW is a mapping
function, it receives three 8-bit parameters which means that 2562 tables with 256
elements each, were required to maintain equivalence. Realistically, it wouldn’t
be an optimization; the resulting circuit would be fast but the large amount of
space traded does not compensate.
However, in all MULxPOW calls, the third parameter is always the same and the
second one only assumes eight different values. We can then shorten the previous
range to just eight tables of 256 8-bit elements each, which only requires about
16KB of memory making it a much more desired optimization. But, we can
do better if MULa and DIVa are converted instead. The space required for this
optimization is also 16KB (two tables of 256 32-bit elements) and would imply
even less computational logic.
Here is a detailed report of the resulting implementation in SPIR:
snow3g - impl > : set spir spir_profile = detailed
snow3g - impl > GenKS
...
=== Summary of Path Timing Estimates ===
Overall clock period : 8.38 ns (119.3 MHz )
Input pin to flip - flop : 1.94 ns (514.7 MHz )
Flip - flop to flip - flop : 7.72 ns (129.6 MHz )
Flip - flop to output pin : 8.38 ns (119.3 MHz )
Input pin to output pin : No paths

=== Summary of Size Estimates ===
Estimated total size : about 6848 LUTs , 2776 Flipflops


=== Circuit Timing ===
circuit latency : 37 cycles (36 cycles plus propagation delay )
circuit rate : one element per cycle
output length : unbounded
total time : unbounded

Although the report still doesn’t look promising, these numbers are rough esti-
mates and a few options used in a later phase will influence it.
There are essentially three ways of controling data flow: by paralleling, se-
quencing or pipelining. Each approach implies a different space-time tradeoff
and translates into different VHDL code. Cryptol provides three pragmas to au-
tomate these tradeoffs: par, seq and reg respectively. The par pragma causes
circuitry to be replicated, whereas the seq pragma causes circuitry to be reused
over multiple clock cycles. By default, the compiler replicates circuitry as much
as possible in exchange for performance. The user overrides this behavior us-
ing seq, par is only useful for switching back to the default behavior within
an instance of seq. The reg pragma imposes pipelining. In a pipelined design,
one separates a function into several smaller computational units, each of which
is a stage in the pipeline that consumes output from the previous stage and
produces output for the next one. Each stage is a relatively small circuit with
some propagation delay. The clock rate is limited by the stage in the pipeline
with the highest propagation delay, whereas the un-pipelined implementation
would be limited by the sum of the propagation delays of all stages. So, rather
than perform one large computation on one input during a very long clock cycle,
an n-stage pipeline performs n parallel computations on n partial results, each
corresponding to a different input to the pipeline.
Our circuit’s remaining computational logic resides in Init and GenKS. These
functions deal with infinite streams so they’re going to be translated as sequential
circuits. Their throughput however, could probably be improved by pipelining
some of their components. In fact, using reg on a section did result in a greater
clock rate which influenced throughput.
A detailed report concerning this resulting implementation can be found in
section 4.2.

4.1 C code generation

Both C and sbv backends generate C code, although for different subsets and
with different goals. C mode can deal with almost the entire Cryptol language,
while only monomorphic, first-order, symbolically terminating and finite func-
tions can be translated in sbv mode. This is because sbv was designed for
formal-verification using SMT solvers and the C backend was mainly designed
for integration with external C projects. The other difference between C and sbv
modes is that the code generated by sbv does not do memory allocation/deal-
location at run-time, as opposed to the C one.
The simplicity of the SBV representation is what allows Cryptol to generate
really fast C code. But, translation in sbv mode fails:


Loading " snow3g_impl . cry " .. Checking types .. Processing .. Done !
snow3g_impl > : set sbv
snow3g_impl > : translate GenKS
PANIC : SBV2C : Not yet implemented : BVRor over unsupported
sizes s198 :[8] -0 x1 :[3]

The reason being that the SBV to C compiler was done as a proof-of-concept
and currently only processes specific constructs.
C code generation in the C backend depends on the following libraries:

– Cryptol.h Contains all the necessary prototypes, macros and a few stan-
dard C includes;
– CryAlloc.o Implements a custom memory allocator/deallocator for Cryp-
tol run-time;
– CryPrim.o Implements C-equivalents of Cryptol’s built-in functions;
– CryStream.o C library for representing/manipulating infinite streams;

Compiling (:compile) in this mode produces fast code, although it’s not as fast
as the hand-written C implementation found in [13]. Also, the generated code is
a bit more cryptic as demonstrated by the C definition of MULxPOW:
uint8
MULxPOW ( uint8 v_MULxPOW , uint32 i_MULxPOW , uint8 c_MULxPOW )
{
uint32 local7 = 0 x0 ;
uint8 local8 = 0 x0 ;
uint8 MULxPOW_res = 0 x0 ;
uint32 * mrk449 = getAllocMark () ;

MULxPOW_res = v_MULxPOW ;
for ( local7 = 0 x0 ; local7 < i_MULxPOW ; local7 += 0 x1 ) {
local8 = MULx0 ( MULxPOW_res , c_MULxPOW ) ;
MULxPOW_res = local8 ;
}
freeUntil ( mrk449 ) ;
return MULxPOW_res ;
}

4.2 VHDL code generation

Modes spir and fpga provide VHDL generation via :translate. This process
depends on the following libraries:

– RTLib.vhdl Run-time library which is linked with the generated VHDL code;
– RTLib Xilinx.vhdl Defines the Xilinx specific parts of the VHDL run-time
library;

But ultimately, synthesis, simulation and exact performance reports require
external tools. For synthesis, Cryptol currently supports xst from Xilinx and


Synplicity’s synplify-pro. Regarding simulation, GHDL, ModelSim and Xilinx’s
own simulator are among those supported. After installing any of these, Cryp-
tol should be ready to interact with them out-of-the-box. We used the following:
fpga_synthesis=xst and vhdl_simulation=ise.
In this mode, a more exact proﬁling report of the proposed implementation
may be generated:

snow3g - impl - pipe > : set fpga fpga_board = spartan3e
fpga_part = xc3s1600e -5 fg484 fpga_netlist = vhdl
fpga_blockram = behavioural fpga_optlevel =6 + v
snow3g - impl - pipe > GenKS
...
Timing Summary :
----------------
Speed Grade : -5

Minimum period : 6.214 ns ( Maximum Frequency : 160.930 MHz )
Minimum input arrival time before clock : 2.892 ns
Maximum output required time after clock : 11.497 ns
Maximum combinational path delay : No path found

Device Utilization ( size summary ) :
-----------------------------------
Selected Device : 3 s1600efg484 -5

Number of Slices : 1212 out of 14752 8%
Number of Slice Flip Flops : 1810 out of 29504 6%
Number of 4 input LUTs : 2192 out of 29504 7%

The following interpreter options were used: fpga_board=spartan3e in order
to specify which FPGA board should the circuit be placed into (Cryptol cur-
rently supports two: spartan3e or avnet_v4mb), fpga_part=xc3s1600e-5fg484
for the speciﬁc FPGA part, fpga_netlist=vhdl for VHDL netlist generation,
fpga_blockram=behavioural to take advantage of one cycle latency Block-
RAMs, fpga_optlevel=6 for maximum code optimization and +v for displaying
the reports of the various stages during implementation.
Regarding space, the proposed implementation is quite compact. But in order
to assess if it’s fast, comparisons need to be made with other implementations
and this requires the throughput value which can be calculated as:

clock rate ∗ output width
throughput =
output rate

The circuit’s clock rate is 160 MHz and it produces a 32-bit word per cycle as seen
previously in spir’s timing report. Therefore, the proposed implementation’s
throughput is equal to 5120 Mbps. Comparing it with other implementations:


Implementation Frequency (MHz) Throughput (Mbps)
Proposed SNOW 3G 160 5120
SNOW 3G [19] 249 7968
SNOW 3G [9] 100 2500
SNOW 2.0 [18] 141 4512
SNOW 1.0 [2] 66.5 2128
Table 1. Experimental results

5 Verification framework
Since specifications are geared towards clearer and rigorous understanding of
behavior while implementations must be optimized and designed for synthesis,
even when written in the same language, they’re bound to become very differ-
ent. Therefore it’s imperative to check whether a implementation is functionally
equivalent to its specification. And since we’re talking about assurance it would
be desirable to assess if an implementation can be safely executed ie. won’t
produce run-time errors.
Cryptol’s verification framework has been designed to check these equivalence
and safety problems. The :eq command checks whether two functions f and g
agree on all inputs. If f and g are not equivalent, Cryptol identifies a value x
such that f x = g x. This is accomplished by generating their formal models
and feeding them for comparison to a Boolean Satisfiability 3 (SAT) solver. Cryp-
tol currently supports two SAT solvers: JAIG [24] and ABC [1], the latter being
the fastest with our code and therefore being used instead of the default one.
A formal model is either a symbolic bit-vector (SBV) or and-inverter graph
(AIG), the former is generated in sbv mode and the latter can be generated
in symbolic, spir or fpga modes. This means that currently, with SBV it’s
only possible to do Cryptol ⇔ Cryptol equivalence checking while AIG based
equivalence checking may be done across different backends.
The SBV is a much simpler language designed for formal verification with
Satisfiability Modulo Theories 4 (SMT) solvers. It’s completely monomorphised,
there are no jumps as all function calls are unrolled and it only consists of
bit-vector data and arithmetic/logical operations. Further details about SBV in
Cryptol can be consulted in [10].
An And-Inverter Graph (AIG) is a directed, acyclic graph representing a
boolean logic circuit composed only of inverters and two-input AND gates. Op-
tional inverters are modeled as labels on the edges and AND gates correspond to
graph nodes. AIGs can represent arbitrary boolean functions and allow for effi-
cient manipulations with such functions [20]. Also, a recent emergence of much
3
Decision problem for determining if the variables of a given boolean formula can be
assigned in such a way as to make the formula evaluate to True.
4
Decision problem for determining whether logical formulas are satisfiable with re-
spect to combinations of background theories expressed in classical first-order logic
with equality.


more efficient SATs when coupled with AIGs as the circuit representation, lead
to remarkable speedups in solving a wide variety of boolean problems [21].
On the other hand, :safe checks for possible run-time exceptions such as
divisions by zero or out-of-bounds indexes and if so, outputs the values which
result in these exceptions. Guaranteeing the safe execution of a Cryptol program
implies that its subsequent translations to C will be safe as well.
However, for the full Cryptol language, both the equivalence and safety check-
ing problems are undecidable. They do become solvable if a restricted subset of
the language is adopted. Therefore, Cryptol’s verification framework only sup-
ports functions that are:

– Monomorphic;
– Finite;
– Symbolically terminating;
– First-order;

The first restriction comes from the fact that the framework’s underlying logic
is fixed-size bit vectors. Functions must also be finite because the system lacks
induction capabilities. The third restriction is required because the symbolic
termination problem is undecidable in general, therefore stream recursions must
be used. And because the only available data types from the underlying logic
are fixed-size bit vectors, everything is expanded away thus it’s impossible to
represent a high-order function.
But even with this restricted subset, the equivalence checking problem re-
mains NP-complete. While most practical instances should be solved in a feasi-
ble amount of time, one cannot expect a fast analysis for every instance. Some
instances can be solved much faster though, if human guidance is introduced.
Cryptol’s equivalence checker can translate problems into Isabelle/HOL notation
via the :isabelle command, reducing the equivalence question to a theorem to
be proved in high-order logic [23].
The proposed implementation is already monomorphic, symbolically termi-
nating and first-order but the finite restriction applies. GenKS is the only infinite
function and so, the size of its output is fixed at, for instance, ten 32-bit words
with the inclusion of take functions.
The following shows how to check the equivalence and safety of GenKS for
the first 10 words of output:
Loading " snow3g_spec . cry " .. Checking types .. Processing .. Done !
snow3g_spec > : set symbolic abc
snow3g_spec > : fm ( ( x , y ) -> take (10 , GenKS (x , y ) ) )
" genks_spec . aig "
snow3g_spec > : load ./ snow3g_impl . cry
Loading " snow3g_impl . cry " .. Checking types .. Processing .. Done !
snow3g_impl > : set symbolic abc
snow3g_impl > : eq ( ( x , y ) -> take (10 , GenKS (x , y ) ) )
True
snow3g_impl > : set sbv


snow3g_impl > : safe ( ( x , y ) -> take (10 , GenKS (x , y ) ) )
" ( ( x , y ) -> take (10 , GenKS (x , y ) ) ) " is safe ; no safety
violations exist .

Equivalence checking may be used with yet another purpose. The Cryptol com-
piler is a verifying one [17] so when translating from Cryptol to VHDL for instance,
it’s necessary to prove the functional equivalence between the two:
snow3g_impl > : set spir
snow3g_impl > : eq ( ( x , y ) -> take (10 , GenKS (x , y ) ) )
True

There’s also a :sat command which can be used to find satisfying assignments
for bit-valued functions. :sat can be used to check interesting properties, for
instance, given the following finite definitions of cipher and decipher operations:
encrypt , decrypt : ([10][32] , [4][32] , [4][32]) -> [10][32];
encrypt ( pt , key , iv ) = [| p ^ k
|| p <- pt
|| k <- GenKS ( key , iv ) |];
decrypt ( ct , key , iv ) = [| c ^ k
|| c <- ct
|| k <- GenKS ( key , iv ) |];

Can encrypt produce the result 0?
: sat ( ( pt , key , iv ) -> encrypt ( pt , key , iv ) == zero )

Are there any different plaintext values p1 and p2, such that they will map to
the same ciphertext for the same key?
: sat ( ( p1 , p2 , key , iv ) -> ( p1 != p2 ) &
( encrypt ( p1 , key , iv ) == encrypt ( p2 , key , iv ) ) )

In each of the two above situations, two formal models are generated. One for
the bit-valued function (property) being checked and another for a function f
defined as:
f : ([10][32] ,[4][32] ,[4][32]) -> Bit ;
f x = False ;

The SAT solver then takes checks these two models for equivalence and the first
counter-example found is returned as the satisfying solution.
It’s also worth mentioning that the equivalence checking problem can be
posed as a satisfiability problem and vice versa. In general, the following two
queries semantically encode the same problem:
: eq f g
: sat ( x -> f x != g x )

Cryptol also supports a flexible way of checking certain properties of an algorithm
with theorem proving. In Cryptol, theorems are simple bit-valued functions re-
turning either True or False. This theorem-function correspondence provides


consistency and avoids an extra language to express properties. The :prove
command generates two formal models, one for the theorem and the other for a
function f defined as:
f : ([10][32] ,[4][32] ,[4][32]) -> Bit ;
f x = True ;
The two models are then checked for equivalence. The following theorem repre-
sents the cipher’s correctness (ie. decryption undoes encryption):
correct : ([10][32] , [4][32] , [4][32]) -> Bit ;
theorem correct : { pt key iv }.
decrypt ( encrypt ( pt , key , iv ) , key , iv ) == pt ;
Which can be proved as demonstrated by the following:
snow3g - impl > : set symbolic abc
snow3g - impl > : prove correct
Q.E.D.
Evidently, only for the first 10 words. Although an algorithm’s total correctness
can’t be proven with this restricted set of the Cryptol language, the verification
system helps to gain confidence in the algorithm’s behavior. For further details
regarding the framework, [10] should be consulted.

6 Related work
The use of tools such as Frama-C which implement automatic proving of algo-
rithms in C is possible and has already been done successfully [3] for crypto-
graphic algorithms such as RC4. However, the provers are guided with special
annotations which represent properties such as Hoare style pre/post-conditions
and invariants. But some properties may be impossible to prove and we have
to perform additional proofs with the help of an interactive proof assistant such
as Coq. Another problem with this approach is having to deal with unrelated
details inherent of a low-level and architecture dependent language like C, such
as (de)allocations, pointer manipulation and valid array accesses for instance.
Other tools like CryptoVerif, provide a generic mechanism for specifying
the security assumptions on cryptographic primitives. CryptoVerif is based on
observational equivalence which induces rewriting rules applicable in contexts
that satisfy some properties. The generated security proofs are sequences of
games [25] and the desired properties are proven if each individual proof remains
valid for a polynomial number of sessions (security parameter) in the presence
of an active adversary. This method requires the correct transcription of C code
or the exact security properties described in the CryptoVerif language.

7 Conclusions
We wrote a specification for SNOW 3G. We then optimized it and also gener-
ated a hardware implementation. This was done with a single tool, Cryptol. The


generated VHDL implementation is both compact and fast. We have successfully
confirmed Galois’ claim, non-hardware people such as us, can get good results
by working in Cryptol. A user’s perspective on the Cryptol language and toolset
was also presented.
During the writing of this paper, Cryptol version 1.8.4 was used and since it’s
constantly being developed, newer versions might be different regarding some of
the aspects discussed.

8 Acknowledgments

We want to thank and express our profound respect to our tutor Prof. Manuel
Alcino Cunha for his reliable guidance. We also want to thank Mr. Levent Erk¨k
o
from Galois for his invaluable help.

References
1. ABC: A System for Sequential Synthesis and Verification. http://www.eecs.
berkeley.edu/~alanmi/abc/. Berkeley Logic Synthesis and Verification Group.
2. K. Alexander, R. Karri, I. Minkin, K. Wu, P. Mishra, and X. Li. Towards 10-100
Gbps Cryptographic Architectures. In International Symposium On Computer and
Information Sciences, Orlando, pages 25–30, 2002.
3. J. B. Almeida, M. Barbosa, J. S. Pinto, and B. Vieira. Deductive Verification
of Cryptographic Software. Technical Report DI-CCTC-09-03, Universidade do
Minho, 2009.
4. D. Coppersmith, S. Halevi, and C. Jutla. Cryptanalysis of Stream Ciphers With
Linear Masking. In Proc. of CRYPTO’02, pages 515–532. Springer-Verlag, 2002.
5. J. Daemen and V. Rijmen. The Design of Rijndael. Springer-Verlag, 2002.
6. L. E. Dickson. The analytic representation of substitutions on a power of a prime
number of letters with a discussion of the linear group. Annals of Mathematics,
11:65–120, 161–183, 1897.
7. P. Ekdahl. LFSR Based Stream Ciphers Analysis and Design. PhD thesis, Depart-
ment of Information Technology, Lund University, Sweden, 2003.
8. P. Ekdahl and T. Johansson. A New Version of the Stream Cipher SNOW. In
SAC ’02: Revised Papers from the 9th Annual International Workshop on Selected
Areas in Cryptography, pages 47–61. Springer-Verlag, 2003.
9. Elliptic Semicondutor Inc. CLP-41 SNOW 3G Cipher Core, available at
http://www.ellipticsemi.com/products-clp-41.php.
10. L. Erk¨k and J. Matthews. Pragmatic Equivalence and Safety Checking in Cryptol.
o
In Programming Languages meets Program Verification, PLPV’09, Georgia, USA,
pages 73–81. ACM Press, 2009.
11. European Telecomunications Standards Industry. http://www.etsi.org.
12. ETSI/SAGE. Specification of the 3GPP Confidentiality and Integrity Algorithms
UEA2 and UIA2. Document 1: UEA2 and UIA2 Specification, version: 1.1. http:
//www.gsmworld.com/documents/etsi_sage_06_09_06.pdf, 2006.
UEA2 and UIA2. Document 2: SNOW 3G Specification, version: 1.1. http://www.
gsmworld.com/documents/snow_3g_spec.pdf, 2006.


UEA2 and UIA2. Document 5: Design and Evaluation report, version: 1.0. http:
//www.gsmworld.com/documents/uea2_design_evaluation.pdf, 2006.
15. Galois, Inc. http://www.galois.com.
16. P. Hawkes and G. G. Rose. Guess-and-Determine Attacks on SNOW. In SAC ’02:
Revised Papers from the 9th Annual International Workshop on Selected Areas in
Cryptography, pages 37–46. Springer-Verlag, 2003.
17. T. Hoare. Towards the Verifying Compiler. In Formal Methods at the Crossroads:
From Panacea to Foundational Support, volume 2757 of LNCS, pages 151–160.
Springer, 2003.
18. P. Kitsos. Hardware Implementations for the ISO/IEC 18033-4:2005 Standard for
Stream Ciphers. International Journal of Signal Processing (IJSP), Number 1,
3:66–73, 2006.
19. P. Kitsos, G. Selimis, and O. Koufopavlou. High Performance ASIC Implemen-
tation of the SNOW 3G Stream Cipher. In IFIP/IEEE VLSI-SoC 2008 - Inter-
national Conference on Very Large Scale Integration (VLSI SOC), Rhodes Island,
Greece, October, pages 13–15, 2008.
20. A. Kuehlmann, V. Paruthi, F. Krohm, and M. K. Ganai. Robust Boolean Reason-
ing for Equivalence Checking and Functional Property Veriﬁcation. IEEE Trans.
CAD, 21:1377–1394, 2002.
21. A. Mishchenko, S. Chatterjee, and R. Brayton. Improvements to technology map-
ping for LUT-based FPGAs. In FPGA ’06: Proceedings of the 2006 ACM/SIGDA
14th international symposium on Field programmable gate arrays, California, USA,
pages 41–49. ACM, 2006.
22. New European Schemes for Signature, Integrity, and Encryption. https://www.
cosic.esat.kuleuven.be/nessie/.
23. T. Nipkow, L. C. Paulson, and M. Wenzel. Isabelle/HOL — A Proof Assistant for
Higher-Order Logic, volume 2283 of LNCS. Springer, 2002.
24. T. Nordin. The JAIG equivalence checker, 2005.
25. D. Nowak. A Framework for Game-Based Security Proofs. In Information and
Communications Security, 9th International Conference, ICICS 2007, Zhengzhou,
China, Proceedings, volume 4861 of LNCS, pages 319–333. Springer, 2007.

A SNOW 3G Reference Speciﬁcation

// SNOW 3 G Specification
// - - - - - - - - - - - - - - - - - - - - -
//
// Pedro Pereira & Ulisses Costa
/////////////////////////////////

// Components
/////////////

MULx : ([8] , [8]) -> [8];
MULx (v , c ) = if ( v ! 0) then ( v << 1) ^ c
else ( v << 1) ;


MULxPOW : ([8] , [32] , [8]) -> [8];
MULxPOW (v , i , c ) = if ( i == 0 )
then v
else MULx ( MULxPOW (v , ( i - 1) , c ) , c ) ;

MULa : [8] -> [32];
MULa ( c ) = join [ ( MULxPOW (c , 239 , 0 xA9 ) )
( MULxPOW (c , 48 , 0 xA9 ) )
( MULxPOW (c , 245 , 0 xA9 ) )
( MULxPOW (c , 23 , 0 xA9 ) ) ];

DIVa : [8] -> [32];
DIVa ( c ) = join [ ( MULxPOW (c , 64 , 0 xA9 ) )
( MULxPOW (c , 6 , 0 xA9 ) )
( MULxPOW (c , 39 , 0 xA9 ) )
( MULxPOW (c , 16 , 0 xA9 ) ) ];

// Rijndael S - box
Sr : [8] -> [8];
Sr ( x ) = sb@x where
sb = [ 0 X63 0 X7C 0 X77 0 X7B 0 XF2 0 X6B 0 X6F 0 XC5 0 X30 0 X01
0 X67 0 X2B 0 XFE 0 XD7 0 XAB 0 X76 0 XCA 0 X82 0 XC9 0 X7D
0 XFA 0 X59 0 X47 0 XF0 0 XAD 0 XD4 0 XA2 0 XAF 0 X9C 0 XA4
0 X72 0 XC0 0 XB7 0 XFD 0 X93 0 X26 0 X36 0 X3F 0 XF7 0 XCC
0 X34 0 XA5 0 XE5 0 XF1 0 X71 0 XD8 0 X31 0 X15 0 X04 0 XC7
0 X23 0 XC3 0 X18 0 X96 0 X05 0 X9A 0 X07 0 X12 0 X80 0 XE2
0 XEB 0 X27 0 XB2 0 X75 0 X09 0 X83 0 X2C 0 X1A 0 X1B 0 X6E
0 X5A 0 XA0 0 X52 0 X3B 0 XD6 0 XB3 0 X29 0 XE3 0 X2F 0 X84
0 X53 0 XD1 0 X00 0 XED 0 X20 0 XFC 0 XB1 0 X5B 0 X6A 0 XCB
0 XBE 0 X39 0 X4A 0 X4C 0 X58 0 XCF 0 XD0 0 XEF 0 XAA 0 XFB
0 X43 0 X4D 0 X33 0 X85 0 X45 0 XF9 0 X02 0 X7F 0 X50 0 X3C
0 X9F 0 XA8 0 X51 0 XA3 0 X40 0 X8F 0 X92 0 X9D 0 X38 0 XF5
0 XBC 0 XB6 0 XDA 0 X21 0 X10 0 XFF 0 XF3 0 XD2 0 XCD 0 X0C
0 X13 0 XEC 0 X5F 0 X97 0 X44 0 X17 0 XC4 0 XA7 0 X7E 0 X3D
0 X64 0 X5D 0 X19 0 X73 0 X60 0 X81 0 X4F 0 XDC 0 X22 0 X2A
0 X90 0 X88 0 X46 0 XEE 0 XB8 0 X14 0 XDE 0 X5E 0 X0B 0 XDB
0 XE0 0 X32 0 X3A 0 X0A 0 X49 0 X06 0 X24 0 X5C 0 XC2 0 XD3
0 XAC 0 X62 0 X91 0 X95 0 XE4 0 X79 0 XE7 0 XC8 0 X37 0 X6D
0 X8D 0 XD5 0 X4E 0 XA9 0 X6C 0 X56 0 XF4 0 XEA 0 X65 0 X7A
0 XAE 0 X08 0 XBA 0 X78 0 X25 0 X2E 0 X1C 0 XA6 0 XB4 0 XC6
0 XE8 0 XDD 0 X74 0 X1F 0 X4B 0 XBD 0 X8B 0 X8A 0 X70 0 X3E
0 XB5 0 X66 0 X48 0 X03 0 XF6 0 X0E 0 X61 0 X35 0 X57 0 XB9
0 X86 0 XC1 0 X1D 0 X9E 0 XE1 0 XF8 0 X98 0 X11 0 X69 0 XD9
0 X8E 0 X94 0 X9B 0 X1E 0 X87 0 XE9 0 XCE 0 X55 0 X28 0 XDF
0 X8C 0 XA1 0 X89 0 X0D 0 XBF 0 XE6 0 X42 0 X68 0 X41 0 X99
0 X2D 0 X0F 0 XB0 0 X54 0 XBB 0 X16 ];


Sq : [8] -> [8];
Sq ( x ) = sb@x where
sb = [ 0 X25 0 X24 0 X73 0 X67 0 XD7 0 XAE 0 X5C 0 X30 0 XA4 0 XEE
0 X6E 0 XCB 0 X7D 0 XB5 0 X82 0 XDB 0 XE4 0 X8E 0 X48 0 X49
0 X4F 0 X5D 0 X6A 0 X78 0 X70 0 X88 0 XE8 0 X5F 0 X5E 0 X84
0 X65 0 XE2 0 XD8 0 XE9 0 XCC 0 XED 0 X40 0 X2F 0 X11 0 X28
0 X57 0 XD2 0 XAC 0 XE3 0 X4A 0 X15 0 X1B 0 XB9 0 XB2 0 X80
0 X85 0 XA6 0 X2E 0 X02 0 X47 0 X29 0 X07 0 X4B 0 X0E 0 XC1
0 X51 0 XAA 0 X89 0 XD4 0 XCA 0 X01 0 X46 0 XB3 0 XEF 0 XDD
0 X44 0 X7B 0 XC2 0 X7F 0 XBE 0 XC3 0 X9F 0 X20 0 X4C 0 X64
0 X83 0 XA2 0 X68 0 X42 0 X13 0 XB4 0 X41 0 XCD 0 XBA 0 XC6
0 XBB 0 X6D 0 X4D 0 X71 0 X21 0 XF4 0 X8D 0 XB0 0 XE5 0 X93
0 XFE 0 X8F 0 XE6 0 XCF 0 X43 0 X45 0 X31 0 X22 0 X37 0 X36
0 X96 0 XFA 0 XBC 0 X0F 0 X08 0 X52 0 X1D 0 X55 0 X1A 0 XC5
0 X4E 0 X23 0 X69 0 X7A 0 X92 0 XFF 0 X5B 0 X5A 0 XEB 0 X9A
0 X1C 0 XA9 0 XD1 0 X7E 0 X0D 0 XFC 0 X50 0 X8A 0 XB6 0 X62
0 XF5 0 X0A 0 XF8 0 XDC 0 X03 0 X3C 0 X0C 0 X39 0 XF1 0 XB8
0 XF3 0 X3D 0 XF2 0 XD5 0 X97 0 X66 0 X81 0 X32 0 XA0 0 X00
0 X06 0 XCE 0 XF6 0 XEA 0 XB7 0 X17 0 XF7 0 X8C 0 X79 0 XD6
0 XA7 0 XBF 0 X8B 0 X3F 0 X1F 0 X53 0 X63 0 X75 0 X35 0 X2C
0 X60 0 XFD 0 X27 0 XD3 0 X94 0 XA5 0 X7C 0 XA1 0 X05 0 X58
0 X2D 0 XBD 0 XD9 0 XC7 0 XAF 0 X6B 0 X54 0 X0B 0 XE0 0 X38
0 X04 0 XC8 0 X9D 0 XE7 0 X14 0 XB1 0 X87 0 X9C 0 XDF 0 X6F
0 XF9 0 XDA 0 X2A 0 XC4 0 X59 0 X16 0 X74 0 X91 0 XAB 0 X26
0 X61 0 X76 0 X34 0 X2B 0 XAD 0 X99 0 XFB 0 X72 0 XEC 0 X33
0 X12 0 XDE 0 X98 0 X3B 0 XC0 0 X9B 0 X3E 0 X18 0 X10 0 X3A
0 X56 0 XE1 0 X77 0 XC9 0 X1E 0 X9E 0 X95 0 XA3 0 X90 0 X19
0 XA8 0 X6C 0 X09 0 XD0 0 XF0 0 X86 ];

S1 : [32] -> [32];
S1 ( w ) = join [ ( Sr ( w0 ) ^ Sr ( w1 ) ^ MULx ( Sr ( w2 ) , 0 x1B ) ^
Sr ( w2 ) ^ MULx ( Sr ( w3 ) , 0 x1B ) )
( Sr ( w0 ) ^ MULx ( Sr ( w1 ) , 0 x1B ) ^ Sr ( w1 ) ^
MULx ( Sr ( w2 ) , 0 x1B ) ^ Sr ( w3 ) )
( MULx ( Sr ( w0 ) , 0 x1B ) ^ Sr ( w0 ) ^
MULx ( Sr ( w1 ) , 0 x1B ) ^ Sr ( w2 ) ^ Sr ( w3 ) )
( MULx ( Sr ( w0 ) , 0 x1B ) ^ Sr ( w1 ) ^ Sr ( w2 ) ^
MULx ( Sr ( w3 ) , 0 x1B ) ^ Sr ( w3 ) ) ]
where [ w3 w2 w1 w0 ] = split w ;

S2 : [32] -> [32];
S2 ( w ) = join [ ( Sq ( w0 ) ^ Sq ( w1 ) ^ MULx ( Sq ( w2 ) , 0 x69 ) ^
Sq ( w2 ) ^ MULx ( Sq ( w3 ) , 0 x69 ) )
( Sq ( w0 ) ^ MULx ( Sq ( w1 ) , 0 x69 ) ^ Sq ( w1 ) ^
MULx ( Sq ( w2 ) , 0 x69 ) ^ Sq ( w3 ) )
( MULx ( Sq ( w0 ) , 0 x69 ) ^ Sq ( w0 ) ^


MULx ( Sq ( w1 ) , 0 x69 ) ^ Sq ( w2 ) ^ Sq ( w3 ) )
( MULx ( Sq ( w0 ) , 0 x69 ) ^ Sq ( w1 ) ^ Sq ( w2 ) ^
MULx ( Sq ( w3 ) , 0 x69 ) ^ Sq ( w3 ) ) ]
where [ w3 w2 w1 w0 ] = split w ;

// Clocking Operations
//////////////////////

Init : ([4][32] , [4][32]) -> ([16][32] , [3][32]) ;
Init (k , iv ) = ( ClockLFSR_IM@32 , ClockFSM@32 ) where {

ClockLFSR_IM : [ inf ][16][32];
ClockLFSR_IM =
[ ( Init_LFSR (k , iv ) ) ] #
[| ( drop (1 , LFSR ) #
[( V ( LFSR@0 , LFSR@2 , LFSR@11 ) ^ F ( R@0 , R@1 , LFSR@15 ) ) ])

ClockFSM =
[ [ 0 0 0 ] ] #
( S1 ( R@0 ) )
( S2 ( R@1 ) ) ]

};

GenKS : ([4][32] , [4][32]) -> [ inf ][32];
GenKS (k , iv ) = tail zt where {

( lfsr , fsm ) = Init (k , iv ) ;

ClockLFSR_KSM : [ inf ][16][32];
ClockLFSR_KSM =
[ lfsr ] #
[| ( drop (1 , LFSR ) # [ ( V ( LFSR@0 , LFSR@2 , LFSR@11 ) ) ] )
|| LFSR <- ClockLFSR_KSM |];

zt : [ inf ][32];
zt =
[| F ( R@0 , R@1 , LFSR@15 ) ^ LFSR@0
|| LFSR <- ClockLFSR_KSM



ClockFSM =
[ fsm ] #
( S1 ( R@0 ) )
( S2 ( R@1 ) ) ]
|| LFSR <- ClockLFSR_KSM

};

// Auxiliary
////////////

Init_LFSR : ([4][32] , [4][32]) -> [16][32];
Init_LFSR (k , iv ) =
[ ( k@0 ^ 0 xFFFFFFFF ) ( k@1 ^ 0 xFFFFFFFF )
( k@2 ^ 0 xFFFFFFFF ) ( k@3 ^ 0 xFFFFFFFF )
( k@0 ) ( k@1 )
( k@2 ) ( k@3 )
( k@0 ^ 0 xFFFFFFFF ) ( k@1 ^ 0 xFFFFFFFF ^ iv@3 )
( k@2 ^ 0 xFFFFFFFF ^ iv@2 ) ( k@3 ^ 0 xFFFFFFFF )
( k@0 ^ iv@1 ) ( k@1 )
( k@2 ) ( k@3 ^ iv@0 ) ];

F : ([32] , [32] , [32]) -> [32];
F ( R0 , R1 , LFSR_15 ) = (( LFSR_15 + R0 ) & 0 xFFFFFFFF ) ^ R1 ;

V : ([32] , [32] , [32]) -> [32];
V ( LFSR_0 , LFSR_2 , LFSR_11 ) =
join ( reverse ( drop (1 , s0 ) # [0 x00 ]) ) ^ MULa ( s0 @ 0) ^
LFSR_2 ^
join ( reverse ([0 x00 ] # take (3 , s11 ) ) ) ^ DIVa ( s11 @ 3)
where {
s0 = reverse ( split LFSR_0 ) :[4][8];
s11 = reverse ( split LFSR_11 ) :[4][8];
};

Specifying and Implementing SNOW3G with Cryptol

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (8)

Similar a Specifying and Implementing SNOW3G with Cryptol

Similar a Specifying and Implementing SNOW3G with Cryptol (20)

Más de Ulisses Costa

Más de Ulisses Costa (15)

Último

Último (20)

Specifying and Implementing SNOW3G with Cryptol