Basics of Cryptography - Stream ciphers and PRNG

Basics of cryptography
Shift registers and stream ciphers

Menu
 Can't explain the whole cryptography here
 Let's try to explain how it works
 Simply
 Let's see then some concrete examples
 Among so many other fields of application...

Menu
 Cryptography main rule
 Vernam One Time Pad (OTP)
 Computer applications
 Problems
 Solutions
 Symetric computer-based cryptography
 PRNG & LFSR
 Designing a stream cipher using PHP and C
 Applications :
 DVD-Blu-Ray encryption (CSS / AACS)
 Wifi (WEP : RC4)
 GSM (A5/1)

About me
 Julien PAULI - @julienpauli - github.com/jpauli
 Working for Sensiolabs in Paris
 Release manager of PHP 5.5 / 5.6
 PHP internals contributor from time to time (bug fixes,
internal API, performances)
 Knowledges about CPU architectures, C lang, Linux and
networking

Vernam OTP (One Time Password)
 The only method of encryption that is mathematically
absolutely 100% sure and uncrackable

Vernam OTP
 The only method of encryption that is mathematically
absolutely 100% sure and uncrackable
Hello foo
secretkey
?????????
clear
key
encrypted
+

Vernam OTP
 You modulo-add clear text + a key, randomly chosen and of
the same size (or more) than the clear text
 The operation is fully bijective and can be undone , just like
a classical math addition
3 + 8 = 11
11 - 8 = 3

Vernam OTP
3 + 8 = 11
11 ??? = ???
clear + key = encrypted
 This cryptography method is the only one being 100% safe
and not violable.
 If you get the crypted content only, you'll never be able to
get back the clear text, without having the key

Vernam OTP
 Used in the image field this time :
+ =

Vernam conditions
 Vernam OTP is 100% sure if and only if :
 The key is random and cant be guessed
 The key is kept secret
 The key size is >= to the clear content size
 The key is never reused (One Time Password : OTP)
 The same key is used to encrypt and decrypt
 This is called symetric encryption

Vernam conditions
 If the key is reused :
+ =
+ =
+ =
1
1
2
2
keykey
keykey

Vernam
 Used during WW II
 Enigma
 Used in red phone
 To link Moscow to Washington
 Keys (physical) were carried using extra safe planes

In computer science
 Machines make use of basis 2 (binary)
 "modulo 2 addition" is called XOR (exclusive OR)
 Noted or ^
A B A ^ B
0 0 0
0 1 1
1 0 1
1 1 0

XOR for cryptography
 XOR satisfies Vernam OTP conditions
 Having A a clear text
 Having B a secret key
 Crypted C = A ^ B
 Clear A = C ^ B
Symetric cryptography using
key C

Vernam in computer sciences
 Vernam based cryptography is inviolable if :
 The key size is >= the clear size
 The key is random
 The key is never reused
 Those 4 rules seem hard to achieve in modern computers world

The key is secret
 It is possible, while not best, to exchange the key securely
 Hand to hand
 "What's the wifi password please ?"
 Usually, asymetric cryptography is used to create a secure
channel to exchange the symetric crypto key

And then ?
 The key is not used more than once

2 - The key size is >= the clear size

Key size
 To crypt 25Mb of data , one will need a 25Mb key
 that's 26214400 characters
 How to do to use a "reasonnably finite-size" key ?
 Think about Wifi keys, often long sized, but not that long of
thousands of thousands of chars

LFSR
 Linear Feedback Shift Register
 Solution chosen to solve the problem "The key size must be
>= the clear size to crypt"
 How does that work ?

LFSR
 Linear Feedback Shift Register
 Computer and electronic structure
 Easy to code in computer language
 Easy to make into electronic chips
 Very powerful, very fast

One byte : 8 digits (bits)
 2^7 + 2^5 + 2^2 + 2^1 + 2^0 = 167 (decimal)
 In computer science, 1 byte = 1 character (like 'f')
 or one integer between 0 and 255 if you prefer
 Let's take one byte from the secret key
1 01 10 1101
7 6 5 4 3 2 1 0

LFSR
1 01 10 1101
 Shift register. At each clock tick ...
 Shift digits one slot to the right
 Reinject the right-out digit to the left
 We got an infinite source of digits
 This is a circular shift
Extracted digit used to crypt
one digit of the payload (using
XOR)

LFSR
1 01 10 1101
 Shift register
 Shift to the right
 Reinject on the left
 We got an infinite number of digits but ...
 We got a finite digit sequence (repeating itself)
1 10 11 1001
1 01 11 0011
1 11 01 0101
1-
2-
3-
4-

LFSR
1 01 10 1101
 Shift register
 We got an infinite digit sequence
 But not random
 The feedback function is 1
1 10 11 1001
1 01 11 0011
1 11 01 0101
. ..
1-
2-
3-
4-
. . .

Where are we ?
 The key is secret
 The key is never reused

LFSR
1 01 10 1101
 The sequence is going to repeat itself
 How to add it some randomness ?
1 10 11 1001
1 01 11 0011
1 11 01 0101
. ..
1-
2-
3-
4-

LFSR
1 01 10 1101
 It's all about the feedback function
 Let's complexify the feedback function

LFSR
1 01 10 1101
1 01 10 1101
1 10 11 1000
1 10 10 0001
1 11 01 0001
1-
2-
3-
4-

LFSR
 That starts looking random right ?
1 01 10 1101
1 10 11 1000
1 10 10 0001
1 11 01 0001
167
83
145
240
bits Integer

LFSR vs Maths
1 01 10 1101
 This can be mathematically modelized :
 S = X^8 + X^7 + X^6 + X^5 + 1
 This is a classic polynom , that can be solved

m-sequence LFSR
1 01 10 1101
 As the output is injected back into the input, this LFSR will
generate a finite number of states
 The maximum sequence is 2^n - 1
 "n" is the LFSR degree (number of digits)
 This maximum sequence is called the "m-sequence"
 In the above example, n is 8, the LFSR will have a maximum
period of 255 states

m-sequence LFSR
1 01 10 1101
 To get an m-sequence
 The number of feedback digits must be odd
 Their factors must be prime between them
 S = X^8 + X^7 + X^6 + X^5 + 1
 Works, this LFSR will have a m-sequence (255 states)
 S = X^8 + 1
 Doesn't work, this LFSR will repeat before 255 states

m-sequence LFSR
1 01 10 1101
 If we extend LFSR to 32 digits, max period becomes 2^32 - 1
 That's 4294967295 different states
 Randomness slowly becomes more and more appearingly clear
 With 32 digits (4 bytes or 4 secret key chars) we can
encrypt 4294967295 digits, thus 512Mb.
 Above that : the key repeats itself (and invalidates Vernam
conditions)

LFSR example coded in PHP
 https://github.com/jpauli/PHP-Crypto
**Simple Galois LFSR, degree 7 (127 states m-sequence)**
Used register bits for feedback : 7 6
Deducted Feedback function : 1100000 (0X60)
Your initial state is : 00000000000000000000001110001100 (908)
Let's now start the Linear Feedback Shift Register
[Iteration] [-------Internal Register -------] [PRandom bit]
| | |
v v v
0 - 00000000000000000000001110001100 [ 0 ]
1 - 00000000000000000000000111000110 [ 0 ]
2 - 00000000000000000000000011100011 [ 1 ]
3 - 00000000000000000000000000010001 [ 1 ]
4 - 00000000000000000000000001101000 [ 0 ]

LFSR example coded in PHP
for ($i = 0; $i < count(self::POLYNOMIAL_PRIME_COEFF[$this->degree]); $i++)
$this->taps[ ] = self::POLYNOMIAL_PRIME_COEFF[$this->degree][$i];
$this->ff |= (1 << self::POLYNOMIAL_PRIME_COEFF[$this->degree][$i]);
}
/* LFSR always has first and last bit set */
$this->ff |= 1 << ($this->degree);
$this->ff |=1;
do {
$this->iterations++;
$this->currentState >>= 1; /* Shift register */
yield $this->iterations => $this->currentState;
if ($this->currentState & 1) {
$this->currentState ^= $this->ff; /* re-enter */
}
} while ($this->currentState != $this->start);

Encryption with a LFSR ?
 Pretty easy
 Initialize LFSR with the secret key
 Encrypt each clear digit with one digit generated from the LFSR
using XOR operation
 This is called a stream cipher
 (bloc ciphers also exist)

Stream Cipher demo

Generating a random byte using an LFSR
function getRandomByte(LFSR $lfsr) : int
{
$random = 0;
$run = $lfsr->run();
for ($j=0; $j<8; $j++) {
$random |= $lfsr->getCurrentBit() << $j;
$run->next();
}
return $random;
}

Ciphering clear data with the random byte
function cipher(string $input) : string
{
$dataSize = strlen($input);
$i = 0;
$output = '';
$lfsr = new LFSR($this->degree, $this->seed);
do {
$random = $this->getRandomByte($lfsr);
$data = unpack('C', $input[$i]);
$output .= pack('C', $outputByte = $data[1] ^ $random);
} while (++$i < $dataSize);
return $output;
}
Yeah !

Where are we ?
 The key is not reused

Stream ciphers can be secure if
 The feedback digits are kept secret
 The period is big enough (m-sequence) to never loop
 The attacker cannot access the input stream
 If the attacker can inject some data into the clear input, a linear
equation system can be used to crack the LFSR and deduce the
key
 This, with only 2n states
 "Berlekamp-Massey attack"

Having a good initialisation
 Randomness will depend on how the key is used to initialize
the LFSR in the stream cipher
 The key is used to define the starting state of the LFSR
 It can also be used to choose the feedback digits
 The key is usually mixed with an initialization vector (IV),
which is some piece of random bytes.
 Thus, with the same key , the same LFSR will produce
different output

Hacking the encryption process
 If the LFSR starts looping, its going to produce the same
output (repeat itself) and thus doesn't satisfy Vernam
conditions anymore
 If the attacker can inject some input, he can use Berlekamp-
Massey attack to crack the LFSR key and states

Strengthen the encryption
 Branch several LFSR together :
1 01 10 1101
1 11 00 output

 Having several LFSR working together :
 The loop is still linear
 Thus can be cracked in polynomial time by injecting some traffic into the
input
 N-degree linear equation system
 We push the time limit, only

Application examples
 Well-known LFSR XOR based encryption systems
 (And how they've been hacked)

Examples
 Content Scrambling System (CSS)
 DVD protection mechanism (from 1995)
 Cracked in 1999 by hacking the LFSRs
 Keys are cracked by injecting some input, watching the output and
cracking the polynoms
 DECSS is born, and movie piracy with it
 Back then, less than 18 seconds were needed to a Pentium 3
@ 450Mhz to hack the LFSRs

DECSS
 CSS keys are secret and distributed by DVDCCA to DVD-
reader manufacturers
 Keys are stored into the hardware (or soft for PC softwares)
 Each device needs a key, this is costly
 http://www.dvdcca.org/css.aspx
 Hence, free world and Linux were forgotten from DVDCCA
 The open/free world answered by cracking CSS
 Lawsuits happened
 Technical analysis of CSS :
 http://www.lemuria.org/DeCSS/crypto.gq.nu/

CSS and VLC
 Since, DECSS code is embeded into VLC
 In libdvdcss
 http://git.videolan.org/?p=libdvdcss.git;a=blob;f=src/css.c;
 This code is the algorithm to hack CSS protected DVDs, to
read them under Linux
 Hacking the LFSRs and the keys
 Otherwise the stream is crypted and unreadable

 LFSR cant be cryptographically secure, but we can still push
the limits of the time needed to crack it
 Time should be > brute force attack
 If output is a linear function of the input, then it can be
cracked
 https://en.wikipedia.org/wiki/Correlation_attack
 We need to have the output not being a linear function of
the input.
 Use a non-linear reentrancy function
 NLFSR
 Use a non-linear shift

Notes about Trivium
 3 LFSR
 A : 93 digits
 B : 84 digits
 C : 111 digits
 On LFSR input depends on an other's output and one of its
own digit
 Period 2^64
 Some of the output makes use of an AND
 AND is a modulo-2 multiplication
 Thus cryptanalysis of the output cant crack the LFSR in linear time
anymore

Using Trivium
 80 digits IV
 loaded in the A LFSR left digits
 secret key of 80 digits as well
 loaded in the B LFSR left digits
 All other digits are zeroed.
 We shuffle 1152 round times.
 Starting from 1153th time : we got our stream

Cracking Trivium
 Today, no efficient attack has been discovered
 We found algos in 2^68
 Thus above brute force (2^64) , thus useless
 As of today 2018, Trivium is recommanded by security
experts

A5/1
 A5/1 makes use of 3 LFSR
 19 / 22 / 23 digits
 Introduces a non-linear shift :
 LFSR are shifted only if it is in the MAJ(1,2,3) set

A5/1
 A5/1 is used to crypt GSM communications
 It took about 10 years, but today A5/1 is broken
 In an acceptable time
 Under acceptable computing hardware (CPU/Mem)
 Often still needs some specific hardware
 Some flaws were found in the GSM protocols that weaken A5/1
and allow an attack

RC4
 Rivest Cipher 4 don't use LFSR, but still can be used as a
pseudo random generator
 The big picture of RC4 :
 Byte based (unit is byte, not digit)
 Works on a 256 bytes payload
 Uses many permutations and one XOR only
 Huge period, about 10^100
 Depending on the key used
 Max theoric period is : 2^170000

RC4
 We put 256 bytes into an array
 We shuffle the array by adding bytes and swapping them
 We get one byte from the array at indexes i and j
 We shuffle 2 array slots, then i and j

RC4 , demo in PHP and C

RC4 is cracked
 As its been massively used since its creation (1987), RC4
has been cracked
 Today, it is cracked. Flaws have been discovered
 The first bytes leak some informations about the key
 KSA (Key Scheduling Algo) is too weak
 RC4 doesnt define how to use the IV
 So weak usage started to appear (concatenation of IV with the key)
 algo has some weaknesses
 You can recognize RC4 from a P-random output stream

RC4 in practice
 RC4 was used in 802.11 WEP (Wired Equivalent Privacy).
 WEP is very weak :
 Ability to inject some trafic in input, and watch the output, thus
hijacking the internal state of RC4
 Control checksum are weak (CRC32 : which is linear)
 Reusage of the key (overflow of the stream cipher period)

Memorize
 We talked about stream ciphers
 There exists block ciphers
 DES/AES/BlowFish/RC5
 Every cipher uses the only 100% cryptographically secure
Vernam one-time pad
 A secret key
 A key length >= the clear length
 A modulo-2 addition (XOR in radix 2)

Memorize
 100% cryptographically secure Vernam one-time pad
 A secret key
 A key length >= the clear length
 A modulo-2 addition (XOR in radix 2)
 ... is difficult to gather in computer world
 We then use compromises : LFSR f.e
 From XOR operations, we try to push the limits so far that it goes
over brute force time
 But cryptanalysers often use high level math tools to try to hack such
systems
 Daniel J Bernstein should be the most known engineer about cryptanalysis

Crypto using PHP ?
 Don't use ext/mcrypt
 Old, unmaintained, bugged and unsecure
 Don't use mt_*() or rand() for crypto purposes
 Use ext/hash if you need to hash
 Use ext/sodium if you need to crypt
 2018 crypto. secured stream ciphers :
 trivium / salsa20 ...
 Have a look at the "estream" project
 http://www.ecrypt.eu.org/stream/

Basics of Cryptography - Stream ciphers and PRNG

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Basics of Cryptography - Stream ciphers and PRNG

Similar a Basics of Cryptography - Stream ciphers and PRNG (20)

Más de julien pauli

Más de julien pauli (20)

Último

Último (20)

Basics of Cryptography - Stream ciphers and PRNG