# Notes - Computer Security MT24, Symmetric key ciphers

### Flashcards

The typical example used for symmetric key ciphers is communication between two parties, Alice and Bob, where there might be an eavesdropper. How can you modify this scenario to show that symmetric key ciphers are still useful for secure storage?

Bob is the future version of Alice, and the transmission channel is the storage medium.

@Define formally a symmetric key cryptosystem.

- Definitions:
- $\mathcal M$ is the set of messages (typically binary strings of length $n$, the block size)
- $\mathcal C$ is the set of ciphertexts (typically binary strings of length $n$, the block size)
- $\mathcal K$ is the keyspace (typically binary strings of length $k$, the key size)
- $E : \mathcal K \times \mathcal M \to \mathcal C$ is the encryption function
- $D : \mathcal K \times \mathcal C \to \mathcal M$

- Requirements:
- For every $m \in \mathcal M$, $D _ k(E _ k(m)) = m$
- (i.e. any encrypted message can be decrypted)
- (note also that it doesn’t have to be true that for every $c \in \mathcal C$, $E _ k(D _ k(c)) = c$)

Formally, a symmetric key cryptosystem is given by:

- Definitions:
- $\mathcal M$ is the set of messages (typically binary strings of length $n$, the block size)
- $\mathcal C$ is the set of ciphertexts (typically binary strings of length $n$, the block size)
- $\mathcal K$ is the keyspace (typically binary strings of length $k$, the key size)
- $E : \mathcal K \times \mathcal M \to \mathcal C$ is the encryption function
- $D : \mathcal K \times \mathcal C \to \mathcal M$

- Requirements:
- For every $m \in \mathcal M$, $D _ k(E _ k(m)) = m$
- (i.e. messages can be decrypted)

What is the Caesar cipher in this context?

- $\mathcal M$ is the set of messages (typically binary strings of length $n$, the block size)
- $\mathcal C$ is the set of ciphertexts (typically binary strings of length $n$, the block size)
- $\mathcal K$ is the keyspace (typically binary strings of length $k$, the key size)
- $E : \mathcal K \times \mathcal M \to \mathcal C$ is the encryption function
- $D : \mathcal K \times \mathcal C \to \mathcal M$

- For every $m \in \mathcal M$, $D _ k(E _ k(m)) = m$
- (i.e. messages can be decrypted)

- $\mathcal M$ are fixed length blocks of Roman alphabet characters, encoded as a sequence of integers
- $\mathcal C = \mathcal M$
- $\mathcal K = \{0, 1, \ldots, 25\}$, so $ \vert \mathcal K \vert = 26$.
- $E _ k\langle m _ n, \ldots, m _ k\rangle = \langle m _ 1 + k, \ldots, m _ n + k \rangle \pmod{26}$
- $D _ k\langle c _ 1, \ldots, c _ k\rangle = \langle c _ 1 - k, \ldots, c _ n - k \rangle \pmod{26}$

Formally, a symmetric key cryptosystem is given by:

- Definitions:
- $\mathcal M$ is the set of messages (typically binary strings of length $n$, the block size)
- $\mathcal C$ is the set of ciphertexts (typically binary strings of length $n$, the block size)
- $\mathcal K$ is the keyspace (typically binary strings of length $k$, the key size)
- $E : \mathcal K \times \mathcal M \to \mathcal C$ is the encryption function
- $D : \mathcal K \times \mathcal C \to \mathcal M$

- Requirements:
- For every $m \in \mathcal M$, $D _ k(E _ k(m)) = m$
- (i.e. messages can be decrypted)

What is the monoalphabetic substitution cipher in this context, and why is it weak?

- $\mathcal M$ is the set of messages (typically binary strings of length $n$, the block size)
- $\mathcal C$ is the set of ciphertexts (typically binary strings of length $n$, the block size)
- $\mathcal K$ is the keyspace (typically binary strings of length $k$, the key size)
- $E : \mathcal K \times \mathcal M \to \mathcal C$ is the encryption function
- $D : \mathcal K \times \mathcal C \to \mathcal M$

- For every $m \in \mathcal M$, $D _ k(E _ k(m)) = m$
- (i.e. messages can be decrypted)

- $\mathcal M$ are fixed length blocks of Roman alphabet characters, encoded as a sequence of integers
- $\mathcal C = \mathcal M$
- $\mathcal K = S _ {26}$ (i.e. symmetric group of $26$ elements), $ \vert \mathcal K \vert = 26!$.
- $E _ k\langle m _ 1, \ldots, m _ n\rangle = \langle \pi(m _ 1), \ldots, \pi(m _ n) \rangle$
- $D _ k\langle c _ 1, \ldots, c _ k\rangle = \langle \pi^{-1}(c _ 1), \ldots, \pi^{-1}(c _ n) \rangle$

Defeated straight away in a KPA, or using frequency analysis.

Formally, a symmetric key cryptosystem is given by:

- Definitions:
- $\mathcal M$ is the set of messages (typically binary strings of length $n$, the block size)
- $\mathcal C$ is the set of ciphertexts (typically binary strings of length $n$, the block size)
- $\mathcal K$ is the keyspace (typically binary strings of length $k$, the key size)
- $E : \mathcal K \times \mathcal M \to \mathcal C$ is the encryption function
- $D : \mathcal K \times \mathcal C \to \mathcal M$

- Requirements:
- For every $m \in \mathcal M$, $D _ k(E _ k(m)) = m$
- (i.e. messages can be decrypted)

@Define the Vernam cipher in this context, and explain why it is weak.

- $\mathcal M$ is the set of messages (typically binary strings of length $n$, the block size)
- $\mathcal C$ is the set of ciphertexts (typically binary strings of length $n$, the block size)
- $\mathcal K$ is the keyspace (typically binary strings of length $k$, the key size)
- $E : \mathcal K \times \mathcal M \to \mathcal C$ is the encryption function
- $D : \mathcal K \times \mathcal C \to \mathcal M$

- For every $m \in \mathcal M$, $D _ k(E _ k(m)) = m$
- (i.e. messages can be decrypted)

- $\mathcal M = \{0, 1\}^n$
- $\mathcal K = \mathcal C = \mathcal M$
- $E _ k(m) = m \oplus k$
- $D _ k(m) = m \oplus k$

It is defeated immediately by a KPA, since if $E _ k(m) = c$, then $k = c \oplus m$. (Here the cipher is linear in the key).

Formally, a symmetric key cryptosystem is given by:

- Definitions:
- $\mathcal M$ is the set of messages (typically binary strings of length $n$, the block size)
- $\mathcal C$ is the set of ciphertexts (typically binary strings of length $n$, the block size)
- $\mathcal K$ is the keyspace (typically binary strings of length $k$, the key size)
- $E : \mathcal K \times \mathcal M \to \mathcal C$ is the encryption function
- $D : \mathcal K \times \mathcal C \to \mathcal M$

- Requirements:
- For every $m \in \mathcal M$, $D _ k(E _ k(m)) = m$
- (i.e. messages can be decrypted)

Consider the following modification of a Vernam cipher:

- $\mathcal M = \{0, 1\}^n$
- $\mathcal K = \mathcal C = \mathcal M$
- $E _ k(m) = m \oplus h(k)$
- $D _ k(m) = m \oplus h(k)$

where $h$ is some one-way function. The immediate KPA for Vernam ciphers no longer works as the cipher is no longer linear in the key. But why is this still weak?

- $\mathcal M$ is the set of messages (typically binary strings of length $n$, the block size)
- $\mathcal C$ is the set of ciphertexts (typically binary strings of length $n$, the block size)
- $\mathcal K$ is the keyspace (typically binary strings of length $k$, the key size)
- $E : \mathcal K \times \mathcal M \to \mathcal C$ is the encryption function
- $D : \mathcal K \times \mathcal C \to \mathcal M$

- For every $m \in \mathcal M$, $D _ k(E _ k(m)) = m$
- (i.e. messages can be decrypted)

Suppose we know $E _ k(m) = c$, and have some $c’$ for which we wish to find the corresponding $m’$.

Since $c’ = m’ \oplus h(k)$, it follows $c \oplus c’ = m \oplus m’$. Then $m \oplus c \oplus c’ = m’$, and we are done (although we can’t deduce the key).

@Define Shannon perfect security.

Suppose:

- $m \in \mathcal M$ are drawn from some distribution

Then, a cryptosystem is perfectly secure if

- For all $m \in \mathcal M$ and $c \in \mathcal C$, $\mathbb P[M = m \mid C = c] = P[M = m]$. (the attacker gains no extra information about the message $m$ if they are given the ciphertext $c$)
- Keys are drawn uniformly from $\mathcal K$, and one key is used for one encryption only (so KPAs, CPAs or CCAs are impossible).

Shannon perfect security is defined where

- $m \in \mathcal M$ are drawn from some distribution

Then, a cryptosystem is perfectly secure if

- For all $m \in \mathcal M$ and $c \in \mathcal C$, $\mathbb P[M = m \mid C = c] = P[M = m]$. (the attacker gains no extra information about the message $m$ if they are given the ciphertext $c$)
- Keys are drawn uniformly from $\mathcal K$, and one key is used for one encryption only (so KPAs, CPAs or CCAs are impossible).

@State the **impracticability theorem**, which shows that it’s impractical to meet condition (1).

**impracticability theorem**, which shows that it’s impractical to meet condition (1).

Let $\mathcal M’ \subseteq \mathcal M$ denote the set of all messages with strictly positive probability.

Then Shannon’s first condition $\mathbb P[M = m \mid C = c] = P[M = m]$ requires $ \vert \mathcal K \vert \ge \vert \mathcal M’ \vert $.

This means it’s impractical because if there at least as many possible keys as there possible messages, it will take at least as many bits to store/transmit the key as the message itself.

Shannon perfect security is defined where

- $m \in \mathcal M$ are drawn from some distribution

Then, a cryptosystem is perfectly secure if

- For all $m \in \mathcal M$ and $c \in \mathcal C$, $\mathbb P[M = m \mid C = c] = P[M = m]$. (the attacker gains no extra information about the message $m$ if they are given the ciphertext $c$)
- Keys are drawn uniformly from $\mathcal K$, and one key is used for one encryption only (so KPAs, CPAs or CCAs are impossible).

A result called the **impracticability theorem** shows that it’s impractical to meet condition (1): If $\mathcal M’ \subseteq \mathcal M$ denotes the set of all messages with strictly positive probability and Shannon’s first condition $\mathbb P[M = m \mid C = c] = P[M = m]$ is met, then $ \vert \mathcal K \vert \ge \vert \mathcal M’ \vert $.

@Prove this.

**impracticability theorem**shows that it’s impractical to meet condition (1): If $\mathcal M’ \subseteq \mathcal M$ denotes the set of all messages with strictly positive probability and Shannon’s first condition $\mathbb P[M = m \mid C = c] = P[M = m]$ is met, then $ \vert \mathcal K \vert \ge \vert \mathcal M’ \vert $.

First note that if $\mathbb P[M = m] \ne 0$, then

\[\begin{aligned} &\mathbb P[M = m \mid C = c] = \mathbb P[M = m] \\\\ \implies& \mathbb P[M = m \land C = c] = \mathbb P[M = m]\mathbb P[C = c] \\\\ \implies& \mathbb P[C = c \mid M = m] = \mathbb P[C = c] \end{aligned}\]Now suppose for a contradiction that $ \vert \mathcal K \vert < \vert \mathcal M’ \vert $. Pick some ciphertext $c$. Because $c$ is possible, $\mathbb P[C = c] > 0$. Define

\[\mathcal M_c := \\{m \in \mathcal M \mid D_k(c) = m \text{ for some }k\\}\]Then

\[|\mathcal M_c| \le |\mathcal K| < |\mathcal M'|\](Why is the first inequality justified? Let $f(k) := D _ k(c)$. Then $f : \mathcal K \to \mathcal M$, and $\mathcal M _ c \subseteq \mathcal M$. But a function cannot have more outputs than it does inputs).

This means there exists some $m _ 0 \in \mathcal M$ with $\mathbb P[M = m _ 0] > 0$ but $D _ k(c) \ne m _ 0$ for any $k$. This implies $E _ k(m _ 0) \ne k$ for any $k$ and so $\mathbb P[C = c \vert M = m _ 0] = 0$. But by assumption $\mathbb P[C = c] \ne 0$, so $\mathbb P[C = c] \ne \mathbb P[C = c\mid M = m _ 0]$, a contradiction.

What is the difference between a Vernam cipher and a one-time pad?

A one-time pad is a Vernam cipher used with uniformly random keys and the key is changed for every encryption.

@Prove that the Vernam cipher with a truly random key meets Shannon’s first condition of perfect security.

Vernam cipher:

- $\mathcal M = \{0, 1\}^n$
- $\mathcal K = \mathcal C = \mathcal M$
- $E _ k(m) = m \oplus k$
- $D _ k(m) = m \oplus k$

Shannon’s first condition:

- For all $m \in \mathcal M$ and $c \in \mathcal C$, $\mathbb P[M = m \mid C = c] = P[M = m]$. (the attacker gains no extra information about the message $m$ if they are given the ciphertext $c$)

Pick arbitrary $m \in \mathcal M$ and $c \in \mathcal C$. Then:

\[\begin{aligned} \mathbb P[M = m \land C = c] &= \mathbb P[M = m \land K = c \oplus m] \\\\ &= \mathbb P[M = m]\mathbb P[K = c \oplus m] (\star)\\\\ &= \mathbb P[M = m] 2^{-n} \end{aligned}\]where $(\star)$ is justified as the choice of key is independent from the message. Also,

\[\begin{aligned} \mathbb P[C = c] &= \sum_m \mathbb P[M = m \land C = c] \\\\ &= \sum_m \mathbb P[M = m] 2^{-n} \\\\ &= 2^{-n} \end{aligned}\]Equating and dividing,

\[\mathbb P[M = m \mid C = c] = \mathbb P[M = m]\]What’s the difference between perfect security and computational security?

Perfect security is immune to attackers with unbounded computational power, but computational security is immune to attackers with only a limited amount of computational power (e.g. polynomial).

@Define Shannon’s property of **confusion** in a cipher, state what attack models this aims to prevent, and give an example of a weak cipher which doesn’t have this property

**confusion**in a cipher, state what attack models this aims to prevent, and give an example of a weak cipher which doesn’t have this property

- Given $m$ and $c$, the equation “find $k$ such that $E _ k(m) = c$” should be very hard to solve
- Aims to make KPA and CPA difficult.
- A Vernam cipher does not have the confusion property, since there is a simple linear relationship

@Define Shannon’s property of **diffusion** in a cipher, state what attack models this aims to prevent, and give an example of a weak cipher which doesn’t have this property.

**diffusion**in a cipher, state what attack models this aims to prevent, and give an example of a weak cipher which doesn’t have this property.

- Nonuniformity in parts of $m$ should be spread out so that $E _ k(m)$ is close to uniform for any $k$.
- Aims to make COA difficult by obscuring properties of the message in the ciphertext.
- A monoalphabetic permutation cipher doesn’t have this property since you can perform straightforward frequency analysis.

Suppose you have a cipher with encryption and decryption functions

\[\begin{aligned}
&E : \mathcal K \times \mathcal M \to \mathcal C \\\\
&D : \mathcal K \times \mathcal C \to \mathcal M
\end{aligned}\]
At the moment, the size of the keyspace is $ \vert \mathcal K \vert $. You might think that encrpting twice through two different keys via

\[\begin{aligned}
&E'' : (\mathcal K \times \mathcal K) \times \mathcal M \to \mathcal C \\\\
&E''_{\langle k, k' \rangle} (m) = E_k(E_{k'}(m))
\end{aligned}\]
would square the number of steps in the exhaustion attack on $E$. Can you describe the technique that shows this is not the case?

The technique is a **meet-in-the-middle** KPA attack. It works like this:

- Obtain plaintext $m$ with known encryption $c$
- For each $k \in \mathcal K$, compute $E _ k(m)$ and store $\langle k, E _ k(m)\rangle$ in list $L$
- For each $k’ \in \mathcal K$, compute $D _ {k’}(c)$ and search for $\langle k, D _ {k’}(c)\rangle$ in $L$
- If it is found, store $\langle k, k’\rangle$ in a list of candidate pairs
- Otherwise, continue

- Test the candidate pairs $\langle k, k’\rangle$ against a new plaintext $m’$ with known encryption $c’$. Discard the pair if $E _ {k’}(E _ k(m’)) \ne c’$
- Repeat the previous step until only one candidate pair remains, this is the double encryption key

Suppose you have a cipher with encryption and decryption functions

\[\begin{aligned}
&E : \mathcal K \times \mathcal M \to \mathcal C \\\\
&D : \mathcal K \times \mathcal C \to \mathcal M
\end{aligned}\]
At the moment, the size of the keyspace is $ \vert \mathcal K \vert $. You might think that encrpting twice through two different keys via

\[\begin{aligned}
&E'' : (\mathcal K \times \mathcal K) \times \mathcal M \to \mathcal C \\\\
&E''_{\langle k, k' \rangle} (m) = E_k(E_{k'}(m))
\end{aligned}\]
would square the number of steps in the exhaustion attack on $E$. But the technique known as a **meet-in-the-middle** KPA attack shows this is not the case:

- Obtain plaintext $m$ with known encryption $c$
- Work through each $k \in \mathcal K$, compute $E _ k(m)$ and store $\langle k, E _ k(m)\rangle$ in list $L$.
- Then go through again, and for each $k’ \in \mathcal K$, compute $D _ {k’}(c)$ and search for $\langle k, D _ {k’}(c)\rangle$ in $L$
- If it is found, store $\langle k, k’\rangle$ in a list of candidate pairs
- Otherwise, continue

- Test the candidate pairs $\langle k, k’\rangle$ against a new plaintext $m’$ with known encryption $c’$. Discard the pair if $E _ {k’}(E _ k(m’)) \ne c’$
- Repeat the previous step until only one candidate pair remains, this is the double encryption key

Justify why this attack works.

**meet-in-the-middle**KPA attack shows this is not the case:

- If it is found, store $\langle k, k’\rangle$ in a list of candidate pairs
- Otherwise, continue

- Firstly, it requires only $2^{n+1}$ work rather than $2^{2n}$ work, since you just need to iterate through the keyspace twice.
- If $E _ {k’}(E _ k(m)) = c$, then $E _ k(m) = D _ {k’}(c)$. So you can identify the two keys by “searching from both directions”, meeting in the middle.

Suppose you have a cipher with encryption and decryption functions

\[\begin{aligned}
&E : \mathcal K \times \mathcal M \to \mathcal C \\\\
&D : \mathcal K \times \mathcal C \to \mathcal M
\end{aligned}\]
At the moment, the size of the keyspace is $ \vert \mathcal K \vert $. What is triple ecryption?

Defining a new encryption function

\[E'''_{\langle k, k', k'' \rangle} (m) = E_{k''}(D_{k'}(E_k(m)))\]Suppose you have a cipher with encryption and decryption functions

\[\begin{aligned}
&E : \mathcal K \times \mathcal M \to \mathcal C \\\\
&D : \mathcal K \times \mathcal C \to \mathcal M
\end{aligned}\]
but want to encrypt a message longer than a single plaintext by splitting the message into several blocks.

@Define the electronic codebook (ECB) technique in this case, and describe some of its disadvantages.

Each block $m _ i$ is encrypted separately with the same key:

\[c_i = E_k(m_i)\]Leads to weaknesses if plaintexts might include repeated or similar blocks.

Suppose you have a cipher with encryption and decryption functions

\[\begin{aligned}
&E : \mathcal K \times \mathcal M \to \mathcal C \\\\
&D : \mathcal K \times \mathcal C \to \mathcal M
\end{aligned}\]
but want to encrypt a message longer than a single plaintext by splitting the message into several blocks.

@Define the cipherblock chaining (CBC) technique in this case.

Encryption of block $i$ is XOR-ed with the plaintext of block $i+1$ before encryption, starting with an initialisation vector $IV$.

\[\begin{aligned}
c_1 &= IV \\\\
c_{i+1} &= E_k(m_i \oplus c_i)
\end{aligned}\]
($IV$ must also be transmitted).

Suppose you have a cipher with encryption and decryption functions

\[\begin{aligned}
&E : \mathcal K \times \mathcal M \to \mathcal C \\\\
&D : \mathcal K \times \mathcal C \to \mathcal M
\end{aligned}\]
but want to encrypt a message longer than a single plaintext by splitting the message into several blocks.

@Define the output feedback mode (OFB) technique in this case, and highlight a way in which you need to be careful.

Use iterated encryption of a random initialisation vector $IV$ to generate a pseudorandom stream $\langle r _ 2, r _ 3, \ldots \rangle$. Use this pseudorandom stream as a Vernam cipher on each block:

\[\begin{aligned} c_1 = r_1 &= IV \\ r_{i+1} &= E_k(r_i) \\ c_{i+1} &= m_i \oplus r_{i+1} \end{aligned}\]Need to be careful that the pseudorandom stream doesn’t get into a cycle.

Suppose you have a cipher with encryption and decryption functions

\[\begin{aligned}
&E : \mathcal K \times \mathcal M \to \mathcal C \\\\
&D : \mathcal K \times \mathcal C \to \mathcal M
\end{aligned}\]
but want to encrypt a message longer than a single plaintext by splitting the message into several blocks.

@Define the counter mode (CTR) technique in this case.

Use encryption of “shifts” of an initialisation vector $IV \oplus i$ to generate a pseudorandom stream $\langle r _ 2, r _ 3, \ldots \rangle$. Use this pseudorandom stream as a Vernam cipher on each block:

\[\begin{aligned} r_i &= E_k(IV \oplus i) \\ c_1 &= IV \\ c_{i+1} &= m_i \oplus r_{i+1} \end{aligned}\]Need to be careful that the pseudorandom stream doesn’t get into a cycle.

What is the idea behind stream ciphers?

Attempt to simulate a perfectly secure one-time pad by using a cryptographic key to a seed pseudorandom stream of bits, which is then XORed with the plaintext for encryption.

Stream ciphers are generally less secure than block ciphers but are still used. Why?

They are useful in low latency applications, since you don’t need to wait for a complete block for encryption and transmission.

Formally @define a stream cipher?

- Definitions
- A hidden state from a set of possible states $s \in \mathcal S$
- An initialisation function $I: \mathcal K \to S$ which constructs an initial state from a key
- An update function $U : \mathcal S \to \mathcal S$ which updates the state
- An output function $f : \mathcal S \to \Sigma$ where $\Sigma$ is the alphabet for the stream cipher

- Operation
- Every time the output function is called to request a new symbol from the stream, the state is updated