Notes - ADS HT24, Amortised analysis


Flashcards

How does amortized analysis differ from ordinary worst-case analysis?


It considers the “average” worst-case runtime of operation in a sequence of operations.

What are the three main types of amortized analysis?


  • Aggregate analysis
  • Accounting method
  • Potential method

Suppose you are performing aggregate analysis on some algorithm. What does this involve?


Showing that a sequence of $n$ operations takes $T(n)$ worst case time in total.

Suppose you are performing aggregate analysis on some algorithm, and determine that for a sequence of $n$ operations, $T(n) = n^3$ in total. What would the amortized cost of each operation then be?


\[T(n) / n = n^2\]

Suppose you are considering a stack with three operations:

  • Push($x$), adds $x$ to the stack
  • Pop(), removes top item from stack
  • MultiPop($k$), removes top $k$ items from stack

Show that considering the worst-case of each operation, we get an amortized cost of $O(n)$, but using aggregate analysis we get an amortized cost of $O(1)$ for each operation.


  • Push, Pop take $O(1)$ in the worst case, and MultiPop takes $O(n)$ in the worst case. Then for a sequence of $n$ operations, we get $T(n) = O(n^2)$, so amortized cost $O(n)$.
  • From an aggregate perspective, note that nothing can be popped that wasn’t first pushed. Since we can push at most $n$ items, we can only pop (including from inside multipop) $n$ times. Hence we have $T(n) = O(n)$, for an amortized cost $O(1)$.

Suppose we are using the accounting method to do amortized analysis and that in a sequence of $n$ operations:

  • $c _ i$ is the actual cost of the $i$-th operation
  • $\hat c _ i$ is the “amortized” cost of the $i$-th operation

What is the “credit” in the data structure such a sequence?


\[\sum^n_{i = 1} \hat c_i - \sum^n_{i = 1} c_i\]

Suppose that we are using the accounting method to do amortized analysis and that in a sequence of $n$ operations,

  • $c _ i$ is the actual cost of the $i$-th operation
  • $\hat c _ i$ is the “amortized” cost of the $i$-th operation

In order for the analysis to be worthwhile, what must be true about such a sequence, and why is this useful for providing an upper bound on the runtime of such a sequence?


\[T(n) = \sum^n_{i = 1} c_i \le \sum^n_{i = 1} \hat c_i\]

Suppose you are considering a stack with three operations:

  • Push($x$), adds $x$ to the stack
  • Pop(), removes top item from stack
  • MultiPop($k$), removes top $k$ items from stack

Quickly show that using the accounting method, we can deduce that the amortized cost of each operation is $O(1)$.


  • Charge £2 for Push, £1 to pay for pushing on the stack, and £1 to credit for later pops
  • Charge £0 for Pop and MultiPop

Note that at any moment, any item on the stack is associated to £1 of credit. Then any pops are essentially “free” because we have already paid. Then the total cost for $n$ operations is $2n$, so the amoritzed cost is $O(1)$.

Suppose we have a data structure $D$ and a potential function $\Phi$. What does $\Phi(D)$ represent when using the potential method for aggregate analysis?


The amount of “paid for” but not yet performed.

Suppose we have a data structure $D$ and a potential function $\Phi$. Suppose further that the true cost of an operation is $c _ i$. What is the definition of the amortized cost $\hat c _ i$, and can you justify what conditions you require to bound $T(n)$ using this method?


\[\hat c_i := c_i + \Phi(D_i) - \Phi(D_{i-1})\]

We hope that

\[T(n) \le \sum^n_{i = 1}\hat c_i\]

Note that

\[\begin{aligned} \sum^n_{i = 1} \hat c_i &= \sum^n_{i = 1} (c_i + \Phi(D_i) - \Phi(D_{i-1})) \\\\ &= \left(\sum^n_{i = 1} c_i\right) + \Phi(D_n) - \Phi(D_0) \end{aligned}\]

The condition we require is that $\Phi(D _ 0) \le \Phi(D _ n)$.

Suppose you are considering a stack with three operations:

  • Push($x$), adds $x$ to the stack
  • Pop(), removes top item from stack
  • MultiPop($k$), removes top $k$ items from stack

Quickly show that using the potential method, we can deduce that the amortized cost of each operation is $O(1)$.


Let $\Phi(D _ i)$ be the number of items in $D _ i$ after operation $i$. Then $\Phi(D _ 0) = 0$. For a push operation

\[\Phi(D_i) - \Phi(D_{i-1}) = (s+1) - s = 1 \implies \hat c_i = 1 + 1 = 2\]

For a pop operation:

\[\Phi(D_i) - \Phi(D_{i-1}) = s - (s-1) = -1 \implies \hat c_i = 1 - 1 = 0\]

Hence for a MultiPop($k$), we have

\[\Phi(D_i) - \Phi(D_{i-1}) = s - (s-1) = -k \implies \hat c_i = k - k = 0\]

Hence each operation has amortized cost $O(1)$.

Suppose we have a $k$-bit binary counter

\[A[k-1], \cdots, A[1], A[0]\]

where $A[0]$ represents the lowest-order bit and $A[k-1]$ represents the highest-order bit. Assuming that we have just one operation $\text{Increment}$ which increments the value of the counter by $1$ ($\text{mod } 2^k$).

Assuming that the cost of an increment is equal to the number of bits flipped, show in three different ways that the amortised cost of each $\text{Increment}$ is $O(1)$.


Aggregate method:

\[\begin{aligned} \text{total flips} &= \sum^{k-1}_ {i = 0} \left\lfloor \frac{n}{2^i} \right\rfloor \\\\ &\le \sum^{k-1}_ {i = 0} \frac{n}{2^i} \\\\ &\le 2n \end{aligned}\]

Thus total cost is $O(n)$, so amortised cost is $O(1)$.

Accounting method:

  • For every bit that changes from $0$ to $1$, charge £2.
    • £1 to cover the cost of the flip
    • £1 to prepay the cost of flipping the bit back to $0$
  • For every bit that changes from $1$ to $0$, charge £0.
    • We have already pre-paid the cost of flipping the bit.

Credit invariant: every bit set to £1 corresponds to £1 credit. Hence total number of flips $\le$ total charge. But the charge for one increment is $\le 2$, so total charge for $n$ increments is $\le 2n$. Therefore amortised cost is less than the maximum charge, which is $2$. So increment costs $O(1)$.

Potential method:

Define our potential function $\Phi : \mathcal C \to \mathbb R _ {\ge 0}$. Then

\[\Phi(A) = \text{number of 1s in counter}\]

Consider the charge

\[\hat c_i = c_i + \Phi_i - \Phi_{i -1}\]

For some increment $i$, let $t$ be the number of $1$s that flipped to $0$. Then $c _ i \le t + 1$, and $\Phi _ i - \Phi _ {i-1} \le -t + 1$. Then

\[\hat c_i = c_i + \Phi_i - \Phi_{i-1}\]

so $x _ i \le 2$ and hence amortised cost is $\le 2$.

Suppose we are implementing a dynamic table with the following code:

INSERT(T, x):
	if size[T] = 0, then:
		allocate table[T] w/ 1 slot
		size[T] = 1

	if num[T] = size[T], then:
		allocate new table with 2*size[T] slots
		copy all items at table[T] to new table
		free table[T]
		table[T] = newtable
		size[T] = 2*size[T]

	insert x into T
	num[T] += 1

Assuming that the cost of each operation is measured by the number of insertions made, quickly prove the amortised cost of each operation is $O(1)$, first using the aggregate method, and then using the potential function method.


Aggregate method: The cost of the $i$-th insertion is

\[c_i = \begin{cases} i &\text{if }i-1\text{ is a power of }2 \\\\ 1 &\text{otherwise} \end{cases}\]

Then

\[\sum^n_{i = 1}c_i = n + \sum^{\lfloor \log n \rfloor}_{i = 0} 2^i \le 3n\]

Potential method: Define

\[\Phi(T) = 2\text{num}[T] - \text{size}[T]\]

Consider charge $x _ i = c _ i + \Phi _ i - \Phi _ {i-1}$ for the $i$-th insertion. Without expansion, $c _ i = 1$, and $\Phi _ i - \Phi _ {i -1} = 2$. With expansion, if $\text{size}[T] = m$ before expansion, then $c _ i = m + 1$, $\Phi _ {i-1} = 2m - m = m$, $\Phi _ i = 2(m+1) - 2m = 2$. So $x _ i = O(1)$.

Suppose we are implementing a dynamic table with the following code:

INSERT(T, x):
	if size[T] = 0, then:
		allocate table[T] w/ 1 slot
		size[T] = 1

	if num[T] = size[T], then:
		allocate new table with 2*size[T] slots
		copy all items at table[T] to new table
		free table[T]
		table[T] = newtable
		size[T] = 2*size[T]

	insert x into T
	num[T] += 1

We also want to implement deletion in $O(1)$ amortised time. To do this, we define a load factor $\alpha = \text{num}[T] / \text{size}[T]$. So that we only use a reasonable amount of space, we want to ensure $\alpha = \Theta(1)$.

When should we halve or double the size of the table, and why does the obvious choice not work, and what potential function can you use to show that this works?


We should halve the table when $\alpha = \frac 1 4$, and double when $\alpha = 1$.

The obvious choice would be halving the table when $\alpha = \frac 1 2$ and doubling when $\alpha = 1$. This doesn’t work because if $n$ is a power of two, we can alternate between adding and deleting.

The potential function is given by:

\[\Phi(T) = \begin{cases} 2\text{num}[T] - \text{size}[T] &\text{if }\alpha[T] \ge \frac 1 2 \\\\ \frac 1 2\text{size}[T] - \text{num}[T] &\text{if } \alpha[T] < \frac 1 2 \end{cases}\]

Consider an implementation of the Union-Find data structure via disjoint-set forests with path compression.

How can you modify the standard potential method to instead get a bound on the total cost of $f$ $\mathbf{FindSet}$ operations in a sequence of:

  • $n$ $\mathbf{MakeSet}$ operations
  • $k$ $\mathbf{Union}$ operations
  • $f$ $\mathbf{FindSet}$ operations

(where we consider the “unit of cost” to be one pointer update)?


The standard potential method uses the fact that

\[\sum^n_{i = 1} \hat c_i = \left(\sum^n_{i = 1} c_i\right) + \left(\Phi_n - \Phi_0\right)\]

and then if $\Phi _ n - \Phi _ 0 \ge 0$, we can use the charge at each step as a lower bound for the cost. Instead, we can rearrange this to get a fully general statement that

\[\sum^n_{i = 1} c_i = \left( \sum^n_{i = 1} \hat{c}_i \right) - \left(\Phi_n - \Phi_0\right)\]

In this case, define the potential function to be

\[\Phi(D) = \text{number of unvisited nodes (say }U)\]

This starts off as at most $n$. Then for each $\mathbf{FindSet}(x)$:

  • If we have already visited $x$: then the amortised cost is $c _ i + U - U = c _ i$, and $c _ i = 1$.
  • If we haven’t visited $x$: suppose we visit $t$ nodes in our path to the root that we make during path compression, then before the operation we have $U$ unvisited nodes, and after, we have $U - t + 1$ unvisited nodes. Then the amortised cost is $t + U - t + 1 - U$, which is $1$ again

Hence

\[\begin{aligned} \text{total cost} &= \text{total charge} - \text{change in potential} \\\\ &\le f - (-n) \\\\ &\le n + f \end{aligned}\]

So the total cost is $O(n + f)$.

Consider a variant of a link-by-size implementation of the disjoint-set forest data structure where instead of storing the actual size $ \vert T _ i \vert $ of each tree $T _ i$, we store $\ell _ i = \lfloor \log _ 2 \vert T _ i \vert \rfloor$, and when merging $T _ i$ and $T _ j$, we use the following rules:

  • If $\ell _ i < \ell _ j$, we make the root of $T _ i$ point to the root of $T _ j$
  • If $\ell _ i > \ell _ j$, we make the root of $T _ j$ point to the root of $T _ i$
  • If $\ell _ i = \ell _ j$, we choose one of the two options arbitrarily.

Quickly prove that the height of a tree that contains $s$ nodes does not exceed $2\log _ 2 s$.


As is typical with lots of these questions, we instead prove a statement like

\[s \ge 2^{\Omega(h)}\]

by induction on the number of $\mathbf{Union}$s. In this case, we prove that

\[s \ge (1.5)^h\]

Note that this does prove the statement, because then

\[h \le \log_{1.5}(s) = \frac{\log_2(s)}{\log_2(3) - 1} \le 2\log_2(s)\]

The base case is immediate. Now consider the $k$-th $\mathbf{Union}$, and then by the inductive hypothesis, all the trees in our data structure have the desired property.

Where does the seemingly arbitrary constant $1.5$ come from? Suppose that we want to combine two trees $T _ 1$ and $T _ 2$ by making the root of $T _ 1$ point to the root of $T _ 2$ and let $s _ i$ and $h _ i$ denote the sizes and heights of each tree respectively.

We have

\[\begin{aligned} \lfloor \log_2 s_1 \rfloor < \lfloor \log_2 s_2 \rfloor &\implies \log_2 s_1 \le (\log_2 s_2) + 1 \\\\ &\implies (\log_2 s_1) - 1 \le (\log_2 s_2)\\\\ &\implies (\log_2 s_1 / 2) \le \log_2 s_2 \\\\ &\implies s_1/2 \le s_2 \end{aligned}\]

Then if $h _ 1 < h _ 2$, $h = h _ 2$ and

\[s > s_2 \ge (1.5)^{h_2} = 1.5^h\]

Otherwise $h _ 1 = h _ 2$ and $h = h _ 1 + 1$. Then

\[\begin{aligned} s &= s_1 + s_2 \\\\ &\ge 1.5s_1 \\\\ &\ge 1.5 \cdot (1.5^{h_1}) \\\\ &\ge (1.5)^h \end{aligned}\]

as required. So the key is noticing that $s _ 1 + s _ 2 \le 1.5s _ 1$.

When using the accounting method to do amortised analysis, what is a “credit invariant”?


Some useful fact about the total credit in the data structure that allows you to deduce you can cover every cost, e.g. total credit is number of items in list, total credit is number of $1$s active, etc. Then you can use the charge as an upper bound on the costs.




Related posts