Notes - Artificial Intelligence MT24, Informed search


Flashcards

What is the expansion strategy for best-first search?


Pick the node $n$ with the smallest $f(n)$ from the frontier, where $f$ is an estimate of the total cost to the optimal solution passing through that node.

Best-first search selects nodes from the frontier with the smallest $f(n)$ where $f(n)$ estimates the total cost to the optimal solution passing through that node. What’s a useful way to “split up” $f$?


\[f(n) = g(n) + h(n)\]

where:

  • $g$ the past cost from the root to $n$
  • $h$ the future cost, when it is an estimate this is $A^\ast$ search and $h$ is the heuristic

Best-first search selects nodes from the frontier with the smallest $f(n)$ where $f(n)$ estimates the total cost to the optimal solution passing through that node. How are BFS and UCS special cases of this?


  • BFS: $f(n) = \text{depth of }n$
  • UCS: $f(n) = \text{path cost from root to }n$

When searching for shortest paths in a graph of distances between cities, what’s an admissible heuristic?


Straight-line distance.

Why is $A^\star$ “informed” search?


Using the heuristic function $h$ encodes domain knowledge about the problem.

@Define what it means for a heuristic $h$ to be admissible.


$h$ is admissible iff for all nodes $n$, $0 \le h(n) \le h^\star(n)$.

Can you give an example where a more accurate heuristic (in terms of being closer in absolute value to the perfect heuristic) actually causes $A^\ast$ to fail to be optimal?


Consider this search problem:

Clearly the optimal solution is $\mathbf{InitialState} \to \mathbf R \to \mathbf{GoalState}$. Here are two heuristic functions, $h _ >$ and $h _ {\ll}$, which could be used with $A^\ast$:

$h _ >$ leads to the solution $\mathbf{InitialState} \to \mathbf L \to \mathbf{GoalState}$ whereas $h _ \ll$ leads to $\mathbf{InitialState} \to \mathbf R \to \mathbf{GoalState}$. The problem is that the highlighted entry makes $h _ <$ not admissible.

@State an important theorem about $A^\ast$ search and admissible heuristics.


$A^\star$ tree search with an admissible heuristic is optimal.

@Prove that $A^\ast$ tree search with an admissible heuristic is optimal.


Suppose, for a contradiction, that in the last step of the algorithm, the suboptimal goal node $G’$ has been selected.

Before this choice, there was also a node $n$ in the frontier which was on the shortest path to an optimal goal node $G$.

But then

\[\begin{aligned} f(G') &= g(G') + h(G') \\\\ &= g(G') \\\\ &> g(G) \\\\ &= g(n) + h^\star(n) \\\\ &\ge g(n) + h(n) \\\\ &\ge f(n) \end{aligned}\]

Hence $f(G’) > f(n)$ and so $n$ would’ve been selected first, a contradiction.

Analyse $A^\star$ tree search with an admissible heuristic as an search strategy along the dimensions of:

  • Complete?
  • Time?
  • Space?
  • Optimal?

  • Complete? Yes, unless there are infinitely many nodes with $f(n) \le f(G)$.
  • Time? Exponential in $\delta d$ where $\delta$ is the maximum relative error $(h^\star - h)/h^\star$.
  • Space? Also exponential in $\delta d$.
  • Optimal? Yes.

What is the main difficulty with $A^\star$ search?


Its exponential memory requirements.

Can you give an example where $A^\star$ graph search with an admissible heuristic is not optimal?


@Define a consistent heuristic.


A heuristic is consistent iff:

  • For any two nodes $n, n’$ and any action $a$: $h(n) \le c(n, a, n’) + h(n’)$ (a form of the triangle inequality)
  • For all nodes $n$, $h(n) \ge 0$ and $h(G) = 0$

What implication is there between admissible heuristics and consistent heuristics?


\[\text{consistent} \implies \text{admissible}\]

Can you @state a theorem about $A^\star$ graph search and consistent heuristics?


$A^\star$ graph search with a consistent heuristic is optimal.

@Prove that $A^\star$ graph search with a consistent heuristic is optimal.


We first show that $A^\star$ graph search with a consistent heuristic expands nodes in a non-decreasing order of $f$ value.

Along any path $(\ldots, n, a, n’, \ldots)$ of the search tree, $f$ values are non-decreasing since

\[\begin{aligned} f(n') &= h(n') + g(n') \\\\ &= h(n') + c(n, a, n') + g(n) \\\\ &\ge h(n) + g(n) \\\\ &= f(n) \end{aligned}\]

But since $A^\star$ always removes a node with the smallest $f$ value from its frontier, it must expand nodes in a non-decreasing order of $f$ value.

Now suppose, for a contradiction, that in the last step of the algorithm, the suboptimal goal node $G’$ has been selected.

Before this choice, there was also a node $n$ in the frontier which was on the shortest path to an optimal goal node $G$.

But then

\[\begin{aligned} f(G') &= g(G') + h(G') \\\\ &= g(G') \\\\ &> g(G) \\\\ &= g(n) + h^\star(n) \\\\ &\ge g(n) + h(n) \\\\ &\ge f(n) \end{aligned}\]

Hence $f(G’) > f(n)$ and so $n$ would’ve been selected first, a contradiction.

Describe the $A^\star$ graph search @algorithm.


function A*GraphSearch(problem, frontier) returns (solution or failure):
	explored = {}
	Insert(RootNode(problem.InitialState), frontier)
	while the frontier is not empty:
		node = Remove(frontier) # Here node removed with least f value
		add node.State to explored
		if problem.GoalTest(problem.node.State):
			return node
		for each action in problem.Actions(node.State):
			if child.State is not in explored: (*)
				Insert(ChildNode(problem, node, action), frontier)
	return failure

Consider the following code:

function A*GraphSearch(problem, frontier) returns (solution or failure):
	explored = {}
	Insert(RootNode(problem.InitialState), frontier)
	while the frontier is not empty:
		node = Remove(frontier) # Here node removed with least f value
		add node.State to explored
		if problem.GoalTest(problem.node.State):
			return node
		for each action in problem.Actions(node.State):
			if problem.GoalTest(child.State):
				return child

			if child.State is not in explored: (*)
				Insert(ChildNode(problem, node, action), frontier)
	return failure

What’s wrong with this?


Adding a goal test before adding a node might break optimality. Consider the following situation:

Describe the $IDA^\star$ @algorithm, and state its advantages compared to plain $A^\star$.


  • $h$ is an admissible heuristic
function IDA*(problem) returns (goal or failure):
	root = RootNode(problem.InitialState)
	f_limit = h(root)
	
	while True:
		goal, f_limit = IDA-DFS(problem, root, f_limit)
		if goal != failure:
			return goal
		if f_limit = infinity:
			return failure

function IDA-DFS(problem, node, f_limit) returns (goal, or next cost):
	f_node = g(node) + h(node)
	if f_node > f_limit:
		return (failure, f_node)

	if problem.GoalTest(node.State):
		return (node, f_limit)

	next_f_limit = infinity

	for each action in problem.Actions(node.State):
		succ = ChildNode(problem, node, action)
		goal, new_f_limit = IDA-DFS(problem, succ, f_limit)

		if goal != failure:
			return (goal, new_f_limit)

		next_f_limit = min(next_f_limit, new_f_limit)

	return (failure, next_f_limit)

It avoids the overhead of keeping a sorted queue of nodes.

What does it mean for heuristic $h _ 1$ to dominate heuristic $h _ 2$?


\[h_1(n) \le h_2(n) \le h^\star(n)\]

for all $n$.

How can you combine two admissible heuristics $h _ 1$ and $h _ 2$?


\[h(n) := \max(h_1(n), h_2(n))\]

This heuristic dominates the heuristic for $h _ 1$ and $h _ 2$.

How can you derive an admissible heuristic from a relaxed problem?


Take the admissible heuristic to be the exact solution cost of a relaxed version of the problem, where the optimal solution cost of the relaxed problem is less than or equal to the optimal solution cost of the real problem.

How can you derive an admissible heuristic from subproblems?


  • Subproblems are parts of the overall problem that need to be solved in order to reach the goal
  • A special kind of relaxation

What is a disjoint pattern database?


Storing the exact solution to disjoint subproblems of a subproblem, and then calculating an admissible heusitic by summing together the optimal solution cost for each subproblem in a given problem.




Related posts