# Heuristic Search & Optimization

## Table of Contents

## Dynamic Programming

**Dynamic programming**is a bottom-up procedure that suggests to store sub-results (memoization) so that they can be reused later on to compute the optimal solution of a given problem.- The general procedure is:
- Break the global problem into
**similar sub-problems**and charazterize their structure in a simple way. **Recursively**define the value of an optimal sub-problem solutions.- Compute the values of optimal subproblems in a
**bottom-up**fashion. **Construct**an optimal solution from the computed intermediate results.

- Break the global problem into

## Linear Programming

A linear programming problem is in

**canonical form**if:- The objective function is in form of
**maximization**. - All the constraints are
**inequalities of the form \( \le \)**. - All the decision variables are
**non-negative**.

The algebraic representation of the canonical form would be:

\[ \max Z = c^\mathsf{T} x \]

\[ Ax \le b, x\ge 0 \]

Where \( A_{m \times n} \) is the

**matrix of technological coefficients**of \( m \) inequalities and \( n \) decision variables; \( x_{n \times 1} \) is the column vector of**decision variables**; \( b_{m \times 1} \) is the vector of**resources**and \( c_{n \times 1} \) is the vector of**costs or benefits**.- The objective function is in form of
A linear programming problem is in

**standard form**if:- The objective function is in form of
**maximization or minimization**. - All the constraints are
**equations**. - All the decision variables are
**non-negative**. - The vector of
**constant resources**does**not contain negative**components.

The algebraic representation of the standard form would be:

\[ \max / \min Z = c^\mathsf{T} x \]

\[ Ax = b, x\ge 0 \]

- The objective function is in form of
- We must take into account some
**transformations**that can be useful when solving linear programming problems:A minimization problem can be converted to a maximization one changing the sign of the objective function:

\[ \min Z = \sum\limits_{j=1}^n c_j x_j \; \longrightarrow \; \max Z' = -\sum\limits_{j=1}^n c_j x_j \]

We can change the direction of an inequality multiplying both sides by -1:

\[ \sum\limits_{j=1}^na_{ij} x_j \ge b_i \longrightarrow - \sum\limits_{j=1}^na_{ij} x_j \le -b_i \]

Any inequality can be converted into an equality by adding or resting a nonnegative varibale (

**slack variable**) with a null coefficient (zero) in the objective function.\[ \sum\limits_{j=1}^na_{ij} x_j \le b_i \triangleq \sum\limits_{j=1}^na_{ij} x_j + s_i = b_i \]

\[ \sum\limits_{j=1}^na_{ij} x_j \ge b_i \triangleq \sum\limits_{j=1}^na_{ij} x_j - s_i = b_i \]

### Graphical Resolution

- Procedure:
- Represent the problem in
**canonical form**(though it is not strictly necessary). - Draw a
**cartesian coordinate system**where each axis represents a decision variable. - Represent each
**constraint as a region**(that may be unbounded). - The intersection of all regions is the
**feasible region**, the solution space \( F \). **Evaluate the objective function**in the extreme points and pick up the one that maximizes the objective function.

- Represent the problem in
- If the problem does not have a unique optimal solution, then it is said to
have
**alternative optimal solutions**. This is easily detected using the graphical method, if the curve of**isoprofit**/**isocost**is parallel or identical to one of the constraints whose extreme points are optimal solutions. - A problem is found to be unfeasible if and only if the
**region of feasible solution is empty**: \( F = \emptyset \). - This method is only useful with a maximum of 3 decision variables.

### Simplex Method

- Procedure:
- Start the problem by converting the task to
**standard form**and adding the necessary**artificial variables**to the problem (with \( \infty \) coefficient in \( Z \)), in order to ensure an initial solution. - Compute the
**basic variables**:- \( B_i = \{x_a, x_b, x_c ...\} \) with \( n \) vectors (to be a square matrix)
- \( x_{B_i} = B_i^{-1}b \)

- Select the entering variable by using the
**entering rule**:- \( y_n = B_i^{-1}a_n \)
- \( z_n - c_n = c_{B_i}^\mathsf{T} y_n - c_n \)
- Entering variable is \( n \) where \( \min\{z_n -c_n\} \forall n \in B \)

- Select the leaving variable by using the
**leaving rule**:- \( \theta = min \{\frac{x_{B_i}}{y_n}\} \) where \( n \) is the entering variable
column of the basis
**as long as**\( \theta > 0 \).

- \( \theta = min \{\frac{x_{B_i}}{y_n}\} \) where \( n \) is the entering variable
column of the basis
- The solution is given by the value of \( x^{*}= x_{B_f} \) once the iterations are over, and the value of the objective function \( z^{*} = z_f \)

- Start the problem by converting the task to
**Interpretation of the results**:- If a
**slack variable is part of \( x_f \)**, there is an excess amount (if the variable's constraint was a lower bound) or a deficit (if it was an upper bound) in the resources. - One thing that points out that a problem is unfeasible is the fact that an
**artificial variable is part of \( x_f \)**.

- If a

### Duality

- A linear programming problem is in
**symmetric form (of maximization)**if:- The objective function is in form of
**maximization**. - All the constraints are
**inequalities in the form of**\( \le \). - All the decision variables are
**non-negative**. - In case it is a minimzation problems, all constraints must be in \( \ge \) form.

- The objective function is in form of
If a

**primal problem**is\[ \max Z = c^\mathsf{T} x \]

\[ Ax \le b, x\ge 0 \]

then its

**dual problem**is:\[ \min W = b^\mathsf{T} x' \]

\[ A^\mathsf{T}x' \ge c, x'\ge 0 \]

If the problem is in symmetric form, and has an optimal solution to a basis \( B \), then \( x'^\mathsf{T} = c^\mathsf{T}_BB^{-1} \) is an optimal solution to the dual problem.

- The
**economic interpretation**states: the dual variable \( x^{*}_{i} \) indicates the contribution per unit of the \( i^{th} \) resource \( b_i \) to the variation in the current optimal value \( z^\* \) of the objective.

### Modelling

- There are some well-known problems for which there are some already created patterns.
- The
**demand problem**, where \( a_i \) is the capacity of the \( i \) origin, \( b_j \) is the demand of the \( j \) destination, and \( c_{ij} \) is the unitary cost of deliver from \( i \) to \( j \); the problem can be modelled like:- Objective function: \( \min Z = \sum \limits_{i=1}^n \sum\limits_{j=1}^m c_{ij} x_{ij} \)
- The capacity is not exceeded: \( \sum \limits_{j=1}^m x_{ij} \le a_i \)
- The offer is satisfied: \( \sum \limits_{i=1}^n x_{ij} \ge b_i \)
- The variables are positive integers: \( x_{ij} \in \mathbb{Z}^+ \)

- The
**assignment problem**, where \( n \) people must be assigned to \( m \) tasks to be performed in an optimal way provided the unitary cost of each assignment by each person, \( c_{ij} \). This problem uses the variables as boolean variables, where \( x_{ij} \leftrightharpoons 1, 0 \) depending if a person \( i \) has been or not assigned to a certain task \( j \).- Objective function: \( \min Z = \sum \limits_{i=1}^n \sum\limits_{j=1}^m c_{ij} x_{ij} \)
- One task is assigned to each person: \( \sum \limits_{j=1}^m x_{ij} = 1, \; \forall i \)
- One person is assigned to each task: \( \sum \limits_{i=1}^n x_{ij} = 1, \; \forall j \)

- The
**network flow problem**, where we have a graph \( G=(V, E) \) and a function of cost, and we must find the cheapest path from one vertex \( x_1 \) to another \( x_n \). In this problem we will also use the variables as boolean variables, where \( x_{ij} \leftrightharpoons 1, 0 \) depending if we have taken the edge that connects \( i \) and \( j \).- Objective function: \( \min Z = \sum \limits_{i=1}^n \sum\limits_{j=1}^n c_{ij} x_{ij} \)
- Only one edge leaves the origin: \( \sum \limits_{j=1}^n x_{1j} = 1 \)
- Only one edge arrives at the goal: \( \sum \limits_{j=1}^n x_{in} = 1 \)
- There are as many entering edges as leaving ones in every node: \( \sum \limits_{j=1}^n x_{il} - \sum \limits_{j=1}^n x_{lj} = 0, \; \forall l \in V \)

## Logical Satisfiability

A propositional formula is in

**Conjunctive Normal Form (CNF)**if it's in form of:\[ \bigwedge\limits_{i=1}^N\bigvee\limits_{j=1}^N\ell_{ij} \]

- A literal \( \ell \) consists of an atom (one variable) in its assertion (\( x \)) or negation (\( \bar{x} \)).
- A literal \( \ell \) is
**pure**in \( F \) if \( \bar{\ell} \) does not appear in any clause of \( F \). - A
**tautology**is an always true propositional formula for any instantiation of its atomic expressions.

### Resolution Method

**Davis-Putnam Algorithm**:- Select a literal \( \ell \in F \)
Apply \( Res\{F, \ell\} \) and write down the variable used and clauses involved.

\[ Res\{(p\vee r), (\bar{p} \vee r) \} = (r\vee s) \]

\[ Res\{p, \bar{p}\} = \{\emptyset\} \]

- If it results in an empty clause \( \{\emptyset\} \), then \( F \)
**is not satisfiable**. - If it results in \( F = \emptyset \), end the problem. If not go back to step 1.
- Consider the list of variables and clauses in inverse order, calculating whether \( \top \) or \( \bot \) for each of them.

**Davis-Putnam-Logemann-Loveland Algorithm**: consists of applying reduction on a formular \( F \) in form of a tree. The reduction function consists in taking a variable and exploring the possibilities of having the variable asserted or negated while the whole formula is still true.

## Constraint Processing

- A
**constraint network**\( R = (X, D, C) \) consists of a finite set of values (\( X \)) defined over domains (\( D \)) that contain the possible value for each variable and a set of constraints (\( C \)). - We can say that \( x_i \) is
**arc consistent**with \( x_j \) if and only if for each value \( a_i \) there exists a value \( a_j \) such that \( (a_i, a_j) \in R_{ij} \). If there is a value that does not participate in a relation, it can be eliminated. Arc consistency is not a bidirectional relationship. - A constraint \( R_{ij} \) is
**path consistent**relative to variable \( x_k \) if and only if for every pair \( (a_i, a_j) \in R_{ij} \) there is a value \( a_k \in D_k \) such that \( (a_i, a_k) \in R_{ik} \) and \( (a_k, a_j) \in R_{kj} \). Path consistency is indeed a bidirectional relationship.

## Search

- A
**problem space**is composed by:- The
**states set**, that contains every possible state of every search problem. - The
**operators set**, that contains every single action that can be performed to the problem. - The
**initial state**. - The
**goal**.

- The
- There are two important factor:
- The branching factor, \( b \), that is a property of the state graph.
- The depth, \( d \), that is a property of the problem.

- We can perform several algorithms:
**Breadth First Search (BFS)**: searches through all nodes of a certain depth before exploring the ones deeper. It's behavior is best understood if we think of adding nodes to a queue to be expanded. It is complete, optimal (with equal costs among operators), efficient if goals are close to the root, but it consumes exponential memory.**Depth First Search (DFS)**: expands nodes depth-wise, and thus it needs backtracking technique if the solution is not found and even a depth limitation if the branch being expanded is infinite. By itself, it is not complete, it is not optimal, but it is efficient as long as there is no cycles if the goals are away from the root.**Iterative DFS**performs a DFS with a depth limitation that is incremented each if the solution is not found in the current nodes. If we set the depth increment to 1, it basically becomes a BFS. It is complete, if the increment is one it is optimal, but it can generate a lot of duplicated nodes.**Depth First Branch-and-Bound**performs a DFS until it finds a solution or the cost of all paths exceed the cost of the best solution found so far. Every time a solution is found, a new bound is set at its depth.

- If we have no information, we perform a blind search, but we can also have a
perfect information or a
**heuristic**, that is a partial knowledge about the problem/domain that permits solving the problem efficiency in this domain. - We can find heuristics by solving simplifiend models of the problem
(
**constraint satisfaction**), adding constraints, using a probabilistic estimation or reasoning by analogy or metaphor. - A heuristic is
**admissible**if never overestimates the real cost of reaching a goal. - Some
**informed search algorithms**are:**Hill Climbing**: chooses the next node to expand based on which one of them has the best heuristic value. It is a greedy method, so it can find some problems like local maxima or minimas, plateaus, ridgesâ€¦ So it is not complete. It is useful and efficient with consistent heuristic functions and we can solve its problems using backtracking, or random restarts.**Beam Search**: it works as Hill Climbing, although it expands \( k \) nodes each iteration. Opening the window improves the possibilities of escaping from plateaus and finding shorter paths. Usually the higher value of \( k \), the better the solutions found, although it is not always like that.**Best First Search (BFS)**: always expands the most promising node according to a given rule at a given time and orders the nodes by its rule in a sorted queue to be expanded.**A***is a BFS algorithm what uses the function \( f(n) = g(n) + h(n) \) to evaluate the nodes, where \( g(n) \) is the cost of the travelled arcs up to node \( n \) and \( h(n) \) is the value of the heuristic for that node. It is complete, admissible (as long as the succesors fot the nodes are finite and the heuristic is admissible too) and a more informed heuristic will expand less nodes than a bad one.**IDA***is a BFS search that uses a A* search evaluation in an iterative deepening way, using a value \( \eta = \min \{f(i)\} \) as an upper bound to search for a solution. It is complete, it is admissible, although it can be time-costly (exponential) although its memory complexity is linear.