# Fundamentals of Constraint Processing

## Table of Contents

- Lecture 2
- Lecture 3
- Lecture 4
- Lecture 5
- Lecture 6
- Lecture 7
- Lecture 8
- Lecture 9
- Lecture 10
- Lecture 11
- Lecture 12: Phase Transitions
- Lecture 13: Back-tracking: Continued
- Lecture 14: Back-tracking: Continued
- Lecture 14: Forward Steps
- Lecture 15: Forward Steps, Continued
- Lecture 15: Intelligent Backt Tracking – Theoretical Eval
- Lecture 16: Theoretical Eval, Continued
- Lecture 17: Search orders in CSPs
- Lecture 18: Ordering Heuristics
- Guest Speaker: Philip Walton, Workday
- Lecture 19: Variable Orderings, continued
- Lecture 20: Variable Orderings, continuedg
- Lecture 21: Filtering For Constraints of Difference
- Lecture 22: CSPs and Relational DB
- Lecture 23: CSPs and Relational DBs, continued
- Lecture 24: Path Consistency and Global Consistency
- Lecture 25: Path Consistency and Global Consistency – continued
- Lecture 26: Consistency properties
- Lecture 27: Consistency properties, continued
- Lecture 28: Consistency properties, continued
- Lecture 28: Local Search
- Lecture 29: Local Search, Continued
- Lecture 30: Local Search, continued
- Lecture 30: Structure-based methods
- Lecture 31: Structure-based methods, continued
- Lecture 32: Structure-based methods, continued
- Lecture 33: Structure-based methods, continued
- Lecture 34: Structure-based methods, continued
- Lecture 35: Structure-base methods, continued
- Lecture 26: More on Constraint Consistency
- Lecture 27: Still more on Constraint Consistency
- Lecture 28: More Constraint Consistency
- Lecture 29: Extensions to BT Search

## Lecture 2

### Resources

- check website – has many, many links
- conferences
- journals
- include logic programming and CSP
- Linear Programming, Integer Programming, Mixed Integer Programing, Dynamic Programming

- journals are also pretty big – and tend to be easier to get into
- many good ways to get jobs with this

### Constraint Satisfaction 101

- examples
- generating optimal decisions
- have unary, bynary, n-ary constraints
- often have many choices, but restrictions can make it easier to make the right decision

- generating optimal decisions
- modeling constraints remains an art
- sets of variables with sets of choices, and aset of constraints restricting combinations of values
- applications
- radio resource management
- databases
- temporal and spatial reasoning
- allocation
- design/config
- graphics, visualization, interfaces
- verification and engineering
- HCI and decision support
- molecular biology
- analytical chemistry
- robotics, machine vision, computational linguistics
- transportation
- qualitative and diagnostic reasoning

- constraint processing is modeling, plus inference and search
- modeling – variables, domains, constraints, query
- inference – enforcing consistency (filtering) and propogation (AC, PC)
- search – systematic (chronological backtracking, with or without intelligence, etc) or non-systemactic (hill climbing, taboo, etc)

- Real Example
- Netherlands have dense network
- wanted more robust schedule that’s more effective, increasing profit
- used OR and AI

- Singapore
- large container hub
- has custom, special loading built using CP

- CP is:
- solving decision problems
- with arbitrary constraints
- providing concise and high-level feedback about alternatives and conflicts

- Very Practical

### Defining a problem

- has a given and a question
- given is a set of objects, the relations, etc
- question ex: find \(x\) such that condition \(y\) is satisfied
- these can be simple or advanced

- a problem must always, always, always have a question!
- Given \(\mathcal{P} = (\mathcal{V}, \mathcal{D}, \mathcal{C})\)
- \(\mathcal{V} = \{V_1, V_2, \ldots, V_n\}\) Variables
- \(\mathcal{D} = \{D_{V_1}, D_{V_2}, \ldots, D_{V_n}\}\) Domains for Variables
- Constrants \(\mathcal{C} = \{C_1, C_2, \ldots, C_l\}\) where \(C_{V_i,V_j,\ldots,V_k} = \{(x,y,\ldots,z)\} \subseteq D_{V_1} \times D_{V_2} \times \ldots \times D_{V_n}\)

- Query whether we can find a value for each variable such that all constraints are satisfied, queries can include weights (making it a Constrained Optimization Problem)
- Various levels of consistency
- AC
- GAC
- SAC
- SGAC

- Sudoku can be done with consistency, but whether or not search is required for all
- find solution – decisions
- find sumber of, all solutions – counting
- find set of constraints that can be removed so that a solution exists – optimization
- specials
- binary CSPs have only two variables
- restricted to \(\{0, 1\}\), boolean CSP
- finite, discrete domains, enumeration can work
- continuous domains require sophisticated algebraic techniques are needed – consistency techniques are then used on domain bounds

## Lecture 3

- always a given \(\mathcal{P} = (\mathcal{V}, \mathcal{D}, \mathcal{C})\)
- Constraints are relations
- relations are a subset of the cartesian product of \(n \geq 2\) sets
- relations can be represented as a table, which are refered to as allowed tuples, or supports, which is a table constraint, being an enumerated
- extension in a set with all members enumerated
- conflicts are the set of tuples that are no good
- Constraints can also be in intention – they’re defined in terms of a logical function, equivalent to extension, but can be easier to understand
- A check function should always be a simple implementation – to build with them

- Code must always be well structured!

## Lecture 4

- \(P = (V, D, C)\)
- \(V\) variables
- \(D\) domains
- \(C\) constraints

- Constraints are defined as \(C_1 = \langle\mathrm{scope}(C_1), \mathrm{rel}(C_1)\rangle\)
- Scope is the variables it discusses – the set of variables over which the constraint is defined
- relation is an abstract relation – may be in extension (listed as either supports or conflicts) or intension (roughly as a function or boolean expression)
- Also have arity, the cardinality of the scope
- universal constraints allow any member of the relevant cartesian product, but may be induced so as to be something else – equivalent to know constraint, and is binary alone

- VVP – Variable Value Pair
- make sure to implement check function, and to do so in a very independent manner
- keep track of a number of constraint checks, always incrementing for every check
- will be writing abscom to parse
- Graph representation
- Macro representation
- Variables are represented as nodes (vertices)
- domains as node labels
- constraints as arcs (edges) between nodes
- hypergraphs allow for special constraint nodes, or use a bubble to connect them instead – the hyperedges

- relations in intension are defined by set-bulider notation
- used when not practical/possible to list all tuples
- define types/templates of common constraints
- linear constraitns
- all-diff constraints
- at most
- TSP-constraints
- cycle-constraints

- constraints implemented
- in extension
- set of tuples (list/table)
- binary matrix (bit-matrix)
- multi-valued decision diagrams

- in intension
- constrained decision diagrams
- predicate functions

- in extension

## Lecture 5

- examples of modeling
- can have temporal csp
- graph or map coloring – requires 4
- resource allocation – airline trip planning, list coloring
- product configuration – allowed components and what they require/forbid
- logic puzzles

- Constraint types
- algebraic constraints
- constraitns of bounded difference
- coloring
- mutual exclusion
- difference constraints
- arbitrary constraints, must be made explicit

- Databases
- Join in DB is a CSP
- View materialization is a CSP

- Interactive systems
- data-flow
- spreadsheets
- graphical layout systems and animation
- graphical user interfaces

- Molecular biologiy
- threading and similar

### Formal characterization

- macrostructures
- networks/constraint graphs, \(G = (V, E)\), may be hypergraphs

- Micro-structure – includes every possible Variable Value Pair
- VVPs are linked iff are a support and are consistent
- cliques of size \(n\) are possible solutions
- however, finding cliques is \(NP\) complete

- Co-microstructure – includes every possible VVP
- VVPs are linked iff they are a no-good
- cliques of size \(n\) prevent a given solution

- Complexity
- decision problem
- NP-complete by reduction from 3SAT
- can be verified in polynomial time, but solution is not generated in polynomial time

### Proof of CSP into SAT

- CSP has Variables, Domains, Constraints
- Every VVP is a Boolean variable in SAT
- Requires a clause for each variable, a disjunction for each VVP, to be added to the conjunction of clauses, along with a disjunction of not a or not b of each diffenrent VVP as pairs
- Constraints are then mapped to SAT

## Lecture 6

### Solving a CSP

- by search
- constructive, systematic, exhaustive – involves enumerating all possible VVPs
- iterative repair, local search – pick an assignment and change unsatisfied constraints until all constraints are satisfied
- Before start
- consider the model/forumaltion – controls the size of the search space (defining a model is an art)
- pre-process
- i.e. constraint filtering/propagation, consistency checking – reduce the size of search space

- Constraint Checking
- arc consistency
- removes individual possible elements from the domains of either element
- Revise Function (B, C):
- \(\forall x \in D_B\)
- if support(\(x\), \(D_c\))
- break

- else
- \(D_B \gets D_B \setminus \{x\}\)

- if support(\(x\), \(D_c\))

- \(\forall x \in D_B\)
- Support function: for every element, if supports, return true, if none exist, default to false
- will not always find a solution, but will reduce problem size

- k-consistency
- consistency on \(k\leq n\) variables

- arc consistency
- Systematic search
- for every value of V
_{i}, check for consistency with previous assignments (starting from the bottom), cuts off portions of the tree if false (using backchecking to prune a search tree) - back-checking expands a partial solution iff it is thus-far complete
- chronological backtracking – DFS with only the current path, undoes the most recent variable assignment and tries a new one – can be used for only one solution
- is complete – will always find a solution if it exists
- is sound / correct – a solution it gives will always be correct – will not give a false positive

- intelligent backtracking
- jump back based on the reason for failure – generally the deepest

- Ordering heuristics
- deciding what to expand first
- often most constraint and most promising values first
- strategies for variable/value ordering could be static/dynamic
- effects branching factor – best to branch as little as possible
- generally variable ordering is dynamic, but value is static (lexicographic)

- Keep-in mind forward-checking – removing values from domains that aren’t consistent – a/k/a domain annhilation – and is more efficient
- Real Full Lookahead is used to further restrict domains and search spaces more than plain forward-checking
- balance between reasoning and search

- for every value of V

## Lecture 7

### Arc Consistency

- revise should only modify one variable’s domain
- requires support, takes VVP and domain of other
- checks for VVP in X with every element of Y, returning true if found, false if none found
- check always increments if there’s a constraint
- revise should check if there’s a constraint to begin with

- review reading!
- VVPs can be represented as tuples, may use angle brackets
- AC-1:
- assume CSP has \(n\) variables, having domain size \(d\), and \(e\) constraints
- \(Q\) is a set of directed edges to revise, from x to y, from y to x, size \(2e\)
- for all edges in \(Q\)
- set acp to fales
- repeat until revisedp is false
- set revisedp to false
- for everything in \(Q\)
- if it’s revised, set revisedp to true
- But, if a variable’s domain is empty, break and report – this CSP has no solution

- This CSP is then arc-consistent

- Make sure to sort list of edges!

- Unary constraints should be done first!

## Lecture 8

Property | Algorithm | Complexity |
---|---|---|

AC | AC-1 | \(O(na^3e) = O(n^3a^3)\) |

- solutions found by
- detecting components
- NC
- normalize constraints
- NC, AC, other forms of AC
- Search
- Verification
- Solution, Node Vists, Constraint Checks, CPU Time

- when implementing AC1, terminate if domain wipe-out occurs
- worst case complexity is seldom reached
- remember, AC
*may*discover a solution to a CSP, or may find an inconsistent problem (by domain annhilation) - AC3
- for \(i \gets 1\) to \(n\), do \(NC(i)\)
- \(Q \gets \{(i, j) | (i, j) \in arcs, i \neq j\}\)
- while \(Q\) is non-empty:
- select and delete any arc \((k, m) \in Q\)
- if \(\mathsc{Revise}(k, m)\) then \(Q \gets Q \cup \{(i, k) | (i, k) \in arcs(G), i \neq k, i \neq m\}\)

- Check for domain wipe-out!

## Lecture 9

- Remember to deal with Node Consistency – and don’t include it in the check count
- see lecture 8 notes
- AC3 is not only more powerful
- \(2e + a[2e - n]\)
- AC3 is \(O(a^2 (2e + a(2e - n))) = O(a^3e) = O(a^3n^2)\)
- AC4 is even more efficient, but requires special bookkeeping, is \(O(a^2n^2)\)
- Variant of AC-3 called AC-2001, but requires cubic space, has yet another variant, AC-3.1\(^{m}\), requires different bookkeeping
- AC4
- generate m, s, and counter
- m and s have as many rows as there are VVPs in the problem, m is set to true, s to empty list
- m is for alive
- s is for a list of all VVPs that support a given VVP
- given a vvp, provide a number of supports provided by a given variable

- prepare data structures
- for every constraint all tuples:
- when allowed, update s
- increment counter

- when allowed, update s

- for every constraint all tuples:
- Then process all constraints
- if counter is 0, remove from domain, and update \(m\)
- since \(x\) is removed, then for everything such that it was a support, decrement the counter

- generate m, s, and counter
- AC3 tends to be a bit better than AC4

## Lecture 10

- soundness – can trust
- complete – will always find a solution if exists
- more efficient AC variants are useful for use during search
- but when enforcing AC during search, you must be able to keep track of what was removed
- remember properties are not the same as the algorithms! there may be many algorithms to implement each property!
- most consistency methods are local initially
- node/arc remove inconsistent values
- path removes inconsistent tuples
- used when you have positive tables of supports

- they get closer to the solution
- reduce the size of the problem
- are cheap (polynomial time)
- choice of consistency properties helps to reduce time to solution

### Intelligent Backtracking Algorithms

- Read Prosser, CI 93. Hybrid Algorithms for the Constraint Satisfaction Problem
- Consider Reading 5/6 of Dechter
- Issues
- Variable/Value ordering
- Variable Instantiation
- Currentt Path
- Current Variable
- Past Variables
- Future Variables
- Shallow/Deep Levels/Nodes
- Search space/Search Tree
- Back-checking
- Backtracking

- As you go, un-assigned variables are the future variables/future sub-problem
- those that have been instantiated are the past variables
- the current path is current instantiations
- always check a new instantiation against past variables (back-checking)
- algorithms
- Vanilla: BT
- Improved Backward: BJ (back-jumping), CBJ (conflict-directed BJ)
- Improving Forward: BM (back-marking), FC (forward-checking)

- D-Way branching is conceptually easier to understand than 2-way
- algorithms from Prosser are iterative back-tracking and pluggable/modularized
- Kondrak and Van Beck looked at the efficiency of algorithms

## Lecture 11

- Variables \(V_i\), \(i \in [1, n]\)
- Domains \(D_i = \{V_{i1}, v_{i2}, \ldots, v_{iMi}\}\)
- Constraint between \(V_i\) and \(V_j\) : \(C_{i,j}\)
- Constraint graph \(G\)
- Arcs of \(G\): \(\mathrm{Arc}(G)\)
- Instantiation order is static or dynamic
- lang primitives are from lisp
- Data structures
- \(v\) is a 1 by \(n\) array to store assignments, \(v_0\) is the root, backtracking to this indicates insolvability, \(v_i\) is the assignment to the $i$th variable
- \(domain\) is the original domain
- \(current-domain\) is the current domain, must be updated on back-tracking
- check is as before

- Generic form
- takes n, status
- set consistent to true
- status to unknown
- \(i \gets 1\)
- while the status is unknown
- if consistent
- \(i \gets \mathrm{label}(i, consistent)\)
- else \(i \gets \mathrm{unlabel}(i, consistent)\)

- if \(i > n\)
- set status to solution
- else if \(i = 0\), set status to impossible

- if consistent

- label and unlabel functions are provided by the backtrack algorithm
- label is forward move, unlabel is backward move
- For BT
- label
- when \(v_i\) is assigned a value from the \(current-domain_i\), we perform back-checking against past variables
- if back-checking succeeds
- return \(i + 1\)
- else, remove assigned value from current-domain[i], assign next value, etc
- if no value exists, set consistent to false
- function
- takes i, consistent
- set consistent to false
- for v[i] in each element of the current domain while not consistent
- set consistent to true
- for h from 1 to i - 1 and while consistent
- set consistent to check(i, h)

- if not consistent
- then remove i from current-domain

- if consistent, return i+1, otherwise i

- unlabel
- current level is set to \(i - 1\)
- for future variables \(j\), reset the current-domain
- if the domain of \(h\) is not empty, set consistent to true
- Function
- takes i, consistent
- set h to i - 1
- set current-domain[i] to domain[i]
- remove v[h] from current-domain[h]
- set consistent to current-domain[h] not-equal empty-set
- return h

- label does the actual instantiation

- label

## Lecture 12: Phase Transitions

- Cheeseman 1991
- Order parameter
- Probability solution exists for random problems is almost 1
- there’s a critical value whereby, after it, probability is almost 0
- around the critical value, the probability is around 0.5
- cost of solving drops sharply as it gets further to the right of the critical value, but high as it goes to it
- around .5 probability is high cost of solving, no matter the algorithm, this is referred to as a phase transition
- conjecture regarding the characterization of NP complete problems
- applies to detecting/implementing arc-consistency
- random graphs are almost always easy to to color – conjecture from famous paper
- graph coloring is a reduction operator
- friends are vertices with the same neighborhood
- this is the degree

- For CSPs, it’s either density or tightness
- currently effects the way of expirement conduct – try and deal with the hardest problems
- but be careful not to focus exclusively on the redior around the critical value
- Run on random CSPs – given statistical analysis
- Vary params \(\langle n, a, t, p \rangle\)
- \(n\) – number of variables
- \(a\) – domain size$
- \(t\) – tightness \(\frac{|\mathrm{forbidden tuples}|}{|\mathrm{all tuples}|}\)
- \(p\) – proportion of constraints \(p = \frac{e}{e_{max} - e_{min}}\), also \(\frac{2e}{n(n-1)}\)
- \(t\) and \(p\) will be between 0 and 1
- Can have issue with uniformity, difficulty (phase transition), solvability
- empirical studies can be done simply by varying one of the parameters
- do it (Model B):
- generate \(n\) nodes
- generate a list of \(\frac{n(n-1)}{2}\) tuples of all combinations of 2 nodes.
- choose e elements from the above list as constraints to go between the \(n\) nodes
- if the graph is not connected, throw away, go back to step 4, otherwise proceed
- generate a list of \(a^2\) tuples of all combinations of 2 values
- For each constraint, choose randomly a number of tuples from the list to gparantee tightness \(t\) for the constraint.

- Model B instances may be flawed, but it’s rare

## Lecture 13: Back-tracking: Continued

- keep track of path – an array of the instantiations thus far and to be created
- remember what unlevel does – actually performs backtracking
- many different ways to order variables
- when doing value-ordering, use lexicographic ordering!
- eventually, variable ordering heuristics
**must**be broken with lexicographic ordering

### Back-Jumping

- uses \(max-check\), an array, and when check(i, h) succeeds, \(max-check[i] \gets max(max_check[i], h)\)
- if the current domain of \(h\) is empty, chronological back-tracking is performed from h
- you go back to \(max-check[i]\) to fix stuff when you have an empty variable, unless you’ve already back-jumped in the current path
- BJ just modifies BT-label and BT-unlabel
- BJ-label just updates \(max-check\)
- bj-unlabel
- backtrack to h = max-check[i]
- resets max-check[j] = 0 for j in \(h+1\) to \(i\)

### Conflict-directed BJ

- requires a new data structure, the conflict-set array
- tracking at what levels conflicts occur

- knows how to jump-back again, to the deepest level of conflict
- can jump-back more than once!
- useless if good variable ordering
- conflict sets are initialized to \(\{0\}\)
- At any point, conf-set is a subset of past variables that are in conflict with \(i\)
- when check(i, h) fails add h to conf-set[i]
- when domain is empty
- jumps to deepest past variable in conf-set
- update \(conf-set[h] = conf-set[h] \cup conf-set[i] \setminus \{h\}\)

- jumps to deepest past variable in conf-set
- To calc all solutions – don’t use CBJ

## Lecture 14: Back-tracking: Continued

### CBJ Continued

- max-check – deepest level checked against – if instantiated, will be \(i - 1\)
- conflict set is used for holding what’s conflicted
- Finding all solutions with CBJ
- use the conflict sets
- or use CBF from Kondrak
- keep track of number of solutions

- make sure to check generated solutions with the solution checker!!
- modify algo to after solution, back track to every level at least once!
- in actuality, must be back tracked to exactly one

- look at Chis Thiel’s response
- when a solution is found, force the lat var to conflict with everything before it. In turn, this forces some chrono backtracking as the conf-sets are propagated backward

- Kondrak’s responce
- cbf flag, vector
- \(\forall i, cbf[i] \gets 0\)
- when you find a solution, \(\forall i, cbf[i] \gets 1\)
- In unlabel
- if \(cbf[i] = 1\)
- \(h \gets i - 1\), \(cbf[i] = 0\)
- else \(h \gets max(conf-set[i])\)

- if \(cbf[i] = 1\)

- cbf flag, vector

## Lecture 14: Forward Steps

### Backmarking

- tries to reduce the amount of consistency checking
- \(v[i]\) to reassign to \(k\)
- \(v[i] \gets k\) last checked against \(v[h] \gets g\)
- \(v[h]\) has not been modified

- not super useful – as consistency enforcement is more useful
- this helps to reduce back checking
- data structures, max back-check level, minimum backup level
- \(mcl\) max check level, \(n \times m\) array, \(mcl[i, k]\) stores deepest variable that \(v[i] \gets k\) was checked positively against, finer-grained mcl
- \(mbl\) min. backp level \(n\) array, gives the shallowest past variable whose value has changesd since \(v[i]\) was the current variable

- BM does not allow dynamic variable ordering
- aware of the deepest variable that \(v[i] \gets k\) is \(v[j]\)
- values of variables in the past of \(v[j]\) (\(h
- so we need to check \(v[i] \gets k\) against the values of the variables between \(v[j]\) and \(v[i]\)
- we do not need to check against the values of the variables in the past of \(v[j]\)
- when \(mcl[i,k] \leq mbl[i]\), don’t check, fails!
- when \(mcl[i,k] \geq mbl[i]\), don’t check, succeeds! – type B savings

### Forward checking

- Looks ahead from the current variable, considering all future variables and clear from the domains the values that aren’t consistent with the current partial solution
- FC means more work in the future, but causes fewer nodes to expand
- When it moves forward, the values of the current domains of future variables are all compatible with past assignments, saves backchecking
- FC may cause wipe-out, but discovers conflicts early on. will backtrack
- goal is to find failures early – avoids expanding fruitless subtrees
- is not truly forcing arc consistency, that’s RFL
- requires a DS to keep track of filtering, can be done in various ways

## Lecture 15: Forward Steps, Continued

### Forward Checking

- look-ahead – revises future variables
- makes more work at instantiation, but expands fewer nodes
- as you move forward, values in future variables are consistent with past variables
- can cause wipe-out and conflict discovery early on, backtracking chronologically
- requires current domain, stack of variables removed by previous instantiations
- future is a subset that v[i] checks againts
- past-fc is past variables that checked against v[i]
- requires functions
- check-forward – called when instantiating v[i]
- revise(j, i)
- returns false if current-domain[j] is empty, true otherwise
- values are removed from current-domain are pushed as a set into reductions[j]

- update-current-domain
- \(current-domain[i] \gets domain[i] \setminus reductions[i]\)

- fc-label
- attempt to instantiate current-variable
- filters domains of current
- on wipe-out of future variable, uninstantiate and undo domain filtering

- check-forward – called when instantiating v[i]
- Suffers from thrashing, as based on bt

### Variable Ordering

- dom, least domain, smallest domain first
- degree of variable (number of neighbors)
- dom/deg is most common
- Degree is difficult, as it changes very rapidly
- can/should be implemented statically

### Consistency Check Summary

- CBJ
- uses backtracking, no addtional structures

- backmarking uses mcl and mbl
- two types of consistency-checking savings

- forward checking
- works more at each instantiation, but expands fewer subtrees
- uses reductios, future-fc, past-fc

- FC-CBJ is the best. Period
- Other Backtracking algos
- Graph-based back-jumping – Dechter
- Pseudo-trees [Freuder, 85]

- Other lookahead – DAC, MAC – DAC seems to be pretty good as a compromise between MAC and FC
- More emperical evals
- now by generating random problems

- theoretical comparisons

### Implementation notes

- Enforce NC
- Normalize constraints
- check for empty crelations
- Interrupt as soon as you detect domain wipe-out
- Remember dynamic variable ordering includes the domino effect
- a/k/a pick singleton domains first

## Lecture 15: Intelligent Backt Tracking – Theoretical Eval

- Kondrak and Van Beek, IJCAI 95
- Dechter 5, 6
- most backtracking algorithms are evaluated empirically
- performance depends heavily on problem instance
- is therefore average case analysis
- representativesness of the test problems
- CSP in NP complete – always possible to construct examples where BJ/CBJ are worse than BT
- comparison criteria are different
- paper gives theroretical approach
- states by number of nodes visited
- number of constraint checks

- BT, CBJ perform the same amount of consistency checks
- bm reduced consistency checked
- MD does not affect number of nodes visited
- FC-CBJ may visit more nodes than CBJ
- proves correctness of BJ/CBJ
- determines partial order between algos
- proves BJ, CBJ are correct
- proves FC never visits more nodes than BJ
- Improves BMJ and BM-CBJ to perform even fewer consistency checks
- provides framework for characterizing future BT algorithms
- BT extends partial solutions – solutions consistent with a set of uninstantiated values if it can be consistently extend to the variables – that is, create a new partial solution extended by one variable s.t. it’s still consistent
- Dead end – when all values of current variables are rejected
- lower level are closer to the root
- higher levels are closer to the fringe
- 2 BT algos are equiv if on every csp they generate the same tree and perform the same consistency checks
- Lemma 1: If BJ bactkracs to variable x
_{h}from a dead end at x_{i}, then (a_{i}\ldots a_{h}) is not a valid partial solution

## Lecture 16: Theoretical Eval, Continued

- sufficient conditions
- BT visits a node if parent is consistent
- BT visits a node if its parent is consistent with all variables
- CBJ vists a node if its parent is consistent with all sets of variables
- FC vists a node if it is consistent and its parent is consistent with all variables

- Necessary Conditions
- BT visits a node only if parent is consistent
- BT visits a node only if its parent is consistent with all variables
- CBJ vists a node only if its parent is consistent with all sets of variables
- FC vists a node only if it is consistent and its parent is consistent with all variables

- BT visits all nodes FC visits
- BJ vistis all nodes CBJ visits
- CBJ and FC are not comparable.
- FC will visit no more nodes than BJ
- BT and BM are equivalent, as are BJ and BMJ
- proof of correctness – termination, soundness, completeness
- extends for seeking all solutions
- BM hybrids should be fixed, by changing MBL to an \(n \times m\) array, creating BM2, BMJ2 and BM-CBJ2

## Lecture 17: Search orders in CSPs

- assumptions
- finite domains
- binary constraints

- Chapter 6, Tsang is req’d reading
- Dual viewpoint heuristics for binary constraint satisfaction problems is recommended reading
- IN BT, some orders have fewer backtracks
- in lookahead, some ordering detect failure earlier
- variable ordering is often the most constrained variable
- value ordering is often the most promising value first
- using dynamic ordering in lookahead algos often removes the need for intelligent backtracking
- Regin and Bessier show that CBJ and similar isn’t needed – but not so sure
- Select values that don’t cause constraint violation – do the most promising value first, avoiding constraint violation
- Discover constraint violation quickly – select variables that don’t delay discovery of constraint violation, go with the most constrained variable first
- Notation: For a given future variable \(V_i\), the assignment \(V_i \gets x\) partitions the domain of the future variable \(V_j\), a set of values consistent with with the assignment of size \(\mathrm{Left}(V_j | V_i \gets x)\) and a set of values inconsistent of size \(\mathrm{Lost}(V_j | V_i \gets x)\)
- Value selection of Lost and Left
- Value Ordering to quickly find a solution
- min-conflict – orders values according to the conflicts in which they are involved with the future variables – most popular (Minton), \(Cost(V_i \gets x) = \sum_{V_j \neq i} \mathrm{Lost}(V_j | V_i \gets x)\)
- Cruciality – (Keng and Yu, 1989) \(Cruciality(V_i \gets x) = \sum_{V_j \neq i} Lost(V_j | V_i\gets x) / |D_{V_j}\)
- promise – most powerful (Geelen, 1992) \(Promise(V_i \gets x) = \prod_{V_j \neq i} Left(V_j | V_i \gets x)\), the number of assignments that can be done such that no constraint on \(V_i\) is broken.
- Others are defined

- Promise
- recognizes domain wipe-out
- product discriminates better than sum
- semantics of promise gives an upper bound on number of solutions

- FFP name is contested
- goal is to recognize dead-ends asap to save search effort
- choose ordering that reduces branching factor
- Early ones
- LD
- maximal degree
- minimal ratio of domain size over degree
- promise for variables
- Br\’elaz heuristic
- weighted degree (Boussemart et al, ECAI 2004)
- graph-based
- minimal width
- induced with ordering w*
- maximal cardinality ordering
- minimal bandwidth ordering

- remember to exploit domino effect, especially under dinamic var ordering
- always state how you are breaking ties – generally lexicographically
- is often cheap and effective
- in most cases, suitable for both static and dynamic ordering

- Promise for variables – Sum of the promises of the values, minimize the number of assignments that can be done such that no constraint on \(V_i\) is broken. Has upper bound on the number of solutions

## Lecture 18: Ordering Heuristics

- Least Domain
- Max Domain
- Domain/Degree
- Promise is based on what has the most promise given the current state
- Br\’elaz heuristic – originally used for graph coloring
- Remember, variable ordering heuristics are cheap and effective
- generally suitable for static and dynamic ordering
- graph-based heuristics aren’t often used anymore
- Br\’elaz
- originally designed for coloring
- assigns most constrained nodes first
- arrange vars in decreasing order of degrees
- color a vertex with maximal degree with color 1
- choose a vertex with a maximal saturation degree (number of different colors to which is it adjacent)
- if thete’s an equality, choose any vertex of maximal degree in uncolored

- color chosen vertex, with lowest numbered color
- if all vertices are colored, stop, otherwise, choose another vertex and continue

- wdeg
- All counstraints assigned a weight, initially 1
- every time a constraint is broken, during propagation with look-ahead (constraint causing domain wipe-out), weight is increased
- the weight of an un-assigned variable is defined as the sum of the weights of the constraints that apply to the variable
- the variable with the largest weight is chosen for instantiation
- weights are not reinitialied at back-track
- refined with dom/wdeg
- inspired by breakout heuristic in local search
- However, there mart be places where dom/deg is more efficient than dom/wdeg
- A decay function may help improve this

- Graph heuristics
- ordering by elimination or instantiation
- minimal width
- given an ordering, the width of a node is the number of its parents
- then the ordering’s width is the largest width of any of the nodes
- sum of degrees is always equal to the number of edges
- take minimum of the orders, by width – this is the width of the graph
- this can reduce chance of backtracking, can help to find minimum width ordering
- Algo:
- remove all nodes not connected to any others
- \(k \gets 0\)
- while no nodes left in the graph
- \(k \gets 1\)
- while there are nodes not connected to more than k others
- remove such nodes from graph, along with any edges connected to them

- return \(k\)
- minimal width ordering of the nodes is obtained by taking the nodes in the reverse order that they were removed

- Inducted Width Ordering
- What happens when you reconnect neighbors of what you remove to each other

## Guest Speaker: Philip Walton, Workday

- From Workday, SaaS company for HR management
- work has primarily been in Data Analysis and optimization
- go from techniques and technologies to money
- Could be a Linear Program
- Measures, Items (index sets)
- indices i, m
- variables Quantity, 0 to N
- model sets, limit, value, weight
- maximize value and quanitity
- but must be withinn weight

- big thing is assigning resources in a data center
- truck scheduling
- many truck loads that a company has to move
- origin and destination specified
- and time windows at origin and destination

- many drivers, many trucks
- some time horizon
- DOT rules
- operational rules
- returning to base
- equipment preventative maintenance

- Taking a run at the whole thing is probably too much
- run a rout model to pick loads that go togethr
- schedule loads – the big thing for SCPs
- drivers pick the sets of loads they prefer

- many truck loads that a company has to move
- modeling
- discrete time forms an integer domain
- can be clumpy

- use start time of the task as the domain
- constraints like before and after are relatively straightforward
- duration of driving/time windows are prety easy to express
- small disruptions are normally readily repaired by freeing up parts of the model and letting it re-solve
- time windows tend to be pretty large
- isn’t always a notion of an optimal schedule
- side constraints can be pretty easy to represent
- work rules aren’t always easy
- drivers may have odd drive/sleep patterns
- side resource use is tolerable

- discrete time forms an integer domain
- Robustness for many things has to be dealt with, but how that’s done is very differennt
- And remember how important interface is – they’re both important, and very, very different
- And even more important is understanding what’s going on within the problem domain
- good large software is built on top of amazing small software

## Lecture 19: Variable Orderings, continued

### Induced Width Ordering, \(\omega^{*}\)

- requires designing induced constraints – can cause much, much less backtracking
- these are called fill-in edges
- this procedure moralizes the graph
- computing min induced width is NP hard
- MIN FILL heuristic is used
- when removing anode, add an edge between every two nodes connected to the node, but not already connected to each other
- remove the node that, after removal, has the samllest number of fill-in edges
- Remember to remove the node with the smallest first!

- A tree is a graph with induced width 1
- An ordering has width, as does a graph. Graph width is not dependent on ordering, nor is induced width, \(\omega^{*}\) is called induced width or tree width.
- triangulated or chordal graphs are graphs in which each cycle of length 4 or more is such that each pair of non-connected vertices is connected
- Makes it easy to find largest clique, can actually be done in polynomial time
- is a type of perfect graphs
- interval graphs are a type of triangulated graph
- finding minimum number of triangulating edges is also np-hard

### Maximal Cardinality ordering

- Choose an arbitrary node
- among remaining nodes
- choose the one connected to the maximum number of already chosen nodes
- break ties arbitrarily

- Repeat until done
- Obtained ordering is instantiation ordering
- reverse ordering yields a PEO
- this is the perfect elimination ordering when reversed
- If the graph does not need to be moralized, the graph is triangular

## Lecture 20: Variable Orderings, continuedg

### Max Cardinality, Continued

- width of an induced ordering is same as graph
- MWO can help to guaranty back-track free search
- Lots of algos came from RDBMs
- Induced width is NP Hard!!!
- Simplicial graph is such that looking at an elimination ordering, all parents of a node form a clique
- moralized is triangulated

### Minimal Bandwidth Ordering

- bandwidth is
- node
- largest distance between the node and its neighbors
- ordering
- largest bandwidth of the nodes in the ordering
- graph
- smallest bandwidth across all possible orderings

- heuristic localizes and confines backtracking
- smaller bandwidth, sooner ability to backtrack
- find minimum bandwidth ordering is np-hard
- but finding an ordering of a given bandwidth is polynomial
- Tsang 6.2.2, only one paper

### Ordering Heuristics

- When
- static vvar, val ordering
- dynamic var, static val
- dynamic var, dynamic val – dynamic vvp

- at each given level
- VVPs pertaining to the same variable across a given level, static-static
- Vars are different, dyn-static
- vvps are different at each and every level

- Computing
- sort vars at preprocessing for static
- dynamic selects var during search, based on current domain, all unsintantiated neighbors, exploits the domino effect!!

## Lecture 21: Filtering For Constraints of Difference

- GAC – Generalized Arc Consistency, for non-binary CSPs
- Operates on table constraints, by supports or by no-goods
- \(\pi_A(R)\) gives a new relation with only values of A
- May be a Set, or a Bag

- finite csps are linear in space, quadratic time when dealing with all-diff constrants
- Sometimes, all diff may become many constraints that together are equivalent, but not always
- GAC4 handles n-ary constraints
- has good pruning power
- but is very expensive – \(\frac{d!}{(d-p)!}\)

- value is consistent if other values exist for all other values such that the values and the give simulatenously satisfy the problem
- a constraint is consistent if all values for all variables arec oconsistent
- csp is arc-consistent if all constraints are consistent
- csp is diff-arc-consistent iff all all-diff constraints are arc-consistent
- value graphs are bipartite
- vertices are \(X_C\), \(\cup_{x \in X_C}(D_x)\)
- edges \((x_i, a)\) iff \(a \in D_x\)

- Space is \(\mathcal{O}(nd)\)
- Matching a subset of edges in G with no vertex in common
- max matching is the bigges possible
- matching covers a set \(X\) – every vertex in \(X\) is an endpoint for an edge in matching
- \(M\) that covers \(X_C\) is a max matching.
- If every edge in \(GV(C)\) is in a matching that covers \(X_C\), \(C\) is consistent.
- CSP is diff-arc-consistent iff, for every all-diff \(C \in \mathcal{C}\), for every edge \(GV(C)\) belongs to a matching that covers \(X_C\) in \(GV(C)\)
- Build graph, compute max matching, if size of max matching is smaller than \(X_C\), return false, otherwise remove edges from G given the max matching and return true
- use Maximal Flow in Bipartite graph
- Or Hopcroft/Karp algo

- Then we have an all-dif constraint, value graph and a maximum covering
- remove edges that aren’t in a matching covering

- given a Matching M
- matching edge is an edge in M
- free edge is an edge not in M
- matched vertex is incident to a matching edge
- free vertex is otherwise
- alternating path (cycle) – a path (cycle) whose edges are alternatively matching and free
- flip colors on even alternating path – gives another matching of same size
- flip colors on alternating cycle – another matching

- length of path – number of edges in path
- vital edge belongs to every maximum matching

- An edge belongs to some of but not all maximum matching iff for an arbitrary maximum matching \(m\) , it belongs to either an even alternating cycle or an even alternating path tha begins at a free vertex
- Edges to remove shouldn’t be in all matchings, even alternating paths starting at a free vertex and an even alternating cycle.
- every directed cycle in \(G_0\) corresponds to an even alternating cycle of \(G\)
- every directed simple path starting at a free vertex corresponds to an even alternating path of \(G\) starting at a free vertex
- make black point one way, red the other
- allows for all-diff and other global constraints

## Lecture 22: CSPs and Relational DB

- connections
- constraint databases and constraint logic programming
- query processing and solving csps

- shared
- constrant dbs (deductive BD, Datalog) and constraint logic programming share representation
- relational databases and constraint satisfaction share computations

- Constraint databases use LP – Prolog for DB, Datalog
- Prolog was a bit of a failed expiriment
- CLP(R,F) is prolog with CSP techniques for either reals or finite domains
- Relations
- binary relations – given \(D_a\) and \(D_b\), a set of any 2-tuples \(\langle x, y \rangle\) with \(x \in D_a\) and \(y \in D_b\), this defines a relation \(R_{ab}\) \(R_{ab} = {(x, y)} \subseteq D_a \times D_b\)
- Function – a binary relation such that there is at most one tuple \(\langle x, ? \rangle \in R_ab\). \(D_a\) is the domain, \(D_b\) is the co-domain.
- $k$-ary relations – are relations on \(k\) elements

- representation
- 2-dimensional binary array, though k-dimension works
- tables
- linked-lists of pairs
- hash-maps of sets
- many other options

- Terms
- table, relation – constraint
- arity of relation – arity of variable
- attribute – variable
- value of attribute – value of var
- domain of attribute – domain of variable
- tuple in table – tuple in constraint/allowed by constraint/consistent with constraint

- Relational algebra
- intersection
- input two relations of same scope
- output a new, more restrictive relation with the same scope made of tuples in all the input relations simultaneously
- logical and

- union
- input 2 rel of same socope
- less restrictive relation with same scope of tuples that are in any input relation
- logical or

- difference – same as set theory
- selection
- projection
- join
- composition (from CSP)

- intersection

## Lecture 23: CSPs and Relational DBs, continued

- Selection and projection are unary operators
- Selection
- Input – a relation \(R\) and a predicate o attributes of \(R\)
- Output a relation \(R^\prime\) with the same scope as \(R\) but having only a subset of the tuples in \(R\)
- Row selection
- \(\sigma_{p}(R)\)

- Projection
- Input – a relation \(R\) and a subset \(s\) of the scope
- Output – a relation \(R^\prime\) of scope \(s\) with the tuples rewritten such that positions not in \(s\) are removed
- column elimination, in CSP set semantics, in DB bag
- \(\pi_{\{x_1, x_2\}}(R)\)

- Join (natural join)
- Input – \(R\) and \(R^\prime\)
- Output a relation \(R^{\prime\prime}\) whose scope is the union of \(R\) and \(R^\prime\) and tuples satisfy both
- if they have no common attribute, cartesian product
- if they have an attribute in common, compute all possibilities

- Computes all solutions to a csp
- \(R \bowtie R^\prime\)

- Composition of two relations
- Input – two binary relations \(R_{ab}\) and \(R_{bc}\) with 1 var in common
- Output – a new induced relation \(R_{ac}\) to be combined by intersection to a pre-existsing relation between them
- bit matrix op is matrix multiplication
- Generalization as constraint synthesis
- Produces induced relations a/k/a indirect, inferred, implicit

- Given \(V_1\) and \(V_2\) and \(C_1\) and \(C_2\) between them, how do \(C_1 \cap C_2\) and \(C_1 \bowtie C_2\) relate? – They’ll be the same in this situation
- Given \(V_1\), \(V_2\), \(V_3\), \(C_{V_1, V_2}\) and \(C_{V_2, V_3}\), write the induced \(C_{V_1, V_3}\) in relational algebra. – \(C_{V_1, V_3} = \pi_{V1, V3}(C_{V_1, V_2} \bowtie C_{V_2, V_3}) = C_{V_1, V_2} \circ C_{V_2, V_3}\)
- Given \(V_1, V_2, V_3\), \(C_{V_1, V_2}, C_{V_2, V_3}, C_{V_1, V_3}\), \(C^{\prime}_{V_1, V_3} = (C_{V_1, V_2} \circ C_{V_2, V_3}) \cap C_{V_1, V_3}\)
- Natural join – synthesized constraint
- left/right join – synth constraint with some inconsistent tuples
- Differences
- number of relations
- db have few
- csp have many

- size of relations
- dbs have high-arity
- csps are mostly binary

- bounds of relations
- dbs are very tight
- csps are loose

- size of domains
- db large
- csp small

- type of domain
- almost always finite, but csp may be continuous

- number of solutions sought
- dbs looking for almost all
- csp only looks for one in general

- cost-model
- DB is I/O, memory
- CSP is computational

- number of relations

## Lecture 24: Path Consistency and Global Consistency

- consistency helps to reduce possible search-space
- A path of length \(m\) is consistent iff for any value \(x \in D_{V0}\) and for any value \(y \in D_{Vm}\) that are consistent, there exists a sequence of values in the domains of the variables \(V_1, V_2, \ldots, V_{m - 1}\) such that all constraints between them (along the path, not across) are satisfied
- A CSP is path consistent if any path of any length is consistent
- However, by Mackworth AIJ77, only paths of length 2 are necessary, \(\mathcal{O}(n^3)\)
- algo looks at every vertex as a mid-point \(k\), and makes \(i\) and \(j\) together consistent given \(j\),
- and \(R_{ij} \gets R_{ij} \cap (R_{ik} \cdot R_{kj})\)
- Closure algorithm. \(k\) is outer loop
- Until fixed point
- for k from 1 to n
- for i from 1 to n
- for j from 1 to n
- \(R_{ij} \gets R_{ij}^{O} \cap (R_{ik} \cdot R_{kj})\)

- for j from 1 to n

- for i from 1 to n

- for k from 1 to n
- Algo
- \(Y^n \gets R\)
- Until \(Y^n = Y^0\)
- For k from 1 to n
- For i from 1 to n
- For j from 1 to n
- See slide for equation

- For j from 1 to n

- For i from 1 to n

- For k from 1 to n
- \(Y \gets Y^n\)

- Difference between k consistent and strong k consistency
- PC1 terminates, is sound and complete for discrete CSPs - \(\mathcal{O}(a^5n^5)\)
- In a complete graph, if every path of length 2 is consistent, the network is path consistent
- PC1 has two operations, composition and intersection
- proof is done by induction

- Mohr and Henderson also developed PC-2, 3
- and Han, Lee created PC4
- PC5 from Singh uses support bookkeeping from AC6
- PC8 uses domains, not constraints, 2001 improves on PC8, but isn’t tested
- PC is rarely used except for certain types of constraints
- monotonic constraints – ordering based

## Lecture 25: Path Consistency and Global Consistency – continued

- completion allows you to use paths of length 2.
- Strong PC will ensure both PC and AC
- AC first, then if no inconsistency, PC (likely PC2)
- Go from weaker first to stronger – making later steps likely cheaper
- But PC is often unused – except some situations make it much more likely to find a solution
- PC is also called 3 consistency

### Partial Path consistency

- same as PC, but no universal constraints, defined over cycles
- graph is triangulated
- run closure only over the tringles – \(\mathcal{O}(n^3)\)
- must be careful for articulation points in the graph
- PPC is much easier to implement
- PPC is strictly weaker than PC as a property
- PC completes graph, PPC will triangulate
- PC filters more than PPC, edge-by-edge
- PPC enforces PC on common edges
- which means that you can find if something is not path consistent
- From Bliek and Sam-Haroud 1999
- Can PC detect insatisfiability when PPC does not? – It seems not – PPC and PC are roughly equivalent, but there are some cases – annd it was hand built

### Global Consistency properties

- minimality/decomposability
- A minimal network – a network where all tuples in a given constraint appear in at least one solution
- computing a minimal network is NP complete
- Solving a minimal network is itself NP complete

## Lecture 26: Consistency properties

- Local Consistencies
- AC
- GAC
- Path Consistency – modifies binary constraints, requirecs completed graph
- Partial Path Consistency – triangulates graph, modifies binary relations

- Global consistency properties
- Consistency in general – it has a solution
- minimality – constraints are as tight as possible, on binary constraints, elimination of a tupple eliminates a solution, every tuple is a member of at least one solution, idealized form of path consistency, garutys consistency, important in interactive product configuration – NP complete in general, Gottlob, 2011
- determining whether or not a CSP is minimal is NP complete
- Finding a solution of a minimal CSP is still NP complete
- occasionally called globally consistent

- decomposability
- CSP is decomposable if for any combination of assignments to any number of variables that is consistent given the constraints can be extended to a complete solution
- guaranties back-track free search
- strongly $n$-consistent

- there are some islands of tractability
- tree structure – AC ensures tractability
- convex constraints – path consistency ensures minimality and decomposability

## Lecture 27: Consistency properties, continued

- minimality doesn’t guaranty bt-free search
- decomposability ensures no back tracking
- PC approximates minimality
- when constraints are convex, PPC on triangulated graph guarantees minimality and decomposability
- when \(\circ\) distributes over \(\cap\), then PC1 guarantees that a CSP is minimal and decomposable, and that the outer loop can be removed
- but this doesn’t always hold
- constraints of bounded difference ad temporal reasoning
- vars,
- constraints of the form \(a \leq Y -X \leq B\), \(Y - X = [a, b] = I\)
- composition and intersection from slides

## Lecture 28: Consistency properties, continued

- Remember, once decomposability is enforced, search is guaranteed backtrack free
- When constraints are convex, use of PPC ensures minimality and decomposability, but triangles must be done in PEO
- When composition distributes over intersection
- PC1 generalizes Floyd-Warshall (all-pairs shortest path)
- PC-1 generalizes Warshall (transitive closure)

- Don’t confuse properties with algorithms
- Local consistency
- remove inconsistent values (node, arc)
- remove tuples (path consistency)
- get closer to solution
- reduce size of problem
- are cheap

- Aim for global consistency, but it’s hard
- Sometimes (given certain types of constraints or graphs), local consistency guarantees lobal consistency
- distributivity in PC, row-convex constraints, special networks

## Lecture 28: Local Search

- Systematic/constructive search is all we’ve talked about
- Starts at a solution that’s not correct, try to repair
- General method:
- for each variable, choose a value
- check if solution
- You’re done. Lucky.
- pick a value that broke a constraint and change it (using min-conflict)

- Local search is in worst case, \(\mathcal{O}(d^n)\)
- States laid on a surface
- quality/cost is height
- optimum state maximizes solution quality while minimising cost
- move around from state to state to find a local optimum
- Gradient descent – will only ever find a local optimum
- State is a complete assignment of values to variables
- how do you move – CSP uses min-conflict
- eval function rates the quality of a state, generally in number of broken constraints
- Try
- Start w/ full but arbitrary assignment of aues
- reduce inconsistencies by local repairs
- repeat until
- A solution is found (a global minimum)
- The current solution cannot be repaired (local minimum)
- You’ve gone through max tries (running out of patience)

- Local repair for decision, local optimzation for optimization
- Greedy types are hill climbing, local beam
- Local beam
- start in parallel with a number of different points
- choose among those another number of best moves
- remove those without promise
- and repeat as necessary

- Stochastic methods
- break out, helps to avoid getting stuck in a local minimum
- when stuck in a local minimum, walk randomly, pick a random variable, change to a random value
- tabu search
- keep a list of tabu positions (those that aren’t good), but only a certain number
- when a new state is visited, added to tabu, and farthest back forgotten
- if no better, go to the next state not tabu
- repeat until a global minimum is found

- sometimes you may randomly ignore a heuristic

- Local search problems
- get stuck in a local optimum
- start having can have issues with plateaus (flat local maxima) (be ready to move to a state with the same quality)
- ridges can go from side to side with minimal progress

- when stuck, do random walk
- Simulated annealing
- When stuck in a local minimum, allow steps towards less good neighbors to escape the ocal maximum
- be more random at the start, be less random as you go
- Start at a random state, start countdown and loop until over
- pick a neighbor at random
- \(d\) = quality of neighbor - quality of current state
- if \(d > 0\)
- then move to neighbor and restart
- otherwise, move to a neighbor with a transition probability \(p<1\)

- Transition probability proportional to \(e^{\frac{d}{t}}\)
- \(d\) is negative
- \(t\) is countdown
- As time passes, you’re less and less likely to move towards bad neighbors

- But all forms of local search are:
- non-exhaustive
- non-systematic
- liable to get stuck in a local optimum
- non-deterministic – outcome depends on where you start
- heavy-tailed – longer you go, less likely you are to find a solution

- Is often somewhat probabalistic

## Lecture 29: Local Search, Continued

- starts from a complete assignment, and moves to a new assignment that’s better
- Very greedy approach, but can be parameterized very easily
- Tends to be very greedy, but only allows improvements
- others can be better, and allow forr random walks/steps – stochastic algorithms
- stochastic algorithms are not deterministic!
- Tabu search moves to a state that may be worse, provided it’s not in a list of previous states
- break-out modifies weight of objective function
- simulated annealing involves countdowns and loops, controled by quality of neighbor and the time, but when significantly restricted,, it is guaranteed to find an optimum
- Have issues
- non-systematic and non-exhaustive
- have issues getting stuck in local optima
- non-deterministic!!
- generally the probability of improving a solution gets smaller, but never actually dies out – heavy-tailed

- So, when you hit a point where it seems improvability drops
- keep the most promising solution
- and start someplace else
- Random restart

- Luby restart profile
- uses a crazy geometric law to decide how/when/where to restart

### Genetic algos for local search

- Combine two complete assignments to generate offspring
- Start from an initial position
- encode assignments in a compact manner
- combine partial solutions to generate new solutions
- requires initial generation of many random assignments
- Choose solutions to combine based on quality
- fitness is used to describe quality, assigns probability for solution
- Select pairs depending on fitness
- crossover point is chosen randomly, offspring are generated
- mutation will randomly change a state
- and this gets put back in the original population

### ERA – Environment, Rules, Agents

- Environ is \(n \times a\) board
- each var is an agent
- each position is a value of a domain
- agent moves in a row on the board
- each position records the number of violations caused by the fact that agents are occupying other positions
- agents try to occupy positions where no constraints are broken
- agents move according to reactive/reaction rules
- agents may themselves have preferences
- move rules can be as simple or complex as possible
- rules:
- least-move
- choose position with min violation
- better-move
- choose position with smaller violation
- random-move
- randomly choose a position
- others
- combinations of those above

- No one talks to each other, but share a common context
- everyone kicks eachother out until every one is happy
- very effective in solving very tight, but solvable instances
- unstable in over-constrained cases
- agents keep kcicking each other out (livelock)
- livelocks can be used to identify bottlenecks

- very stable on solvable instances
- oscillates wildly on unsolveable instances
- Agent motion can be categorized as variable, stable, constant

## Lecture 30: Local Search, continued

### ERA, Continued

- Rules help to find a solution somewhat quickly
- Has less of an issue with getting stuck in various local optima
- May be very good for resource-allocation problems
- more dynamic, as there’s no global evaluation function

### Nothing works, what to do

- Random restart
- when no progress made, restart from a new srandom selected state
- save the best results so far

- repeat random restart
- for a fixed number of iterations
- until best results have not ben improved on for a certain number of iterations – e.g., a geometric law

### Evaluation of Local Search

- Test on a given problem instance
- an ensemble of problem instances

- Experiment
- running algo thosands of times
- measure some metric as a random variable – i.e., the time needed to find a solution

- Then provide
- probability distribution function approximated by the number of runs that took a given time to find a solution
- the cumulative distribution function approximated by the number of runs that found a solution within a given time

- Then compare performance with statistical tests
- t-tests assume normal distribution of the metrics
- nonparametric tests don’t
- Consult with a trained statistician

## Lecture 30: Structure-based methods

- CSPs represented by networks
- for bCSPs, graph
- constraint graph
- microstructure
- co-microstructure

- for general CSPs
- hypergraph
- dual – allowed tuples connected to each other when shared variables match
- primal graph – vertices in graph are original variables, are variables in a constraint are a clique

- for bCSPs, graph

## Lecture 31: Structure-based methods, continued

Graph | Vertices | Edge |
---|---|---|

Binary | Var | Constraints |

Non-binary | Var | constraints |

Dual | Constraints | Equality constraints on shared variables |

primal | variables | cliques over scope of constraints |

\(\mu\) structure | VVPs | supports |

co \(\mu\) structure | VVPs | conflicts |

- Incidence graph – Variables are one type of node, constraints another, edges connect variables to the constraints, is bipartite

### Tree-structured CSPs

- Tree-structured CSPs can be solved in polynomial time
- apply \(revise(V_i, V_j)\) for all nodes from leaves to root
- instantiate variables from root to leaves

- Generally
- directional consistency from leaves to root
- solutions built from root to leaves

- This is often the case for configuration problems
- Width of a tree is always one for a tree
- If a CSP graph has \(w = 1\), it is guaranteed to be solved with 2 consistency
- Freuder has therem that if a csp is \(w\) in width, then $w+1$-consistency is guaranteed to solve it.
- May have issues w/ changing width of graph
- There are ways to prevent this, but is not necessarily true

### Cycle-cutset

- identify a cycle cutset \(S\) in the CSP
- cutset – nodes when removed yield a tree

- Decompose CSP into two partitions
- \(S\)
- \(T\) the nodes forming a tree

- Iterate until a solution is found
- Solve the nodes in \(S\)
- Try to extend the solution to \(T\)

- \(|S| < |V|\)
- if \(|S| = c\), then $\mathcal{O}(d
^{c}(n-c)d^{2}) = \(\mathcal{O}(nd^{c+2})\) - But finding smallest cutset is \(NP\) hard

## Lecture 32: Structure-based methods, continued

### Dynamic Dangle Identification

- graham reduction used to detect if schema is acyclic
- join of acyclic schema is polynomial
- remove vertices w/ degree 1

- After each var assignment/lookahead
- check connectivity
- identify dangling trees using graph reduction

- for each dangling root
- do DAC from leaf to root
- wipeout means unsolvability

- Restrict search to nodes outside of the identified dangling trees

### Directional Path Consistency

- Given a binary CSP and a fixed variable ordering
- \(\mathcal{O}(nw*^2d^3)\)
- if \(w* \leq 2\)
- DPC determines consistency
- guarantees BT-free search given the ordering
- But this only works iff induced width is less than or equal to 2, you get strong consistency, minimality and tractability

### Biconnected components

- decomposition into separable graphs
- consider a CSP whose graph has artiulation nodes
- assume that the largest biconnected component is \(b\)
- build a tree whose nodes are biconnected components considering that the articulation node belongs to the parent (or parent and child)
- build an order using a pre-order traversal of the tree
- complexity is then bound by the size of the largest biconnected component

## Lecture 33: Structure-based methods, continued

### Biconnected components

- must have articulation nodes – nodes common to two or more problems
- is \(d^b\) where \(d\) is the domain size and \(b\) is the size of the largest component
- Use a pre-order of the variable tree, which means that BT effort is the size of the largest component
- Originally from 1982, in Journal of the ACM

### Tree Clustering

- bi-connected components, but with more than one shared variable for each cluster
- triangulates the graph
- finds maximal cliques
- create a tree structure from cliques
- Then repeat:
- solve a clique at each node in the tree
- apply DAC from leaves to root
- generates solutions in a BT-free manner

- MaxCliques from Algorithmic Graph Theory and Perfect Graphs, Golumbic
- Triangulation is \(\mathcal{O}(n^2)\)
- Max clique is \(\mathcal{O}(n + e)\) in triangulated graph
- \(\mathcal{O}(k^r)\) for solving clusters, \(k\) is domain size, \(r\) is size of largest clique (\(t = k^r\))
- generating solution \(\mathcal{O}(nt\log t)\), \(t\) is tuples in each cluster (sorted domain of a ’super’ variable (sorting can help to speed things up)), in best case, we have \(n\) cliques
- complexity bounded by size of largest clique:
- \(\mathcal{O}(n^2) + \mathcal{O}(k^r) + \mathcal{O}(nt\log t) = \mathcal{O}(nk^r\log k^r) = \mathcal{O}(nrk^r)\)

## Lecture 34: Structure-based methods, continued

### Bucket Elimination

- Built on non-binary CSPs
- Dechter 1997
- Choose a variable ordering – PEO recommended
- create buckets for each variable, placing each constraint in the bucket of the deepest variable in its scope
- create tree node for each bucket
- From bottom to top
- join relations in bucket (including on the domain)
- project join on deepest bucket with common variable names
- top-down creates solution backtrack free
- Can handle cycles!
- And is a pretty quick form of DAC, but for non-binary CSPs

## Lecture 35: Structure-base methods, continued

### Bucket Elimination/Tree Clustering

- linear in \(n\), exponential in induced width + 1
- induced width is the number of vars in the largest component
- gives fixed parameter complexity

### Bucket Elimination

- very much like Gaussian elimination
- Useful for Bayesian networks – graphical models of all kinds

### Tree Decomposition

- Generalizes all of structure methods
- comes from database theory
- organizes any constraint problem into a tree
- \((T, \Chi, \Psi)\) of \(P = (X, D, C)\)
- \(T\) – \((V, E)\) the tree
- \(\Chi(v) \subseteq X\) – A mapping of all variables in a node
- \(\Psi(v) \subseteq C\) – A mapping of constraints and variables
- Each constraint must appear on at least one node in the tree, and all its variables are in that node
- nodes where any variable appears induce a single connected subtree (connectedness property), the path between any two nodes with the same variable will all have the variable

- join trees and acyclic DBs from database theory
- organizes the constraint network into a tree structure, a join-tree
- labels tree nodes with variables and constraints – labels must obey some rules
- Solving
- bottom-up – joins relations and projects on parent
- top-down generates the solution

- Complexity bound by width
- tree width is the max number of variables in a node - 1
- hyperwidth is the max number of constranits in a node

- Group of variables are the variables in common to two nodes
- However, selecting the tree decomposition is NP-hard, use approximation by min-fill
- Quite a few different tree-decomposition techniques – Gottlob’s HYPERTREE seems to be one of the best
- parameters
- tree width max number of variables in any node minus 1
- hypewwidth – number of constraints in any node

- theory – characterizes tractable csps
- complexity is bound by tree/hyper-width
- fixed parameter complexity – independent of number of variables

- practice – uses structure to solve csps
- requires balancing time and space

## Lecture 26: More on Constraint Consistency

### Global Consistency Properties

- minimality – relations join completely, $n$-consistency
- decomposability – strong $n$-consistency
- solvability

### Local Properties

- AC/GAC – Domains
- Path Consistency – Constraints, add new constraints (constraint synthesis)
- Partial Path Consistency – Constraints, add new constraints
- Directional Path Consistency – Constraints, add new constraints
- Types of constraint algorithms
- Binary
- Non-Binary
- Global Constraints

- For continuous domains, work on box-consistency
- On interval, box consistency can occasionally also be used
- weighted CSPs can also be dealt with, with a variant of AC

### Info

- arc consistency – $(1,1)$-consistency – for every variable, the solution can be extended to another consistent assignment
- path-consistency – $(2,1)$-consistency – for every two variables, the solution can be extended to another consistent assignment
- \((i,j)\) – for every consistent assignment of length \(i\), the solution can be extended by \(j\) consistent assignments (\(i + j = n\))
- strong $k$-consistency is $k$-consistency down to \(1\) consistency

## Lecture 27: Still more on Constraint Consistency

- domain minimality – constraints over arity 1, Gottlob 2011
- however, even if domain minimality and constraint minimality hold, solving is still NP complete
- global inverse consistency – where given a value at 1 var, you can extend it to \(n - 1\) assignments

### $(i, j)$-consistency

- \(j = 1\)
- 2 – AC
- 3 – PC
- k – $k$-consistency

- \((1, n - 1)\) – domain minimality, global inverse consistency
- \((1, k)\) – \(d^{k + 1}\), guaranteed that a solution can be extended over \(k\) variables
- NIC – Neighborhood Inverse Consistency, for every variable, look at its neighbors, and make consistent with those

## Lecture 28: More Constraint Consistency

### Singleton Arc Consistency

- family of algos based on singleton test
- singleton test like search – kinda
- The CSP is AC for every VVP
- Algo
- Until No change
- for each var
- for each value
- assign the val to the var
- if CSP is AC, keep
- else Remove

- for each value

- for each var

- Until No change
- Sometimes, when you get lots of backtracks, trigger a high level consistency

### Inverse consistency

- (1, 2) consistency – Path Inverse consistency
- Neighborhood Inverce – see previous
- GIC – Global Inverse Consistency – ensure domain minimality

### Special Constraints – AC

- functional – A constraint C is functional with regard to D iff for all \(v \in D\) there exists at most one \(w \in D\) such that \(C(v,w)\)
- monotonic constraints – a constraint C is monotonic with regard to \(D\) iff there exists a total ordering on \(D\) such that, for all \(v\) and \(w \in D\), \(C(v,w)\) hold, implies \(C(v^{\prime}, w^{\prime})\) holds for all values \(v^{\prime}, w^{\prime} \in D\) such that \(v^{\prime} \leq v \land w^{\prime} \leq w\)

### Non-Binary Special constraints

- almost all properties and algos discussed were restricted to binaryies
- consistency for non-binary is the topic of current research
- in particular domain filtering that doesn’t change topology or modify constraint definitions
- relational \(m\) consistency – adding constraints/changing constraint definitions

- Domain Filtering
- GAC
- Singleton GAC
- maxRPWC, rPIC, RPWC, etc

- Relational consistency
- Strong consistencies
- Simple Tabular Reduction

## Lecture 29: Extensions to BT Search

- bounded number of backtrack search
- bounded backtrack depth search
- limited discrepency search
- credit-based backtrack search
- randomized backtrack search

### Credit Based Search

- start with a given credit, usually \(n^3\)
- assign 1/2 credit to current assignment, 1/2 to the remaining
- keep going in a depth-first manner, until credit is used up, chronologically backtrack from there
- Used in conjunction with backtrack bounded search

### Limited Discrepancy Search

- may be blind at shallow levels of search
- disobeys a given number of times
- Matthew Ginsburg

### Randomized BT

- in systematic search, ordering of the variables/ values determines which of the solution is explored
- randomization allows us to explore wider portion of search tree
- thrashing causes stagnation of BT search, interrupt it, and restart
- Pick random variables at instantiation
- And Random values when you’re picking values
- Use the restart schedule when thrashing
- Fixed cutoff and universal – Luby
- Randomization and Rapid restarts (Gomes), fixed optimal cutoff, priori knowledge of cost distribution required
- randomization annd geometric restarts Walsh – \(C_{i + 1} = r C_i\), \(C_0 = n\)
- Randomization and dynamic geometric restarts (guddeti) – $C
_{i + 1}= \left\{\begin{matrix} rC_{i}& \text{when solution improved} \\ C_{i}& \text{otherwise}\end{matrix}\right.$ - Bayesion approach (Kautz)