# Fundamentals of Constraint Processing

## Lecture 2

### Resources

• check website – has many, many links
• conferences
• journals
• include logic programming and CSP
• Linear Programming, Integer Programming, Mixed Integer Programing, Dynamic Programming
• journals are also pretty big – and tend to be easier to get into
• many good ways to get jobs with this

### Constraint Satisfaction 101

• examples
• generating optimal decisions
• have unary, bynary, n-ary constraints
• often have many choices, but restrictions can make it easier to make the right decision
• modeling constraints remains an art
• sets of variables with sets of choices, and aset of constraints restricting combinations of values
• applications
• databases
• temporal and spatial reasoning
• allocation
• design/config
• graphics, visualization, interfaces
• verification and engineering
• HCI and decision support
• molecular biology
• analytical chemistry
• robotics, machine vision, computational linguistics
• transportation
• qualitative and diagnostic reasoning
• constraint processing is modeling, plus inference and search
• modeling – variables, domains, constraints, query
• inference – enforcing consistency (filtering) and propogation (AC, PC)
• search – systematic (chronological backtracking, with or without intelligence, etc) or non-systemactic (hill climbing, taboo, etc)
• Real Example
• Netherlands have dense network
• wanted more robust schedule that’s more effective, increasing profit
• used OR and AI
• Singapore
• large container hub
• CP is:
• solving decision problems
• with arbitrary constraints
• providing concise and high-level feedback about alternatives and conflicts
• Very Practical

### Defining a problem

• has a given and a question
• given is a set of objects, the relations, etc
• question ex: find $$x$$ such that condition $$y$$ is satisfied
• these can be simple or advanced
• a problem must always, always, always have a question!
• Given $$\mathcal{P} = (\mathcal{V}, \mathcal{D}, \mathcal{C})$$
• $$\mathcal{V} = \{V_1, V_2, \ldots, V_n\}$$ Variables
• $$\mathcal{D} = \{D_{V_1}, D_{V_2}, \ldots, D_{V_n}\}$$ Domains for Variables
• Constrants $$\mathcal{C} = \{C_1, C_2, \ldots, C_l\}$$ where $$C_{V_i,V_j,\ldots,V_k} = \{(x,y,\ldots,z)\} \subseteq D_{V_1} \times D_{V_2} \times \ldots \times D_{V_n}$$
• Query whether we can find a value for each variable such that all constraints are satisfied, queries can include weights (making it a Constrained Optimization Problem)
• Various levels of consistency
• AC
• GAC
• SAC
• SGAC
• Sudoku can be done with consistency, but whether or not search is required for all
• find solution – decisions
• find sumber of, all solutions – counting
• find set of constraints that can be removed so that a solution exists – optimization
• specials
• binary CSPs have only two variables
• restricted to $$\{0, 1\}$$, boolean CSP
• finite, discrete domains, enumeration can work
• continuous domains require sophisticated algebraic techniques are needed – consistency techniques are then used on domain bounds

## Lecture 3

• always a given $$\mathcal{P} = (\mathcal{V}, \mathcal{D}, \mathcal{C})$$
• Constraints are relations
• relations are a subset of the cartesian product of $$n \geq 2$$ sets
• relations can be represented as a table, which are refered to as allowed tuples, or supports, which is a table constraint, being an enumerated
• extension in a set with all members enumerated
• conflicts are the set of tuples that are no good
• Constraints can also be in intention – they’re defined in terms of a logical function, equivalent to extension, but can be easier to understand
• A check function should always be a simple implementation – to build with them
• Code must always be well structured!

## Lecture 4

• $$P = (V, D, C)$$
• $$V$$ variables
• $$D$$ domains
• $$C$$ constraints
• Constraints are defined as $$C_1 = \langle\mathrm{scope}(C_1), \mathrm{rel}(C_1)\rangle$$
• Scope is the variables it discusses – the set of variables over which the constraint is defined
• relation is an abstract relation – may be in extension (listed as either supports or conflicts) or intension (roughly as a function or boolean expression)
• Also have arity, the cardinality of the scope
• universal constraints allow any member of the relevant cartesian product, but may be induced so as to be something else – equivalent to know constraint, and is binary alone
• VVP – Variable Value Pair
• make sure to implement check function, and to do so in a very independent manner
• keep track of a number of constraint checks, always incrementing for every check
• will be writing abscom to parse
• Graph representation
• Macro representation
• Variables are represented as nodes (vertices)
• domains as node labels
• constraints as arcs (edges) between nodes
• hypergraphs allow for special constraint nodes, or use a bubble to connect them instead – the hyperedges
• relations in intension are defined by set-bulider notation
• used when not practical/possible to list all tuples
• define types/templates of common constraints
• linear constraitns
• all-diff constraints
• at most
• TSP-constraints
• cycle-constraints
• constraints implemented
• in extension
• set of tuples (list/table)
• binary matrix (bit-matrix)
• multi-valued decision diagrams
• in intension
• constrained decision diagrams
• predicate functions

## Lecture 5

• examples of modeling
• can have temporal csp
• graph or map coloring – requires 4
• resource allocation – airline trip planning, list coloring
• product configuration – allowed components and what they require/forbid
• logic puzzles
• Constraint types
• algebraic constraints
• constraitns of bounded difference
• coloring
• mutual exclusion
• difference constraints
• arbitrary constraints, must be made explicit
• Databases
• Join in DB is a CSP
• View materialization is a CSP
• Interactive systems
• data-flow
• graphical layout systems and animation
• graphical user interfaces
• Molecular biologiy

### Formal characterization

• macrostructures
• networks/constraint graphs, $$G = (V, E)$$, may be hypergraphs
• Micro-structure – includes every possible Variable Value Pair
• VVPs are linked iff are a support and are consistent
• cliques of size $$n$$ are possible solutions
• however, finding cliques is $$NP$$ complete
• Co-microstructure – includes every possible VVP
• VVPs are linked iff they are a no-good
• cliques of size $$n$$ prevent a given solution
• Complexity
• decision problem
• NP-complete by reduction from 3SAT
• can be verified in polynomial time, but solution is not generated in polynomial time

### Proof of CSP into SAT

• CSP has Variables, Domains, Constraints
• Every VVP is a Boolean variable in SAT
• Requires a clause for each variable, a disjunction for each VVP, to be added to the conjunction of clauses, along with a disjunction of not a or not b of each diffenrent VVP as pairs
• Constraints are then mapped to SAT

## Lecture 6

### Solving a CSP

• by search
• constructive, systematic, exhaustive – involves enumerating all possible VVPs
• iterative repair, local search – pick an assignment and change unsatisfied constraints until all constraints are satisfied
• Before start
• consider the model/forumaltion – controls the size of the search space (defining a model is an art)
• pre-process
• i.e. constraint filtering/propagation, consistency checking – reduce the size of search space
• Constraint Checking
• arc consistency
• removes individual possible elements from the domains of either element
• Revise Function (B, C):
• $$\forall x \in D_B$$
• if support($$x$$, $$D_c$$)
• break
• else
• $$D_B \gets D_B \setminus \{x\}$$
• Support function: for every element, if supports, return true, if none exist, default to false
• will not always find a solution, but will reduce problem size
• k-consistency
• consistency on $$k\leq n$$ variables
• Systematic search
• for every value of Vi, check for consistency with previous assignments (starting from the bottom), cuts off portions of the tree if false (using backchecking to prune a search tree)
• back-checking expands a partial solution iff it is thus-far complete
• chronological backtracking – DFS with only the current path, undoes the most recent variable assignment and tries a new one – can be used for only one solution
• is complete – will always find a solution if it exists
• is sound / correct – a solution it gives will always be correct – will not give a false positive
• intelligent backtracking
• jump back based on the reason for failure – generally the deepest
• Ordering heuristics
• deciding what to expand first
• often most constraint and most promising values first
• strategies for variable/value ordering could be static/dynamic
• effects branching factor – best to branch as little as possible
• generally variable ordering is dynamic, but value is static (lexicographic)
• Keep-in mind forward-checking – removing values from domains that aren’t consistent – a/k/a domain annhilation – and is more efficient
• Real Full Lookahead is used to further restrict domains and search spaces more than plain forward-checking
• balance between reasoning and search

## Lecture 7

### Arc Consistency

• revise should only modify one variable’s domain
• requires support, takes VVP and domain of other
• checks for VVP in X with every element of Y, returning true if found, false if none found
• check always increments if there’s a constraint
• revise should check if there’s a constraint to begin with
• VVPs can be represented as tuples, may use angle brackets
• AC-1:
• assume CSP has $$n$$ variables, having domain size $$d$$, and $$e$$ constraints
• $$Q$$ is a set of directed edges to revise, from x to y, from y to x, size $$2e$$
• for all edges in $$Q$$
• set acp to fales
• repeat until revisedp is false
• set revisedp to false
• for everything in $$Q$$
• if it’s revised, set revisedp to true
• But, if a variable’s domain is empty, break and report – this CSP has no solution
• This CSP is then arc-consistent
• Make sure to sort list of edges!
• Unary constraints should be done first!

## Lecture 8

Property Algorithm Complexity
AC AC-1 $$O(na^3e) = O(n^3a^3)$$
• solutions found by
• detecting components
• NC
• normalize constraints
• NC, AC, other forms of AC
• Search
• Verification
• Solution, Node Vists, Constraint Checks, CPU Time
• when implementing AC1, terminate if domain wipe-out occurs
• worst case complexity is seldom reached
• remember, AC may discover a solution to a CSP, or may find an inconsistent problem (by domain annhilation)
• AC3
• for $$i \gets 1$$ to $$n$$, do $$NC(i)$$
• $$Q \gets \{(i, j) | (i, j) \in arcs, i \neq j\}$$
• while $$Q$$ is non-empty:
• select and delete any arc $$(k, m) \in Q$$
• if $$\mathsc{Revise}(k, m)$$ then $$Q \gets Q \cup \{(i, k) | (i, k) \in arcs(G), i \neq k, i \neq m\}$$
• Check for domain wipe-out!

## Lecture 9

• Remember to deal with Node Consistency – and don’t include it in the check count
• see lecture 8 notes
• AC3 is not only more powerful
• $$2e + a[2e - n]$$
• AC3 is $$O(a^2 (2e + a(2e - n))) = O(a^3e) = O(a^3n^2)$$
• AC4 is even more efficient, but requires special bookkeeping, is $$O(a^2n^2)$$
• Variant of AC-3 called AC-2001, but requires cubic space, has yet another variant, AC-3.1$$^{m}$$, requires different bookkeeping
• AC4
• generate m, s, and counter
• m and s have as many rows as there are VVPs in the problem, m is set to true, s to empty list
• m is for alive
• s is for a list of all VVPs that support a given VVP
• given a vvp, provide a number of supports provided by a given variable
• prepare data structures
• for every constraint all tuples:
• when allowed, update s
• increment counter
• Then process all constraints
• if counter is 0, remove from domain, and update $$m$$
• since $$x$$ is removed, then for everything such that it was a support, decrement the counter
• AC3 tends to be a bit better than AC4

## Lecture 10

• soundness – can trust
• complete – will always find a solution if exists
• more efficient AC variants are useful for use during search
• but when enforcing AC during search, you must be able to keep track of what was removed
• remember properties are not the same as the algorithms! there may be many algorithms to implement each property!
• most consistency methods are local initially
• node/arc remove inconsistent values
• path removes inconsistent tuples
• used when you have positive tables of supports
• they get closer to the solution
• reduce the size of the problem
• are cheap (polynomial time)
• choice of consistency properties helps to reduce time to solution

### Intelligent Backtracking Algorithms

• Read Prosser, CI 93. Hybrid Algorithms for the Constraint Satisfaction Problem
• Consider Reading 5/6 of Dechter
• Issues
• Variable/Value ordering
• Variable Instantiation
• Currentt Path
• Current Variable
• Past Variables
• Future Variables
• Shallow/Deep Levels/Nodes
• Search space/Search Tree
• Back-checking
• Backtracking
• As you go, un-assigned variables are the future variables/future sub-problem
• those that have been instantiated are the past variables
• the current path is current instantiations
• always check a new instantiation against past variables (back-checking)
• algorithms
• Vanilla: BT
• Improved Backward: BJ (back-jumping), CBJ (conflict-directed BJ)
• Improving Forward: BM (back-marking), FC (forward-checking)
• D-Way branching is conceptually easier to understand than 2-way
• algorithms from Prosser are iterative back-tracking and pluggable/modularized
• Kondrak and Van Beck looked at the efficiency of algorithms

## Lecture 11

• Variables $$V_i$$, $$i \in [1, n]$$
• Domains $$D_i = \{V_{i1}, v_{i2}, \ldots, v_{iMi}\}$$
• Constraint between $$V_i$$ and $$V_j$$ : $$C_{i,j}$$
• Constraint graph $$G$$
• Arcs of $$G$$: $$\mathrm{Arc}(G)$$
• Instantiation order is static or dynamic
• lang primitives are from lisp
• Data structures
• $$v$$ is a 1 by $$n$$ array to store assignments, $$v_0$$ is the root, backtracking to this indicates insolvability, $$v_i$$ is the assignment to the $i$th variable
• $$domain$$ is the original domain
• $$current-domain$$ is the current domain, must be updated on back-tracking
• check is as before
• Generic form
• takes n, status
• set consistent to true
• status to unknown
• $$i \gets 1$$
• while the status is unknown
• if consistent
• $$i \gets \mathrm{label}(i, consistent)$$
• else $$i \gets \mathrm{unlabel}(i, consistent)$$
• if $$i > n$$
• set status to solution
• else if $$i = 0$$, set status to impossible
• label and unlabel functions are provided by the backtrack algorithm
• label is forward move, unlabel is backward move
• For BT
• label
• when $$v_i$$ is assigned a value from the $$current-domain_i$$, we perform back-checking against past variables
• if back-checking succeeds
• return $$i + 1$$
• else, remove assigned value from current-domain[i], assign next value, etc
• if no value exists, set consistent to false
• function
• takes i, consistent
• set consistent to false
• for v[i] in each element of the current domain while not consistent
• set consistent to true
• for h from 1 to i - 1 and while consistent
• set consistent to check(i, h)
• if not consistent
• then remove i from current-domain
• if consistent, return i+1, otherwise i
• unlabel
• current level is set to $$i - 1$$
• for future variables $$j$$, reset the current-domain
• if the domain of $$h$$ is not empty, set consistent to true
• Function
• takes i, consistent
• set h to i - 1
• set current-domain[i] to domain[i]
• remove v[h] from current-domain[h]
• set consistent to current-domain[h] not-equal empty-set
• return h
• label does the actual instantiation

## Lecture 12: Phase Transitions

• Cheeseman 1991
• Order parameter
• Probability solution exists for random problems is almost 1
• there’s a critical value whereby, after it, probability is almost 0
• around the critical value, the probability is around 0.5
• cost of solving drops sharply as it gets further to the right of the critical value, but high as it goes to it
• around .5 probability is high cost of solving, no matter the algorithm, this is referred to as a phase transition
• conjecture regarding the characterization of NP complete problems
• applies to detecting/implementing arc-consistency
• random graphs are almost always easy to to color – conjecture from famous paper
• graph coloring is a reduction operator
• friends are vertices with the same neighborhood
• this is the degree
• For CSPs, it’s either density or tightness
• currently effects the way of expirement conduct – try and deal with the hardest problems
• but be careful not to focus exclusively on the redior around the critical value
• Run on random CSPs – given statistical analysis
• Vary params $$\langle n, a, t, p \rangle$$
• $$n$$ – number of variables
• $$a$$ – domain size$• $$t$$ – tightness $$\frac{|\mathrm{forbidden tuples}|}{|\mathrm{all tuples}|}$$ • $$p$$ – proportion of constraints $$p = \frac{e}{e_{max} - e_{min}}$$, also $$\frac{2e}{n(n-1)}$$ • $$t$$ and $$p$$ will be between 0 and 1 • Can have issue with uniformity, difficulty (phase transition), solvability • empirical studies can be done simply by varying one of the parameters • do it (Model B): 1. generate $$n$$ nodes 2. generate a list of $$\frac{n(n-1)}{2}$$ tuples of all combinations of 2 nodes. 3. choose e elements from the above list as constraints to go between the $$n$$ nodes 4. if the graph is not connected, throw away, go back to step 4, otherwise proceed 5. generate a list of $$a^2$$ tuples of all combinations of 2 values 6. For each constraint, choose randomly a number of tuples from the list to gparantee tightness $$t$$ for the constraint. • Model B instances may be flawed, but it’s rare ## Lecture 13: Back-tracking: Continued • keep track of path – an array of the instantiations thus far and to be created • remember what unlevel does – actually performs backtracking • many different ways to order variables • when doing value-ordering, use lexicographic ordering! • eventually, variable ordering heuristics must be broken with lexicographic ordering ### Back-Jumping • uses $$max-check$$, an array, and when check(i, h) succeeds, $$max-check[i] \gets max(max_check[i], h)$$ • if the current domain of $$h$$ is empty, chronological back-tracking is performed from h • you go back to $$max-check[i]$$ to fix stuff when you have an empty variable, unless you’ve already back-jumped in the current path • BJ just modifies BT-label and BT-unlabel • BJ-label just updates $$max-check$$ • bj-unlabel • backtrack to h = max-check[i] • resets max-check[j] = 0 for j in $$h+1$$ to $$i$$ ### Conflict-directed BJ • requires a new data structure, the conflict-set array • tracking at what levels conflicts occur • knows how to jump-back again, to the deepest level of conflict • can jump-back more than once! • useless if good variable ordering • conflict sets are initialized to $$\{0\}$$ • At any point, conf-set is a subset of past variables that are in conflict with $$i$$ • when check(i, h) fails add h to conf-set[i] • when domain is empty • jumps to deepest past variable in conf-set • update $$conf-set[h] = conf-set[h] \cup conf-set[i] \setminus \{h\}$$ • To calc all solutions – don’t use CBJ ## Lecture 14: Back-tracking: Continued ### CBJ Continued • max-check – deepest level checked against – if instantiated, will be $$i - 1$$ • conflict set is used for holding what’s conflicted • Finding all solutions with CBJ • use the conflict sets • or use CBF from Kondrak • keep track of number of solutions • make sure to check generated solutions with the solution checker!! • modify algo to after solution, back track to every level at least once! • in actuality, must be back tracked to exactly one • look at Chis Thiel’s response • when a solution is found, force the lat var to conflict with everything before it. In turn, this forces some chrono backtracking as the conf-sets are propagated backward • Kondrak’s responce • cbf flag, vector • $$\forall i, cbf[i] \gets 0$$ • when you find a solution, $$\forall i, cbf[i] \gets 1$$ • In unlabel • if $$cbf[i] = 1$$ • $$h \gets i - 1$$, $$cbf[i] = 0$$ • else $$h \gets max(conf-set[i])$$ ## Lecture 14: Forward Steps ### Backmarking • tries to reduce the amount of consistency checking • $$v[i]$$ to reassign to $$k$$ • $$v[i] \gets k$$ last checked against $$v[h] \gets g$$ • $$v[h]$$ has not been modified • not super useful – as consistency enforcement is more useful • this helps to reduce back checking • data structures, max back-check level, minimum backup level • $$mcl$$ max check level, $$n \times m$$ array, $$mcl[i, k]$$ stores deepest variable that $$v[i] \gets k$$ was checked positively against, finer-grained mcl • $$mbl$$ min. backp level $$n$$ array, gives the shallowest past variable whose value has changesd since $$v[i]$$ was the current variable • BM does not allow dynamic variable ordering • aware of the deepest variable that $$v[i] \gets k$$ is $$v[j]$$ • values of variables in the past of $$v[j]$$ ($$h • so we need to check \(v[i] \gets k$$ against the values of the variables between $$v[j]$$ and $$v[i]$$ • we do not need to check against the values of the variables in the past of $$v[j]$$ • when $$mcl[i,k] \leq mbl[i]$$, don’t check, fails! • when $$mcl[i,k] \geq mbl[i]$$, don’t check, succeeds! – type B savings ### Forward checking • Looks ahead from the current variable, considering all future variables and clear from the domains the values that aren’t consistent with the current partial solution • FC means more work in the future, but causes fewer nodes to expand • When it moves forward, the values of the current domains of future variables are all compatible with past assignments, saves backchecking • FC may cause wipe-out, but discovers conflicts early on. will backtrack • goal is to find failures early – avoids expanding fruitless subtrees • is not truly forcing arc consistency, that’s RFL • requires a DS to keep track of filtering, can be done in various ways ## Lecture 15: Forward Steps, Continued ### Forward Checking • look-ahead – revises future variables • makes more work at instantiation, but expands fewer nodes • as you move forward, values in future variables are consistent with past variables • can cause wipe-out and conflict discovery early on, backtracking chronologically • requires current domain, stack of variables removed by previous instantiations • future is a subset that v[i] checks againts • past-fc is past variables that checked against v[i] • requires functions • check-forward – called when instantiating v[i] • revise(j, i) • returns false if current-domain[j] is empty, true otherwise • values are removed from current-domain are pushed as a set into reductions[j] • update-current-domain • $$current-domain[i] \gets domain[i] \setminus reductions[i]$$ • fc-label • attempt to instantiate current-variable • filters domains of current • on wipe-out of future variable, uninstantiate and undo domain filtering • Suffers from thrashing, as based on bt ### Variable Ordering • dom, least domain, smallest domain first • degree of variable (number of neighbors) • dom/deg is most common • Degree is difficult, as it changes very rapidly • can/should be implemented statically ### Consistency Check Summary • CBJ • uses backtracking, no addtional structures • backmarking uses mcl and mbl • two types of consistency-checking savings • forward checking • works more at each instantiation, but expands fewer subtrees • uses reductios, future-fc, past-fc • FC-CBJ is the best. Period • Other Backtracking algos • Graph-based back-jumping – Dechter • Pseudo-trees [Freuder, 85] • Other lookahead – DAC, MAC – DAC seems to be pretty good as a compromise between MAC and FC • More emperical evals • now by generating random problems • theoretical comparisons ### Implementation notes • Enforce NC • Normalize constraints • check for empty crelations • Interrupt as soon as you detect domain wipe-out • Remember dynamic variable ordering includes the domino effect • a/k/a pick singleton domains first ## Lecture 15: Intelligent Backt Tracking – Theoretical Eval • Kondrak and Van Beek, IJCAI 95 • Dechter 5, 6 • most backtracking algorithms are evaluated empirically • performance depends heavily on problem instance • is therefore average case analysis • representativesness of the test problems • CSP in NP complete – always possible to construct examples where BJ/CBJ are worse than BT • comparison criteria are different • paper gives theroretical approach • states by number of nodes visited • number of constraint checks • BT, CBJ perform the same amount of consistency checks • bm reduced consistency checked • MD does not affect number of nodes visited • FC-CBJ may visit more nodes than CBJ • proves correctness of BJ/CBJ • determines partial order between algos • proves BJ, CBJ are correct • proves FC never visits more nodes than BJ • Improves BMJ and BM-CBJ to perform even fewer consistency checks • provides framework for characterizing future BT algorithms • BT extends partial solutions – solutions consistent with a set of uninstantiated values if it can be consistently extend to the variables – that is, create a new partial solution extended by one variable s.t. it’s still consistent • Dead end – when all values of current variables are rejected • lower level are closer to the root • higher levels are closer to the fringe • 2 BT algos are equiv if on every csp they generate the same tree and perform the same consistency checks • Lemma 1: If BJ bactkracs to variable xh from a dead end at xi, then (ai \ldots ah) is not a valid partial solution ## Lecture 16: Theoretical Eval, Continued • sufficient conditions • BT visits a node if parent is consistent • BT visits a node if its parent is consistent with all variables • CBJ vists a node if its parent is consistent with all sets of variables • FC vists a node if it is consistent and its parent is consistent with all variables • Necessary Conditions • BT visits a node only if parent is consistent • BT visits a node only if its parent is consistent with all variables • CBJ vists a node only if its parent is consistent with all sets of variables • FC vists a node only if it is consistent and its parent is consistent with all variables • BT visits all nodes FC visits • BJ vistis all nodes CBJ visits • CBJ and FC are not comparable. • FC will visit no more nodes than BJ • BT and BM are equivalent, as are BJ and BMJ • proof of correctness – termination, soundness, completeness • extends for seeking all solutions • BM hybrids should be fixed, by changing MBL to an $$n \times m$$ array, creating BM2, BMJ2 and BM-CBJ2 ## Lecture 17: Search orders in CSPs • assumptions • finite domains • binary constraints • Chapter 6, Tsang is req’d reading • Dual viewpoint heuristics for binary constraint satisfaction problems is recommended reading • IN BT, some orders have fewer backtracks • in lookahead, some ordering detect failure earlier • variable ordering is often the most constrained variable • value ordering is often the most promising value first • using dynamic ordering in lookahead algos often removes the need for intelligent backtracking • Regin and Bessier show that CBJ and similar isn’t needed – but not so sure • Select values that don’t cause constraint violation – do the most promising value first, avoiding constraint violation • Discover constraint violation quickly – select variables that don’t delay discovery of constraint violation, go with the most constrained variable first • Notation: For a given future variable $$V_i$$, the assignment $$V_i \gets x$$ partitions the domain of the future variable $$V_j$$, a set of values consistent with with the assignment of size $$\mathrm{Left}(V_j | V_i \gets x)$$ and a set of values inconsistent of size $$\mathrm{Lost}(V_j | V_i \gets x)$$ • Value selection of Lost and Left • Value Ordering to quickly find a solution • min-conflict – orders values according to the conflicts in which they are involved with the future variables – most popular (Minton), $$Cost(V_i \gets x) = \sum_{V_j \neq i} \mathrm{Lost}(V_j | V_i \gets x)$$ • Cruciality – (Keng and Yu, 1989) $$Cruciality(V_i \gets x) = \sum_{V_j \neq i} Lost(V_j | V_i\gets x) / |D_{V_j}$$ • promise – most powerful (Geelen, 1992) $$Promise(V_i \gets x) = \prod_{V_j \neq i} Left(V_j | V_i \gets x)$$, the number of assignments that can be done such that no constraint on $$V_i$$ is broken. • Others are defined • Promise • recognizes domain wipe-out • product discriminates better than sum • semantics of promise gives an upper bound on number of solutions • FFP name is contested • goal is to recognize dead-ends asap to save search effort • choose ordering that reduces branching factor • Early ones • LD • maximal degree • minimal ratio of domain size over degree • promise for variables • Br\’elaz heuristic • weighted degree (Boussemart et al, ECAI 2004) • graph-based • minimal width • induced with ordering w* • maximal cardinality ordering • minimal bandwidth ordering • remember to exploit domino effect, especially under dinamic var ordering • always state how you are breaking ties – generally lexicographically • is often cheap and effective • in most cases, suitable for both static and dynamic ordering • Promise for variables – Sum of the promises of the values, minimize the number of assignments that can be done such that no constraint on $$V_i$$ is broken. Has upper bound on the number of solutions ## Lecture 18: Ordering Heuristics • Least Domain • Max Domain • Domain/Degree • Promise is based on what has the most promise given the current state • Br\’elaz heuristic – originally used for graph coloring • Remember, variable ordering heuristics are cheap and effective • generally suitable for static and dynamic ordering • graph-based heuristics aren’t often used anymore • Br\’elaz • originally designed for coloring • assigns most constrained nodes first • arrange vars in decreasing order of degrees • color a vertex with maximal degree with color 1 • choose a vertex with a maximal saturation degree (number of different colors to which is it adjacent) • if thete’s an equality, choose any vertex of maximal degree in uncolored • color chosen vertex, with lowest numbered color • if all vertices are colored, stop, otherwise, choose another vertex and continue • wdeg • All counstraints assigned a weight, initially 1 • every time a constraint is broken, during propagation with look-ahead (constraint causing domain wipe-out), weight is increased • the weight of an un-assigned variable is defined as the sum of the weights of the constraints that apply to the variable • the variable with the largest weight is chosen for instantiation • weights are not reinitialied at back-track • refined with dom/wdeg • inspired by breakout heuristic in local search • However, there mart be places where dom/deg is more efficient than dom/wdeg • A decay function may help improve this • Graph heuristics • ordering by elimination or instantiation • minimal width • given an ordering, the width of a node is the number of its parents • then the ordering’s width is the largest width of any of the nodes • sum of degrees is always equal to the number of edges • take minimum of the orders, by width – this is the width of the graph • this can reduce chance of backtracking, can help to find minimum width ordering • Algo: • remove all nodes not connected to any others • $$k \gets 0$$ • while no nodes left in the graph • $$k \gets 1$$ • while there are nodes not connected to more than k others • remove such nodes from graph, along with any edges connected to them • return $$k$$ • minimal width ordering of the nodes is obtained by taking the nodes in the reverse order that they were removed • Inducted Width Ordering • What happens when you reconnect neighbors of what you remove to each other ## Guest Speaker: Philip Walton, Workday • From Workday, SaaS company for HR management • work has primarily been in Data Analysis and optimization • go from techniques and technologies to money • Could be a Linear Program • Measures, Items (index sets) • indices i, m • variables Quantity, 0 to N • model sets, limit, value, weight • maximize value and quanitity • but must be withinn weight • big thing is assigning resources in a data center • truck scheduling • many truck loads that a company has to move • origin and destination specified • and time windows at origin and destination • many drivers, many trucks • some time horizon • DOT rules • operational rules • returning to base • equipment preventative maintenance • Taking a run at the whole thing is probably too much • run a rout model to pick loads that go togethr • schedule loads – the big thing for SCPs • drivers pick the sets of loads they prefer • modeling • discrete time forms an integer domain • can be clumpy • use start time of the task as the domain • constraints like before and after are relatively straightforward • duration of driving/time windows are prety easy to express • small disruptions are normally readily repaired by freeing up parts of the model and letting it re-solve • time windows tend to be pretty large • isn’t always a notion of an optimal schedule • side constraints can be pretty easy to represent • work rules aren’t always easy • drivers may have odd drive/sleep patterns • side resource use is tolerable • Robustness for many things has to be dealt with, but how that’s done is very differennt • And remember how important interface is – they’re both important, and very, very different • And even more important is understanding what’s going on within the problem domain • good large software is built on top of amazing small software ## Lecture 19: Variable Orderings, continued ### Induced Width Ordering, $$\omega^{*}$$ • requires designing induced constraints – can cause much, much less backtracking • these are called fill-in edges • this procedure moralizes the graph • computing min induced width is NP hard • MIN FILL heuristic is used • when removing anode, add an edge between every two nodes connected to the node, but not already connected to each other • remove the node that, after removal, has the samllest number of fill-in edges • Remember to remove the node with the smallest first! • A tree is a graph with induced width 1 • An ordering has width, as does a graph. Graph width is not dependent on ordering, nor is induced width, $$\omega^{*}$$ is called induced width or tree width. • triangulated or chordal graphs are graphs in which each cycle of length 4 or more is such that each pair of non-connected vertices is connected • Makes it easy to find largest clique, can actually be done in polynomial time • is a type of perfect graphs • interval graphs are a type of triangulated graph • finding minimum number of triangulating edges is also np-hard ### Maximal Cardinality ordering • Choose an arbitrary node • among remaining nodes • choose the one connected to the maximum number of already chosen nodes • break ties arbitrarily • Repeat until done • Obtained ordering is instantiation ordering • reverse ordering yields a PEO • this is the perfect elimination ordering when reversed • If the graph does not need to be moralized, the graph is triangular ## Lecture 20: Variable Orderings, continuedg ### Max Cardinality, Continued • width of an induced ordering is same as graph • MWO can help to guaranty back-track free search • Lots of algos came from RDBMs • Induced width is NP Hard!!! • Simplicial graph is such that looking at an elimination ordering, all parents of a node form a clique • moralized is triangulated ### Minimal Bandwidth Ordering • bandwidth is node largest distance between the node and its neighbors ordering largest bandwidth of the nodes in the ordering graph smallest bandwidth across all possible orderings • heuristic localizes and confines backtracking • smaller bandwidth, sooner ability to backtrack • find minimum bandwidth ordering is np-hard • but finding an ordering of a given bandwidth is polynomial • Tsang 6.2.2, only one paper ### Ordering Heuristics • When • static vvar, val ordering • dynamic var, static val • dynamic var, dynamic val – dynamic vvp • at each given level • VVPs pertaining to the same variable across a given level, static-static • Vars are different, dyn-static • vvps are different at each and every level • Computing • sort vars at preprocessing for static • dynamic selects var during search, based on current domain, all unsintantiated neighbors, exploits the domino effect!! ## Lecture 21: Filtering For Constraints of Difference • GAC – Generalized Arc Consistency, for non-binary CSPs • Operates on table constraints, by supports or by no-goods • $$\pi_A(R)$$ gives a new relation with only values of A • May be a Set, or a Bag • finite csps are linear in space, quadratic time when dealing with all-diff constrants • Sometimes, all diff may become many constraints that together are equivalent, but not always • GAC4 handles n-ary constraints • has good pruning power • but is very expensive – $$\frac{d!}{(d-p)!}$$ • value is consistent if other values exist for all other values such that the values and the give simulatenously satisfy the problem • a constraint is consistent if all values for all variables arec oconsistent • csp is arc-consistent if all constraints are consistent • csp is diff-arc-consistent iff all all-diff constraints are arc-consistent • value graphs are bipartite • vertices are $$X_C$$, $$\cup_{x \in X_C}(D_x)$$ • edges $$(x_i, a)$$ iff $$a \in D_x$$ • Space is $$\mathcal{O}(nd)$$ • Matching a subset of edges in G with no vertex in common • max matching is the bigges possible • matching covers a set $$X$$ – every vertex in $$X$$ is an endpoint for an edge in matching • $$M$$ that covers $$X_C$$ is a max matching. • If every edge in $$GV(C)$$ is in a matching that covers $$X_C$$, $$C$$ is consistent. • CSP is diff-arc-consistent iff, for every all-diff $$C \in \mathcal{C}$$, for every edge $$GV(C)$$ belongs to a matching that covers $$X_C$$ in $$GV(C)$$ • Build graph, compute max matching, if size of max matching is smaller than $$X_C$$, return false, otherwise remove edges from G given the max matching and return true • use Maximal Flow in Bipartite graph • Or Hopcroft/Karp algo • Then we have an all-dif constraint, value graph and a maximum covering • remove edges that aren’t in a matching covering • given a Matching M • matching edge is an edge in M • free edge is an edge not in M • matched vertex is incident to a matching edge • free vertex is otherwise • alternating path (cycle) – a path (cycle) whose edges are alternatively matching and free • flip colors on even alternating path – gives another matching of same size • flip colors on alternating cycle – another matching • length of path – number of edges in path • vital edge belongs to every maximum matching • An edge belongs to some of but not all maximum matching iff for an arbitrary maximum matching $$m$$ , it belongs to either an even alternating cycle or an even alternating path tha begins at a free vertex • Edges to remove shouldn’t be in all matchings, even alternating paths starting at a free vertex and an even alternating cycle. • every directed cycle in $$G_0$$ corresponds to an even alternating cycle of $$G$$ • every directed simple path starting at a free vertex corresponds to an even alternating path of $$G$$ starting at a free vertex • make black point one way, red the other • allows for all-diff and other global constraints ## Lecture 22: CSPs and Relational DB • connections • constraint databases and constraint logic programming • query processing and solving csps • shared • constrant dbs (deductive BD, Datalog) and constraint logic programming share representation • relational databases and constraint satisfaction share computations • Constraint databases use LP – Prolog for DB, Datalog • Prolog was a bit of a failed expiriment • CLP(R,F) is prolog with CSP techniques for either reals or finite domains • Relations • binary relations – given $$D_a$$ and $$D_b$$, a set of any 2-tuples $$\langle x, y \rangle$$ with $$x \in D_a$$ and $$y \in D_b$$, this defines a relation $$R_{ab}$$ $$R_{ab} = {(x, y)} \subseteq D_a \times D_b$$ • Function – a binary relation such that there is at most one tuple $$\langle x, ? \rangle \in R_ab$$. $$D_a$$ is the domain, $$D_b$$ is the co-domain. •$k$-ary relations – are relations on $$k$$ elements • representation • 2-dimensional binary array, though k-dimension works • tables • linked-lists of pairs • hash-maps of sets • many other options • Terms • table, relation – constraint • arity of relation – arity of variable • attribute – variable • value of attribute – value of var • domain of attribute – domain of variable • tuple in table – tuple in constraint/allowed by constraint/consistent with constraint • Relational algebra • intersection • input two relations of same scope • output a new, more restrictive relation with the same scope made of tuples in all the input relations simultaneously • logical and • union • input 2 rel of same socope • less restrictive relation with same scope of tuples that are in any input relation • logical or • difference – same as set theory • selection • projection • join • composition (from CSP) ## Lecture 23: CSPs and Relational DBs, continued • Selection and projection are unary operators • Selection • Input – a relation $$R$$ and a predicate o attributes of $$R$$ • Output a relation $$R^\prime$$ with the same scope as $$R$$ but having only a subset of the tuples in $$R$$ • Row selection • $$\sigma_{p}(R)$$ • Projection • Input – a relation $$R$$ and a subset $$s$$ of the scope • Output – a relation $$R^\prime$$ of scope $$s$$ with the tuples rewritten such that positions not in $$s$$ are removed • column elimination, in CSP set semantics, in DB bag • $$\pi_{\{x_1, x_2\}}(R)$$ • Join (natural join) • Input – $$R$$ and $$R^\prime$$ • Output a relation $$R^{\prime\prime}$$ whose scope is the union of $$R$$ and $$R^\prime$$ and tuples satisfy both • if they have no common attribute, cartesian product • if they have an attribute in common, compute all possibilities • Computes all solutions to a csp • $$R \bowtie R^\prime$$ • Composition of two relations • Input – two binary relations $$R_{ab}$$ and $$R_{bc}$$ with 1 var in common • Output – a new induced relation $$R_{ac}$$ to be combined by intersection to a pre-existsing relation between them • bit matrix op is matrix multiplication • Generalization as constraint synthesis • Produces induced relations a/k/a indirect, inferred, implicit • Given $$V_1$$ and $$V_2$$ and $$C_1$$ and $$C_2$$ between them, how do $$C_1 \cap C_2$$ and $$C_1 \bowtie C_2$$ relate? – They’ll be the same in this situation • Given $$V_1$$, $$V_2$$, $$V_3$$, $$C_{V_1, V_2}$$ and $$C_{V_2, V_3}$$, write the induced $$C_{V_1, V_3}$$ in relational algebra. – $$C_{V_1, V_3} = \pi_{V1, V3}(C_{V_1, V_2} \bowtie C_{V_2, V_3}) = C_{V_1, V_2} \circ C_{V_2, V_3}$$ • Given $$V_1, V_2, V_3$$, $$C_{V_1, V_2}, C_{V_2, V_3}, C_{V_1, V_3}$$, $$C^{\prime}_{V_1, V_3} = (C_{V_1, V_2} \circ C_{V_2, V_3}) \cap C_{V_1, V_3}$$ • Natural join – synthesized constraint • left/right join – synth constraint with some inconsistent tuples • Differences • number of relations • db have few • csp have many • size of relations • dbs have high-arity • csps are mostly binary • bounds of relations • dbs are very tight • csps are loose • size of domains • db large • csp small • type of domain • almost always finite, but csp may be continuous • number of solutions sought • dbs looking for almost all • csp only looks for one in general • cost-model • DB is I/O, memory • CSP is computational ## Lecture 24: Path Consistency and Global Consistency • consistency helps to reduce possible search-space • A path of length $$m$$ is consistent iff for any value $$x \in D_{V0}$$ and for any value $$y \in D_{Vm}$$ that are consistent, there exists a sequence of values in the domains of the variables $$V_1, V_2, \ldots, V_{m - 1}$$ such that all constraints between them (along the path, not across) are satisfied • A CSP is path consistent if any path of any length is consistent • However, by Mackworth AIJ77, only paths of length 2 are necessary, $$\mathcal{O}(n^3)$$ • algo looks at every vertex as a mid-point $$k$$, and makes $$i$$ and $$j$$ together consistent given $$j$$, • and $$R_{ij} \gets R_{ij} \cap (R_{ik} \cdot R_{kj})$$ • Closure algorithm. $$k$$ is outer loop • Until fixed point • for k from 1 to n • for i from 1 to n • for j from 1 to n • $$R_{ij} \gets R_{ij}^{O} \cap (R_{ik} \cdot R_{kj})$$ • Algo • $$Y^n \gets R$$ • Until $$Y^n = Y^0$$ • For k from 1 to n • For i from 1 to n • For j from 1 to n • See slide for equation • $$Y \gets Y^n$$ • Difference between k consistent and strong k consistency • PC1 terminates, is sound and complete for discrete CSPs - $$\mathcal{O}(a^5n^5)$$ • In a complete graph, if every path of length 2 is consistent, the network is path consistent • PC1 has two operations, composition and intersection • proof is done by induction • Mohr and Henderson also developed PC-2, 3 • and Han, Lee created PC4 • PC5 from Singh uses support bookkeeping from AC6 • PC8 uses domains, not constraints, 2001 improves on PC8, but isn’t tested • PC is rarely used except for certain types of constraints • monotonic constraints – ordering based ## Lecture 25: Path Consistency and Global Consistency – continued • completion allows you to use paths of length 2. • Strong PC will ensure both PC and AC • AC first, then if no inconsistency, PC (likely PC2) • Go from weaker first to stronger – making later steps likely cheaper • But PC is often unused – except some situations make it much more likely to find a solution • PC is also called 3 consistency ### Partial Path consistency • same as PC, but no universal constraints, defined over cycles • graph is triangulated • run closure only over the tringles – $$\mathcal{O}(n^3)$$ • must be careful for articulation points in the graph • PPC is much easier to implement • PPC is strictly weaker than PC as a property • PC completes graph, PPC will triangulate • PC filters more than PPC, edge-by-edge • PPC enforces PC on common edges • which means that you can find if something is not path consistent • From Bliek and Sam-Haroud 1999 • Can PC detect insatisfiability when PPC does not? – It seems not – PPC and PC are roughly equivalent, but there are some cases – annd it was hand built ### Global Consistency properties • minimality/decomposability • A minimal network – a network where all tuples in a given constraint appear in at least one solution • computing a minimal network is NP complete • Solving a minimal network is itself NP complete ## Lecture 26: Consistency properties • Local Consistencies • AC • GAC • Path Consistency – modifies binary constraints, requirecs completed graph • Partial Path Consistency – triangulates graph, modifies binary relations • Global consistency properties • Consistency in general – it has a solution • minimality – constraints are as tight as possible, on binary constraints, elimination of a tupple eliminates a solution, every tuple is a member of at least one solution, idealized form of path consistency, garutys consistency, important in interactive product configuration – NP complete in general, Gottlob, 2011 • determining whether or not a CSP is minimal is NP complete • Finding a solution of a minimal CSP is still NP complete • occasionally called globally consistent • decomposability • CSP is decomposable if for any combination of assignments to any number of variables that is consistent given the constraints can be extended to a complete solution • guaranties back-track free search • strongly$n$-consistent • there are some islands of tractability • tree structure – AC ensures tractability • convex constraints – path consistency ensures minimality and decomposability ## Lecture 27: Consistency properties, continued • minimality doesn’t guaranty bt-free search • decomposability ensures no back tracking • PC approximates minimality • when constraints are convex, PPC on triangulated graph guarantees minimality and decomposability • when $$\circ$$ distributes over $$\cap$$, then PC1 guarantees that a CSP is minimal and decomposable, and that the outer loop can be removed • but this doesn’t always hold • constraints of bounded difference ad temporal reasoning • vars, • constraints of the form $$a \leq Y -X \leq B$$, $$Y - X = [a, b] = I$$ • composition and intersection from slides ## Lecture 28: Consistency properties, continued • Remember, once decomposability is enforced, search is guaranteed backtrack free • When constraints are convex, use of PPC ensures minimality and decomposability, but triangles must be done in PEO • When composition distributes over intersection • PC1 generalizes Floyd-Warshall (all-pairs shortest path) • PC-1 generalizes Warshall (transitive closure) • Don’t confuse properties with algorithms • Local consistency • remove inconsistent values (node, arc) • remove tuples (path consistency) • get closer to solution • reduce size of problem • are cheap • Aim for global consistency, but it’s hard • Sometimes (given certain types of constraints or graphs), local consistency guarantees lobal consistency • distributivity in PC, row-convex constraints, special networks ## Lecture 28: Local Search • Systematic/constructive search is all we’ve talked about • Starts at a solution that’s not correct, try to repair • General method: • for each variable, choose a value • check if solution • You’re done. Lucky. • pick a value that broke a constraint and change it (using min-conflict) • Local search is in worst case, $$\mathcal{O}(d^n)$$ • States laid on a surface • quality/cost is height • optimum state maximizes solution quality while minimising cost • move around from state to state to find a local optimum • Gradient descent – will only ever find a local optimum • State is a complete assignment of values to variables • how do you move – CSP uses min-conflict • eval function rates the quality of a state, generally in number of broken constraints • Try • Start w/ full but arbitrary assignment of aues • reduce inconsistencies by local repairs • repeat until • A solution is found (a global minimum) • The current solution cannot be repaired (local minimum) • You’ve gone through max tries (running out of patience) • Local repair for decision, local optimzation for optimization • Greedy types are hill climbing, local beam • Local beam • start in parallel with a number of different points • choose among those another number of best moves • remove those without promise • and repeat as necessary • Stochastic methods • break out, helps to avoid getting stuck in a local minimum • when stuck in a local minimum, walk randomly, pick a random variable, change to a random value • tabu search • keep a list of tabu positions (those that aren’t good), but only a certain number • when a new state is visited, added to tabu, and farthest back forgotten • if no better, go to the next state not tabu • repeat until a global minimum is found • sometimes you may randomly ignore a heuristic • Local search problems • get stuck in a local optimum • start having can have issues with plateaus (flat local maxima) (be ready to move to a state with the same quality) • ridges can go from side to side with minimal progress • when stuck, do random walk • Simulated annealing • When stuck in a local minimum, allow steps towards less good neighbors to escape the ocal maximum • be more random at the start, be less random as you go • Start at a random state, start countdown and loop until over • pick a neighbor at random • $$d$$ = quality of neighbor - quality of current state • if $$d > 0$$ • then move to neighbor and restart • otherwise, move to a neighbor with a transition probability $$p<1$$ • Transition probability proportional to $$e^{\frac{d}{t}}$$ • $$d$$ is negative • $$t$$ is countdown • As time passes, you’re less and less likely to move towards bad neighbors • But all forms of local search are: • non-exhaustive • non-systematic • liable to get stuck in a local optimum • non-deterministic – outcome depends on where you start • heavy-tailed – longer you go, less likely you are to find a solution • Is often somewhat probabalistic ## Lecture 29: Local Search, Continued • starts from a complete assignment, and moves to a new assignment that’s better • Very greedy approach, but can be parameterized very easily • Tends to be very greedy, but only allows improvements • others can be better, and allow forr random walks/steps – stochastic algorithms • stochastic algorithms are not deterministic! • Tabu search moves to a state that may be worse, provided it’s not in a list of previous states • break-out modifies weight of objective function • simulated annealing involves countdowns and loops, controled by quality of neighbor and the time, but when significantly restricted,, it is guaranteed to find an optimum • Have issues • non-systematic and non-exhaustive • have issues getting stuck in local optima • non-deterministic!! • generally the probability of improving a solution gets smaller, but never actually dies out – heavy-tailed • So, when you hit a point where it seems improvability drops • keep the most promising solution • and start someplace else • Random restart • Luby restart profile • uses a crazy geometric law to decide how/when/where to restart ### Genetic algos for local search • Combine two complete assignments to generate offspring • Start from an initial position • encode assignments in a compact manner • combine partial solutions to generate new solutions • requires initial generation of many random assignments • Choose solutions to combine based on quality • fitness is used to describe quality, assigns probability for solution • Select pairs depending on fitness • crossover point is chosen randomly, offspring are generated • mutation will randomly change a state • and this gets put back in the original population ### ERA – Environment, Rules, Agents • Environ is $$n \times a$$ board • each var is an agent • each position is a value of a domain • agent moves in a row on the board • each position records the number of violations caused by the fact that agents are occupying other positions • agents try to occupy positions where no constraints are broken • agents move according to reactive/reaction rules • agents may themselves have preferences • move rules can be as simple or complex as possible • rules: least-move choose position with min violation better-move choose position with smaller violation random-move randomly choose a position others combinations of those above • No one talks to each other, but share a common context • everyone kicks eachother out until every one is happy • very effective in solving very tight, but solvable instances • unstable in over-constrained cases • agents keep kcicking each other out (livelock) • livelocks can be used to identify bottlenecks • very stable on solvable instances • oscillates wildly on unsolveable instances • Agent motion can be categorized as variable, stable, constant ## Lecture 30: Local Search, continued ### ERA, Continued • Rules help to find a solution somewhat quickly • Has less of an issue with getting stuck in various local optima • May be very good for resource-allocation problems • more dynamic, as there’s no global evaluation function ### Nothing works, what to do • Random restart • when no progress made, restart from a new srandom selected state • save the best results so far • repeat random restart • for a fixed number of iterations • until best results have not ben improved on for a certain number of iterations – e.g., a geometric law ### Evaluation of Local Search • Test on a given problem instance • an ensemble of problem instances • Experiment • running algo thosands of times • measure some metric as a random variable – i.e., the time needed to find a solution • Then provide • probability distribution function approximated by the number of runs that took a given time to find a solution • the cumulative distribution function approximated by the number of runs that found a solution within a given time • Then compare performance with statistical tests • t-tests assume normal distribution of the metrics • nonparametric tests don’t • Consult with a trained statistician ## Lecture 30: Structure-based methods • CSPs represented by networks • for bCSPs, graph • constraint graph • microstructure • co-microstructure • for general CSPs • hypergraph • dual – allowed tuples connected to each other when shared variables match • primal graph – vertices in graph are original variables, are variables in a constraint are a clique ## Lecture 31: Structure-based methods, continued Graph Vertices Edge Binary Var Constraints Non-binary Var constraints Dual Constraints Equality constraints on shared variables primal variables cliques over scope of constraints $$\mu$$ structure VVPs supports co $$\mu$$ structure VVPs conflicts • Incidence graph – Variables are one type of node, constraints another, edges connect variables to the constraints, is bipartite ### Tree-structured CSPs • Tree-structured CSPs can be solved in polynomial time • apply $$revise(V_i, V_j)$$ for all nodes from leaves to root • instantiate variables from root to leaves • Generally • directional consistency from leaves to root • solutions built from root to leaves • This is often the case for configuration problems • Width of a tree is always one for a tree • If a CSP graph has $$w = 1$$, it is guaranteed to be solved with 2 consistency • Freuder has therem that if a csp is $$w$$ in width, then$w+1$-consistency is guaranteed to solve it. • May have issues w/ changing width of graph • There are ways to prevent this, but is not necessarily true ### Cycle-cutset • identify a cycle cutset $$S$$ in the CSP • cutset – nodes when removed yield a tree • Decompose CSP into two partitions • $$S$$ • $$T$$ the nodes forming a tree • Iterate until a solution is found • Solve the nodes in $$S$$ • Try to extend the solution to $$T$$ • $$|S| < |V|$$ • if $$|S| = c$$, then$\mathcal{O}(dc(n-c)d2) = $$\mathcal{O}(nd^{c+2})$$
• But finding smallest cutset is $$NP$$ hard

## Lecture 32: Structure-based methods, continued

### Dynamic Dangle Identification

• graham reduction used to detect if schema is acyclic
• join of acyclic schema is polynomial
• remove vertices w/ degree 1
• check connectivity
• identify dangling trees using graph reduction
• for each dangling root
• do DAC from leaf to root
• wipeout means unsolvability
• Restrict search to nodes outside of the identified dangling trees

### Directional Path Consistency

• Given a binary CSP and a fixed variable ordering
• $$\mathcal{O}(nw*^2d^3)$$
• if $$w* \leq 2$$
• DPC determines consistency
• guarantees BT-free search given the ordering
• But this only works iff induced width is less than or equal to 2, you get strong consistency, minimality and tractability

### Biconnected components

• decomposition into separable graphs
• consider a CSP whose graph has artiulation nodes
• assume that the largest biconnected component is $$b$$
• build a tree whose nodes are biconnected components considering that the articulation node belongs to the parent (or parent and child)
• build an order using a pre-order traversal of the tree
• complexity is then bound by the size of the largest biconnected component

## Lecture 33: Structure-based methods, continued

### Biconnected components

• must have articulation nodes – nodes common to two or more problems
• is $$d^b$$ where $$d$$ is the domain size and $$b$$ is the size of the largest component
• Use a pre-order of the variable tree, which means that BT effort is the size of the largest component
• Originally from 1982, in Journal of the ACM

### Tree Clustering

• bi-connected components, but with more than one shared variable for each cluster
• triangulates the graph
• finds maximal cliques
• create a tree structure from cliques
• Then repeat:
• solve a clique at each node in the tree
• apply DAC from leaves to root
• generates solutions in a BT-free manner
• MaxCliques from Algorithmic Graph Theory and Perfect Graphs, Golumbic
• Triangulation is $$\mathcal{O}(n^2)$$
• Max clique is $$\mathcal{O}(n + e)$$ in triangulated graph
• $$\mathcal{O}(k^r)$$ for solving clusters, $$k$$ is domain size, $$r$$ is size of largest clique ($$t = k^r$$)
• generating solution $$\mathcal{O}(nt\log t)$$, $$t$$ is tuples in each cluster (sorted domain of a ’super’ variable (sorting can help to speed things up)), in best case, we have $$n$$ cliques
• complexity bounded by size of largest clique:
• $$\mathcal{O}(n^2) + \mathcal{O}(k^r) + \mathcal{O}(nt\log t) = \mathcal{O}(nk^r\log k^r) = \mathcal{O}(nrk^r)$$

## Lecture 34: Structure-based methods, continued

### Bucket Elimination

• Built on non-binary CSPs
• Dechter 1997
• Choose a variable ordering – PEO recommended
• create buckets for each variable, placing each constraint in the bucket of the deepest variable in its scope
• create tree node for each bucket
• From bottom to top
• join relations in bucket (including on the domain)
• project join on deepest bucket with common variable names
• top-down creates solution backtrack free
• Can handle cycles!
• And is a pretty quick form of DAC, but for non-binary CSPs

## Lecture 35: Structure-base methods, continued

### Bucket Elimination/Tree Clustering

• linear in $$n$$, exponential in induced width + 1
• induced width is the number of vars in the largest component
• gives fixed parameter complexity

### Bucket Elimination

• very much like Gaussian elimination
• Useful for Bayesian networks – graphical models of all kinds

### Tree Decomposition

• Generalizes all of structure methods
• comes from database theory
• organizes any constraint problem into a tree
• $$(T, \Chi, \Psi)$$ of $$P = (X, D, C)$$
• $$T$$ – $$(V, E)$$ the tree
• $$\Chi(v) \subseteq X$$ – A mapping of all variables in a node
• $$\Psi(v) \subseteq C$$ – A mapping of constraints and variables
• Each constraint must appear on at least one node in the tree, and all its variables are in that node
• nodes where any variable appears induce a single connected subtree (connectedness property), the path between any two nodes with the same variable will all have the variable
• join trees and acyclic DBs from database theory
• organizes the constraint network into a tree structure, a join-tree
• labels tree nodes with variables and constraints – labels must obey some rules
• Solving
• bottom-up – joins relations and projects on parent
• top-down generates the solution
• Complexity bound by width
• tree width is the max number of variables in a node - 1
• hyperwidth is the max number of constranits in a node
• Group of variables are the variables in common to two nodes
• However, selecting the tree decomposition is NP-hard, use approximation by min-fill
• Quite a few different tree-decomposition techniques – Gottlob’s HYPERTREE seems to be one of the best
• parameters
• tree width max number of variables in any node minus 1
• hypewwidth – number of constraints in any node
• theory – characterizes tractable csps
• complexity is bound by tree/hyper-width
• fixed parameter complexity – independent of number of variables
• practice – uses structure to solve csps
• requires balancing time and space

## Lecture 26: More on Constraint Consistency

### Global Consistency Properties

• minimality – relations join completely, $n$-consistency
• decomposability – strong $n$-consistency
• solvability

### Local Properties

• AC/GAC – Domains
• Path Consistency – Constraints, add new constraints (constraint synthesis)
• Partial Path Consistency – Constraints, add new constraints
• Directional Path Consistency – Constraints, add new constraints
• Types of constraint algorithms
• Binary
• Non-Binary
• Global Constraints
• For continuous domains, work on box-consistency
• On interval, box consistency can occasionally also be used
• weighted CSPs can also be dealt with, with a variant of AC

### Info

• arc consistency – $(1,1)$-consistency – for every variable, the solution can be extended to another consistent assignment
• path-consistency – $(2,1)$-consistency – for every two variables, the solution can be extended to another consistent assignment
• $$(i,j)$$ – for every consistent assignment of length $$i$$, the solution can be extended by $$j$$ consistent assignments ($$i + j = n$$)
• strong $k$-consistency is $k$-consistency down to $$1$$ consistency

## Lecture 27: Still more on Constraint Consistency

• domain minimality – constraints over arity 1, Gottlob 2011
• however, even if domain minimality and constraint minimality hold, solving is still NP complete
• global inverse consistency – where given a value at 1 var, you can extend it to $$n - 1$$ assignments

### $(i, j)$-consistency

• $$j = 1$$
• 2 – AC
• 3 – PC
• k – $k$-consistency
• $$(1, n - 1)$$ – domain minimality, global inverse consistency
• $$(1, k)$$ – $$d^{k + 1}$$, guaranteed that a solution can be extended over $$k$$ variables
• NIC – Neighborhood Inverse Consistency, for every variable, look at its neighbors, and make consistent with those

## Lecture 28: More Constraint Consistency

### Singleton Arc Consistency

• family of algos based on singleton test
• singleton test like search – kinda
• The CSP is AC for every VVP
• Algo
• Until No change
• for each var
• for each value
• assign the val to the var
• if CSP is AC, keep
• else Remove
• Sometimes, when you get lots of backtracks, trigger a high level consistency

### Inverse consistency

• (1, 2) consistency – Path Inverse consistency
• Neighborhood Inverce – see previous
• GIC – Global Inverse Consistency – ensure domain minimality

### Special Constraints – AC

• functional – A constraint C is functional with regard to D iff for all $$v \in D$$ there exists at most one $$w \in D$$ such that $$C(v,w)$$
• monotonic constraints – a constraint C is monotonic with regard to $$D$$ iff there exists a total ordering on $$D$$ such that, for all $$v$$ and $$w \in D$$, $$C(v,w)$$ hold, implies $$C(v^{\prime}, w^{\prime})$$ holds for all values $$v^{\prime}, w^{\prime} \in D$$ such that $$v^{\prime} \leq v \land w^{\prime} \leq w$$

### Non-Binary Special constraints

• almost all properties and algos discussed were restricted to binaryies
• consistency for non-binary is the topic of current research
• in particular domain filtering that doesn’t change topology or modify constraint definitions
• relational $$m$$ consistency – adding constraints/changing constraint definitions
• Domain Filtering
• GAC
• Singleton GAC
• maxRPWC, rPIC, RPWC, etc
• Relational consistency
• Strong consistencies
• Simple Tabular Reduction

## Lecture 29: Extensions to BT Search

• bounded number of backtrack search
• bounded backtrack depth search
• limited discrepency search
• credit-based backtrack search
• randomized backtrack search

### Credit Based Search

• start with a given credit, usually $$n^3$$
• assign 1/2 credit to current assignment, 1/2 to the remaining
• keep going in a depth-first manner, until credit is used up, chronologically backtrack from there
• Used in conjunction with backtrack bounded search

### Limited Discrepancy Search

• may be blind at shallow levels of search
• disobeys a given number of times
• Matthew Ginsburg

### Randomized BT

• in systematic search, ordering of the variables/ values determines which of the solution is explored
• randomization allows us to explore wider portion of search tree
• thrashing causes stagnation of BT search, interrupt it, and restart
• Pick random variables at instantiation
• And Random values when you’re picking values
• Use the restart schedule when thrashing
• Fixed cutoff and universal – Luby
• Randomization and Rapid restarts (Gomes), fixed optimal cutoff, priori knowledge of cost distribution required
• randomization annd geometric restarts Walsh – $$C_{i + 1} = r C_i$$, $$C_0 = n$$
• Randomization and dynamic geometric restarts (guddeti) – $Ci + 1 = \left\{\begin{matrix} rCi & \text{when solution improved} \\ Ci & \text{otherwise}\end{matrix}\right.$
• Bayesion approach (Kautz)