UP | HOME

Fundamentals of Constraint Processing

Table of Contents

Lecture 2

Resources

  • check website – has many, many links
  • conferences
  • journals
    • include logic programming and CSP
    • Linear Programming, Integer Programming, Mixed Integer Programing, Dynamic Programming
  • journals are also pretty big – and tend to be easier to get into
  • many good ways to get jobs with this

Constraint Satisfaction 101

  • examples
    • generating optimal decisions
      • have unary, bynary, n-ary constraints
      • often have many choices, but restrictions can make it easier to make the right decision
  • modeling constraints remains an art
  • sets of variables with sets of choices, and aset of constraints restricting combinations of values
  • applications
    • radio resource management
    • databases
    • temporal and spatial reasoning
    • allocation
    • design/config
    • graphics, visualization, interfaces
    • verification and engineering
    • HCI and decision support
    • molecular biology
    • analytical chemistry
    • robotics, machine vision, computational linguistics
    • transportation
    • qualitative and diagnostic reasoning
  • constraint processing is modeling, plus inference and search
    • modeling – variables, domains, constraints, query
    • inference – enforcing consistency (filtering) and propogation (AC, PC)
    • search – systematic (chronological backtracking, with or without intelligence, etc) or non-systemactic (hill climbing, taboo, etc)
  • Real Example
    • Netherlands have dense network
    • wanted more robust schedule that’s more effective, increasing profit
    • used OR and AI
  • Singapore
    • large container hub
    • has custom, special loading built using CP
  • CP is:
    • solving decision problems
    • with arbitrary constraints
    • providing concise and high-level feedback about alternatives and conflicts
  • Very Practical

Defining a problem

  • has a given and a question
  • given is a set of objects, the relations, etc
  • question ex: find \(x\) such that condition \(y\) is satisfied
    • these can be simple or advanced
  • a problem must always, always, always have a question!
  • Given \(\mathcal{P} = (\mathcal{V}, \mathcal{D}, \mathcal{C})\)
    • \(\mathcal{V} = \{V_1, V_2, \ldots, V_n\}\) Variables
    • \(\mathcal{D} = \{D_{V_1}, D_{V_2}, \ldots, D_{V_n}\}\) Domains for Variables
    • Constrants \(\mathcal{C} = \{C_1, C_2, \ldots, C_l\}\) where \(C_{V_i,V_j,\ldots,V_k} = \{(x,y,\ldots,z)\} \subseteq D_{V_1} \times D_{V_2} \times \ldots \times D_{V_n}\)
  • Query whether we can find a value for each variable such that all constraints are satisfied, queries can include weights (making it a Constrained Optimization Problem)
  • Various levels of consistency
    • AC
    • GAC
    • SAC
    • SGAC
  • Sudoku can be done with consistency, but whether or not search is required for all
  • find solution – decisions
  • find sumber of, all solutions – counting
  • find set of constraints that can be removed so that a solution exists – optimization
  • specials
    • binary CSPs have only two variables
    • restricted to \(\{0, 1\}\), boolean CSP
    • finite, discrete domains, enumeration can work
    • continuous domains require sophisticated algebraic techniques are needed – consistency techniques are then used on domain bounds

Lecture 3

  • always a given \(\mathcal{P} = (\mathcal{V}, \mathcal{D}, \mathcal{C})\)
  • Constraints are relations
    • relations are a subset of the cartesian product of \(n \geq 2\) sets
    • relations can be represented as a table, which are refered to as allowed tuples, or supports, which is a table constraint, being an enumerated
    • extension in a set with all members enumerated
    • conflicts are the set of tuples that are no good
    • Constraints can also be in intention – they’re defined in terms of a logical function, equivalent to extension, but can be easier to understand
    • A check function should always be a simple implementation – to build with them
  • Code must always be well structured!

Lecture 4

  • \(P = (V, D, C)\)
    • \(V\) variables
    • \(D\) domains
    • \(C\) constraints
  • Constraints are defined as \(C_1 = \langle\mathrm{scope}(C_1), \mathrm{rel}(C_1)\rangle\)
    • Scope is the variables it discusses – the set of variables over which the constraint is defined
    • relation is an abstract relation – may be in extension (listed as either supports or conflicts) or intension (roughly as a function or boolean expression)
    • Also have arity, the cardinality of the scope
    • universal constraints allow any member of the relevant cartesian product, but may be induced so as to be something else – equivalent to know constraint, and is binary alone
  • VVP – Variable Value Pair
  • make sure to implement check function, and to do so in a very independent manner
  • keep track of a number of constraint checks, always incrementing for every check
  • will be writing abscom to parse
  • Graph representation
    • Macro representation
    • Variables are represented as nodes (vertices)
    • domains as node labels
    • constraints as arcs (edges) between nodes
    • hypergraphs allow for special constraint nodes, or use a bubble to connect them instead – the hyperedges
  • relations in intension are defined by set-bulider notation
    • used when not practical/possible to list all tuples
    • define types/templates of common constraints
    • linear constraitns
    • all-diff constraints
    • at most
    • TSP-constraints
    • cycle-constraints
  • constraints implemented
    • in extension
      • set of tuples (list/table)
      • binary matrix (bit-matrix)
      • multi-valued decision diagrams
    • in intension
      • constrained decision diagrams
      • predicate functions

Lecture 5

  • examples of modeling
    • can have temporal csp
    • graph or map coloring – requires 4
    • resource allocation – airline trip planning, list coloring
    • product configuration – allowed components and what they require/forbid
    • logic puzzles
  • Constraint types
    • algebraic constraints
    • constraitns of bounded difference
    • coloring
    • mutual exclusion
    • difference constraints
    • arbitrary constraints, must be made explicit
  • Databases
    • Join in DB is a CSP
    • View materialization is a CSP
  • Interactive systems
    • data-flow
    • spreadsheets
    • graphical layout systems and animation
    • graphical user interfaces
  • Molecular biologiy
    • threading and similar

Formal characterization

  • macrostructures
    • networks/constraint graphs, \(G = (V, E)\), may be hypergraphs
  • Micro-structure – includes every possible Variable Value Pair
    • VVPs are linked iff are a support and are consistent
    • cliques of size \(n\) are possible solutions
    • however, finding cliques is \(NP\) complete
  • Co-microstructure – includes every possible VVP
    • VVPs are linked iff they are a no-good
    • cliques of size \(n\) prevent a given solution
  • Complexity
    • decision problem
    • NP-complete by reduction from 3SAT
    • can be verified in polynomial time, but solution is not generated in polynomial time

Proof of CSP into SAT

  • CSP has Variables, Domains, Constraints
  • Every VVP is a Boolean variable in SAT
  • Requires a clause for each variable, a disjunction for each VVP, to be added to the conjunction of clauses, along with a disjunction of not a or not b of each diffenrent VVP as pairs
  • Constraints are then mapped to SAT

Lecture 6

Solving a CSP

  • by search
    • constructive, systematic, exhaustive – involves enumerating all possible VVPs
    • iterative repair, local search – pick an assignment and change unsatisfied constraints until all constraints are satisfied
    • Before start
      • consider the model/forumaltion – controls the size of the search space (defining a model is an art)
      • pre-process
        • i.e. constraint filtering/propagation, consistency checking – reduce the size of search space
  • Constraint Checking
    • arc consistency
      • removes individual possible elements from the domains of either element
      • Revise Function (B, C):
        • \(\forall x \in D_B\)
          • if support(\(x\), \(D_c\))
            • break
          • else
            • \(D_B \gets D_B \setminus \{x\}\)
      • Support function: for every element, if supports, return true, if none exist, default to false
      • will not always find a solution, but will reduce problem size
    • k-consistency
      • consistency on \(k\leq n\) variables
  • Systematic search
    • for every value of Vi, check for consistency with previous assignments (starting from the bottom), cuts off portions of the tree if false (using backchecking to prune a search tree)
    • back-checking expands a partial solution iff it is thus-far complete
    • chronological backtracking – DFS with only the current path, undoes the most recent variable assignment and tries a new one – can be used for only one solution
      • is complete – will always find a solution if it exists
      • is sound / correct – a solution it gives will always be correct – will not give a false positive
    • intelligent backtracking
      • jump back based on the reason for failure – generally the deepest
    • Ordering heuristics
      • deciding what to expand first
      • often most constraint and most promising values first
      • strategies for variable/value ordering could be static/dynamic
      • effects branching factor – best to branch as little as possible
      • generally variable ordering is dynamic, but value is static (lexicographic)
    • Keep-in mind forward-checking – removing values from domains that aren’t consistent – a/k/a domain annhilation – and is more efficient
    • Real Full Lookahead is used to further restrict domains and search spaces more than plain forward-checking
    • balance between reasoning and search

Lecture 7

Arc Consistency

  • revise should only modify one variable’s domain
  • requires support, takes VVP and domain of other
    • checks for VVP in X with every element of Y, returning true if found, false if none found
    • check always increments if there’s a constraint
    • revise should check if there’s a constraint to begin with
  • review reading!
  • VVPs can be represented as tuples, may use angle brackets
  • AC-1:
    • assume CSP has \(n\) variables, having domain size \(d\), and \(e\) constraints
    • \(Q\) is a set of directed edges to revise, from x to y, from y to x, size \(2e\)
    • for all edges in \(Q\)
      • set acp to fales
      • repeat until revisedp is false
        • set revisedp to false
        • for everything in \(Q\)
          • if it’s revised, set revisedp to true
          • But, if a variable’s domain is empty, break and report – this CSP has no solution
      • This CSP is then arc-consistent
    • Make sure to sort list of edges!
  • Unary constraints should be done first!

Lecture 8

Property Algorithm Complexity
AC AC-1 \(O(na^3e) = O(n^3a^3)\)
  • solutions found by
    • detecting components
    • NC
    • normalize constraints
    • NC, AC, other forms of AC
    • Search
    • Verification
    • Solution, Node Vists, Constraint Checks, CPU Time
  • when implementing AC1, terminate if domain wipe-out occurs
  • worst case complexity is seldom reached
  • remember, AC may discover a solution to a CSP, or may find an inconsistent problem (by domain annhilation)
  • AC3
    • for \(i \gets 1\) to \(n\), do \(NC(i)\)
    • \(Q \gets \{(i, j) | (i, j) \in arcs, i \neq j\}\)
    • while \(Q\) is non-empty:
      • select and delete any arc \((k, m) \in Q\)
      • if \(\mathsc{Revise}(k, m)\) then \(Q \gets Q \cup \{(i, k) | (i, k) \in arcs(G), i \neq k, i \neq m\}\)
    • Check for domain wipe-out!

Lecture 9

  • Remember to deal with Node Consistency – and don’t include it in the check count
  • see lecture 8 notes
  • AC3 is not only more powerful
  • \(2e + a[2e - n]\)
  • AC3 is \(O(a^2 (2e + a(2e - n))) = O(a^3e) = O(a^3n^2)\)
  • AC4 is even more efficient, but requires special bookkeeping, is \(O(a^2n^2)\)
  • Variant of AC-3 called AC-2001, but requires cubic space, has yet another variant, AC-3.1\(^{m}\), requires different bookkeeping
  • AC4
    • generate m, s, and counter
      • m and s have as many rows as there are VVPs in the problem, m is set to true, s to empty list
      • m is for alive
      • s is for a list of all VVPs that support a given VVP
      • given a vvp, provide a number of supports provided by a given variable
    • prepare data structures
      • for every constraint all tuples:
        • when allowed, update s
          • increment counter
    • Then process all constraints
      • if counter is 0, remove from domain, and update \(m\)
      • since \(x\) is removed, then for everything such that it was a support, decrement the counter
  • AC3 tends to be a bit better than AC4

Lecture 10

  • soundness – can trust
  • complete – will always find a solution if exists
  • more efficient AC variants are useful for use during search
  • but when enforcing AC during search, you must be able to keep track of what was removed
  • remember properties are not the same as the algorithms! there may be many algorithms to implement each property!
  • most consistency methods are local initially
    • node/arc remove inconsistent values
    • path removes inconsistent tuples
      • used when you have positive tables of supports
    • they get closer to the solution
    • reduce the size of the problem
    • are cheap (polynomial time)
    • choice of consistency properties helps to reduce time to solution

Intelligent Backtracking Algorithms

  • Read Prosser, CI 93. Hybrid Algorithms for the Constraint Satisfaction Problem
  • Consider Reading 5/6 of Dechter
  • Issues
    • Variable/Value ordering
    • Variable Instantiation
    • Currentt Path
    • Current Variable
    • Past Variables
    • Future Variables
    • Shallow/Deep Levels/Nodes
    • Search space/Search Tree
    • Back-checking
    • Backtracking
  • As you go, un-assigned variables are the future variables/future sub-problem
  • those that have been instantiated are the past variables
  • the current path is current instantiations
  • always check a new instantiation against past variables (back-checking)
  • algorithms
    • Vanilla: BT
    • Improved Backward: BJ (back-jumping), CBJ (conflict-directed BJ)
    • Improving Forward: BM (back-marking), FC (forward-checking)
  • D-Way branching is conceptually easier to understand than 2-way
  • algorithms from Prosser are iterative back-tracking and pluggable/modularized
  • Kondrak and Van Beck looked at the efficiency of algorithms

Lecture 11

  • Variables \(V_i\), \(i \in [1, n]\)
  • Domains \(D_i = \{V_{i1}, v_{i2}, \ldots, v_{iMi}\}\)
  • Constraint between \(V_i\) and \(V_j\) : \(C_{i,j}\)
  • Constraint graph \(G\)
  • Arcs of \(G\): \(\mathrm{Arc}(G)\)
  • Instantiation order is static or dynamic
  • lang primitives are from lisp
  • Data structures
    • \(v\) is a 1 by \(n\) array to store assignments, \(v_0\) is the root, backtracking to this indicates insolvability, \(v_i\) is the assignment to the $i$th variable
    • \(domain\) is the original domain
    • \(current-domain\) is the current domain, must be updated on back-tracking
    • check is as before
  • Generic form
    • takes n, status
    • set consistent to true
    • status to unknown
    • \(i \gets 1\)
    • while the status is unknown
      • if consistent
        • \(i \gets \mathrm{label}(i, consistent)\)
        • else \(i \gets \mathrm{unlabel}(i, consistent)\)
      • if \(i > n\)
        • set status to solution
        • else if \(i = 0\), set status to impossible
  • label and unlabel functions are provided by the backtrack algorithm
  • label is forward move, unlabel is backward move
  • For BT
    • label
      • when \(v_i\) is assigned a value from the \(current-domain_i\), we perform back-checking against past variables
      • if back-checking succeeds
      • return \(i + 1\)
      • else, remove assigned value from current-domain[i], assign next value, etc
      • if no value exists, set consistent to false
      • function
        • takes i, consistent
        • set consistent to false
        • for v[i] in each element of the current domain while not consistent
          • set consistent to true
          • for h from 1 to i - 1 and while consistent
            • set consistent to check(i, h)
          • if not consistent
            • then remove i from current-domain
        • if consistent, return i+1, otherwise i
    • unlabel
      • current level is set to \(i - 1\)
      • for future variables \(j\), reset the current-domain
      • if the domain of \(h\) is not empty, set consistent to true
      • Function
        • takes i, consistent
        • set h to i - 1
        • set current-domain[i] to domain[i]
        • remove v[h] from current-domain[h]
        • set consistent to current-domain[h] not-equal empty-set
        • return h
    • label does the actual instantiation

Lecture 12: Phase Transitions

  • Cheeseman 1991
  • Order parameter
  • Probability solution exists for random problems is almost 1
  • there’s a critical value whereby, after it, probability is almost 0
  • around the critical value, the probability is around 0.5
  • cost of solving drops sharply as it gets further to the right of the critical value, but high as it goes to it
  • around .5 probability is high cost of solving, no matter the algorithm, this is referred to as a phase transition
  • conjecture regarding the characterization of NP complete problems
  • applies to detecting/implementing arc-consistency
  • random graphs are almost always easy to to color – conjecture from famous paper
    • graph coloring is a reduction operator
    • friends are vertices with the same neighborhood
    • this is the degree
  • For CSPs, it’s either density or tightness
  • currently effects the way of expirement conduct – try and deal with the hardest problems
  • but be careful not to focus exclusively on the redior around the critical value
  • Run on random CSPs – given statistical analysis
  • Vary params \(\langle n, a, t, p \rangle\)
    • \(n\) – number of variables
    • \(a\) – domain size$
    • \(t\) – tightness \(\frac{|\mathrm{forbidden tuples}|}{|\mathrm{all tuples}|}\)
    • \(p\) – proportion of constraints \(p = \frac{e}{e_{max} - e_{min}}\), also \(\frac{2e}{n(n-1)}\)
    • \(t\) and \(p\) will be between 0 and 1
    • Can have issue with uniformity, difficulty (phase transition), solvability
    • empirical studies can be done simply by varying one of the parameters
    • do it (Model B):
      1. generate \(n\) nodes
      2. generate a list of \(\frac{n(n-1)}{2}\) tuples of all combinations of 2 nodes.
      3. choose e elements from the above list as constraints to go between the \(n\) nodes
      4. if the graph is not connected, throw away, go back to step 4, otherwise proceed
      5. generate a list of \(a^2\) tuples of all combinations of 2 values
      6. For each constraint, choose randomly a number of tuples from the list to gparantee tightness \(t\) for the constraint.
    • Model B instances may be flawed, but it’s rare

Lecture 13: Back-tracking: Continued

  • keep track of path – an array of the instantiations thus far and to be created
  • remember what unlevel does – actually performs backtracking
  • many different ways to order variables
  • when doing value-ordering, use lexicographic ordering!
  • eventually, variable ordering heuristics must be broken with lexicographic ordering

Back-Jumping

  • uses \(max-check\), an array, and when check(i, h) succeeds, \(max-check[i] \gets max(max_check[i], h)\)
  • if the current domain of \(h\) is empty, chronological back-tracking is performed from h
  • you go back to \(max-check[i]\) to fix stuff when you have an empty variable, unless you’ve already back-jumped in the current path
  • BJ just modifies BT-label and BT-unlabel
    • BJ-label just updates \(max-check\)
    • bj-unlabel
      • backtrack to h = max-check[i]
      • resets max-check[j] = 0 for j in \(h+1\) to \(i\)

Conflict-directed BJ

  • requires a new data structure, the conflict-set array
    • tracking at what levels conflicts occur
  • knows how to jump-back again, to the deepest level of conflict
  • can jump-back more than once!
  • useless if good variable ordering
  • conflict sets are initialized to \(\{0\}\)
  • At any point, conf-set is a subset of past variables that are in conflict with \(i\)
  • when check(i, h) fails add h to conf-set[i]
  • when domain is empty
    • jumps to deepest past variable in conf-set
      • update \(conf-set[h] = conf-set[h] \cup conf-set[i] \setminus \{h\}\)
  • To calc all solutions – don’t use CBJ

Lecture 14: Back-tracking: Continued

CBJ Continued

  • max-check – deepest level checked against – if instantiated, will be \(i - 1\)
  • conflict set is used for holding what’s conflicted
  • Finding all solutions with CBJ
    • use the conflict sets
    • or use CBF from Kondrak
    • keep track of number of solutions
  • make sure to check generated solutions with the solution checker!!
  • modify algo to after solution, back track to every level at least once!
    • in actuality, must be back tracked to exactly one
  • look at Chis Thiel’s response
    • when a solution is found, force the lat var to conflict with everything before it. In turn, this forces some chrono backtracking as the conf-sets are propagated backward
  • Kondrak’s responce
    • cbf flag, vector
      • \(\forall i, cbf[i] \gets 0\)
      • when you find a solution, \(\forall i, cbf[i] \gets 1\)
      • In unlabel
        • if \(cbf[i] = 1\)
          • \(h \gets i - 1\), \(cbf[i] = 0\)
          • else \(h \gets max(conf-set[i])\)

Lecture 14: Forward Steps

Backmarking

  • tries to reduce the amount of consistency checking
    • \(v[i]\) to reassign to \(k\)
    • \(v[i] \gets k\) last checked against \(v[h] \gets g\)
    • \(v[h]\) has not been modified
  • not super useful – as consistency enforcement is more useful
  • this helps to reduce back checking
  • data structures, max back-check level, minimum backup level
    • \(mcl\) max check level, \(n \times m\) array, \(mcl[i, k]\) stores deepest variable that \(v[i] \gets k\) was checked positively against, finer-grained mcl
    • \(mbl\) min. backp level \(n\) array, gives the shallowest past variable whose value has changesd since \(v[i]\) was the current variable
  • BM does not allow dynamic variable ordering
  • aware of the deepest variable that \(v[i] \gets k\) is \(v[j]\)
  • values of variables in the past of \(v[j]\) (\(h
  • so we need to check \(v[i] \gets k\) against the values of the variables between \(v[j]\) and \(v[i]\)
  • we do not need to check against the values of the variables in the past of \(v[j]\)
  • when \(mcl[i,k] \leq mbl[i]\), don’t check, fails!
  • when \(mcl[i,k] \geq mbl[i]\), don’t check, succeeds! – type B savings

Forward checking

  • Looks ahead from the current variable, considering all future variables and clear from the domains the values that aren’t consistent with the current partial solution
  • FC means more work in the future, but causes fewer nodes to expand
  • When it moves forward, the values of the current domains of future variables are all compatible with past assignments, saves backchecking
  • FC may cause wipe-out, but discovers conflicts early on. will backtrack
  • goal is to find failures early – avoids expanding fruitless subtrees
  • is not truly forcing arc consistency, that’s RFL
  • requires a DS to keep track of filtering, can be done in various ways

Lecture 15: Forward Steps, Continued

Forward Checking

  • look-ahead – revises future variables
  • makes more work at instantiation, but expands fewer nodes
  • as you move forward, values in future variables are consistent with past variables
  • can cause wipe-out and conflict discovery early on, backtracking chronologically
  • requires current domain, stack of variables removed by previous instantiations
  • future is a subset that v[i] checks againts
  • past-fc is past variables that checked against v[i]
  • requires functions
    • check-forward – called when instantiating v[i]
      • revise(j, i)
      • returns false if current-domain[j] is empty, true otherwise
      • values are removed from current-domain are pushed as a set into reductions[j]
    • update-current-domain
      • \(current-domain[i] \gets domain[i] \setminus reductions[i]\)
    • fc-label
      • attempt to instantiate current-variable
      • filters domains of current
      • on wipe-out of future variable, uninstantiate and undo domain filtering
  • Suffers from thrashing, as based on bt

Variable Ordering

  • dom, least domain, smallest domain first
  • degree of variable (number of neighbors)
  • dom/deg is most common
  • Degree is difficult, as it changes very rapidly
    • can/should be implemented statically

Consistency Check Summary

  • CBJ
    • uses backtracking, no addtional structures
  • backmarking uses mcl and mbl
    • two types of consistency-checking savings
  • forward checking
    • works more at each instantiation, but expands fewer subtrees
    • uses reductios, future-fc, past-fc
  • FC-CBJ is the best. Period
  • Other Backtracking algos
    • Graph-based back-jumping – Dechter
    • Pseudo-trees [Freuder, 85]
  • Other lookahead – DAC, MAC – DAC seems to be pretty good as a compromise between MAC and FC
  • More emperical evals
    • now by generating random problems
  • theoretical comparisons

Implementation notes

  • Enforce NC
  • Normalize constraints
  • check for empty crelations
  • Interrupt as soon as you detect domain wipe-out
  • Remember dynamic variable ordering includes the domino effect
    • a/k/a pick singleton domains first

Lecture 15: Intelligent Backt Tracking – Theoretical Eval

  • Kondrak and Van Beek, IJCAI 95
  • Dechter 5, 6
  • most backtracking algorithms are evaluated empirically
  • performance depends heavily on problem instance
  • is therefore average case analysis
  • representativesness of the test problems
  • CSP in NP complete – always possible to construct examples where BJ/CBJ are worse than BT
  • comparison criteria are different
  • paper gives theroretical approach
    • states by number of nodes visited
    • number of constraint checks
  • BT, CBJ perform the same amount of consistency checks
  • bm reduced consistency checked
  • MD does not affect number of nodes visited
  • FC-CBJ may visit more nodes than CBJ
  • proves correctness of BJ/CBJ
  • determines partial order between algos
  • proves BJ, CBJ are correct
  • proves FC never visits more nodes than BJ
  • Improves BMJ and BM-CBJ to perform even fewer consistency checks
  • provides framework for characterizing future BT algorithms
  • BT extends partial solutions – solutions consistent with a set of uninstantiated values if it can be consistently extend to the variables – that is, create a new partial solution extended by one variable s.t. it’s still consistent
  • Dead end – when all values of current variables are rejected
  • lower level are closer to the root
  • higher levels are closer to the fringe
  • 2 BT algos are equiv if on every csp they generate the same tree and perform the same consistency checks
  • Lemma 1: If BJ bactkracs to variable xh from a dead end at xi, then (ai \ldots ah) is not a valid partial solution

Lecture 16: Theoretical Eval, Continued

  • sufficient conditions
    • BT visits a node if parent is consistent
    • BT visits a node if its parent is consistent with all variables
    • CBJ vists a node if its parent is consistent with all sets of variables
    • FC vists a node if it is consistent and its parent is consistent with all variables
  • Necessary Conditions
    • BT visits a node only if parent is consistent
    • BT visits a node only if its parent is consistent with all variables
    • CBJ vists a node only if its parent is consistent with all sets of variables
    • FC vists a node only if it is consistent and its parent is consistent with all variables
  • BT visits all nodes FC visits
  • BJ vistis all nodes CBJ visits
  • CBJ and FC are not comparable.
  • FC will visit no more nodes than BJ
  • BT and BM are equivalent, as are BJ and BMJ
  • proof of correctness – termination, soundness, completeness
  • extends for seeking all solutions
  • BM hybrids should be fixed, by changing MBL to an \(n \times m\) array, creating BM2, BMJ2 and BM-CBJ2

Lecture 17: Search orders in CSPs

  • assumptions
    • finite domains
    • binary constraints
  • Chapter 6, Tsang is req’d reading
  • Dual viewpoint heuristics for binary constraint satisfaction problems is recommended reading
  • IN BT, some orders have fewer backtracks
  • in lookahead, some ordering detect failure earlier
  • variable ordering is often the most constrained variable
  • value ordering is often the most promising value first
  • using dynamic ordering in lookahead algos often removes the need for intelligent backtracking
  • Regin and Bessier show that CBJ and similar isn’t needed – but not so sure
  • Select values that don’t cause constraint violation – do the most promising value first, avoiding constraint violation
  • Discover constraint violation quickly – select variables that don’t delay discovery of constraint violation, go with the most constrained variable first
  • Notation: For a given future variable \(V_i\), the assignment \(V_i \gets x\) partitions the domain of the future variable \(V_j\), a set of values consistent with with the assignment of size \(\mathrm{Left}(V_j | V_i \gets x)\) and a set of values inconsistent of size \(\mathrm{Lost}(V_j | V_i \gets x)\)
  • Value selection of Lost and Left
  • Value Ordering to quickly find a solution
    • min-conflict – orders values according to the conflicts in which they are involved with the future variables – most popular (Minton), \(Cost(V_i \gets x) = \sum_{V_j \neq i} \mathrm{Lost}(V_j | V_i \gets x)\)
    • Cruciality – (Keng and Yu, 1989) \(Cruciality(V_i \gets x) = \sum_{V_j \neq i} Lost(V_j | V_i\gets x) / |D_{V_j}\)
    • promise – most powerful (Geelen, 1992) \(Promise(V_i \gets x) = \prod_{V_j \neq i} Left(V_j | V_i \gets x)\), the number of assignments that can be done such that no constraint on \(V_i\) is broken.
    • Others are defined
  • Promise
    • recognizes domain wipe-out
    • product discriminates better than sum
    • semantics of promise gives an upper bound on number of solutions
  • FFP name is contested
    • goal is to recognize dead-ends asap to save search effort
    • choose ordering that reduces branching factor
    • Early ones
      • LD
      • maximal degree
      • minimal ratio of domain size over degree
      • promise for variables
      • Br\’elaz heuristic
      • weighted degree (Boussemart et al, ECAI 2004)
      • graph-based
        • minimal width
        • induced with ordering w*
        • maximal cardinality ordering
        • minimal bandwidth ordering
      • remember to exploit domino effect, especially under dinamic var ordering
      • always state how you are breaking ties – generally lexicographically
      • is often cheap and effective
      • in most cases, suitable for both static and dynamic ordering
    • Promise for variables – Sum of the promises of the values, minimize the number of assignments that can be done such that no constraint on \(V_i\) is broken. Has upper bound on the number of solutions

Lecture 18: Ordering Heuristics

  • Least Domain
  • Max Domain
  • Domain/Degree
  • Promise is based on what has the most promise given the current state
  • Br\’elaz heuristic – originally used for graph coloring
  • Remember, variable ordering heuristics are cheap and effective
  • generally suitable for static and dynamic ordering
  • graph-based heuristics aren’t often used anymore
  • Br\’elaz
    • originally designed for coloring
    • assigns most constrained nodes first
    • arrange vars in decreasing order of degrees
    • color a vertex with maximal degree with color 1
    • choose a vertex with a maximal saturation degree (number of different colors to which is it adjacent)
      • if thete’s an equality, choose any vertex of maximal degree in uncolored
    • color chosen vertex, with lowest numbered color
    • if all vertices are colored, stop, otherwise, choose another vertex and continue
  • wdeg
    • All counstraints assigned a weight, initially 1
    • every time a constraint is broken, during propagation with look-ahead (constraint causing domain wipe-out), weight is increased
    • the weight of an un-assigned variable is defined as the sum of the weights of the constraints that apply to the variable
    • the variable with the largest weight is chosen for instantiation
    • weights are not reinitialied at back-track
    • refined with dom/wdeg
    • inspired by breakout heuristic in local search
    • However, there mart be places where dom/deg is more efficient than dom/wdeg
    • A decay function may help improve this
  • Graph heuristics
    • ordering by elimination or instantiation
    • minimal width
      • given an ordering, the width of a node is the number of its parents
      • then the ordering’s width is the largest width of any of the nodes
      • sum of degrees is always equal to the number of edges
      • take minimum of the orders, by width – this is the width of the graph
      • this can reduce chance of backtracking, can help to find minimum width ordering
      • Algo:
        • remove all nodes not connected to any others
        • \(k \gets 0\)
        • while no nodes left in the graph
          • \(k \gets 1\)
          • while there are nodes not connected to more than k others
            • remove such nodes from graph, along with any edges connected to them
        • return \(k\)
        • minimal width ordering of the nodes is obtained by taking the nodes in the reverse order that they were removed
    • Inducted Width Ordering
      • What happens when you reconnect neighbors of what you remove to each other

Guest Speaker: Philip Walton, Workday

  • From Workday, SaaS company for HR management
  • work has primarily been in Data Analysis and optimization
  • go from techniques and technologies to money
  • Could be a Linear Program
    • Measures, Items (index sets)
    • indices i, m
    • variables Quantity, 0 to N
    • model sets, limit, value, weight
    • maximize value and quanitity
    • but must be withinn weight
  • big thing is assigning resources in a data center
  • truck scheduling
    • many truck loads that a company has to move
      • origin and destination specified
      • and time windows at origin and destination
    • many drivers, many trucks
    • some time horizon
    • DOT rules
    • operational rules
      • returning to base
      • equipment preventative maintenance
    • Taking a run at the whole thing is probably too much
      • run a rout model to pick loads that go togethr
      • schedule loads – the big thing for SCPs
      • drivers pick the sets of loads they prefer
  • modeling
    • discrete time forms an integer domain
      • can be clumpy
    • use start time of the task as the domain
    • constraints like before and after are relatively straightforward
    • duration of driving/time windows are prety easy to express
    • small disruptions are normally readily repaired by freeing up parts of the model and letting it re-solve
    • time windows tend to be pretty large
    • isn’t always a notion of an optimal schedule
    • side constraints can be pretty easy to represent
      • work rules aren’t always easy
      • drivers may have odd drive/sleep patterns
      • side resource use is tolerable
  • Robustness for many things has to be dealt with, but how that’s done is very differennt
  • And remember how important interface is – they’re both important, and very, very different
  • And even more important is understanding what’s going on within the problem domain
  • good large software is built on top of amazing small software

Lecture 19: Variable Orderings, continued

Induced Width Ordering, \(\omega^{*}\)

  • requires designing induced constraints – can cause much, much less backtracking
  • these are called fill-in edges
  • this procedure moralizes the graph
  • computing min induced width is NP hard
  • MIN FILL heuristic is used
    • when removing anode, add an edge between every two nodes connected to the node, but not already connected to each other
    • remove the node that, after removal, has the samllest number of fill-in edges
    • Remember to remove the node with the smallest first!
  • A tree is a graph with induced width 1
  • An ordering has width, as does a graph. Graph width is not dependent on ordering, nor is induced width, \(\omega^{*}\) is called induced width or tree width.
  • triangulated or chordal graphs are graphs in which each cycle of length 4 or more is such that each pair of non-connected vertices is connected
    • Makes it easy to find largest clique, can actually be done in polynomial time
    • is a type of perfect graphs
    • interval graphs are a type of triangulated graph
    • finding minimum number of triangulating edges is also np-hard

Maximal Cardinality ordering

  • Choose an arbitrary node
  • among remaining nodes
    • choose the one connected to the maximum number of already chosen nodes
    • break ties arbitrarily
  • Repeat until done
  • Obtained ordering is instantiation ordering
  • reverse ordering yields a PEO
  • this is the perfect elimination ordering when reversed
  • If the graph does not need to be moralized, the graph is triangular

Lecture 20: Variable Orderings, continuedg

Max Cardinality, Continued

  • width of an induced ordering is same as graph
  • MWO can help to guaranty back-track free search
  • Lots of algos came from RDBMs
  • Induced width is NP Hard!!!
  • Simplicial graph is such that looking at an elimination ordering, all parents of a node form a clique
  • moralized is triangulated

Minimal Bandwidth Ordering

  • bandwidth is
    node
    largest distance between the node and its neighbors
    ordering
    largest bandwidth of the nodes in the ordering
    graph
    smallest bandwidth across all possible orderings
  • heuristic localizes and confines backtracking
  • smaller bandwidth, sooner ability to backtrack
  • find minimum bandwidth ordering is np-hard
  • but finding an ordering of a given bandwidth is polynomial
  • Tsang 6.2.2, only one paper

Ordering Heuristics

  • When
    • static vvar, val ordering
    • dynamic var, static val
    • dynamic var, dynamic val – dynamic vvp
  • at each given level
    • VVPs pertaining to the same variable across a given level, static-static
    • Vars are different, dyn-static
    • vvps are different at each and every level
  • Computing
    • sort vars at preprocessing for static
    • dynamic selects var during search, based on current domain, all unsintantiated neighbors, exploits the domino effect!!

Lecture 21: Filtering For Constraints of Difference

  • GAC – Generalized Arc Consistency, for non-binary CSPs
    • Operates on table constraints, by supports or by no-goods
    • \(\pi_A(R)\) gives a new relation with only values of A
      • May be a Set, or a Bag
  • finite csps are linear in space, quadratic time when dealing with all-diff constrants
  • Sometimes, all diff may become many constraints that together are equivalent, but not always
  • GAC4 handles n-ary constraints
    • has good pruning power
    • but is very expensive – \(\frac{d!}{(d-p)!}\)
  • value is consistent if other values exist for all other values such that the values and the give simulatenously satisfy the problem
  • a constraint is consistent if all values for all variables arec oconsistent
  • csp is arc-consistent if all constraints are consistent
  • csp is diff-arc-consistent iff all all-diff constraints are arc-consistent
  • value graphs are bipartite
    • vertices are \(X_C\), \(\cup_{x \in X_C}(D_x)\)
    • edges \((x_i, a)\) iff \(a \in D_x\)
  • Space is \(\mathcal{O}(nd)\)
  • Matching a subset of edges in G with no vertex in common
  • max matching is the bigges possible
  • matching covers a set \(X\) – every vertex in \(X\) is an endpoint for an edge in matching
  • \(M\) that covers \(X_C\) is a max matching.
  • If every edge in \(GV(C)\) is in a matching that covers \(X_C\), \(C\) is consistent.
  • CSP is diff-arc-consistent iff, for every all-diff \(C \in \mathcal{C}\), for every edge \(GV(C)\) belongs to a matching that covers \(X_C\) in \(GV(C)\)
  • Build graph, compute max matching, if size of max matching is smaller than \(X_C\), return false, otherwise remove edges from G given the max matching and return true
    • use Maximal Flow in Bipartite graph
    • Or Hopcroft/Karp algo
  • Then we have an all-dif constraint, value graph and a maximum covering
    • remove edges that aren’t in a matching covering
  • given a Matching M
    • matching edge is an edge in M
    • free edge is an edge not in M
    • matched vertex is incident to a matching edge
    • free vertex is otherwise
    • alternating path (cycle) – a path (cycle) whose edges are alternatively matching and free
      • flip colors on even alternating path – gives another matching of same size
      • flip colors on alternating cycle – another matching
    • length of path – number of edges in path
    • vital edge belongs to every maximum matching
  • An edge belongs to some of but not all maximum matching iff for an arbitrary maximum matching \(m\) , it belongs to either an even alternating cycle or an even alternating path tha begins at a free vertex
  • Edges to remove shouldn’t be in all matchings, even alternating paths starting at a free vertex and an even alternating cycle.
  • every directed cycle in \(G_0\) corresponds to an even alternating cycle of \(G\)
  • every directed simple path starting at a free vertex corresponds to an even alternating path of \(G\) starting at a free vertex
  • make black point one way, red the other
  • allows for all-diff and other global constraints

Lecture 22: CSPs and Relational DB

  • connections
    • constraint databases and constraint logic programming
    • query processing and solving csps
  • shared
    • constrant dbs (deductive BD, Datalog) and constraint logic programming share representation
    • relational databases and constraint satisfaction share computations
  • Constraint databases use LP – Prolog for DB, Datalog
  • Prolog was a bit of a failed expiriment
  • CLP(R,F) is prolog with CSP techniques for either reals or finite domains
  • Relations
    • binary relations – given \(D_a\) and \(D_b\), a set of any 2-tuples \(\langle x, y \rangle\) with \(x \in D_a\) and \(y \in D_b\), this defines a relation \(R_{ab}\) \(R_{ab} = {(x, y)} \subseteq D_a \times D_b\)
    • Function – a binary relation such that there is at most one tuple \(\langle x, ? \rangle \in R_ab\). \(D_a\) is the domain, \(D_b\) is the co-domain.
    • $k$-ary relations – are relations on \(k\) elements
  • representation
    • 2-dimensional binary array, though k-dimension works
    • tables
    • linked-lists of pairs
    • hash-maps of sets
    • many other options
  • Terms
    • table, relation – constraint
    • arity of relation – arity of variable
    • attribute – variable
    • value of attribute – value of var
    • domain of attribute – domain of variable
    • tuple in table – tuple in constraint/allowed by constraint/consistent with constraint
  • Relational algebra
    • intersection
      • input two relations of same scope
      • output a new, more restrictive relation with the same scope made of tuples in all the input relations simultaneously
      • logical and
    • union
      • input 2 rel of same socope
      • less restrictive relation with same scope of tuples that are in any input relation
      • logical or
    • difference – same as set theory
    • selection
    • projection
    • join
    • composition (from CSP)

Lecture 23: CSPs and Relational DBs, continued

  • Selection and projection are unary operators
  • Selection
    • Input – a relation \(R\) and a predicate o attributes of \(R\)
    • Output a relation \(R^\prime\) with the same scope as \(R\) but having only a subset of the tuples in \(R\)
    • Row selection
    • \(\sigma_{p}(R)\)
  • Projection
    • Input – a relation \(R\) and a subset \(s\) of the scope
    • Output – a relation \(R^\prime\) of scope \(s\) with the tuples rewritten such that positions not in \(s\) are removed
    • column elimination, in CSP set semantics, in DB bag
    • \(\pi_{\{x_1, x_2\}}(R)\)
  • Join (natural join)
    • Input – \(R\) and \(R^\prime\)
    • Output a relation \(R^{\prime\prime}\) whose scope is the union of \(R\) and \(R^\prime\) and tuples satisfy both
      • if they have no common attribute, cartesian product
      • if they have an attribute in common, compute all possibilities
    • Computes all solutions to a csp
    • \(R \bowtie R^\prime\)
  • Composition of two relations
    • Input – two binary relations \(R_{ab}\) and \(R_{bc}\) with 1 var in common
    • Output – a new induced relation \(R_{ac}\) to be combined by intersection to a pre-existsing relation between them
    • bit matrix op is matrix multiplication
    • Generalization as constraint synthesis
    • Produces induced relations a/k/a indirect, inferred, implicit
  • Given \(V_1\) and \(V_2\) and \(C_1\) and \(C_2\) between them, how do \(C_1 \cap C_2\) and \(C_1 \bowtie C_2\) relate? – They’ll be the same in this situation
  • Given \(V_1\), \(V_2\), \(V_3\), \(C_{V_1, V_2}\) and \(C_{V_2, V_3}\), write the induced \(C_{V_1, V_3}\) in relational algebra. – \(C_{V_1, V_3} = \pi_{V1, V3}(C_{V_1, V_2} \bowtie C_{V_2, V_3}) = C_{V_1, V_2} \circ C_{V_2, V_3}\)
  • Given \(V_1, V_2, V_3\), \(C_{V_1, V_2}, C_{V_2, V_3}, C_{V_1, V_3}\), \(C^{\prime}_{V_1, V_3} = (C_{V_1, V_2} \circ C_{V_2, V_3}) \cap C_{V_1, V_3}\)
  • Natural join – synthesized constraint
  • left/right join – synth constraint with some inconsistent tuples
  • Differences
    • number of relations
      • db have few
      • csp have many
    • size of relations
      • dbs have high-arity
      • csps are mostly binary
    • bounds of relations
      • dbs are very tight
      • csps are loose
    • size of domains
      • db large
      • csp small
    • type of domain
      • almost always finite, but csp may be continuous
    • number of solutions sought
      • dbs looking for almost all
      • csp only looks for one in general
    • cost-model
      • DB is I/O, memory
      • CSP is computational

Lecture 24: Path Consistency and Global Consistency

  • consistency helps to reduce possible search-space
  • A path of length \(m\) is consistent iff for any value \(x \in D_{V0}\) and for any value \(y \in D_{Vm}\) that are consistent, there exists a sequence of values in the domains of the variables \(V_1, V_2, \ldots, V_{m - 1}\) such that all constraints between them (along the path, not across) are satisfied
  • A CSP is path consistent if any path of any length is consistent
  • However, by Mackworth AIJ77, only paths of length 2 are necessary, \(\mathcal{O}(n^3)\)
  • algo looks at every vertex as a mid-point \(k\), and makes \(i\) and \(j\) together consistent given \(j\),
    • and \(R_{ij} \gets R_{ij} \cap (R_{ik} \cdot R_{kj})\)
    • Closure algorithm. \(k\) is outer loop
    • Until fixed point
      • for k from 1 to n
        • for i from 1 to n
          • for j from 1 to n
            • \(R_{ij} \gets R_{ij}^{O} \cap (R_{ik} \cdot R_{kj})\)
    • Algo
      • \(Y^n \gets R\)
      • Until \(Y^n = Y^0\)
        • For k from 1 to n
          • For i from 1 to n
            • For j from 1 to n
              • See slide for equation
      • \(Y \gets Y^n\)
  • Difference between k consistent and strong k consistency
  • PC1 terminates, is sound and complete for discrete CSPs - \(\mathcal{O}(a^5n^5)\)
  • In a complete graph, if every path of length 2 is consistent, the network is path consistent
    • PC1 has two operations, composition and intersection
    • proof is done by induction
  • Mohr and Henderson also developed PC-2, 3
  • and Han, Lee created PC4
  • PC5 from Singh uses support bookkeeping from AC6
  • PC8 uses domains, not constraints, 2001 improves on PC8, but isn’t tested
  • PC is rarely used except for certain types of constraints
  • monotonic constraints – ordering based

Lecture 25: Path Consistency and Global Consistency – continued

  • completion allows you to use paths of length 2.
  • Strong PC will ensure both PC and AC
  • AC first, then if no inconsistency, PC (likely PC2)
  • Go from weaker first to stronger – making later steps likely cheaper
  • But PC is often unused – except some situations make it much more likely to find a solution
  • PC is also called 3 consistency

Partial Path consistency

  • same as PC, but no universal constraints, defined over cycles
  • graph is triangulated
  • run closure only over the tringles – \(\mathcal{O}(n^3)\)
  • must be careful for articulation points in the graph
  • PPC is much easier to implement
  • PPC is strictly weaker than PC as a property
  • PC completes graph, PPC will triangulate
  • PC filters more than PPC, edge-by-edge
  • PPC enforces PC on common edges
  • which means that you can find if something is not path consistent
  • From Bliek and Sam-Haroud 1999
  • Can PC detect insatisfiability when PPC does not? – It seems not – PPC and PC are roughly equivalent, but there are some cases – annd it was hand built

Global Consistency properties

  • minimality/decomposability
  • A minimal network – a network where all tuples in a given constraint appear in at least one solution
  • computing a minimal network is NP complete
  • Solving a minimal network is itself NP complete

Lecture 26: Consistency properties

  • Local Consistencies
    • AC
    • GAC
    • Path Consistency – modifies binary constraints, requirecs completed graph
    • Partial Path Consistency – triangulates graph, modifies binary relations
  • Global consistency properties
    • Consistency in general – it has a solution
    • minimality – constraints are as tight as possible, on binary constraints, elimination of a tupple eliminates a solution, every tuple is a member of at least one solution, idealized form of path consistency, garutys consistency, important in interactive product configuration – NP complete in general, Gottlob, 2011
      • determining whether or not a CSP is minimal is NP complete
      • Finding a solution of a minimal CSP is still NP complete
      • occasionally called globally consistent
    • decomposability
      • CSP is decomposable if for any combination of assignments to any number of variables that is consistent given the constraints can be extended to a complete solution
      • guaranties back-track free search
      • strongly $n$-consistent
  • there are some islands of tractability
    • tree structure – AC ensures tractability
    • convex constraints – path consistency ensures minimality and decomposability

Lecture 27: Consistency properties, continued

  • minimality doesn’t guaranty bt-free search
  • decomposability ensures no back tracking
  • PC approximates minimality
  • when constraints are convex, PPC on triangulated graph guarantees minimality and decomposability
  • when \(\circ\) distributes over \(\cap\), then PC1 guarantees that a CSP is minimal and decomposable, and that the outer loop can be removed
  • but this doesn’t always hold
  • constraints of bounded difference ad temporal reasoning
    • vars,
    • constraints of the form \(a \leq Y -X \leq B\), \(Y - X = [a, b] = I\)
    • composition and intersection from slides

Lecture 28: Consistency properties, continued

  • Remember, once decomposability is enforced, search is guaranteed backtrack free
  • When constraints are convex, use of PPC ensures minimality and decomposability, but triangles must be done in PEO
  • When composition distributes over intersection
    • PC1 generalizes Floyd-Warshall (all-pairs shortest path)
    • PC-1 generalizes Warshall (transitive closure)
  • Don’t confuse properties with algorithms
  • Local consistency
    • remove inconsistent values (node, arc)
    • remove tuples (path consistency)
    • get closer to solution
    • reduce size of problem
    • are cheap
  • Aim for global consistency, but it’s hard
  • Sometimes (given certain types of constraints or graphs), local consistency guarantees lobal consistency
    • distributivity in PC, row-convex constraints, special networks

Lecture 28: Local Search

  • Systematic/constructive search is all we’ve talked about
  • Starts at a solution that’s not correct, try to repair
  • General method:
    • for each variable, choose a value
    • check if solution
      • You’re done. Lucky.
      • pick a value that broke a constraint and change it (using min-conflict)
  • Local search is in worst case, \(\mathcal{O}(d^n)\)
  • States laid on a surface
  • quality/cost is height
  • optimum state maximizes solution quality while minimising cost
  • move around from state to state to find a local optimum
  • Gradient descent – will only ever find a local optimum
  • State is a complete assignment of values to variables
  • how do you move – CSP uses min-conflict
  • eval function rates the quality of a state, generally in number of broken constraints
  • Try
    • Start w/ full but arbitrary assignment of aues
    • reduce inconsistencies by local repairs
    • repeat until
      • A solution is found (a global minimum)
      • The current solution cannot be repaired (local minimum)
      • You’ve gone through max tries (running out of patience)
  • Local repair for decision, local optimzation for optimization
  • Greedy types are hill climbing, local beam
  • Local beam
    • start in parallel with a number of different points
    • choose among those another number of best moves
    • remove those without promise
    • and repeat as necessary
  • Stochastic methods
    • break out, helps to avoid getting stuck in a local minimum
    • when stuck in a local minimum, walk randomly, pick a random variable, change to a random value
    • tabu search
      • keep a list of tabu positions (those that aren’t good), but only a certain number
      • when a new state is visited, added to tabu, and farthest back forgotten
      • if no better, go to the next state not tabu
      • repeat until a global minimum is found
    • sometimes you may randomly ignore a heuristic
  • Local search problems
    • get stuck in a local optimum
    • start having can have issues with plateaus (flat local maxima) (be ready to move to a state with the same quality)
    • ridges can go from side to side with minimal progress
  • when stuck, do random walk
  • Simulated annealing
    • When stuck in a local minimum, allow steps towards less good neighbors to escape the ocal maximum
    • be more random at the start, be less random as you go
    • Start at a random state, start countdown and loop until over
      • pick a neighbor at random
      • \(d\) = quality of neighbor - quality of current state
      • if \(d > 0\)
        • then move to neighbor and restart
        • otherwise, move to a neighbor with a transition probability \(p<1\)
    • Transition probability proportional to \(e^{\frac{d}{t}}\)
      • \(d\) is negative
      • \(t\) is countdown
      • As time passes, you’re less and less likely to move towards bad neighbors
  • But all forms of local search are:
    • non-exhaustive
    • non-systematic
    • liable to get stuck in a local optimum
    • non-deterministic – outcome depends on where you start
    • heavy-tailed – longer you go, less likely you are to find a solution
  • Is often somewhat probabalistic

Lecture 29: Local Search, Continued

  • starts from a complete assignment, and moves to a new assignment that’s better
  • Very greedy approach, but can be parameterized very easily
  • Tends to be very greedy, but only allows improvements
  • others can be better, and allow forr random walks/steps – stochastic algorithms
  • stochastic algorithms are not deterministic!
  • Tabu search moves to a state that may be worse, provided it’s not in a list of previous states
  • break-out modifies weight of objective function
  • simulated annealing involves countdowns and loops, controled by quality of neighbor and the time, but when significantly restricted,, it is guaranteed to find an optimum
  • Have issues
    • non-systematic and non-exhaustive
    • have issues getting stuck in local optima
    • non-deterministic!!
    • generally the probability of improving a solution gets smaller, but never actually dies out – heavy-tailed
  • So, when you hit a point where it seems improvability drops
    • keep the most promising solution
    • and start someplace else
    • Random restart
  • Luby restart profile
    • uses a crazy geometric law to decide how/when/where to restart

Genetic algos for local search

  • Combine two complete assignments to generate offspring
  • Start from an initial position
  • encode assignments in a compact manner
  • combine partial solutions to generate new solutions
  • requires initial generation of many random assignments
  • Choose solutions to combine based on quality
  • fitness is used to describe quality, assigns probability for solution
  • Select pairs depending on fitness
  • crossover point is chosen randomly, offspring are generated
  • mutation will randomly change a state
  • and this gets put back in the original population

ERA – Environment, Rules, Agents

  • Environ is \(n \times a\) board
  • each var is an agent
  • each position is a value of a domain
  • agent moves in a row on the board
  • each position records the number of violations caused by the fact that agents are occupying other positions
  • agents try to occupy positions where no constraints are broken
  • agents move according to reactive/reaction rules
  • agents may themselves have preferences
  • move rules can be as simple or complex as possible
  • rules:
    least-move
    choose position with min violation
    better-move
    choose position with smaller violation
    random-move
    randomly choose a position
    others
    combinations of those above
  • No one talks to each other, but share a common context
  • everyone kicks eachother out until every one is happy
  • very effective in solving very tight, but solvable instances
  • unstable in over-constrained cases
    • agents keep kcicking each other out (livelock)
    • livelocks can be used to identify bottlenecks
  • very stable on solvable instances
  • oscillates wildly on unsolveable instances
  • Agent motion can be categorized as variable, stable, constant

Lecture 30: Local Search, continued

ERA, Continued

  • Rules help to find a solution somewhat quickly
  • Has less of an issue with getting stuck in various local optima
  • May be very good for resource-allocation problems
  • more dynamic, as there’s no global evaluation function

Nothing works, what to do

  • Random restart
    • when no progress made, restart from a new srandom selected state
    • save the best results so far
  • repeat random restart
    • for a fixed number of iterations
    • until best results have not ben improved on for a certain number of iterations – e.g., a geometric law

Evaluation of Local Search

  • Test on a given problem instance
    • an ensemble of problem instances
  • Experiment
    • running algo thosands of times
    • measure some metric as a random variable – i.e., the time needed to find a solution
  • Then provide
    • probability distribution function approximated by the number of runs that took a given time to find a solution
    • the cumulative distribution function approximated by the number of runs that found a solution within a given time
  • Then compare performance with statistical tests
    • t-tests assume normal distribution of the metrics
    • nonparametric tests don’t
    • Consult with a trained statistician

Lecture 30: Structure-based methods

  • CSPs represented by networks
    • for bCSPs, graph
      • constraint graph
      • microstructure
      • co-microstructure
    • for general CSPs
      • hypergraph
      • dual – allowed tuples connected to each other when shared variables match
      • primal graph – vertices in graph are original variables, are variables in a constraint are a clique

Lecture 31: Structure-based methods, continued

Graph Vertices Edge
Binary Var Constraints
Non-binary Var constraints
Dual Constraints Equality constraints on shared variables
primal variables cliques over scope of constraints
\(\mu\) structure VVPs supports
co \(\mu\) structure VVPs conflicts
  • Incidence graph – Variables are one type of node, constraints another, edges connect variables to the constraints, is bipartite

Tree-structured CSPs

  • Tree-structured CSPs can be solved in polynomial time
    • apply \(revise(V_i, V_j)\) for all nodes from leaves to root
    • instantiate variables from root to leaves
  • Generally
    • directional consistency from leaves to root
    • solutions built from root to leaves
  • This is often the case for configuration problems
  • Width of a tree is always one for a tree
  • If a CSP graph has \(w = 1\), it is guaranteed to be solved with 2 consistency
  • Freuder has therem that if a csp is \(w\) in width, then $w+1$-consistency is guaranteed to solve it.
    • May have issues w/ changing width of graph
    • There are ways to prevent this, but is not necessarily true

Cycle-cutset

  • identify a cycle cutset \(S\) in the CSP
    • cutset – nodes when removed yield a tree
  • Decompose CSP into two partitions
    • \(S\)
    • \(T\) the nodes forming a tree
  • Iterate until a solution is found
    • Solve the nodes in \(S\)
    • Try to extend the solution to \(T\)
  • \(|S| < |V|\)
  • if \(|S| = c\), then $\mathcal{O}(dc(n-c)d2) = \(\mathcal{O}(nd^{c+2})\)
  • But finding smallest cutset is \(NP\) hard

Lecture 32: Structure-based methods, continued

Dynamic Dangle Identification

  • graham reduction used to detect if schema is acyclic
    • join of acyclic schema is polynomial
    • remove vertices w/ degree 1
  • After each var assignment/lookahead
    • check connectivity
    • identify dangling trees using graph reduction
  • for each dangling root
    • do DAC from leaf to root
    • wipeout means unsolvability
  • Restrict search to nodes outside of the identified dangling trees

Directional Path Consistency

  • Given a binary CSP and a fixed variable ordering
  • \(\mathcal{O}(nw*^2d^3)\)
  • if \(w* \leq 2\)
    • DPC determines consistency
    • guarantees BT-free search given the ordering
    • But this only works iff induced width is less than or equal to 2, you get strong consistency, minimality and tractability

Biconnected components

  • decomposition into separable graphs
  • consider a CSP whose graph has artiulation nodes
  • assume that the largest biconnected component is \(b\)
  • build a tree whose nodes are biconnected components considering that the articulation node belongs to the parent (or parent and child)
  • build an order using a pre-order traversal of the tree
  • complexity is then bound by the size of the largest biconnected component

Lecture 33: Structure-based methods, continued

Biconnected components

  • must have articulation nodes – nodes common to two or more problems
  • is \(d^b\) where \(d\) is the domain size and \(b\) is the size of the largest component
  • Use a pre-order of the variable tree, which means that BT effort is the size of the largest component
  • Originally from 1982, in Journal of the ACM

Tree Clustering

  • bi-connected components, but with more than one shared variable for each cluster
  • triangulates the graph
  • finds maximal cliques
  • create a tree structure from cliques
  • Then repeat:
    • solve a clique at each node in the tree
    • apply DAC from leaves to root
    • generates solutions in a BT-free manner
  • MaxCliques from Algorithmic Graph Theory and Perfect Graphs, Golumbic
  • Triangulation is \(\mathcal{O}(n^2)\)
  • Max clique is \(\mathcal{O}(n + e)\) in triangulated graph
  • \(\mathcal{O}(k^r)\) for solving clusters, \(k\) is domain size, \(r\) is size of largest clique (\(t = k^r\))
  • generating solution \(\mathcal{O}(nt\log t)\), \(t\) is tuples in each cluster (sorted domain of a ’super’ variable (sorting can help to speed things up)), in best case, we have \(n\) cliques
  • complexity bounded by size of largest clique:
    • \(\mathcal{O}(n^2) + \mathcal{O}(k^r) + \mathcal{O}(nt\log t) = \mathcal{O}(nk^r\log k^r) = \mathcal{O}(nrk^r)\)

Lecture 34: Structure-based methods, continued

Bucket Elimination

  • Built on non-binary CSPs
  • Dechter 1997
  • Choose a variable ordering – PEO recommended
  • create buckets for each variable, placing each constraint in the bucket of the deepest variable in its scope
  • create tree node for each bucket
  • From bottom to top
  • join relations in bucket (including on the domain)
  • project join on deepest bucket with common variable names
  • top-down creates solution backtrack free
  • Can handle cycles!
  • And is a pretty quick form of DAC, but for non-binary CSPs

Lecture 35: Structure-base methods, continued

Bucket Elimination/Tree Clustering

  • linear in \(n\), exponential in induced width + 1
  • induced width is the number of vars in the largest component
  • gives fixed parameter complexity

Bucket Elimination

  • very much like Gaussian elimination
  • Useful for Bayesian networks – graphical models of all kinds

Tree Decomposition

  • Generalizes all of structure methods
  • comes from database theory
  • organizes any constraint problem into a tree
  • \((T, \Chi, \Psi)\) of \(P = (X, D, C)\)
    • \(T\) – \((V, E)\) the tree
    • \(\Chi(v) \subseteq X\) – A mapping of all variables in a node
    • \(\Psi(v) \subseteq C\) – A mapping of constraints and variables
    • Each constraint must appear on at least one node in the tree, and all its variables are in that node
    • nodes where any variable appears induce a single connected subtree (connectedness property), the path between any two nodes with the same variable will all have the variable
  • join trees and acyclic DBs from database theory
  • organizes the constraint network into a tree structure, a join-tree
  • labels tree nodes with variables and constraints – labels must obey some rules
  • Solving
    • bottom-up – joins relations and projects on parent
    • top-down generates the solution
  • Complexity bound by width
    • tree width is the max number of variables in a node - 1
    • hyperwidth is the max number of constranits in a node
  • Group of variables are the variables in common to two nodes
  • However, selecting the tree decomposition is NP-hard, use approximation by min-fill
  • Quite a few different tree-decomposition techniques – Gottlob’s HYPERTREE seems to be one of the best
  • parameters
    • tree width max number of variables in any node minus 1
    • hypewwidth – number of constraints in any node
  • theory – characterizes tractable csps
    • complexity is bound by tree/hyper-width
    • fixed parameter complexity – independent of number of variables
  • practice – uses structure to solve csps
  • requires balancing time and space

Lecture 26: More on Constraint Consistency

Global Consistency Properties

  • minimality – relations join completely, $n$-consistency
  • decomposability – strong $n$-consistency
  • solvability

Local Properties

  • AC/GAC – Domains
  • Path Consistency – Constraints, add new constraints (constraint synthesis)
  • Partial Path Consistency – Constraints, add new constraints
  • Directional Path Consistency – Constraints, add new constraints
  • Types of constraint algorithms
    • Binary
    • Non-Binary
    • Global Constraints
  • For continuous domains, work on box-consistency
  • On interval, box consistency can occasionally also be used
  • weighted CSPs can also be dealt with, with a variant of AC

Info

  • arc consistency – $(1,1)$-consistency – for every variable, the solution can be extended to another consistent assignment
  • path-consistency – $(2,1)$-consistency – for every two variables, the solution can be extended to another consistent assignment
  • \((i,j)\) – for every consistent assignment of length \(i\), the solution can be extended by \(j\) consistent assignments (\(i + j = n\))
  • strong $k$-consistency is $k$-consistency down to \(1\) consistency

Lecture 27: Still more on Constraint Consistency

  • domain minimality – constraints over arity 1, Gottlob 2011
  • however, even if domain minimality and constraint minimality hold, solving is still NP complete
  • global inverse consistency – where given a value at 1 var, you can extend it to \(n - 1\) assignments

$(i, j)$-consistency

  • \(j = 1\)
    • 2 – AC
    • 3 – PC
    • k – $k$-consistency
  • \((1, n - 1)\) – domain minimality, global inverse consistency
  • \((1, k)\) – \(d^{k + 1}\), guaranteed that a solution can be extended over \(k\) variables
  • NIC – Neighborhood Inverse Consistency, for every variable, look at its neighbors, and make consistent with those

Lecture 28: More Constraint Consistency

Singleton Arc Consistency

  • family of algos based on singleton test
  • singleton test like search – kinda
  • The CSP is AC for every VVP
  • Algo
    • Until No change
      • for each var
        • for each value
          • assign the val to the var
          • if CSP is AC, keep
          • else Remove
  • Sometimes, when you get lots of backtracks, trigger a high level consistency

Inverse consistency

  • (1, 2) consistency – Path Inverse consistency
  • Neighborhood Inverce – see previous
  • GIC – Global Inverse Consistency – ensure domain minimality

Special Constraints – AC

  • functional – A constraint C is functional with regard to D iff for all \(v \in D\) there exists at most one \(w \in D\) such that \(C(v,w)\)
  • monotonic constraints – a constraint C is monotonic with regard to \(D\) iff there exists a total ordering on \(D\) such that, for all \(v\) and \(w \in D\), \(C(v,w)\) hold, implies \(C(v^{\prime}, w^{\prime})\) holds for all values \(v^{\prime}, w^{\prime} \in D\) such that \(v^{\prime} \leq v \land w^{\prime} \leq w\)

Non-Binary Special constraints

  • almost all properties and algos discussed were restricted to binaryies
  • consistency for non-binary is the topic of current research
    • in particular domain filtering that doesn’t change topology or modify constraint definitions
    • relational \(m\) consistency – adding constraints/changing constraint definitions
  • Domain Filtering
    • GAC
    • Singleton GAC
    • maxRPWC, rPIC, RPWC, etc
  • Relational consistency
    • Strong consistencies
    • Simple Tabular Reduction

Lecture 29: Extensions to BT Search

  • bounded number of backtrack search
  • bounded backtrack depth search
  • limited discrepency search
  • credit-based backtrack search
  • randomized backtrack search

Credit Based Search

  • start with a given credit, usually \(n^3\)
  • assign 1/2 credit to current assignment, 1/2 to the remaining
  • keep going in a depth-first manner, until credit is used up, chronologically backtrack from there
  • Used in conjunction with backtrack bounded search

Limited Discrepancy Search

  • may be blind at shallow levels of search
  • disobeys a given number of times
  • Matthew Ginsburg

Randomized BT

  • in systematic search, ordering of the variables/ values determines which of the solution is explored
  • randomization allows us to explore wider portion of search tree
  • thrashing causes stagnation of BT search, interrupt it, and restart
  • Pick random variables at instantiation
  • And Random values when you’re picking values
  • Use the restart schedule when thrashing
    • Fixed cutoff and universal – Luby
    • Randomization and Rapid restarts (Gomes), fixed optimal cutoff, priori knowledge of cost distribution required
    • randomization annd geometric restarts Walsh – \(C_{i + 1} = r C_i\), \(C_0 = n\)
    • Randomization and dynamic geometric restarts (guddeti) – $Ci + 1 = \left\{\begin{matrix} rCi & \text{when solution improved} \\ Ci & \text{otherwise}\end{matrix}\right.$
    • Bayesion approach (Kautz)

Licensed under Creative Commons Attribution-ShareAlike 4.0 International License