The Big 3: AB, TT, and ID

PSU CS410/510GAMES Lecture 4
April 24, 2002

Minimax continued
- Search space is too large for real games. Limit by
  - Depth:
    - Instead of recursive call, try using heuristics to estimate quality of state (not best move).
    - Prove upper or lower bound on value of game below state.
  - Breadth:
    - Use heuristics to take moves from best to worst for player on move. Use more heuristics to avoid recursive calls for ``bad'' moves.
    - Prove that some move is strictly worse than a move already considered.
  - Opponent Modeling (expectimax): If your opponent would not consider this move and its consequences, you need not either.
- Evaluation Function
  - Heuristic for estimating minimax value of a state.
  - Typically just a number, but may also estimate confidence of estimate, or be some fancy (total-orderable) datum.
  - ``Real'' backed-up values vs. heuristic values: risk aversion.
  - No perfect formula. Sanity checks:
    - Does a maximizer winning state get a maximum positive score? Does a minimizer winning state get a minimum negative score? A draw 0?
    - Does the value assigned in real games have some relationship to human estimates?
    - Are there obvious game features missing?
    - In games with material as a state feature, is it domininant in the game? In the evaluator?
  - Cost of evaluation vs. cost of extra search.
  - Examples:
    - Aces Up
    - Tic-Tac-Toe
    - Chess
AB: Alpha-Beta Pruning
- Idea: Your opponent knows as much as you, and won't give you a cheap win.
- Implementation: keep track of best provably achievable score so far for self, opponent (from earlier siblings). If you find you can achieve better than the negation of opponent's provable best score, you're never getting here: stop searching.
- Corollary: For maximum pruning, order moves by best-first to get tight bounds early. Leads to narrow "AB window".
- Pending: "Zero-window" search.
- Effectiveness: 1/2 branching factor best case, 3/4 random case, doesn't hurt in worst case.
TT: Transposition Tables
- Ideas: Moves transpose. Search is expensive.
- Implementation: Remember the value of a state to avoid re-computing it when encountered again. Create a reasonable hash function (fast yet good) and do random replacement. Allocate as much storage for the ttable as you can.
- Danger: states are distinguished by
  - Board
  - Side to move
  - Search depth
  - AB window
- Danger: odd termination rules will mess you up. (Graph-History Interaction [Campbell 1985])
ID: Iterative Deepening (again)
- Ideas: Hard to predict search time. Need to order moves well.
- Implementation: Use ID to time limit. Note convergence of root value? Use ttable to order moves effectively (via values from previous iteration).
- Problem: Want to leverage transposition table to decrease search, but how?
- Idea: Horizon effect is a nightmare. Search deeper where it matters
- Implementation: AB already prunes out-of-bounds leaves. Use secondary heuristic to estimate accuracy of primary heuristic ("quietness"), extend search where necessary.
- Problem: Hard to build/tune.
Putting it all together: zero-window search
- Idea: Could prove a known value quickly.
- Implementation: Start the AB window closed on some guessed value. At some point, may find that no line achieves this value. Can track "fail-low" vs. "fail-high".
- Advantage: Goes very fast if you have good estimates.
- Disadvantage: Goes very slowly if you have bad estimates.
- Modern implementation: MTD(f) [Plaat 1996]