The Big 3: AB, TT, and ID
PSU CS410/510GAMES
Lecture 4
April 24, 2002
- Minimax continued
- Search space is too large for real games. Limit by
- Depth:
- Instead of recursive call, try using
heuristics to
estimate quality of state (not
best move).
- Prove upper or lower bound on value of
game below state.
- Breadth:
- Use heuristics to take moves from best to worst for
player on move.
Use more heuristics to avoid recursive calls
for ``bad'' moves.
- Prove that some move is strictly worse
than a move already considered.
- Opponent Modeling (expectimax): If your opponent
would not consider this move and its consequences,
you need not either.
- Evaluation Function
- Heuristic for estimating minimax value of a state.
- Typically just a number, but may also estimate
confidence of estimate, or be some fancy
(total-orderable) datum.
- ``Real'' backed-up values vs. heuristic values: risk aversion.
- No perfect formula. Sanity checks:
- Does a maximizer winning state get a maximum
positive score? Does a minimizer winning state
get a minimum negative score? A draw 0?
- Does the value assigned in real games have some
relationship to human estimates?
- Are there obvious game features missing?
- In games with material as a state
feature, is it domininant in the game? In
the evaluator?
- Cost of evaluation vs. cost of extra search.
- Examples:
- Aces Up
- Tic-Tac-Toe
- Chess
- AB: Alpha-Beta Pruning
- Idea: Your opponent knows as much as you, and won't
give you a cheap win.
- Implementation: keep track of best provably achievable
score so far for self, opponent (from earlier
siblings). If you find you can achieve better than
the negation of opponent's provable best score, you're
never getting here: stop searching.
- Corollary: For maximum pruning, order moves by
best-first to get tight bounds early. Leads to
narrow "AB window".
- Pending: "Zero-window" search.
- Effectiveness: 1/2 branching factor best case,
3/4 random case, doesn't hurt in worst case.
- TT: Transposition Tables
- Ideas: Moves transpose. Search is expensive.
- Implementation: Remember the value of a state to avoid
re-computing it when encountered again. Create a
reasonable hash function (fast yet good) and do random
replacement. Allocate as much storage for the ttable
as you can.
- Danger: states are distinguished by
- Board
- Side to move
- Search depth
- AB window
- Danger: odd termination rules will mess you up.
(Graph-History Interaction [Campbell 1985])
- ID: Iterative Deepening (again)
- Ideas: Hard to predict search time. Need to order
moves well.
- Implementation: Use ID to time limit. Note
convergence of root value? Use ttable to order
moves effectively (via values from previous
iteration).
- Problem: Want to leverage transposition table to
decrease search, but how?
- Idea: Horizon effect is a nightmare. Search deeper where it matters
- Implementation: AB already prunes out-of-bounds
leaves. Use secondary heuristic to estimate
accuracy of primary heuristic ("quietness"),
extend search where necessary.
- Problem: Hard to build/tune.
- Putting it all together: zero-window search
- Idea: Could prove a known value quickly.
- Implementation: Start the AB window closed
on some guessed value. At some point, may
find that no line achieves this value. Can track
"fail-low" vs. "fail-high".
- Advantage: Goes very fast if you have good estimates.
- Disadvantage: Goes very slowly if you have bad estimates.
- Modern implementation: MTD(f) [Plaat 1996]