Machine Learning and VideoGames

Published on February 2017 | Categories: Documents | Downloads: 17 | Comments: 0 | Views: 182
of 26
Download PDF   Embed   Report

Comments

Content

Machine Learning and Data Mining

10. Machine Learning in Games

Luc De Raedt

Thanks to Johannes Fuernkranz for his slides

Contents
ƒ
ƒ
ƒ
ƒ

Game playing
What can machine learning do ?
What is (still) hard ?
Various types of games
• Board games
• Card games
• Real-time games

ƒ Some historical developments

Why Games ?

ƒ Games - ideal environment to test AI / ML systems
• Progress / performance can easily be measured
• Environment can easily be controlled

Machine Learning for Game Playing
ƒ A long history, almost as old
as AI itself
ƒ Arthur Samuel
• Playing checkers - Damen
• (late 50’s, early 60’)
• Several interesting ideas
and techniques
• Now, chinook (without
learning) - world champion

State of the art
ƒ Solves
• Tic-tac-toe, 4 gewinnt, Go-Mo-Ku
• Endgames: chess (5 pieces), checkers (8)

ƒ Worldchampion level
• Chess, checkers, backgammon, scrabble,
Othello

ƒ Human still much better
• Go, Shogi, Bridge, Poker

ML in games
ƒ Learning the evaluation function
• For e.g. minimax
• Essentially reinforcement learning

ƒ Discovering patterns
• From databases discover characteristic / winning
patterns

ƒ Modelling the opponent
• Given optimal strategy
• Find strategy that better fits the opponent.

MENACE (Michie, 1963)

MENACE (Michie, 1963)
ƒ Learns Tic-Tac-Toe
• 287 boxes
(1 for every board)
• 9 colors (for every position)

O X O
X

ƒ Algorithm:
• Choose box according to position
• Choose pearl from box
• Take corresponding move

ƒ Learning:
• Lost game -> keep pearls (negative reinforcement)
• Won game -> add extra pearl to boxes from which
pearl was taken (positive reinforcement)

X to
Move

O X O
X

O X O
X
X

Choose Box
Take
corresponding
Move

Select pearl

Arthur’s Samuel Checkers Player
ƒ Rote learning
• Learning by heart - memorizing
• Minimax - AlphaBeta

Minimax Search / KnightCap

Temporal difference learning

Backgammon
ƒ
ƒ
ƒ
ƒ
ƒ

Elements of chance
TD-gammon (Tesauro)
Very high level
Changes in strategies of humans
Why does it work ?
• Deep search does not seem to be very useful (due
to random aspects)
• Situations can be compactly represented using
neural net and reasonable set of features

KnightCap (Baxter et al. 2000)
ƒ Learns chess
• From 1650 Elo (beginner)
to 2150 Elo (master player)
in ca. 300 Internetgames

ƒ Improvements wrt TD-Gammon:
• Integration of TD-learning with search
• Training against real opponents instead of against
itself

Discovering patterns
ƒ Database endgames
• Enormous endgame databases exist
• For certain combinations of pieces
ƒ Optimal moves known (brute force)
ƒ Known whether positions are won, lost, draw, how many moves

• Can they be compressed ?
ƒ Rules + exceptions more compact than database ?

• Can they be turned into simple rules ?
• Can we turn complex optimal strategies into simple but
effective ones ?
• Which properties of boards to take into account ?
ƒ Relational representations / engineering

• E.g., Quinlan, Alan Shapiro, Fuernkranz, …

ƒ KRK: simplest endgame
• 25620 positions
• Won in 0-16 moves
ƒ 2796 different positions
ƒ 18 classes

ƒ Learning classification
rules
• Knowledge, relations
• 1457 rules, 1003
exceptions

ƒ Not much gained

Relational / Logical representatoins
ƒ krk(-1,d,4,h,5,g,5)
ƒ Use information such as






samediagonal
samerow
samecollumn
attacks(…)
Etc.

Discovering strategies
ƒ Endgames are solved but hard to
understand
• Even hard for grand masters (KQKR)
• Many books written on endgames

ƒ Goal
• Find easy to understand strategies
• Perhaps not optimal, but easy to recall and
follow

Difficult games for computers
ƒ Go ?
• Too many possible moves
• Too deep search would be necessary
• Intractable (big award to be gained)

ƒ What about end-games ?
• Go end-games (simplified) have been
considered (E.g. Jan Ramon)

Modelling the opponent
ƒ Key problem in games such as poker, bridge, …
ƒ For simple games, optimal strategy known (NashEquilibrium)
• Optimal: Random
• But not optimal against a player that always plays stone

ƒ Modelling the opponent
• Trying to predict move of the opponent
• Or which move the opponent you will play

ƒ Key to success for some games
• Cf. Poker (Jonathan Schaeffer)

Other types of games
ƒ Adventure games, interactive games,
current compute games
ƒ Let’s look at some examples

QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.

Digger
ƒ (learning to survive)

QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.

ƒ A key problem : representing the states, use
of relations necessary

Real time games
ƒ Robocup
• Components can be learned
ƒ Using RL - e.g. the goalie

ƒ How to tackle those ?
• Problems
ƒ Degrees of freedom
ƒ Varying number of objects
ƒ Continuous positions …

Learning to fly
ƒ Work by Claude Sammut et al.
• Behavioural cloning
ƒ Trying to imitate the player

• Reinforcement learning
• Layered learning / bootstrapping

Financial Games
ƒ Predicting exchange rates
• Daimler-Chrysler

ƒ Predicting the stock market
• Many models

ƒ Time series … !

Games and ML
ƒ A natural and challenging environment
ƒ Several successes, a lot still to do
• Ideal topic for thesis / studien arbeit

Merry Christmas and Happy New Year !!!

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close