James Nathan: Cameron Browne’s “Automatic generation and evaluation of recombination games”

It turns out there is an equation that likely describes games I like:

img_20180415_215629952188481726924050.jpg

It probably doesn’t, but maybe.

If you ask me what types of games I like, I’m fairly at sea as I don’t have the vocabulary for that and you probably don’t have time for my rambling answer. I bemuse myself that I have trouble with the concepts of favorite designers or favorite publishers – as I generally don’t like multiple games from the same designer or publisher.  This last year at Essen, I triaged Eric’s preview by generally assuming that I was safe to throw-out themes, components, and mechanics that aren’t generally in my wheelhouse: take-that, mini’s, etc. Typically my answer would probably start with mechanics- it seems safe to say that I generally like worker placement games. But with enough time, I’d probably give you a quirkierly specific answer- I like games where you are forced to sacrifice/consume your victory condition in order to compete; games where one player is disqualified at the end for meeting/failing to meet a certain criteria; games where you have a hand of cards and on your turn you play two of them to take actions.

Turns out mechanics may not be the best predictor for what you like. There are equations for that too.

There’s a chance this post will be full of un-scientific extrapolation and mis-application of scholarly results.  

That chance is 1000%.

In preparation for a series of reviews this week, I’ve played games with titles like Bregorme; Ninniach; Elrostir; and Ndengrod. Most of the games I’ll talk about are published by the Spanish company nestorgames – a publisher from which we haven’t reviewed a game since 2011 and the only publisher I know that offers discounts if you pay with crypto. Some of the titles will use more traditional morphemes, such as in Feed the Ducks. I’m also going to talk about some other game-type-activities, let’s call them puzzles, that I came across during preparation.

This all came about from a discussion of what is a game designer, and then somebody said ‘Hey, there is Yavalath’.  The short version is this: Cameron Browne wrote a computer program to design games based upon evolving existing rulesets, and Yavalath was the second most liked game.  

(The most liked was Ndengrod, but it’s been renamed Pentalath.  Those other exotic names? The program also had a markov-chain based system for naming the games it evolved.)

What I found when I read Cameron’s PhD thesis, for which the program, LUDI, was written, was that it contained much more nuance than I had anticipated and led me to many interesting things, and that’s what I’m here to talk about.

In the 239 pages of “Automatic generation and evaluation of recombination games”, Cameron doesn’t get to the experiment which evolves the games until page 153 – and everything after 170 is Conclusion and Appendices.  What I had overlooked in my general awe at Cameron and LUDI’s accomplishment is that the hypothesis of the thesis was not so much about could a program be written to design a game, but rather the underlying structural foundation that needs to exist first – namely when LUDI presents you with 1389 new games, how will you playtest them?  How will you skim the cream off the top?

You’ll need a couple underlying systems – an AI capable of playing whatever LUDI brings to the table; and a system for reviewing the results of AI vs AI play, and generating statistics based upon those games which are predictive of what human players might like.  Much of the text of the thesis is composed of addressing these questions.

I imagine it goes without saying, but in the context of LUDI, we’re talking about 2-player abstract games. Interestingly, at one point Cameron cites a 2000 The Games Journal post positing that two-player games can be described as “puzzles that the players set each other”, and solitaire puzzles are solitaire games between the player and the inventor.

I’ve never played much chess, and dabbled just lightly in Go during college, but I remember fondly the curiosity that I had when I first saw one of my classmates studying a book of Go moves before class one day – it was a book of static game states, sometimes global and sometimes local, and the book inevitably said who’s turn it was, and asked what the optimal move was.  At the time, this light Go dabbling was as deep into the hobby as I waded, but I was entranced at the idea of a game that presented you such puzzles in the course of play.

Today, of course, I can’t imagine such a book, say, presenting me with the information for a Power Grid game state and asking me my maximum bid for a certain plant up for auction.  The literature of abstract games is littered with these koans, and Cameron figures out a way to measure the capacity for a game to make interesting puzzles in this manner. That’s the equation.  The one we started with.

But before he could write the general AI or evolve games or measure results, Cameron started with a game language (GDL) that could serve as the framework. It would break a game down into ludemes to describe the board shape for instance, or the ending condition; a win condition, and the placement/movement rules.  These are nuggetized such that the evolutionary mating of two games and random mutations could take the hex tiling of one parent game, and the 4×4 size of the other parent, and combine them – or turn a win condition of 3-in-a-row into a lose condition of 3-in-a-row.

(One of my disappointments with the project, though certainly understandable, is the amount of times “this feature is not implemented” comes up during the discussion of the GDL.  From what I gather, this is largely contributable to processing time. The GDL is more robust than the parts that the LUDI experiments exploited.)

Cameron establishes 57 measurements that can be used to evaluate aspects of a game – this is not to say game states involved in the AI evaluation processes, but rather the intrinsic and extrinsic criteria that we would use to describe a game.  However, rather than the pedestrian “it was interesting” that you might get from me (or the more frequent “it was fine”), he has armed us with measurements of Convergence, Uncertainty, Drama, Stability, and Momentum, among others.

(Somewhat unexpected to me, was the inclusion of measurements for the victory condition: is it an n-in-a-row game?  A game where you need to connect two sides?  These are boolean (0 or 1) results.)

Cameron has a group of subjects play the 79 games that would serve as the initial gene pool for LUDI in pairs – Game A and Game B; the subject would express a preference for one over the other, and then repeat.  While these trials were going on, he also has the AI run simulations against itself, logging the data throughout for the 57 measurements.  Ultimately, he compares the measurements of the AI trials against the games most liked by the survey participants to create a subset of the 57 criteria which most correlate with a human-preferred game.

I would expect I would like games that have a high amount of Coolness (“a measure of the degree to which players are forced to moves that harm their position”), but it turns out I may be wrong.  

(I did know that I would like games that reach a conclusion, but in the realm of LUDI’s xygotes, that isn’t a given.)

Cameron was able to narrow down the 57 criteria to 17 that are best correlated with the games preferred by the survey participants.  Some of these were related to the victory conditions (e.g. grouping pieces), some of them were likely predictable (e.g. duration), and the others range from “Uncertainty (late)”, to “Killer moves”; “Clarity (variance)” to “Permanence”.  Yours truly’s favorite, “Puzzle quality”, made it in, but was removed due to the calculation intensity required.

Now that LUDI knows what to look for, it looked through its familienbande for the top candidates, and its 1389 game design credits were quickly narrowed to 19. A second survey (Experiment III) was sent to a subset of the original survey participants who played the 19 games against the AI in comparative pairs, as before.

What are the 19 games? I’ll discuss several games from the 19 later this week, starting tomorrow with Yavalath, a LUDI game, and Manalath, designed by Dieter Stein. I loved reading Cameron’s thesis – I got much joy out of his creativity and capability, out of the overall structure and theory. There are some links below if you’d like some further readings.  

mvimg_20171230_190909118086788127144730.jpg

Further Reading:

From most concise to the whole hog, here are links to Cameron’s own words describing LUDI:

BGG Designer Diary: https://boardgamegeek.com/blogpost/2814/yavalath-evolutionary-game-design

Article for IEEE Transactions on Computational Intelligence in Games and AI summarizing the thesis: https://eprints.qut.edu.au/31909/1/c31909.pdf

PhD Thesis: https://eprints.qut.edu.au/17025/

Related Reviews:

Tuesday – Yavalath and Manalath
Wednesday – Ndengrod/Pentalath, Valion, and Elrostir
Thursday – Volo, Feed the Ducks, and assorted puzzles
Friday – Brief Interview with Dr. Cameron Browne

This entry was posted in Reviews. Bookmark the permalink.

6 Responses to James Nathan: Cameron Browne’s “Automatic generation and evaluation of recombination games”

  1. Scott Duncan says:

    The second link (IEEE article) seems to fail on some sort of privacy message.

  2. Pingback: James Nathan: Yavalath and Manalath | The Opinionated Gamers

  3. Pingback: James Nathan: Volo, Feed the Ducks, and assorted puzzles | The Opinionated Gamers

  4. Pingback: James Nathan: Ndengrod/Pentalath, Valion, and Elrostir | The Opinionated Gamers

  5. Pingback: The Estates (Neue Heimat) | The Opinionated Gamers

Leave a Reply