Next: Remaining Notes Up: notes Previous: Groups for project 1

Monday, 17th September, 2001

Absent: DK.

These notes were taken by Suhit in GSKC's absence. GSKC was out of town due to a family emergency.

KR confirmed for the class that the project is due next Monday the 24th. Reports and presentations are both due then. The report should be made electronically available in HTML format, and a hardcopy submitted in class. Guidelines for writing the report are on the web. KR emphasized that it is better to be specific and deep about a few things than being vague about all aspects of the problem.

KR asked the class to put their current player code on the web today. That way, there can be some interesting discussion on Wednesday about how well the players play against each other. Random and Greedy are obviously not the best to test against.

DA reported that his group was implementing an incremental approach. The basic idea is to try to improve your hand one card at a time, say from two of a kind to two pairs to a full house. Right now, DA doesn't consider straights and flushes, but he plans to do so.

BHB tested his code against Greedy and Random. Greedy had a pair of Queens but traded one of them away. Greedy thus seemed to be a weak player. This was not a surprise to KR, since random and greedy were never expected to be competitive.

GY observed during his tests that Greedy players sometimes ``collaborate'' with other greedy players, at the expense of his player. Nevertheless, the more players there were (and hence the more cards are dealt), the easier it was to defeat Greedy.

HBB wondered about the appropriateness of publishing source code at this early stage of the project. KR thought it was reasonable, since if other groups used your code, they would be required to include an acknowledgement.

MD described her group's initial approach. Look at other people's cards and split them into ones we think that the other players will hold on to, and ones we think the other players would part with. For example a player with a three of a kind would probably not part with the triple, but could conceivably part with either of the other two cards. A score is computed for all exchanges based on how much it improves the player's hand, whether the desired card is one the other player would part with, how much the swap would help the other player, etc. Then choose the swap that maximizes the score.

AM added that once a swap is done, there might be a dead end. For example, the card needed might be a step towards a flush, but there may not be enough ``movable'' cards left to complete the flush. It might pay to look ahead to see if there is a chance if a swap will be useless in the future.

SP described a situation where you desire two cards, say A and B. A has higher value, in that it increases the rank of the hand by more than B, but B is harder to get, since others might want it too. In that case, one should try to get B first.

NN described her group's approach that involved building a player whose strategy depends on several parameters. Those parameters will be varied during the code's development. Hopefully, the system could ``learn'' good values of the parameters. The parameters she had in mind were things like how to compute the value of your hand and other players' hands. NN also suggested looking at the closest possible hand that would be an improvement in rank, and try to go for that first.

JD proposed that certain cards can be classified as ``strong''. For example, if a player is trying to get a set of 7s, then we say all 7s are strong. With sufficiently many players going for rank-based hands, there will be many strong cards. In such a situation, it would be very difficult to get a straight or a flush because you might have to step on other people's toes to get strong cards. KR liked the notion of ``strong'' cards; perhaps straights and flushes will be rare because of this effect.

HBB expressed his satisfaction that today we were discussing actual game strategy, as compared with last class when we spent a lot of time discussing how (or how not to) team up with other players.

SK thought that a fair degree of luck would be involved in the game, both guessing how the opponents will repsond, and how the cards are initially dealt. JW disagreed. He thought there was enough room for skill to shine out in the long run.

RM mentioned that her group had considered a traditional minimax search for the game. However, they rejected the idea because the search space after the first level would be much too big.

MD suggested that some hands might allow one to work towards two different goals. To the extent that it is possible, one might try to stay within the intersection of those goals as long as possible. RB thought that trying to achieve this two-goal philosophy would not work because the search space would then get too large.

Somebody (who?) suggested that worrying about the future wasn't worthwhile; just go with what you have right now and get the one best card. KR outlined how the class had suggested two models: one that does the best local move, and one that makes plans for the future. Both have advantages and disadvantages. HB suggested that the preferred approach depends on number of rounds left, with more planning being preferable when more rounds remain. We might want to use both depending on how many rounds are left

JDG was emphatic that there is no sensible notion of which approach is ``better''. It depends upon who plays your opponents. For example, a short-term player may do well against many long-term players, but poorly against other short-term players. Conversely, a long-term player may do well against many short-term players, but poorly against long-term players.

KR divided the class into four groups, to play a four person game. The cards were dealt as follows by GSKC's simulator. (Suits were not noted by the note taker, but nobody ended up going for a flush anyway.)

p1 - 4 4 2 6 K
p2 - 5 5 6 9 10
p3 - A 5 6 7 J
p4 - 3 7 8 9 J

Several students analyzed the state of the hands. Various kinds of strategic issues were discussed, such as whether there was a possibility of a flush, which cards were available for three of a kind (or better), whose straights might conflict with others' goals, etc. KR asked how much of this stuff that we just said can be automated?

Some of the issues (such as how many of each rank are out there) could clearly be automated. Much of the strategic thinking was more difficult to conceive of as a machine strategy. MD mentioned that, perhaps unlike a human, a machine could easily enough compute the best possible hand that can be made from the dealt cards.

After one round of the game, it looked like this:

p1 - 4 4 2 6 K
p2 - 5 5 5 9 10
p3 - 6 6 J J A
p4 - 7 7 3 8 9

JW thought it was not feasible to determine what other people are going to say for a swap. AM suggested that if players have more than one combination, there is no way to predict which way they are going to go. BG thought player 1 could do nothing since they had bad cards.

JD had an idea for player 1 (even though he was actually playing player 4). He suggested giving player 3 the 6 in exchange for a J. As a result, player 3 would improve from two pairs to three of a kind. While it might seem like player 1 is no better off, on the next round player 1 can ask for the second J, to get two pairs. While this second swap relies on player 3 showing some gratitude for earlier moves, or at least being favorable to neutral swaps, there didn't seem anything better for player 1 to do.

This move was completed. Interestingly, on player 3's turn, they asked for the J back in exchange for the A. They didn't have anything better to suggest; this offer was declined, to great laughter. IW stated that there is clearly emotion here, but the computer has no emotion.

JW wondered if players can remember things? KR said yes, within one game, but not from game to game JW mentioned that, unlike in the human version of the game, there is no possibility for threatening other players with future mistreatment.

JD disagrees with IW, in as much as he thinks computers could be programmed to behave in a way that is benevolent. KR reminded the class of JD's earlier point. Even if player 1 doesn't expect to get the J, there is still a nonzero chance of getting it, which is better than nothing.

Player 4 offered player 2 a 9 to complete a full house, even though the net gain to player 4 was almost zero.

At the start of the final round, the cards were:

p1 - 4 4 2 J K
p2 - 5 5 5 9 9
p3 - 6 6 6 A J
p4 - 7 7 3 8 10

In the final round, player 1 succeeded in getting their J, with all others passing. The final result:

p1 - 4 4 J J 2
p2 - 5 5 5 9 9
p3 - 6 6 6 A K  
p4 - 7 7 3 8 10

Ranks: p2, p3, p1, p4.

The class revisited their initial thoughts about the hands. p2 was supposed to be close to a flush but got a full house. Other players that we close to straights did not get it. SP tought that if p4 had gone for a straight, then things would have been very different, even if p4 hadn;t succeeded.

KR wondered how many people are going to try and get straights and flushes? Multiple people said yes. AC thought that p2 and p4 should have tried for straights.

p1 people thought they were worse off than they were. But it turned out they did better because of the benevolent trade in the last round.

KR suggested that improving a little when you are really bad is just as valuable as moving up when you are doing well.

The class then discussed deliverables for next time. KR wants players to be submitted on-line ASAP, both in executable and source form. Gaurav would get back to the students (assuming his imminent return) to clarify what's needed.

KR asked how many groups would like to submit more than one player. Multiple people said yes, others did not know. KR encouraged this, since it permits the exploration of more ideas. However, to be eligible to submit multiple players for the tournament, versions of all players must be available on Wednesday. Wednesday is the last class before the tournament.

If submitting more than one player, then submit them with different names. Do not submit 10 players with minor variations of a parameter. But do submit a couple if necessary if they are significantly different.

Finally, the class discussed the tournament parameters. It was agreed to run tournaments with 4, 7, and 10 players. For each of these, we'd run tournaments with 1, 3, 5 and 10 rounds. Somehow the players would be randomized so that they played a variety of opponents, but played the same number of games.

An interesting suggestion was to run many games with the same initial set of hands, varying the players playing each hand. This could be done either using a default hand in a configuration file, or by being able to control the random number seed used.

SP liked this idea, becuase we could gather data about who did best with the worst cards. For example, if a hand almost always cam in last, but some player managed to do well with it, that is more informative that doing well on a hand that everybody does well on.

KR also suggested ``rigging'' some hands with special interesting combinations. The class liked this idea. NN suggested a candidate in which all hands conatin 5 different ranks, with no initial straights or flushes. KR went further, suggesting a perfectly symmetric set of four hands, each having exactly the same ranks in them.

Next: Remaining Notes Up: notes Previous: Groups for project 1

Ken Ross 2001-10-02