Present: SK BGB JDG RB SP GY BHB NN JW JRF DK BPS JRK BH IW AC MH BG SM HBB HB Absent: AM
KR had been asked by the dean to allow for some time in class for students to discuss recent events if they wish to do so. The floor was open for students to talk about their experiences, and about people they know that have been affected by the attacks on the World Trade Center towers. After 15 minutes discussion, the class returned to project 1.
There had been another withdrawal from the course, namely RG, and 2 new people (GY and BHB) who joined. These students needed to be placed into the existing groups - it was decided that GY replace RG in group 1, and BHB join group 5. The new group assignment is at the end of these notes.
KR began the discussion by asking students for any observations that
they might have, based on the discourse from the last class. SP
wanted their player interface to be extended so that the players had
access to a ``log'' of the game. A history of what proposals were
made, and whether they were accepted or not would help players
determine the likelihood of certain proposals getting accepted. JRF
complained that she could not run the simulator, and queried about
where she could find the API for players. GSKC asked her if she had
seen the webpage he had set up for project 1, and said that a lot of
the information that the students were looking for would be available
on that webpge: http://www.cs.columbia.edu/~gskc/pps/proj1.html
.
All students still had not been able to access the webboard; KR suggested that GSKC find all students' cunix UNIs and explicitly grant students access to the webboard.
Continuing on the API extension discussion, HBB asked if it was possible to have more information regarding the state of the game in the API, perhaps a data structure of what the other players' intentions were, based on recent history, of course. SK raised the point that players didn't actually know the true identity of other players. KR elaborated on this: the simulator attempts to hide the identity [e.g., creator, strategy description] of each player from other players. However, was there a way to circumvent the simulator's efforts to maintain anonymity amongst the players? In other words, was there a way to correctly identify the other players during the course of a game based on their behaviour? KR also reminded the students that using Java introspection to identify other players was not permissible.
JW proposed that a player could profile the other players during a game - this would involve studying their behaviour under certain circumstances. Given enough knowledge of how other players carry out moves, it might be possible to determine their identity. RM reckoned that player-identification was in fact a good thing because it would permit an AI-enabled player to ``learn'' how to play better if it recognised who its opponents were. However, KR worried that such information might be used by players to form informal teams, and to gang up on other players. This would be unfair to the other players!
Even without explicit identification information, TW figured that a log of the choices made by the other players would be helpful, and asked if it could be made accessible. GSKC offered to provide a log of player decisions made during the course of the game; such information would not exist between games. In fact, student players must not save data to persistent storage for use over the course of several games.
The students then offered several means by which players could identify each other during a game. BHB thought that there was no direct way to do this unless references to other player objects were actually available. However, HB noted that players could correctly identify each other by making certain, pre-determined (probably unusual) swap requests! To which, KR added that such behaviour would allow two instances of the same player object to trivially identify each other during a game. JW brought to everyone's attention that there was a more logical way for such dual instances to recognise each other - a different player in the game is obviously an instance of the same player class if its behaviour is the same as what you would have done in the given situations for a number of rounds!
KR pointed out that a lot of different players would sometimes behave identically. For instance, any smart player would attempt to acquire the last remaining Joker card if it already has 3 Jokers in its hand! In such a scenario, a player can identify a ``brother'' player only with fairly low confidence. SP then suggested that ruling out other players as not being a brother player based on their decisions being different from what you would have done is a surer way of identifying replicas or brother players.
KR recommended that student players should maintain persistent state between many game iterations, as well as use many parameterised configurations during their testing phase. But the final submissions could not do so.
The issue of how to identify replicas having been addressed, BPS inquired if it was actually beneficial to have a brother player in the game. Would a player always help a brother, or would sibling rivalry rear its ugly head? JRF remarked that a player could always accurately predict the behaviour of a replica, so will be in a better position to decide whether to propose a swap with a brother or not. KR reminded the students that not all brothers would be exact replicas. Half-brothers might share the same logic when proposing swaps, but use different mechanisms when figuring out whether to accept other players' swap proposals. Also, completed unrelated players might employ fairly similar algorithms, and hence exhibit analogous behaviour at times.
Noting that most games reach stability with fewer players choosing to swap cards after playing about 3-4 rounds, IW observed that it might be a waste of time to actually use up your turns to send out signals to your brothers.
A general discussion started on how the number of rounds required for a game to converge was related to the number of players. Both IW and MH agreed that a smaller number of players reach lesser rounds to converge. In fact, MH had a theory that the number of rounds required to converge was approximately equal to the number of players subtracted by 2! KR cautioned that exact numeric values are hard to prove, but the general idea (i.e., fewer rounds for convergence for smaller games) is probably correct. Also, it is not guaranteed that all games will converge early as IW observed; in fact, some players remain relatively inactive during the first rounds, and then use the last remaining rounds to converge massively. JRF disagreed because she believed that while human emotion does play a part when people play, a computer program will not play emotionally.
AC returned to the fact that a brother could be evil, especially if it is placed earlier in the playlist. For the tournament, since the aggregate score of all instances will be applied towards any student submission, KR indicated that an ``evil twin'' player would not help as far as the tournament was concerned.
So, disregarding the possibility of evil or jealous replicas, BH offered that brothers that do identify each other could work towards getting one of them higher placed in the game than both of them trying equally hard to to get a good hand, perhaps the instance that started out with the better hand.
The discussion on brother players continued - BGB inquired about the likelihood of a player actually being faced with a replica during the final tournament. KR responded that with multiple submissions from each group, it should not be difficult to play 10 different players against each other in every game. There will be no danger of twins in a game if we don't wish it.
DK arrived late, having biked all the way from New Jersey. (Bridges and tunnels were closed.)
SP wanted to know if any swap can really be considered neutral. KR suggested that a swap that is not really advantageous or disadvantageous should still be proposed if it might help out a brother player. JDG repeated the fact that brother players should not spend many rounds trying to identify each other; this reason behind this is more apparent for games with smaller number of rounds.
KR then asked the students to vote whether to allow brothers in any single game or not: the results were in favour of disallowing replicas by 11 to 4. KR asked the students to think whether, by excluding replicas altogether, the students had gotten rid of the fact that some players might have an unfair advantage by having some knowledge of other players' logic coded into their own algorithms. SM stated that multiple submissions of players by the student groups does not eliminate the given risk because the different players submitted by the same group might still be able to identify each other during the game. Different players from different groups also have the possibility of teaming together to gang up on the other players.
SP remarked that having looked at the source files for other groups' players, it would be possible to identify these players based on their runtime behaviour. However, should this be deemed unfair, and hence made illegal? RM suggested that it would be an interesting idea to be able to figure out who the other players are based on their behaviour, but it would be unfair to team up during a game. BGB proposed making it harder for players to be identified during a game, perhaps by incorporating some randomisation in the behaviour; this will certainly reduce the frequency of correct identification taking place. DK reckoned that the main intention of a player is to better its score, not necessarily to come out on top. Teaming up with a different player, and even helping them out at times should not be considered to be a bad thing, as long as it is trying to better its own hand.
JDG argued that student groups that work together to write collaborating players might discover that their partner group has suddenly changed their code in secret!
KR continued illustrating how seemingly weird swaps might actually make sense: two replica players will do better overall if their combined ranking is improved after a swap. This would be preferred to making swaps that improve the proposers hand without much improving the partner's hand.
NN suggested that players should be ``nice'' towards those players that have been favorable towards them; this will create an overall incentive for all players to be friendly towards others. KAR reminded the students that they had to end the discussion and come to an agreement on what sort of collaboration between players would constitute fair game. BGB expected any explicit inter-team collaboration to be disallowed - this included members from different groups coding together. KR reminded the students that even the act of sharing ideas in class was some form of explicit collaboration. JW argued that not all players would demonstrate gratitude!
SP considered passive information gathering during a game based on the other players' behaviour to be legal. To ensure that the player do not have hard-coded logic to identify other players, GSKC and KR will look at the source files for all students' submissions. As far as passive information gathering logic is concerned, the only permitted logic will be of the form that can identify that a certain player always likes to go for a full-house. On the other hand, logic to identify that a certain player is from a certain student group is illegal.
DK reckoned that helping other players do better would be harmful to the player itself, and this should be outlawed. JDG added that saving information to persistent storage from testing phases for use at the tournament should be disallowed. Somebody pointed out that in real human games, the players' personal characteristics and behaviour during the game would definitely be used to guess those players' decisions.
KAR explained to the students how in one of the projects from a previous year when this course was taught, one of the students did real well on the tournament by observing that all the other groups were using a particular strategy, and then developing a counter-strategy! Such a thing does constitute explicit identification, but this is permissible.
JRF expected the students should aim towards developing the best player. KR responded that the best performance in the tournament did not necessarily guarantee the best grades. Also, the students should not feel bad if their player is not the ultimate best player in the world. A player with a certain strategy might be good at beating a certain class of other players, but there will be another class of players that it cannot beat. KR will not be grading on a curve, and he stated that at the end of the course, it might turn out to be that all the students get an A, or maybe, everyone gets a C! Everything will be on an absolute scale.
GY defined the term ``best player'' to indicate the player that will do well against most other players, ie. the player that has the hightest probability of winning a game, based on its past performance. KR added that it might even be impossible to program the absolute best player in the world, since there may not be such a thing.
Shifting the focus to game particulars, KR asked about how things would differ for different numbers of players in the game. DK pointed out that there will be more cards not in play, and this will mean that there will be less room for players to improve their hands. BG remarked that with less cards to play with, the task of deciding which cards to swap will become harder, and there will be fewer rounds in which players actively propose swaps. Also, a change in rank amongst fewer players will become more significant.
KR observed that in a game with only a few players, the higher-ranked hands will occur less frequently at the end of the game. It's more likely that the winning hand will only have a pair. SP added that trading will become so important for smaller-player games. it might even be necessary for a player to try to get rid of all the cards it started with in order to get a decent hand! BG thought that this would give the other players an advantage if they could use these cards to attain better hands of their own. DK considered that convergence would be quicker with smaller-player games since the players would not be motivated to make many swaps. JDG figured that a sensible top-ranked player will never propose nor accept any swap that could potentially lower their ranking, thus making the analysis of the whole game more tractable. KR supplemented this argument with the model of a 2-player game: the higher ranked player will never lose its top slot, even if it may be leading with only a TEN-HIGH hand!
The deliverable for the class was to have implemented some version of a player. For the next class, students will have to have players with some degree of strategy, preferably able to beat RandomPlayer, maybe even the GreedyPlayer.
JDG proposed that the different student groups share strategies for ranking hands, after all, even if the different groups were to do this seperately, most of the implementation be very similar. Having thought about it for a while, SK remarked that a certain degree of randomness in decision making would make strategy-determinism and consequently, runtime identification more difficult for other playesr.
KR decided that a broad test of the capabilities of the different players required the tournament to have different parameters: eg. small, medium, large number of rounds, the full 10-players, fewer players, etc.
NN asked GSKC to add a method for comparing hands of different players to the API.
BH had a few questions at the end: were there to be no brother players? could players save state about games to persistent storage? The answer was 'no' on both counts. The class had made up a list of different strategies that could help support/avoid taking advantage of brother instances during a game: