COMP3160 ARTIFICIAL INTELLIGENCE Assignment 2 (Weight: 20%)
Evolutionary Algorithms for Adversarial Game Playing
Draft Submission Due: 11:55pm, Oct 21, 2022 (Friday, Week 11)
Final Submission Due: 11:55pm, Oct 28, 2022 (Friday, Week 12)
The goal of this assignment is to appreciate the efficacy of Evolutionary Algorithms, specifically Genetic Algorithm (GA), in the context of game theory. In this assignment you will be using the DEAP package for Genetic Algorithm in order to evolve strategies for repeatedly playing 3-Person Volunteers Dilemma (3VD):
Three bystanders, P1, P2 and P3, witness a hit-and-run accident, with the victim seriously injured. None of these bystanders has the appropriate (medical) skill to help the injured. They are fully aware of their civic duty – to immediately call the emergency number 000 (Triple-Zero) and report the accident so that both police and ambulance can be dispatched to the accident site to help the victim as well as carry out the police investigation. Each of the bystander can either volunteer (V) to call Triple-Zero, or remain apathetic (A) to the victim and do nothing. The assumed facts are:
1. If none of three bystanders volunteers to call Triple-Zero, the victim will die, and each of P1, P2 and P3 will suffer from lifelong shame and guilt.
2. There is a cost to calling Triple-Zero: the volunteer would need to provide their own details, wait at the accident site until the ambulance/ police arrives, and will likely be called to visit the police station and provide more evidence regarding the accident in future if necessary.
3. The cost decreases if there are more than one volunteers – individually they will need to provide less detail, and they will not have to make lonely trip to the police station in future if it becomes necessary.
4. The three bystanders are generally sociable – they prefer to be in each-other’s company than by their own.
5. If at least one bystander volunteers, the victim will be treated, and the one(s) who don’t volunteer would not have to suffer from lifelong guilt and shame.
If you were one of those witnesses, would you call Triple-Zero, or remain a mute spectator?
We model this 3VD game by the payoff matrix given in Table 1 below.
Pj & Pk
0 V 1 V 2 Vs
5 6 8
0 9 7
V
Pi A
Table 1: Payoffs to Pi, under how many of {Pj, Pk} play V.
As an illustration of how this table is meant to be used, suppose both P1 and P2 choose to volunteer, but P3 remains apathetic. We want to calculate P2’s payoff. Setting i = 2 and {j,k} = {1,3}, we see Pi plays V (so payoffs in top row), and only one of Pj and Pk plays V (so we restrict the focus to the column under 1V); and determine that P2’s payoff is 6. The payoffs to each of the three witnesses under alternative arrangements (e.g., if all of them opt V) can be similarly verified.
The payoff matrix can be more explicitly represented as in Table 2 below. In the tasks specified in this assignment you can use either Table 1 or Table 2 (or both) as you find convenient.
(Pj, Pk)
( V, V) ( V, A) ( A, V) ( A, A)
(8, 8, 8) (6, 6, 7) (6, 7, 6) (5, 9, 9)
(7, 6, 6) (9, 5, 9) (9, 9, 5) (0, 0, 0)
V
Pi
A
Table 2: Payoffs to (Pi, Pj, Pk).
You will be using the DEAP package for Genetic Algorithm in order to evolve strategies for playing Iterated 3-Player Volunteers Dilemma (3IVD). In order to assist you on this task, two papers on the application of GA to Prisoners Dilemma – one to IPD and the other to nIPD – are provided in the Assignment folder.
Task Specification
Note: Apart from the class notes, you are advised to go through the two supplied papers: i “Using GA to Develop Strategies for IPD,” by A Haider, and
ii “An Experimental Study of N-Person IPD Games,” by X Yao and PJ. Darwen
in the given order before proceeding with the assignment tasks. Give particular attention to Sections 4.1 and 2.1 of the respective works.
1. BACKGROUND KNOWLEDGE ASSESSMENT [6 marks]
(a) Determine if a Dominant Strategy Equilibrium (weak or otherwise) exists for the game 3VD. If so, identify at least one of its Dominant Strategy Equilibria, and explain why it is so.
(b) Determine if the strategy profile (V, A, V) is a Nash Equilibrium for the game 3VD. Clearly justify your answer.
(c) Determine if a Nash Equilibrium for the game 3VD. If so, identify all the Nash Equilibria for the game 3VD. Clearly justify your answer.
(d) Based on the analysis you have done as part of Tasks (1a)-(1c) above, describe in simple English, with explanation, the predicted behaviour of a rational player in the game 3VD.
(e) Consider a strategy (individual/chromosome) of memory-depth 2 for playing 3IVD. Explain how you would represent the memory bits and the default moves in this individual.
2. IMPLEMENTATION IN PYTHON [10 marks]
(a) Implement the function:
payoff_to_ind1(individual1, individual2, individual3, game):
returns payoff to individual1
Note: payoff is determined by latest moves obtained from respective appropriate memory locations of the individuals and the provided payoff matrix for the game game. (Assume that the game is 3VD and memory-depth is 2.) (b) Implement the function:
move_by_ind1(individual1, individual2, individual3, round):
returns individual1’s move
Note: individual1’s next move is based on all the three individuals’ earlier moves and individual1’s strategy (encoded in individual1’s chromosome). The move to be returned can be V/A, or 0/1 depending on your representation. Note that in early rounds some default moves are made. Assume memory-depth of 2.
(c) Implement the function:
process_move(individual, move, memory_depth): returns success / failure
Note: individual’s relevant memory bits are appropriately updated based on its latest move move and memory depth.
(d) Implement the function:
eval_function(individual1, individual2, individual3, m_depth, n_rounds):
returns score to individual1
Note: individual1 iteratively plays 3VD against the other two for a number of rounds given by nrounds, and its score is returned.
(e) Implement, using the DEAP package, genetic evolution of strategies for playing 3IVD. Assume a memory depth of 2. Based on your implementation, briefly describe the best 3IVD-individual you generated via GA, and what parameters (fitness function, type of crossover, mutation rate, etc.) you used for that purpose. Explain why you chose those specific parameters.
3. ANALYSIS [4 marks]
(a) Describe in English the behaviour of the 3IVD-strategy you obtained via Task (2e) above. Exploit any pattern you notice in it for this purpose.
(b) We know that although in one-shot Prisoners Dilemma the equilibrium is mutual defection, IPD leads to evolution of mutual cooperation. As part of Task (1d) you have analysed the behaviour of a player playing one-shot 3VD that rationality dictates. Discuss evolution of any deviation from such behaviour that your analysis of 3IVD in Task (3a) suggests. Clearly explain your answer.
4. OPTIONAL NOTES [0 marks]
If applicable, note in this section anything relevant that is worth noting.
What to Submit, and When
You will submit two files: a Python code file, and a report in pdf. Your code file should include all the Python codes you wrote for this assignment. Your report file, in pdf, should include all the answers (including the Python codes copied-and-pasted). The report file must have as cover page the one that has been supplied (as part of the document template), duly filled and signed. You will submit the files in two stages.
STAGE ONE: DRAFT SUBMISSION
In the first stage you must submit two draft files (that you will be able to update) by 11:55pm, Friday Week 11:
(a) Draft program file, to be named yourLastnameyourFirstname draftcode.py, that includes the implementation of functions specified in Tasks (2a) and (2b).
(b) Draft report file, to be named yourLastnameyourFirstname draftreport.pdf, that includes answers to Tasks (1a)-(1e).
STAGE TWO: FINAL SUBMISSION
The final versions of your code and report must be submitted by 11:55pm, Friday Week 12: (a) The program file, to be named yourLastnameyourFirstname code.py.
(b) The report (in pdf), to be named yourLastnameyourFirstname report.pdf.
GET ANSWERS / LIVE CHAT