This project extends a project that the students in last year's class worked on. Last year, students had to guide a single agent through the grid using a controller. This year, you'll be trying to guide a stream of agents through the grid.
In biology, chemotaxis is the process by which cells follow a gradient in the concentration of a chemical signal to home in on a destination. For example, a damaged tissue might emit a signal that attracts other kinds of cells that repair the tissue, or that fight bacteria. Alternatively, one cell might emit a chemical that repels other kinds of cells whose presence might be detrimental to the host. In this project you're going to guide a single agent through a large square grid using only virtual chemical signals. There are three chemicals available, which we'll call red, green and blue.
Chemicals diffuse through the grid. Initially every cell of the grid has a concentration of 0.0 of each chemical. At various points in the game, your controller may choose to apply a chemical at a particular grid location. Say you choose to apply blue to (10,10). Immediately after you do this, the concentration of blue at (10,10) is set to 1.0. In subsequent turns, the chemical diffuses in the following manner: the concentration at (i,j) at time t+1 is the average of the concentrations at time t at (i-1,j), (i+1,j), (i,j-1), (i,j+1), and (i,j). In other words, the new concentration is the average over the 5 cells orthogonally adjacent to and including the cell. So on the next turn, the concentration of blue will be 0.2 at each of (9,10), (11,10), (10,9), (10,11), and (10,10). On the next turn, the concentration at (9,9) would be 0.08. (Can you see why?) Each of the three chemicals diffuses independently, meaning that a cell will have an (r,g,b) triple defining its concentration of the three chemicals. If at a later time you again applied blue to a cell, the new concentration would be set to 1.0, irrespective of what the blue concentration of the cell was beforehand.
The square grid may not be completely open. There may be cells that are blocked off, so that chemicals (and agents) cannot move to that cell. In such a case, the average calculation described above is over the open cells. The pattern of blocked cells will be specified by a map. We'll consider a variety of maps, and one of the first tasks for each group will be to submit a map that you think would be interesting for the project. A map has a size (anywhere up to 100x100) and a set of blocked squares. The format for maps will be provided with the simulator. A map also has a start cell and a target cell. Every unblocked cell in a map must be reachable via orthogonal moves from every other unblocked cell.
You will actually code two coorperating components for this project. You will write code for your agent which will be used for every map and every simulation. In other words, your agent is independent of the map. New in 2021: There may be multiple instances of an agent in the grid at any point in time. Each agent instance runs the identical agent code, with each agent having its own local state that is not visible to other agents.
You will also write code for your controller. The controller is aware of the map, and places chemicals strategically over time, to induce appropriate movement from the agents. The goal is to move n of the agents from the start to target cell in the shortest time possible. Here, n is a parameter that we'll vary in the course of the project.
Agents spawn at the start cell every r turns, where r is a parameter that we'll also vary. The only time an agent won't spawn is if the start cell is already occupied by another agent. When an agent reaches the target cell, it disappears from the grid.
Agents can move one cell orthogonally (not diagonally) on a turn. An agent receives inputs indicating the (r,g,b) values for the currently occupied cell, as well as for all unblocked orthogonal neighbors. The agent then makes a move decision based solely on these (r,g,b) values. Possible moves are north, south, east, west, and rest (i.e., stay put). So a simple agent might choose to ignore the red and green chemicals, and move to the neighboring cell with the highest blue concentration. Agents do not know their x or y coordinates, but they can sense neighboring blocked cells. Also, agents have a limited sensitivity to chemical concentrations: any concentration that is less than 0.001 will be sensed as zero. Agents can retain limited state information (one byte) that is preserved from turn to turn. So agents can't build maps as they explore, for example. Each agent also get access to a distinct random number (32-bit integer) on each turn.
If an agent tries to move to a cell where another agent already exists, then the move fails. This may be an important factor in your strategy, because it is possible that the grid becomes clogged. Moves are resolved in the order in which the agents spawned, so that the agent that appeared first moves first, etc. So an agent may choose a move to a cell with another agent on it, and will succeed if that other agent is an earlier-appearing agent that moves away from that cell. Importantly, agents do not sense other agents, so cannot adjust their move algorithm based on the degree of crowding.
Your controller places chemicals and observes the behavior of the agents. The controller also has complete knowledge of the map, the simulation parameters n, r and b (see below), the start cell, the target cell, all of the agents' coordinates and spawning times, and all cell concentrations. The controller has a chemical budget b, i.e., the total number of chemicals that can be placed during the simulation. There is no bonus for using fewer chemicals: you're just trying to minimize the time. We'll try simulations where the budget is plentiful, and others where the budget is barely enough to complete the task. You can choose which combination of chemical types you want to use -- there is one common budget rather than a separate budget for each type.
We'll put a (large) upper limit on the simulation time. If you don't manage to complete the task, your score will be this limit. We'll also put a limit (say 1 CPU second) on the time that a controller can spend on determining a move.
At the end of the project we'll run a tournament with various configurations, including some maps that you won't have seen before.
Ken Ross 2021-09-28