In biology, chemotaxis is the process by which cells follow a gradient in the concentration of a chemical signal to home in on a destination. For example, a damaged tissue might emit a signal that attracts other kinds of cells that repair the tissue, or that fight bacteria. Alternatively, one cell might emit a chemical that repels other kinds of cells whose presence might be detrimental to the host. In this project you're going to guide a single agent through a large square grid using only virtual chemical signals. There are three chemicals available, which we'll call red, green and blue.
Chemicals diffuse through the grid. Initially every cell of the grid has a concentration of 0.0 of each chemical. At various points in the game, your controller may choose to apply a chemical at a particular grid location. Say you choose to apply blue to (10,10). Immediately after you do this, the concentration of blue at (10,10) is set to 1.0. In subsequent turns, the chemical diffuses in the following manner: the concentration at (i,j) at time t+1 is the average of the concentrations at time t at (i-1,j), (i+1,j), (i,j-1), (i,j+1), and (i,j). In other words, the new concentration is the average over the 5 cells orthogonally adjacent to and including the cell. So on the next turn, the concentration of blue will be 0.2 at each of (9,10), (11,10), (10,9), (10,11), and (10,10). On the next turn, the concentration at (9,9) would be 0.08. (Can you see why?) Each of the three chemicals diffuses independently, meaning that a cell will have an (r,g,b) triple defining its concentration of the three chemicals. If at a later time you again applied blue to a cell, the new concentration would be set to 1.0, irrespective of what the blue concentration of the cell was beforehand.
The square grid may not be completely open. There may be cells that are blocked off, so that chemicals (and agents) cannot move to that cell. In such a case, the average calculation described above is over the open cells. The pattern of blocked cells will be specified by a map. We'll consider a variety of maps, and one of the first tasks for each group will be to submit a map that you think would be interesting for the project. A map has a size (anywhere up to 100x100) and a set of blocked squares. The format for maps will be provided with the simulator. A map also has a start cell and a target cell. Every unblocked cell in a map must be reachable via orthogonal moves from every other unblocked cell.
You will actually code two coorperating components for this project. You will write code for your agent which will be used for every map and every simulation. In other words, your agent is independent of the map. You will also write code for your controller. The controller is aware of the map, and places chemicals strategically over time, to induce appropriate movement from the agent. The goal is to move the agent from the start to target cell in the shortest time possible.
Agents can move one cell orthogonally (not diagonally) on a turn. An agent receives inputs indicating the (r,g,b) values for the currently occupied cell, as well as for all unblocked orthogonal neighbors. The agent then makes a move decision based solely on these (r,g,b) values. Possible moves are north, south, east, west, and rest (i.e., stay put). So a simple agent might choose to ignore the red and green chemicals, and move to the neighboring cell with the highest blue concentration. Agents do not know their x or y coordinates, but they can sense neighboring blocked cells. Also, agents have a limited sensitivity to chemical concentrations: any concentration that is less than 0.01 will be sensed as zero. Agents can retain limited state information (one byte) that is preserved from turn to turn. So agents can't build maps as they explore, for example. Agents also get access to a random number (32-bit integer) on each turn.
Your controller places chemicals and observes the behavior of the agent. The controller also has complete knowledge of the map, the start cell, the target cell, the player's coordinates, and all cell concentrations. The controller has a chemical budget b, i.e., the total number of chemicals that can be placed during the simulation. There is no bonus for using fewer chemicals: you're just trying to minimize the time. We'll try simulations where the budget is plentiful, and others where the budget is barely enough to complete the task. You can choose which combination of chemical types you want to use -- there is one common budget rather than a separate budget for each type.
We'll put a (large) upper limit on the simulation time. If you don't manage to complete the task, your score will be this limit. We'll also put a limit (say 1 CPU second) on the time that a controller can spend on determining a move.
At the end of the project we'll run a tournament with various configurations, including some maps that you won't have seen before.
Ken Ross 2020-09-09