There is a circular trench at the bottom of the ocean that is 100 kilometers in circumference [0 .. 100] where 0 is also where the 100 mark is. Six of those kilometers [d..(d+5) mod 100] (the d will be chosen randomly and will be unique for each game) are deemed a 'red-alert region'. Everywhere else is a yellow-alert region.
The whole game lasts m time units (m is up to 1000) In each time unit, the submarine can move one kilometer along the trench or stay in its current position (+1/-1/0 movement).
There are two players, the submarine captain S and the trench manager T. T must guarantee to be at the red-alert level whenever the submarine is in the red-alert region over the m time units. This is called the Safety Condition. However, T wants to achieve this Safety Condition at the lowest cost possible.
Going to yellow alert costs y per time unit and going to red alert costs r per time unit with r (significantly) greater than y. In addition there are probes wihin at most +/- L kilometers (where L is a value chosen for each probe by T less than or equal to 12). That is, a probe either returns True, which means 'the submarine is within L kilometers of where I am dropped in the trench' or False, which means 'the submarine is more than L kilometers away from where I am dropped in the trench'. Let's say the trench manager T puts a probe at x = 5 and sets L to 2. Then the probe would return True for positions 3, 4, 5, 6, 7. Each probe costs p (no greater than y) and deploys to any location, reports instantly, and then is never heard from again. At every time unit, T may deploy zero or more probes and at any location(s) within 20 kilometers of the red-alert region. Each probe will have an L value (12 or less) chosen by T. T considers the results and then emits an alert level for that time unit. Even though the probes deploy immediately, T must decide on how many and where to deploy the set of probes for that time unit just once; there is no second pass of probe deployment within the single time unit.
The variables d, y, r, m, and p will be given to T on the day of the contest. However, S will be given only L and m and its initial position. As the game proceeds, T will deploy probes at various locations and the simulator built by the architect will tell T which probes return True and which return False. The simulator will also tell S whether its position has been probed or not.
The score of T will be its cost over the m time units provided T achieves the safety condition for all m time units. If T does not achieve the safety condition at any time during the m time units, then the score of T will be the cost of red-alert for m time units plus the cost of 5 m probes.
In each pairwise competition between two groups G1 and G2, there will be two games. In one game G1 will be S and G2 will be T. In the other, G1 will be T and G2 will be S. The group with the lowest score wins.
Example to train your intuition: suppose T knows that S is in the closed interval [30..39] and the red-alert zone is [40..45] and there are three time units left. If T puts a probe at 41 with L = 4 and the probe returns False, then the submarine must be at [30..36] so T can go to yellow-alert till the end. In general, T might want to work backwards from 1 time unit left, then 2 time units left etc. That is a dynamic programming approach.
Dennis published a related puzzle in the Communications of the ACM.
Show the layout of the circular trench. Indicate the red-alert zone with red. Upon receiving the values of the various parameters ( d, y, r, m and p), send the appropriate ones to S and T. Show the position of the submarine at all times. Show the position of probes as time goes on as well as their range. Tell T whether each probe returns True or False at the time it is dropped (given that probe's range as provided by T) and tell S whether any probe has pinged S and what the range of that probe was. Keep track of the score and each player's time.