cRi3D Game Theory

denkbots’ cRi3D [Part 1] Game Theory

Game Theory is a tool used in several fields of Science including Political Science and Economics. For this article a very simplified version will be used where the model consists of a rational agent (it always chooses to perform the action with the optimal expected outcome for itself from among all feasible actions) competing in a mathematically defined competition.

To put it more plainly, we are going to try and find the theoretical scoring maximums for each part of the game.

Using a game theory based evaluation of the scoring system accomplishes three things:

Makes sure the students understand the scoring rules
Positions optimized scoring strategies at the front of the discussion leading into system strategy brainstorming
Provides the team with an objective reference that can be used as in unbiased evaluation of strategies in later phases of design

To start this article, it is assumed that the reader is familiar with this season’s game and corresponding game manual.

To begin, we will break the game into three sections for evaluation: Autonomous Mode (15 Seconds), Teleoperation Mode (2 Minutes), End Game (15 Seconds).

In each section we ask: What are all of the ways our robot can score points? Are any of these scoring methods mutually exclusive (for example, any given ball can only be put in one goal or the other)? Do any of the scoring methods share casual relationships (for example, a BREACH is caused when four of the five DEFENSES have been DAMAGED)?

For reference, the game manual scoring table (Table 3-1: Point Values) is provided below:

Autonomous Mode

In Autonomous Mode, the robot is placed in the Neutral Zone with a Boulder. The game manual (Table 1-1: Auto Point Values) details that a robot can score points in this mode by:

Reaching a Defense (2 pts)
Crossing an undamaged Defense (10 pts)
Boulder in the Low Tower Goal (5 pts)
Boulder in the High Tower Goal (10 pts)

If the robot scores a Low Tower Goal it can not score a High Tower Goal, and vice-versa, without going back and getting another Boulder. Also, a robot that has Crossed an undamaged Defense must have already Reached that Defense.

Treating the robot as a rational agent, it chooses to perform the options with the optimal expected outcomes and Reaches a Defense then Cross the undamaged Defense (10 pts) and finally toss the Boulder in the High Tower Goal (10 pts).

This optimal Autonomous Mode achieves a score of 20 pts.

Teleoperation Mode

In Teleoperation Mode, the robot is located in the enemy Courtyard from the end of Autonomous Mode. The game manual (Table 1-2: Teleop Point Values) details that a robot can score points in this mode by:

Crossing an undamaged Defense (5 pts)
Boulder in the Low Tower Goal (2 pts)
Boulder in the High Tower Goal (5 pts)

Also, the game manual (Table 3-1: Point Values) details that a robot (or robots) can also score Rank Points (RP) or playoff points (PoP) by:

BREACH (four of the five DEFENSES have been DAMAGED) (1 RP or 20 PoPs)
CAPTURE (three enemy robots SCALE or CHALLENGE a WEAKENED tower) (1 RP or 25 PoPs)

If the robot scores a Low Tower Goal it can not score a High Tower Goal, and vice-versa. If the robot has already crossed a DEFENSE once, on the second crossing it gets 5 points and the DEFENSE becomes DAMAGED. Once four of the five DEFENSES have been DAMAGED, a BREACH occurs.

Treating the robot as a rational agent, it chooses to perform the options with the optimal expected outcomes and returns to the Neutral Zone to retrieve a Boulder, Crosses an undamaged Defense (5 pts), and hurls a Boulder in the High Tower (5 pts). Back at its original location, the robot again chooses to perform the options with the optimal expected outcomes and repeats the previous actions.

In FRC games, this repeatable set of optimal actions is called a Cycle. A challenging part of FRC Game Theory is estimating cycle time. One way to estimate cycle time is by humans acting out the actions with game pieces, simulating a robot by “duck walking” (this forces the student/mentor to move at approximately the same rate as a kit-chassis) while not being allowed to bend their elbows while manipulating game pieces (this forces the student/mentor to articulate their shoulders with a game piece at approximately the same rate as a geared Fisher-Price motor).

Another way to estimate cycle time is by assumption. For example if a robot must be designed to do something in Autonomous Mode and a cycle consists of repeating those same actions in Teleoperation Mode, it would be safe to assume a cycle time approximately equal to the length of the Autonomous Mode.

While some DEFENSES are more difficult than others, for this exercise we will assume an average Cycle time of 15 seconds.

Given a 15 second Cycle time and 2 minute match, the robot will be able to complete 8 cycles in a match.

Since one DEFENSE was CROSSED in Autonomous mode, the robot will be able to complete 4 cycles in which it can CROSS an un-CROSSED DEFENSE. These first four Cycles will score 10 pts (5 pts for Crossing an undamaged Defense and 5 pts for a High Tower Goal) each for a total of 40 pts.

With one minute remaining, the robot will be able to complete 4 cycles in which it can DAMAGE a DEFENSE, achieving a BREACH. These cycles will score 10 pts (5 pts for Crossing an undamaged Defense + 5 pts for a High Tower Goal) each for a total of 20 points and 1 RP.

This optimal Teleoperation Mode achieves a score of 80 pts + 1 RP.

End Game

In the End Game, the robot is at the end of its last completed cycle in Teleoperation Mode. The game manual (Table 1-2: Teleop Point Values) details that a robot can score points in this mode by:

Challenging the tower (per Robot) (5 pts)
Scaling the tower (per Robot) (15 pts)

If the robot is Scaling the tower it can not also be Challenging the tower, and vice-versa.

Treating the robot as a rational agent, it chooses to perform the option with the optimal expected outcome and Scales the tower.

This optimal End Game achieves a score of 15 pts.

Final Score

Optimal Autonomous Mode: 20 pts
Optimal Teleoperation Mode: 80 pts + 1 RP
Optimal End Game: 15 pts

TOTAL: 115 pts + 1 RP

Summary

So what good has this exercise done? By walking through the match as a rational agent, the flow of a match can be better understood. By picking the optimal actions, necessary robot functions begin to take shape. By approaching the game systematically, rules and strategic advantages provided by those rules can be uncovered.

This exercise has also provided a baseline to evaluate other game strategies against and to help define robot functions from.

Please feel free to join the conversation on our Facebook or Twitter with your questions, thoughts, and feedback on these articles!

*Updated 20160112 to fix second DEFENSE crossing error

*Updated 20160113 to fix REACH vs CROSS scoring error

cRi3D Game Theory

Like this:

Leave a ReplyCancel reply

Share this:

Like this:

Leave a ReplyCancel reply