denkbots' cRi3D_2019

cRi3D DEEPSPACE Game Theory

Game Theory is a tool used in several fields of Science including Political Science and Economics.  For this article a very simplified version will be used where the model consists of a rational agent (it always chooses to perform the action with the optimal expected outcome for itself from among all feasible actions) competing in a mathematically defined competition.

To put it more plainly, we are going to try and find the theoretical scoring maximums for each part of the game.

Using a game theory based evaluation of the scoring system accomplishes three things:

  1. Makes sure the students understand the scoring rules
  2. Positions optimized scoring strategies at the front of the discussion leading into system strategy brainstorming
  3. Provides the team with an objective reference that can be used as an unbiased evaluation of strategies in later phases of design

To start this article, it is assumed that the reader is familiar with this season’s game and corresponding game manual.

To begin, we will break the game into three sections for evaluation:

  • Sandstorm (15 Seconds)
  • Teleoperation Mode (135 Seconds)
  • End Game (T-minus 30 Seconds).

In each section we ask: What are all of the ways our robot can score points?  Are any of these scoring methods mutually exclusive (for example, we can only start with a HATCH or a CARGO)?  Do any of the scoring methods share casual relationships (for example, a HATCH must be installed before a CARGO will stay inside the CARGO BAY)?

For reference we will add a copy of “Figure 4-4 DESTINATION: DEEP SPACE FIELD” from the game manual to clarify location terminology.


Figure 4-4 DESTINATION: DEEP SPACE FIELD

Autonomous Mode

In Autonomous Mode, teams stage their ROBOT on their HAB PLATFORM such that it is fully and only supported by HAB PLATFORM Levels 1 or 2.

The game manual (Table 5-1 DESTINATION: DEEP SPACE scoring opportunities) details that a robot can score points in this mode by:

  • SANDSTORM Bonus (Level One: 3 points, Level Two: 6 points)
  • Secure a HATCH (2 points)
  • Load a CARGO (3 points)

If the robot is pre-loaded with a HATCH it can not be pre-loaded with a CARGO, and vice-versa (per Section 5.1.1 of the Game Manual).

The first decision we need to make is how we should pre-load the CARGO BAY.

We can pre-load the CARGO BAY with a null-HATCH (worth no points per Table 5-1 DESTINATION: DEEP SPACE scoring opportunities) and our robot with a CARGO which, if loaded, would net us 3 points.

The other option, is to pre-load the CARGO BAY with a CARGO and our robot with a HATCH which, if secured, would net us 5 points.

NOTE: Once the first HATCH is secured, there is a HATCH pre-loaded at each LOADING STATION on the field, so any other scoring we would want to do in this mode could use those, or we could load a null-HATCH on the CARGO BAY.

Treating the robot as a rational agent, it chooses to perform the actions with the optimal expected outcomes and Cross the HAB LINE (6 pts), then place the HATCH on the pre-loaded CARGO BAY.

First, we assume it takes 1s to depart the second level of the HAB scoring 6pts. Next, if we assume an average speed of 10 ft/s for the robot, then evaluate the distance to the CARGO BAY (~18ft), it takes approximately 1.8s to reach the CARGO BAY. Finally, we assume it takes 1s to secure the HATCH on the CARGO BAY. That puts us at 2.8s of SANDSTORM used with 5pts scored.

Choosing the shortest path to scoring, we will have pre-loaded another CARGO BAY door with a null-HATCH. With that known, we can travel to the DEPOT (19’8″) which will take 2s, collect a CARGO (which we will assume takes 1s), return to the CARGO BAY which takes 2s, then deposit the CARGO (which we will assume takes 1s). This cycle takes 6s to complete, so we have now used 8.8s of SANDSTORM with an additional 3pts scored.

Again, since we want to choose the shortest path to score, we use our final pre-load option to supply the CARGO BAY door with another null-HATCH. Then we can travel back to the DEPOT (21’6″”) which will take 2.2s, collect a CARGO (which we will assume takes 1s), return to the CARGO BAY which takes 2.2s. This cycle takes 6.4s to complete, so we have now used all 15s* of SANDSTORM with an additional 3pts scored.

This MAX* Autonomous Mode achieves a score of 17 pts.

Teleop Mode

In Teleoperation Mode, the robot is located at the CARGO BAY from the end of SANDSTORM. The game manual (Table 5-1 DESTINATION: DEEP SPACE scoring opportunities) details that a robot can score points in this mode by:

  • Secure a HATCH (2 points)
  • Load a CARGO (3 points)
  • HAB Climb Bonus (Level One: 3 points, Level Two: 6 points, Level Three: 12 points)

Also, the game manual (Table 5-1 DESTINATION: DEEP SPACE scoring opportunities) details that a robot (or robots) can also score Ranking Points (RP) by:

  • One Complete ROCKET (1 RP)
  • HAB Docking (1 RP)

Treating the robot as a rational agent, it chooses to perform the actions with the optimal expected outcomes and turns and travels to the LOADING STATION (~27″ at 10ft/s is 2.7s), then collects a HATCH (1s), then travels to the ROCKET (19″ at 10ft/s is 1.9s ), and secure the HATCH to the ROCKET (1s). This cycle takes 6.6s and scores us 2pts.

Now that we are located at the ROCKET, which has one secured HATCH, our rational agent turns and travels to the LOADING STATION (~19″ at 10ft/s is 1.9s), then collect a CARGO (1s), then travels to the ROCKET (19″ at 10ft/s is 1.9s ), and load the CARGO into the ROCKET (1s). This cycle takes 5.8s (total time 12.4s) and scores us 3pts.

With this data, we can extrapolate that one cycle (collecting an item from the LOADING STATION and depositing it at the ROCKET) takes 11.6s. Therefore it will take us 23.2s (total time 35.6s) to fill the remaining half of the rocket, which will score us another 10pts.

If we repeat this exercise for the other half of the ROCKET, it will take us 34.8s (total time 70.4s), which will score us a another 15pts and 1RP.

With the first ROCKET complete, our rational agent knows there is one more null-HATCH from SANDSTORM that does not have a CARGO. Therefore we travel from the ROCKET to the DEPOT (23’3″ at 10ft/s is 2.3s), collect a CARGO (1s), then travel to the CARGO BAY, (23’4″ at 10ft/s is 2.3s), and deposit the CARGO (1s) scoring 3pts. This cycle takes 6.6s (total time 77s).

Now, our rational agent looks to the next most optimal scoring location. With the shorter distance from the ROCKET to the LOADING STATION, our rational agent begins filling the second ROCKET.

The robot turns and travels to the LOADING STATION (~27″ at 10ft/s is 2.7s), then collects a HATCH (1s), then travels to the ROCKET (19″ at 10ft/s is 1.9s ), and secure the HATCH to the ROCKET (1s). This cycle takes 6.6s (total time 83.6s) and scores us 2pts.

Now, we just mirror the initial ROCKET cycle, so we finish the first dock which takes 5.8s (total time 89.4s) and scores us 3pts. Then we finish the first half which takes 23.2s (total time 112.6s) to fill the remaining half of the rocket, which will score us another 10pts.

With 22.4s remaining, our rational agent can either complete another 3 cycles (which take 6.6s apiece) and score a maximum of 7pts, or go for the END GAME which as a potential of 12pts. Our rational agent heads to the HAB ZONE.

Therefore, the MAX TELEOP Mode using this model achieves a score of 48 pts and 1 RP.

End Game

We turn and drive to the HAB PLATFORM (23’3″ at 10ft/s is 2.3s), climb onto the second level (5s), then climb onto the second level (10s). This takes 17.3s (total time 129.9) and scores us 12pts.

NOTE:While we assume we are playing “by ourselves” in an imagined scenario where both of the other robots are just KOP Chassis bots, we can safely assume in any given match that we will still gain at least 3pts from the other bots driving onto the HAB.

This MAX End Game achieves a score of 18 pts and 1 RP.

Final Score

  • MAX SANSTORM Mode: 17 pts
  • MAX TELEOP Mode: 48 pts and 1 RP
  • MAX End Game: 18 pts and 1 RP

MAX TOTAL: 83 pts and 2 RP

Summary

So what good has this exercise done?  By walking through the match as a rational agent, the flow of a match can be better understood.  By picking the optimal actions, necessary robot functions begin to take shape.  By approaching the game systematically, rules and strategic advantages provided by those rules can be uncovered.

This exercise has also provided a baseline to evaluate other game strategies against and to help define robot functions from.

Check out Part 2 – Strategy and Research for more fun!

If you want to learn more about this process, check out our presentation from the 2018 Purdue FIRST Forums on Robot Requirements!

Please feel free to join the conversation on our Facebook or Twitter with your questions, thoughts, and feedback on these articles!

Leave a Reply