Ruling StarCraft Game Spitefully -- Exploiting the Blind Spot of AI-Powered Game Bots

Conference: BlackHat USA 2020

2020-08-05

Summary

The presentation discusses an attack on master agents in deep reinforcement learning environments and the vulnerability of game rules. The authors propose a defense strategy using adversarial training but note its limitations.

Attacking master agents in deep reinforcement learning environments is possible and can exploit vulnerabilities in game rules.
The proposed attack trains an adversarial agent to defeat the victim agent in a two-party game scenario.
The attack does not affect the learning process of the victim agent but rather exploits its policy weaknesses.
Adversarial training is a possible defense strategy but may not always succeed.
The authors plan to release the code for the attack as open source in the near future.

The authors provide videos of the victim agent playing against the adversarial agent in different games to demonstrate the effectiveness of the attack and the limitations of the defense strategy. In one game, the victim agent learns to ignore the adversarial agent and go for the finish line, while in another game, it recognizes the trick played by the adversarial agent and stays put to force a draw. However, in a third game, the victim agent performs worse and is more likely to fall into the ground and trigger a loss.

Abstract

With recent breakthroughs of deep neural networks in problems like computer vision, machine translation, and time series prediction, we have witnessed a great advance in the area of reinforcement learning. By integrating deep neural networks into reinforcement learning algorithms, the machine learning community designs various deep reinforcement learning algorithms and demonstrates their great success in a variety of games, ranging from defeating world champions of Go to mastering the most challenging real-time strategy game -- StarCraft. Different from conventional deep learning, deep reinforcement learning refers to goal-oriented algorithms, through which one could train an agent to learn how to attain a complex objective (e.g., in StarCraft game, balancing big-picture management of the economy and at the same time managing low-level control of individual worker units). Like a kid incentivized by spankings and candy, reinforcement learning algorithms penalize a game agent when it takes the wrong action and reward when the agent takes the right ones.In light of the success in many reinforcement-learning-powered games, we recently devoted energies to investigating the security risk of reinforcement learning algorithms in the context of video games. More specifically, we explore how to design an effective learning algorithm to learn an adversarial agent (or in other words an adversarial bot), which could automatically discover and exploit the weakness of master game bots driven by a reinforcement learning algorithm. In this talk, we will introduce how we design and develop such a learning algorithm. Then, we will demonstrate how we use this algorithm to train an adversarial agent to beat a world-class AI bot in one of the longest-played video games -- StarCraft. In addition to the game of StarCraft, we explore the effectiveness of our adversarial learning algorithm in the context of other games powered by AI, such as RobotSchool's Pong and MuJoCo's games. Along with the talk, we will publicly release our code and a variety of adversarial AI bots. By using our code, researchers and white-hat hackers could train their own adversarial agents to master many – if not all -- multi-party video games. To help the BlackHat technical board to assess our work, we release some demo videos at https://tinyurl.com/ugun2m3, showing how our adversarial agents play with world-class AI bots.

Materials:

Slides

Tags: