Abstract:
Recently, deep reinforcement learning (DRL) has achieved remarkable success in many practical sequential decision problems, such as Go, chess, real-time strategy games, etc. However, before we actually apply DRL’s policy into real world applications, a question must be asked: are these learned policies safe or robust to be deployed? Unfotunately, many works had shown that DRL can be susceptible to adversarial attacks: agent may achieve bad performance when the environment or its observation just changed a little bit. So in order to safely deploy DRL's policy into real world critical tasks, we must find some ways to defense these adversarial attacks.
Based on (PO)MDP's basic attributes, we can divide these adversarial attacks into four types: attack state/observation, attack policy, attack environment and attack reward:
- Attack state/observation: Attacker adds small perturbation to agent's input. It's similar to adversarial example in CV to some extent.
- Attack policy: Attacker changes agent's policy, either continuously or occasionally.
- Attack environment: Attacker changes the environment dynamic function.
- Attack reward: Attacker changes agent's reward.
- First three kinds of attack can be used in both testing and training period, and the last kind is used in training period.
In this talk, we will present a survey about adversarial attacks in DRL based on this classification method, as well as some robust DRL method that defenses these attacks. Also, we will discuss some existing open issues and challenges in robust DRL.