MADDPG_simpletag

MADDPG environment to solve openai's 'simple_tag' environment.
Three(default) predators chase a preyer for reward(10) plus shaped reward(distance of predators and preyers). Three predator choose action with MADDPG algorithms and the preyer acts with uniform distribution from -1. to 1.

Dependency

pytorch==1.0.1
tensorboardX
Use my environment on envs or...
- Install the OpenAI's environment and edit some codes

# environment.py L29

# self.discrete_action_space = True
self.discrete_action_space = False

# simple_tag.py L92
def agent_reward(self, agent, world):
    # Agents are negatively rewarded if caught by adversaries
    rew = 0
    # shape = False
    shape = True

# simple_tag.py L118
def adversary_reward(self, agent, world):
    # Adversaries are rewarded for collisions with agents
    rew = 0
    # shape = False
    shape = True

Getting started

Train

python train.py --tensorboard

Result

Acknowledgement

shariqiqbal2810's Pytorch implementation (Most motivated)
Kostrikov's Pytorch implementation

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
envs		envs
screenshot		screenshot
LICENSE		LICENSE
README.md		README.md
agents.py		agents.py
network.py		network.py
normalized_env.py		normalized_env.py
random_process.py		random_process.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MADDPG_simpletag

Dependency

Getting started

Train

Result

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MADDPG_simpletag

Dependency

Getting started

Train

Result

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages