Skip to content

Diversity is all you need

⬅️ [Evolution Strategies at the Hyperscale](<./Evolution Strategies at the Hyperscale.md>) | ⬆️ [Reading List](<./README.md>) | [Decision Tree Policy Optimization](<./Decision Tree Policy Optimization.md>) ➡️

diversity_is_all_you_need.pdf

Point of method is to maximize the mutual information between states and skills

Lol what does this even mean?

Already know that a discrimination objective is equivalent to maximizing the mutual information between a skill and some aspect of the trajectory.

Use maximum entropy policies to force skill diversity.

At start of each episode, sample a skill. Skill is fed to both the policy which executes the skill and acts as part of the loss in the discriminator which predicts the skill based on the trajectory.


⬅️ [Evolution Strategies at the Hyperscale](<./Evolution Strategies at the Hyperscale.md>) | ⬆️ [Reading List](<./README.md>) | [Decision Tree Policy Optimization](<./Decision Tree Policy Optimization.md>) ➡️