Open
Conversation
Contributor
|
Thanks for the implementations! |
feynmanix
pushed a commit
that referenced
this pull request
Nov 29, 2022
feynmanix
pushed a commit
that referenced
this pull request
Nov 29, 2022
feynmanix
pushed a commit
that referenced
this pull request
Nov 30, 2022
feynmanix
pushed a commit
that referenced
this pull request
Nov 30, 2022
feynmanix
pushed a commit
that referenced
this pull request
Nov 30, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
61e6cea to
2ab0780
Compare
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 1, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 2, 2022
feynmanix
pushed a commit
that referenced
this pull request
Dec 2, 2022
9426e0b to
1ec229c
Compare
feynmanix
pushed a commit
that referenced
this pull request
Dec 2, 2022
1ec229c to
d23d98b
Compare
d23d98b to
73b1e36
Compare
AdamGleave
reviewed
Dec 7, 2022
77c85db to
a3369d4
Compare
…ardNets can be injected from the outside
efc5ae0 to
a0bacca
Compare
a0bacca to
74ba96b
Compare
Contributor
|
@AdamGleave: reacting to your comments here together:
Ok, it required a larger refactor, but you can see how it looks in the last couple of commits. A good thing is that this change also addresses your other comment. It simplified the entropy reward classes (separate entropy reward and switching from pre-traininig reward) and allows for more configurability, at the expense of making wiring a little more complicated (in train_preference_comparison.py). It also results in two changes internally:
|
7434ee6 to
4fd0758
Compare
4fd0758 to
b344cbd
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Creates an entropy reward replay wrapper to support the unsupervised state entropy based pre-training of an agent, as described in the PEBBLE paper.
https://sites.google.com/view/icml21pebble
Testing
Added unit tests.