-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add IRIS #30883
base: main
Are you sure you want to change the base?
Add IRIS #30883
Conversation
merge:upstream current main
…eshaping func for HF
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @RUFFY-369, thanks for opening this PR! This is a mammoth piece of work
I've just done a very high-level pass and more review rounds would be needed. At the moment, the structure of the modeling file is very far away from the standard library patterns. In particular, there's lots of logic for things which should be handled elsewhere e.g. agents, environments, downloading files.
I think the best and easiest way to make this model available is by adding the mode directly on the hub. I would refer to the decision transformer to see what classes should be added and how to add them into the library to make them transformers compatible.
IRIS_PRETRAINED_MODEL_ARCHIVE_LIST = [ | ||
"ruffy369/iris-breakout", | ||
# See all Iris models at https://huggingface.co/models?filter=iris | ||
] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Model archive lists have been deprecated
IRIS_PRETRAINED_MODEL_ARCHIVE_LIST = [ | |
"ruffy369/iris-breakout", | |
# See all Iris models at https://huggingface.co/models?filter=iris | |
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed and committed
tokens: torch.LongTensor | ||
|
||
|
||
class Slicer(nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All model specific submodules and layers should have the model prefix. The prefix should be camel-case
class Slicer(nn.Module): | |
class IrisSlicer(nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
working on it asap 👍
return (y, att) | ||
|
||
|
||
class WorldModelEnv: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definition of the environment is outside the scope of the modeling file - this should just be the model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, will make the required changes, thank you for pointing out 💯 And also, this class is not a gym
env but a simulation of env in the world model, use of the world model for training the actor critic component in imagination without actual env. The self.env
was for the reset()
function from the original code and it is not used in the modeling file. So, I will modify the code to the usage of just the IrisWorldModel
without any outer dependencies of any type of env
self.register_buffer("mask", causal_mask if config.attention == "causal" else block_causal_mask) | ||
|
||
def forward(self, x: torch.Tensor, kv_cache: Optional[KVCache] = None) -> torch.Tensor: | ||
B, T, C = x.size() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No single letter var names - they should all be explicit e.g. batch_size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do that just in a while 👍
self._size += x.size(2) | ||
|
||
|
||
class KVCache: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cache definition is outside the scope of the modeling file - the model should use the library's cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright, will change that, thanks for mentioning 👍
|
||
|
||
def nonlinearity(x: torch.Tensor) -> torch.Tensor: | ||
# swish |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the swish activation already defined in the library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh!sure, thanks for mentioning 👍
|
||
class ScalingLayer(nn.Module): | ||
def __init__(self) -> None: | ||
super(ScalingLayer, self).__init__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super(ScalingLayer, self).__init__() | |
super().__init__() |
"""A single linear layer which does a 1x1 conv""" | ||
|
||
def __init__(self, chn_in: int, chn_out: int = 1, use_dropout: bool = False) -> None: | ||
super(NetLinLayer, self).__init__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super(NetLinLayer, self).__init__() | |
super().__init__() |
layers = ( | ||
[ | ||
nn.Dropout(), | ||
] | ||
if (use_dropout) | ||
else [] | ||
) | ||
layers += [ | ||
nn.Conv2d(chn_in, chn_out, 1, stride=1, padding=0, bias=False), | ||
] | ||
self.model = nn.Sequential(*layers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
layers = ( | |
[ | |
nn.Dropout(), | |
] | |
if (use_dropout) | |
else [] | |
) | |
layers += [ | |
nn.Conv2d(chn_in, chn_out, 1, stride=1, padding=0, bias=False), | |
] | |
self.model = nn.Sequential(*layers) | |
layers = [nn.Dropout()] if (use_dropout) else [] | |
layers += [nn.Conv2d(chn_in, chn_out, 1, stride=1, padding=0, bias=False)] | |
self.model = nn.Sequential(*layers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make style
did it in correction
|
||
@dataclass | ||
class TransformerConfig: | ||
tokens_per_block: int | ||
max_blocks: int | ||
attention: str | ||
|
||
num_layers: int | ||
num_heads: int | ||
embed_dim: int | ||
|
||
embed_pdrop: float | ||
resid_pdrop: float | ||
attn_pdrop: float | ||
|
||
@property | ||
def max_tokens(self): | ||
return self.tokens_per_block * self.max_blocks | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dataclass | |
class TransformerConfig: | |
tokens_per_block: int | |
max_blocks: int | |
attention: str | |
num_layers: int | |
num_heads: int | |
embed_dim: int | |
embed_pdrop: float | |
resid_pdrop: float | |
attn_pdrop: float | |
@property | |
def max_tokens(self): | |
return self.tokens_per_block * self.max_blocks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed and comitted
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
…s for vgg16 lpips
…missed hidden state layer
Hi @amyeroberts , I have done the suggested changes and some refactoring in the modeling file to enhance compatibility. Please review the updated files at your convenience. Thank you! As this is a huge piece of work and and it represents only the third model that integrates Reinforcement Learning (RL) with transformers. If this model is successfully ported, it will significantly benefit the transformers + RL community. As this is SOTA in sample efficient RL for methods without lookahead search in the Atari 100k benchmark so,its successful integration with transformers will provide considerable value for fine-tuning it or training it from scratch on various tasks with Transformers Following the successful porting, I will create and hyperlink a Colab notebook with a detailed guide on using the model with transformers, including training it from scratch and will be more than happy to maintaining the model here. 👍 |
How did you convert the model file into hugginface format? Usually, transformers require conversion script also. https://github.com/eloialonso/iris_pretrained_models/tree/main/pretrained_models Maybe upload few more models would be good for other people :) |
@SangbumChoi Thank you for your comment. Basically i converted the initial model weights file of
I just pushed all the converted models to the hub. You can check them out here |
soft cc @amyeroberts , thank you |
What does this PR do?
This PR adds Iris, a Reinforcement learning agent for Sample Efficient RL
Fixes #30882
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@amyeroberts @younesbelkada @NielsRogge @kashif
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
Ready for review!!!