New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add IRIS #30883

Open

RUFFY-369 wants to merge 83 commits into huggingface:main from RUFFY-369:add_iris

RUFFY-369 commented May 17, 2024 •

edited

What does this PR do?

This PR adds Iris, a Reinforcement learning agent for Sample Efficient RL

Fixes #30882

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@amyeroberts @younesbelkada @NielsRogge @kashif

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Ready for review!!!

RUFFY-369 and others added 30 commits

April 21, 2024 05:12


          feat: initial implementation of iris in pytorch

921a7d0


          chore: file to convert orig ckpt to HF format

9f94801


          chore: getting usable blocks of code for implementation

410dbe0


          chore: code blocks to work with for adding model

a829877


          Merge branch 'huggingface:main' into add_iris

a7ee9d3


          Merge remote-tracking branch 'upstream/main' into add_iris

09e021a

merge:upstream current main


          chore:add required code from original codebase for hf iris

c463a1b


          Merge remote-tracking branch 'upstream/main' into add_iris

67a82bf


          chore:add each component loss to model output from agent

9da33b7


          Merge remote-tracking branch 'upstream/main' into add_iris

ec10216


          chore:add hidden states of all components as model outputs


          chore: add attentions of qualified components as model outputs

f00cea6


          chore:remove einops dependecy in original code with torch different r…

74b7e24

…eshaping func for HF


          chore:replace hydra dependent original code, cleaning,few nits

50b4aa5


          Merge remote-tracking branch 'upstream/main' into add_iris

9aa2fb9


          chore:add configuration to the model from the orig code

c3e409a


          Merge remote-tracking branch 'upstream/main' into add_iris

fd8d975


          chore:adapt modeling code to changes in config file

470e945


          fix:bug inhibiting model and config instance creation with config as …

05c1c8f

…arg for model


          Merge remote-tracking branch 'upstream/main' into add_iris

147047e


          chore:remove training args from configuration

fd7da81


          fix:model forward pass bug, merge conflict

bb9e8f4


          Merge remote-tracking branch 'upstream/main' into add_iris

10d542c


          Merge remote-tracking branch 'upstream/main' into add_iris

6683a9f


          chore:add return dict output

86db29e


          Merge remote-tracking branch 'upstream/main' into add_iris

e95e02c


          chore:make HF and original model outputs precision more than 1e-3

1eeb6c0


          test:add common and integration tests for iris

3d36d37


          Merge remote-tracking branch 'upstream/main' into add_iris

50184e3


          test:modify tests for dummy inputs;fix:bug leading to test fails

6f37d7a

amyeroberts reviewed

View reviewed changes

Collaborator

amyeroberts left a comment

Hi @RUFFY-369, thanks for opening this PR! This is a mammoth piece of work

I've just done a very high-level pass and more review rounds would be needed. At the moment, the structure of the modeling file is very far away from the standard library patterns. In particular, there's lots of logic for things which should be handled elsewhere e.g. agents, environments, downloading files.

I think the best and easiest way to make this model available is by adding the mode directly on the hub. I would refer to the decision transformer to see what classes should be added and how to add them into the library to make them transformers compatible.

src/transformers/models/iris/modeling_iris.py Outdated

Comment on lines 60 to 64

+              IRIS_PRETRAINED_MODEL_ARCHIVE_LIST = [
+                  "ruffy369/iris-breakout",
+                  # See all Iris models at https://huggingface.co/models?filter=iris
+              ]

Collaborator

amyeroberts May 20, 2024

Model archive lists have been deprecated

Suggested change

      
            IRIS_PRETRAINED_MODEL_ARCHIVE_LIST = [
          
                "ruffy369/iris-breakout",
          
                # See all Iris models at https://huggingface.co/models?filter=iris
          
            ]

Author

RUFFY-369 May 24, 2024

removed and committed

src/transformers/models/iris/modeling_iris.py Outdated

		tokens: torch.LongTensor


		class Slicer(nn.Module):

Collaborator

amyeroberts May 23, 2024

All model specific submodules and layers should have the model prefix. The prefix should be camel-case

Suggested change

      
            class Slicer(nn.Module):
          
            class IrisSlicer(nn.Module):

Author

RUFFY-369 May 24, 2024

working on it asap 👍

src/transformers/models/iris/modeling_iris.py Outdated

		return (y, att)


		class WorldModelEnv:

Collaborator

amyeroberts May 23, 2024

Definition of the environment is outside the scope of the modeling file - this should just be the model

Author

RUFFY-369 May 24, 2024 •

edited

Okay, will make the required changes, thank you for pointing out 💯 And also, this class is not a gym env but a simulation of env in the world model, use of the world model for training the actor critic component in imagination without actual env. The self.env was for the reset() function from the original code and it is not used in the modeling file. So, I will modify the code to the usage of just the IrisWorldModel without any outer dependencies of any type of env

src/transformers/models/iris/modeling_iris.py Outdated

+                      self.register_buffer("mask", causal_mask if config.attention == "causal" else block_causal_mask)
+                  def forward(self, x: torch.Tensor, kv_cache: Optional[KVCache] = None) -> torch.Tensor:
+                      B, T, C = x.size()

Collaborator

amyeroberts May 23, 2024

No single letter var names - they should all be explicit e.g. batch_size

Author

RUFFY-369 May 24, 2024

will do that just in a while 👍

src/transformers/models/iris/modeling_iris.py Outdated

		self._size += x.size(2)


		class KVCache:

Collaborator

amyeroberts May 23, 2024

Cache definition is outside the scope of the modeling file - the model should use the library's cache

Author

RUFFY-369 May 24, 2024

alright, will change that, thanks for mentioning 👍

src/transformers/models/iris/modeling_iris.py Outdated



		def nonlinearity(x: torch.Tensor) -> torch.Tensor:
		# swish

Collaborator

amyeroberts May 23, 2024

You can use the swish activation already defined in the library

Author

RUFFY-369 May 24, 2024

oh!sure, thanks for mentioning 👍

src/transformers/models/iris/modeling_iris.py Outdated

+              class ScalingLayer(nn.Module):
+                  def __init__(self) -> None:
+                      super(ScalingLayer, self).__init__()

Collaborator

amyeroberts May 23, 2024

Suggested change

      
                    super(ScalingLayer, self).__init__()
          
                    super().__init__()

src/transformers/models/iris/modeling_iris.py Outdated

+                  """A single linear layer which does a 1x1 conv"""
+                  def __init__(self, chn_in: int, chn_out: int = 1, use_dropout: bool = False) -> None:
+                      super(NetLinLayer, self).__init__()

Collaborator

amyeroberts May 23, 2024

Suggested change

      
                    super(NetLinLayer, self).__init__()
          
                    super().__init__()

src/transformers/models/iris/modeling_iris.py Outdated

Comment on lines 649 to 659

+                      layers = (
+                          [
+                              nn.Dropout(),
+                          ]
+                          if (use_dropout)
+                          else []
+                      )
+                      layers += [
+                          nn.Conv2d(chn_in, chn_out, 1, stride=1, padding=0, bias=False),
+                      ]
+                      self.model = nn.Sequential(*layers)

Collaborator

amyeroberts May 23, 2024

Suggested change

      
                    layers = (
          
                        [
          
                            nn.Dropout(),
          
                        ]
          
                        if (use_dropout)
          
                        else []
          
                    )
          
                    layers += [
          
                        nn.Conv2d(chn_in, chn_out, 1, stride=1, padding=0, bias=False),
          
                    ]
          
                    self.model = nn.Sequential(*layers)
          
                    layers = [nn.Dropout()] if (use_dropout) else []
          
                    layers += [nn.Conv2d(chn_in, chn_out, 1, stride=1, padding=0, bias=False)]
          
                    self.model = nn.Sequential(*layers)

Author

RUFFY-369 May 26, 2024

make style did it in correction

src/transformers/models/iris/modeling_iris.py Outdated

Comment on lines 885 to 903

+              @dataclass
+              class TransformerConfig:
+                  tokens_per_block: int
+                  max_blocks: int
+                  attention: str
+                  num_layers: int
+                  num_heads: int
+                  embed_dim: int
+                  embed_pdrop: float
+                  resid_pdrop: float
+                  attn_pdrop: float
+                  @property
+                  def max_tokens(self):
+                      return self.tokens_per_block * self.max_blocks

Collaborator

amyeroberts May 23, 2024

Suggested change

      
            @dataclass
          
            class TransformerConfig:
          
                tokens_per_block: int
          
                max_blocks: int
          
                attention: str
          
                num_layers: int
          
                num_heads: int
          
                embed_dim: int
          
                embed_pdrop: float
          
                resid_pdrop: float
          
                attn_pdrop: float
          
                @property
          
                def max_tokens(self):
          
                    return self.tokens_per_block * self.max_blocks

Author

RUFFY-369 May 27, 2024

removed and comitted

RUFFY-369 and others added 24 commits

May 24, 2024 16:21


          refactor: do suggested changes

e32efa7

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>


          refactor: add suggested naming changes

3f399f6

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>


          Merge remote-tracking branch 'upstream/main' into add_iris

ce1e1c7


          refactor: suggested changes add model prefix to submodules

50f1558


          refactor: WorldModelEnv class without env for training actor critic

d1d994a


          Merge remote-tracking branch 'upstream/main' into add_iris

51e6cb4


          refactor:change single letter var names to descriptive ones

ba1eeab


          Merge remote-tracking branch 'upstream/main' into add_iris

75bbc41


          chore:replace cache definition with transformers library cache

ca50955


          refactor:remove subclassing other than nn.module and transformer classes

b9bb88a


          refactor:replace redundant components config with model config

db3b384


          chore:remove ext pretrained loading funcs and downloading dependencie…

64092ff

…s for vgg16 lpips


          refactor:move component loss calc to separate class

9e0228e


          refactor:remove custom repr() from agent components

8f364b5


          refactor:remove component output classes

f64c0d6


          chore:replace nonlinearity() func with swish act define in transformers

85c4a18


          refactor:few nits

4d48f15


          refactor:few more nits


          style:make style run

24ceda4


          fix:bug failing common tests

7e232a1


          fix:failing common tests

97128e6


          fix:repository consistency test fail

788e4fd


          Merge remote-tracking branch 'upstream/main' into add_iris

ba656bf


          refactor:proper class naming changes, removing unused func;chore:add …

aa4aba4

…missed hidden state layer

Author

RUFFY-369 commented May 27, 2024 •

edited

Hi @amyeroberts , I have done the suggested changes and some refactoring in the modeling file to enhance compatibility. Please review the updated files at your convenience. Thank you!

As this is a huge piece of work and and it represents only the third model that integrates Reinforcement Learning (RL) with transformers. If this model is successfully ported, it will significantly benefit the transformers + RL community. As this is SOTA in sample efficient RL for methods without lookahead search in the Atari 100k benchmark so,its successful integration with transformers will provide considerable value for fine-tuning it or training it from scratch on various tasks with Transformers Trainer. Moreover, achieving full compatibility and successful porting will serve as a blueprint for future RL models based on the transformers architecture to be added in the library even for the papers' authors.

Following the successful porting, I will create and hyperlink a Colab notebook with a detailed guide on using the model with transformers, including training it from scratch and will be more than happy to maintaining the model here. 👍

Contributor

SangbumChoi commented May 28, 2024

How did you convert the model file into hugginface format? Usually, transformers require conversion script also.

https://github.com/eloialonso/iris_pretrained_models/tree/main/pretrained_models

Maybe upload few more models would be good for other people :)

Author

RUFFY-369 commented May 28, 2024 •

edited

How did you convert the model file into hugginface format? Usually, transformers require conversion script also.

https://github.com/eloialonso/iris_pretrained_models/tree/main/pretrained_models

@SangbumChoi Thank you for your comment. Basically i converted the initial model weights file of iris-breakout with the conversion script to make it compatible with Hugging face separately. The conversion script was experimental and was also quite simple so had it in a separate folder. I will push that too in this branch as it worked fine.

Maybe upload few more models would be good for other people :)

I just pushed all the converted models to the hub. You can check them out here

Author

RUFFY-369 commented May 31, 2024 •

edited

soft cc @amyeroberts , thank you


          Merge remote-tracking branch 'upstream/main' into add_iris

55dceea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment