Add VQ-BeT #166

jayLEE0301 · 2024-05-10T00:12:39Z

What this does

Add VQ-BeT for PushT env.

How it was tested

Explain/show how you tested your changes.

Examples:

Added configuration_vqbet.py and modeling_vqbet.py in vqbet folder.

How to checkout & try? (for the reviewer)

Examples:

python lerobot/scripts/train.py policy=vqbet env=pusht dataset_repo_id=lerobot/pusht

This change is

Merge from main

alexander-soare

@jayLEE0301 thanks so much for being the first to PR a model to LeRobot! The paper for VQ-BeT was a really nice read.

So, for the review. I've left a bunch of comments (many of them nits, but some blockers), and actually decided to stop reviewing partway through. That's because I noticed there are some high-level points I can share here. So instead, please consider these high level comments as my primary review, and my inline comments as examples to support.

So, our goal is to make this code highly accessible to the community, meaning it's easy to read and understand, and is easily hackable. A side effect of aiming for these goals is usually that the code is maintainable.

With that overarching goal in mind here are 3 high level points:

Consider the VQBeTPolicy class as the only "public" object in the modeling file. Everything else is there for the sole purpose of VQBeTPolicy. This means:
- Go minimalist. We should drop any kwargs, conditional branching, or other logic that is unused. The other functions and logic should only be as dynamic as needed to serve VQBeTPolicy. Rule of thumb: if it can't be be activated via the configuration parameters, it can go
- Use the config instead of many kwargs. Most of the other modules can take a config argument and make a self.config (avoids relisting parameters twice, and makes it that there's one source of truth for what the params mean - no need to repeat documentation or type-hinting).
Consolidate code: We want to avoid too much nesting or duplication of code. Consider for example my inline comment about the MLPs. I think it's reasonable to use one class for MLPs (and it can be simpler and shorter than the 3 existing classes now). This is just an example though, there may be more opportunities for consolidation.
Documentation and naming: We want to make sure that everything is well understood by a first-time reader. Wear the hat of someone who has read through your paper once, and enters the code via the VQBeTPolicy class. They should be able to traverse the submodule hierarchy, understanding what everything is as they go. And they should be able to make sense of what's happening in the forward function.
- Above all, please make sure the VQBeTConfig documentation is solid.
- Please add docstrings to classes and methods when it wouldn't be obvious what they are in relation to the main policy and paper.
- Please separate long methods into logical blocks with comments so that one doesn't get lost along they way. (btw: this doesn't mean separating them into smaller functions)
- Please make sure it's easy to follow what's happening with tensor dimensions. einops is also helpful for that.
- Favor full words over abbreviations: embd -> embed and try to match the terminology/naming in your paper.

When in doubt, please take inspiration from LeRobot's ACT and TD-MPC (Diffusion Policy is good too but may need a little more work).

lerobot/common/policies/vqbet/configuration_vqbet.py

lerobot/common/policies/vqbet/modeling_vqbet.py

lerobot/configs/policy/vqbet.yaml

Debugging vqbet

merge changes from HF repo

Pull main branch

jayLEE0301 · 2024-05-27T17:50:15Z

Thank you for the review!

Following these high-level points,

we removed all kwargs as possible, and only remained configs.
consolidated all the similar functions
added comments, and changed names

for all the parts of this PR.

alexander-soare

Checkpoint. I will continue next week.

lerobot/common/policies/vqbet/configuration_vqbet.py

lerobot/common/policies/vqbet/modeling_vqbet.py

alexander-soare · 2024-05-31T10:59:05Z

lerobot/common/policies/vqbet/modeling_vqbet.py

+
+
+        # queues are populated during rollout of the policy, they contain the n latest observations and actions
+        self._queues = None


nit: perhaps a call to self.reset() at the bottom of the __init__ would be more appropriate? See

lerobot/lerobot/common/policies/diffusion/modeling_diffusion.py

Line 94 in 83f4f7f

self.reset()

added self.reset() at the bottom of __init__

Any chance we can also drop self._queues = None please? It doesn't hurt from a logic perspective, but it does potentially confuse someone who will wonder why there's a redundant line of code here.

lerobot/common/policies/vqbet/modeling_vqbet.py

alexander-soare · 2024-05-31T13:57:30Z

lerobot/common/policies/vqbet/modeling_vqbet.py

+        features = self.policy(observation_feature)
+        historical_act_pred_index = np.arange(0, n_obs_steps) * (self.config.gpt_num_obs_mode+1) + self.config.gpt_num_obs_mode
+
+        # only extract the output tokens at the position of action query


Curious: If this is the case, what function to the other action tokens serve other than to increase compute for a forward pass?

This increases computation, but can help improve overall learning performance, and avoiding overfitting (not always)

You can think of it similar to predicting a longer sequence of actions in a diffusion policy compared to the actual sequence of actions to be performed.

Cool! Mind adding that in as a comment (if it's not already mentioned in your paper)?

I added
Behavior Transformer (BeT), and VQ-BeT are both sequence-to-sequence prediction models, mapping sequential observation to sequential action (please refer to section 2.2 in BeT paper https://arxiv.org/pdf/2206.11251). Thus, it predict historical action sequence, in addition to current and future actions (predicting future actions : optional).

lerobot/common/policies/vqbet/modeling_vqbet.py

lerobot/common/policies/vqbet/configuration_vqbet.py

lerobot/common/policies/vqbet/modeling_vqbet.py

alexander-soare · 2024-06-03T16:41:25Z

lerobot/common/policies/vqbet/configuration_vqbet.py

+    spatial_softmax_num_keypoints: int = 32
+    # VQ-VAE
+    discretize_step: int = 3000
+    vqvae_groups: int = 2


In some places of the code, this is statically handled, meaning changing this number will break things. Can we please either remove it as a parameter or make sure the code can handle it dynamically?

One example is the cbet_loss.

made the code can handle vqvae_groups dynamically

I'm still seeing the use of "primary" and "secondary" in the code. For example VQBeTOptimizer.__init__. Am I misunderstanding something?

…n_vqvae

Merge from hf lerobot main branch

alexander-soare

Just publishing my responses in a batch. Thanks for resolving these :D

Now moving on with the review.

lerobot/common/policies/vqbet/configuration_vqbet.py

alexander-soare · 2024-06-06T12:17:13Z

lerobot/common/policies/vqbet/modeling_vqbet.py

+
+
+        # queues are populated during rollout of the policy, they contain the n latest observations and actions
+        self._queues = None


Any chance we can also drop self._queues = None please? It doesn't hurt from a logic perspective, but it does potentially confuse someone who will wonder why there's a redundant line of code here.

alexander-soare · 2024-06-06T12:35:55Z

lerobot/common/policies/vqbet/modeling_vqbet.py

+        # VQ-BeT discretizes action using VQ-VAE before training BeT (please refer to section 3.2 in the VQ-BeT paper https://arxiv.org/pdf/2403.03181)
+        if not self.check_discretized():
+            loss, n_different_codes, n_different_combinations = self.vqbet.discretize(self.config.discretize_step, batch['action'])
+            return {"loss": loss, "n_different_codes": n_different_codes, "n_different_combinations": n_different_combinations}


Thanks! I think I understand this. Can you let me know if my understanding is correct?

n_different_codes: how many of the total possible VQ codes are being used (how many of them have at least one encoder embedding as a nearest neighbor). This can be at most `vqvae_n_embed`. n_different_combinations: how many different code combinations are being used out of all possible combinations. This can be at most `vqvae_n_embed ^ vqvae_groups` (hint consider the RVQ as a decision tree). But shouldn't `n_different_codes` max out at `vqvae_n_embed * vqvae_groups`? That's how many codes there are in total. Or are you only referring to the codes of the first RVQ layer? Btw: I think this is a great metric to track!

lerobot/common/policies/vqbet/modeling_vqbet.py

alexander-soare · 2024-06-06T12:40:29Z

lerobot/common/policies/vqbet/modeling_vqbet.py

+        features = self.policy(observation_feature)
+        historical_act_pred_index = np.arange(0, n_obs_steps) * (self.config.gpt_num_obs_mode+1) + self.config.gpt_num_obs_mode
+
+        # only extract the output tokens at the position of action query


Cool! Mind adding that in as a comment (if it's not already mentioned in your paper)?

lerobot/common/policies/vqbet/modeling_vqbet.py

alexander-soare · 2024-06-06T12:42:48Z

lerobot/common/policies/vqbet/modeling_vqbet.py

+        self.map_to_cbet_preds_bin: outputs probability of each code (for each layer).
+            The input dimension of `self.map_to_cbet_preds_bin` is same with the output of GPT, 
+            and the output dimension of `self.map_to_cbet_preds_bin` is `self.config.vqvae_groups * self.config.vqvae_n_embed`, where 
+                `self.config.vqvae_groups` is number of RVQ layers, and


nit: Same as an earlier revision above, can we please remove these duplicated explanations of what these variables mean?

I removed the duplicated parts:) Thank you

lerobot/common/policies/vqbet/configuration_vqbet.py

lerobot/common/policies/vqbet/modeling_vqbet.py

alexander-soare · 2024-06-06T13:24:35Z

lerobot/common/policies/vqbet/configuration_vqbet.py

+    spatial_softmax_num_keypoints: int = 32
+    # VQ-VAE
+    discretize_step: int = 3000
+    vqvae_groups: int = 2


I'm still seeing the use of "primary" and "secondary" in the code. For example VQBeTOptimizer.__init__. Am I misunderstanding something?

alexander-soare · 2024-06-06T13:31:24Z

lerobot/common/policies/vqbet/modeling_vqbet.py

+        }
+        return loss_dict
+
+class VQBeTOptimizer(nn.Module):


Can I please ask that we consolidate and simplify here? Consider where we can get away with using one optimizer instead of many. I count 4 optimizers being initialized here and I'm not sure all of them are needed. I'll let you double check, but I think we might be able to get away with 2 or even just 1 (if you no_grad the quantizer when the discretization is done).

Feel free to let me know if this is not possible. I checked briefly, but not exhaustively.

At a higher level, we have a plan to have some way of the policy code providing the optimizer and scheduler. So I think you have made a good step towards that here. Right now we have train.py handling this logic and that's not nice. Ideally, what I think we want here is one method in the top-level policy class make_optimizer which handles everything. That way train.py can just call make_optimizer without having to know which specific policy it is. Here, this would mean taking Karpathy's configure optimizers logic and consolidating it into that same make_optimizer class. I don't think we want the optimizer creation distributed throughout various modules of the file.

Happy to get your input on all these thoughts.

Thank you for the suggestions!
I removed all the redundant optimizers, and merged all the optimizers for phase 1 and phase 2 into one, leaving only one optimizer. I also deleted def step, def zero_grad. (and put all the parameters for phase 2 in the same scheduler.)

We haven't done much analysis on how this affects the stability of training at this time, but (after running two seeds) we have found that it can produce similar performance to the uploaded model(https://huggingface.co/JayLee131/vqbet_pusht) based on the best checkpoint.

Perhaps a more diverse hyperparameter search may be needed in the future.

class VQBeTOptimizer(torch.optim.Adam): def __init__(self, policy, cfg): vqvae_params = ( list(policy.vqbet.action_head.vqvae_model.encoder.parameters()) + list(policy.vqbet.action_head.vqvae_model.decoder.parameters()) + list(policy.vqbet.action_head.vqvae_model.vq_layer.parameters()) ) decay_params, no_decay_params = policy.vqbet.policy.configure_parameters() decay_params = ( decay_params + list(policy.vqbet.rgb_encoder.parameters()) + list(policy.vqbet.state_projector.parameters()) + list(policy.vqbet.rgb_feature_projector.parameters()) + [policy.vqbet._action_token] + list(policy.vqbet.action_head.map_to_cbet_preds_offset.parameters()) ) if cfg.policy.sequentially_select: decay_params = ( decay_params + list(policy.vqbet.action_head.map_to_cbet_preds_primary_bin.parameters()) + list(policy.vqbet.action_head.map_to_cbet_preds_secondary_bin.parameters()) ) else: decay_params = ( decay_params + list(policy.vqbet.action_head.map_to_cbet_preds_bin.parameters()) ) optim_groups = [ { "params": decay_params, "weight_decay": cfg.training.adam_weight_decay, "lr": cfg.training.lr, }, { "params": vqvae_params, "weight_decay": 0.0001, "lr": cfg.training.vqvae_lr, }, { "params": no_decay_params, "weight_decay": 0.0, "lr": cfg.training.lr, }, ] super(VQBeTOptimizer, self).__init__( optim_groups, cfg.training.lr, cfg.training.adam_betas, cfg.training.adam_eps, )

lerobot/common/policies/vqbet/modeling_vqbet.py

alexander-soare · 2024-06-06T15:30:19Z

lerobot/common/policies/vqbet/modeling_vqbet.py

+        else:
+            self.eval()
+
+    def draw_logits_forward(self, encoding_logits):


Can we please add a docstring here or change the function name to something more apparent? I'm not sure what it means to draw logits forward.

Note: I think most of the function names are self-explanatory, so I really do just mean this one and draw_code_forward.

Thank you for your opinion:) I removed def draw_logits_forward since it is not used, and changed def draw_code_forward to def get_embeddings_from_code

… of all parameters in phase 2 together

…scretized

…retized

…k in class ResidualVQ as resigered buffer

…class VqVae)

…raw_code_forward to def get_embeddings_from_code

…ebook_vector_from_indices

jayLEE0301 · 2024-06-08T22:24:39Z

Thank you for the review!

I've resolved all the comments. In high-level view,

I removed and consolidated all the redundant optimizers and schedulers (merged all the optimizers for phase 1 and phase 2 into one, leaving only one optimizer). I also deleted def step, def zero_grad
removed redundant functions, and now using original load_state_dict, train, and eval of nn.module. In fact, the existence of these functions was to prevent the EMA updates after RVQ training has ended. This is now implemented instead via self.vq_layer.freeze_codebook = torch.tensor(True) and torch.no_grad()
Added comments and changed some confusing function names, and removed unused parts.

Merge from HF main branch

jayLEE0301 and others added 6 commits May 7, 2024 20:56

add vq-vae pretraining and vq-bet training

f6a5f96

add rollout part

f2d9b70

Merge pull request #1 from huggingface/main

00d2422

Merge from main

Let vqvae inherit from nn.module

db4e5e7

Consistency with other models's configs

44edc6f

Merge pull request #2 from huggingface/main

23fbc19

Merge from main

alexander-soare requested changes May 10, 2024

View reviewed changes

aliberts added the 🧠 Policies Something policies-related label May 12, 2024

jayLEE0301 and others added 19 commits May 13, 2024 20:31

add temporary eval part, optimizer

02d55c0

split img emb / state

311f79a

clean code

1d68951

add warnings, merge to main branch

d8f8fa5

minor

e57efff

Merge pull request #3 from jayLEE0301/debugging-vqbet

f127837

Debugging vqbet

Merge pull request #4 from huggingface/main

2f4f137

merge changes from HF repo

delete redundant kwargs, change parameter names

d209c0f

remove unused parameter explanations, fix some names of parameters

e301caf

add comment, change param names, delete kwargs

340f7cf

remove _ in the names of models

0b66324

remove redundant comment, change scheduler name

8ee1e53

add bin pred temperature

b3e0ec1

replace hardcoded part with gpt_num_obs_mode, delete unused eos token

d71db34

add explanations of each part of vq-bet

71ec76f

add comments, add primary_code_loss_weight

547d3c3

fix typo

5f6b665

fix NT assignment error

fd95f2e

Merge pull request #5 from huggingface/main

c0bed3e

Pull main branch

alexander-soare requested changes May 31, 2024

View reviewed changes

alexander-soare requested changes Jun 3, 2024

View reviewed changes

jayLEE0301 and others added 10 commits June 5, 2024 13:36

change loss fn name, remove unnecessary resizing part

975da28

delete some short functions

1778dee

move if self.vqvae_model.check_discretized(): part inside def pretrai…

eedc131

…n_vqvae

make the code to handle dynamically

fd8fc11

change discretize_step -> n_vqvae_training_steps

bc10e34

Merge branch 'main' of https://github.com/jayLEE0301/lerobot

8b3683f

Merge pull request #6 from huggingface/main

aaa4c65

Merge from hf lerobot main branch

follow recent changes in other policies

66931cd

follow recent changes in other policies2

5e7bbf1

fix action error monitoring

b3fc3b7

alexander-soare requested changes Jun 6, 2024

View reviewed changes

jayLEE0301 added 11 commits June 7, 2024 16:59

remove gpt_num_obs_mode

3124532

remove duplicated part, and add comments

930c0cf

Consolidate the optimizer into one, and make the scheduler updates lr…

913489e

… of all parameters in phase 2 together

remove redundant self.eval(), self.train(), and removed def toggle_di…

4d3a45a

…scretized

remove redundant self.eval(), self.train(), and removed def toggle_di…

1257aaf

…scretized

removed def train, def check_discretized, and restore def toggle_disc…

0b48268

…retized

minor: change location of comments

670c08a

remove toggle_discretized, load_state_dict, and change freeze_codeboo…

cde1804

…k in class ResidualVQ as resigered buffer

remove unused parts (some lines in def get_code and def preprocessin …

2f0e601

…class VqVae)

remove def draw_logits_forward since it is not used, and change def d…

cff71e6

…raw_code_forward to def get_embeddings_from_code

add comments in class VqVae, change get_codes_from_indices -> get_cod…

32fa5d2

…ebook_vector_from_indices

jayLEE0301 and others added 2 commits June 8, 2024 18:28

remove redundant comments

2a6af3a

Merge pull request #7 from huggingface/main

e66188f

Merge from HF main branch

aliberts mentioned this pull request Jun 11, 2024

when will other models be added ? #259

Open

jayLEE0301 added 3 commits June 11, 2024 15:20

separate vqbet util files

da4bdd8

add comments on vqbet util

0c70972

add comments

75e7628

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add VQ-BeT #166

Add VQ-BeT #166

jayLEE0301 commented May 10, 2024 •

edited by alexander-soare

alexander-soare left a comment •

edited

jayLEE0301 commented May 27, 2024

alexander-soare left a comment

alexander-soare May 31, 2024

jayLEE0301 Jun 3, 2024

alexander-soare Jun 6, 2024

alexander-soare May 31, 2024

jayLEE0301 Jun 4, 2024

alexander-soare Jun 6, 2024

jayLEE0301 Jun 7, 2024 •

edited

alexander-soare Jun 3, 2024

jayLEE0301 Jun 5, 2024

alexander-soare Jun 6, 2024

alexander-soare left a comment

alexander-soare Jun 6, 2024

alexander-soare Jun 6, 2024

alexander-soare Jun 6, 2024

alexander-soare Jun 6, 2024

jayLEE0301 Jun 7, 2024

alexander-soare Jun 6, 2024

alexander-soare Jun 6, 2024

jayLEE0301 Jun 8, 2024 •

edited

alexander-soare Jun 6, 2024

jayLEE0301 Jun 8, 2024

jayLEE0301 commented Jun 8, 2024



		# queues are populated during rollout of the policy, they contain the n latest observations and actions
		self._queues = None

Add VQ-BeT #166

Are you sure you want to change the base?

Add VQ-BeT #166

Conversation

jayLEE0301 commented May 10, 2024 • edited by alexander-soare

What this does

How it was tested

How to checkout & try? (for the reviewer)

alexander-soare left a comment • edited

Choose a reason for hiding this comment

jayLEE0301 commented May 27, 2024

alexander-soare left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jayLEE0301 Jun 7, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexander-soare left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jayLEE0301 Jun 8, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jayLEE0301 commented Jun 8, 2024

jayLEE0301 commented May 10, 2024 •

edited by alexander-soare

alexander-soare left a comment •

edited

jayLEE0301 Jun 7, 2024 •

edited

jayLEE0301 Jun 8, 2024 •

edited