We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(torch) ppop@DESKTOP-NMJBJQC:~/Chinese-CLIP$ sudo bash run_scripts/muge_finetune_vit-b-16_rbt-base.sh ~/Chinese-CLIP/datapath Loading vision model config from cn_clip/clip/model_configs/ViT-L-14.json Loading text model config from cn_clip/clip/model_configs/RoBERTa-wwm-ext-base-chinese.json 2024-04-18,22:23:46 | INFO | Rank 0 | train LMDB file contains 35000 images and 105000 pairs. 2024-04-18,22:23:46 | INFO | Rank 0 | val LMDB file contains 7500 images and 22500 pairs. 2024-04-18,22:23:46 | INFO | Rank 0 | Params: 2024-04-18,22:23:46 | INFO | Rank 0 | accum_freq: 1 2024-04-18,22:23:46 | INFO | Rank 0 | aggregate: True 2024-04-18,22:23:46 | INFO | Rank 0 | batch_size: 128 2024-04-18,22:23:46 | INFO | Rank 0 | bert_weight_path: None 2024-04-18,22:23:46 | INFO | Rank 0 | beta1: 0.9 2024-04-18,22:23:46 | INFO | Rank 0 | beta2: 0.98 2024-04-18,22:23:46 | INFO | Rank 0 | checkpoint_path: /home/ppop/Chinese-CLIP/datapath/experiments/muge_finetune_vit-H-14_roberta-base_bs128_1gpu/checkpoints 2024-04-18,22:23:46 | INFO | Rank 0 | clip_weight_path: None 2024-04-18,22:23:46 | INFO | Rank 0 | context_length: 52 2024-04-18,22:23:46 | INFO | Rank 0 | debug: False 2024-04-18,22:23:46 | INFO | Rank 0 | device: cuda:0 2024-04-18,22:23:46 | INFO | Rank 0 | distllation: False 2024-04-18,22:23:46 | INFO | Rank 0 | eps: 1e-06 2024-04-18,22:23:46 | INFO | Rank 0 | freeze_vision: False 2024-04-18,22:23:46 | INFO | Rank 0 | gather_with_grad: False 2024-04-18,22:23:46 | INFO | Rank 0 | grad_checkpointing: False 2024-04-18,22:23:46 | INFO | Rank 0 | kd_loss_weight: 0.5 2024-04-18,22:23:46 | INFO | Rank 0 | local_device_rank: 0 2024-04-18,22:23:46 | INFO | Rank 0 | log_interval: 1 2024-04-18,22:23:46 | INFO | Rank 0 | log_level: 20 2024-04-18,22:23:46 | INFO | Rank 0 | log_path: /home/ppop/Chinese-CLIP/datapath/experiments/muge_finetune_vit-H-14_roberta-base_bs128_1gpu/out_2024-04-18-14-23-43.log 2024-04-18,22:23:46 | INFO | Rank 0 | logs: /home/ppop/Chinese-CLIP/datapath/experiments/ 2024-04-18,22:23:46 | INFO | Rank 0 | lr: 5e-05 2024-04-18,22:23:46 | INFO | Rank 0 | mask_ratio: 0 2024-04-18,22:23:46 | INFO | Rank 0 | max_epochs: 3 2024-04-18,22:23:46 | INFO | Rank 0 | max_steps: 2463 2024-04-18,22:23:46 | INFO | Rank 0 | name: muge_finetune_vit-H-14_roberta-base_bs128_1gpu 2024-04-18,22:23:46 | INFO | Rank 0 | num_workers: 4 2024-04-18,22:23:46 | INFO | Rank 0 | precision: amp 2024-04-18,22:23:46 | INFO | Rank 0 | rank: 0 2024-04-18,22:23:46 | INFO | Rank 0 | report_training_batch_acc: True 2024-04-18,22:23:46 | INFO | Rank 0 | reset_data_offset: False 2024-04-18,22:23:46 | INFO | Rank 0 | reset_optimizer: False 2024-04-18,22:23:46 | INFO | Rank 0 | resume: /home/ppop/Chinese-CLIP/datapath/pretrained_weights/clip_cn_vit-l-14.pt 2024-04-18,22:23:46 | INFO | Rank 0 | save_epoch_frequency: 1 2024-04-18,22:23:46 | INFO | Rank 0 | save_step_frequency: 999999 2024-04-18,22:23:46 | INFO | Rank 0 | seed: 123 2024-04-18,22:23:46 | INFO | Rank 0 | skip_aggregate: False 2024-04-18,22:23:46 | INFO | Rank 0 | skip_scheduler: False 2024-04-18,22:23:46 | INFO | Rank 0 | teacher_model_name: None 2024-04-18,22:23:46 | INFO | Rank 0 | text_model: RoBERTa-wwm-ext-base-chinese 2024-04-18,22:23:46 | INFO | Rank 0 | train_data: /home/ppop/Chinese-CLIP/datapath/datasets/yyut/lmdb/train 2024-04-18,22:23:46 | INFO | Rank 0 | use_augment: False 2024-04-18,22:23:46 | INFO | Rank 0 | use_bn_sync: False 2024-04-18,22:23:46 | INFO | Rank 0 | use_flash_attention: False 2024-04-18,22:23:46 | INFO | Rank 0 | val_data: /home/ppop/Chinese-CLIP/datapath/datasets/yyut/lmdb/valid 2024-04-18,22:23:46 | INFO | Rank 0 | valid_batch_size: 128 2024-04-18,22:23:46 | INFO | Rank 0 | valid_epoch_interval: 1 2024-04-18,22:23:46 | INFO | Rank 0 | valid_num_workers: 1 2024-04-18,22:23:46 | INFO | Rank 0 | valid_step_interval: 150 2024-04-18,22:23:46 | INFO | Rank 0 | vision_model: ViT-L-14 2024-04-18,22:23:46 | INFO | Rank 0 | warmup: 100 2024-04-18,22:23:46 | INFO | Rank 0 | wd: 0.001 2024-04-18,22:23:46 | INFO | Rank 0 | world_size: 1 2024-04-18,22:23:46 | INFO | Rank 0 | Use GPU: 0 for training 2024-04-18,22:23:46 | INFO | Rank 0 | => begin to load checkpoint '/home/ppop/Chinese-CLIP/datapath/pretrained_weights/clip_cn_vit-l-14.pt' 2024-04-18,22:23:47 | INFO | Rank 0 | train LMDB file contains 35000 images and 105000 pairs. 2024-04-18,22:23:47 | INFO | Rank 0 | val LMDB file contains 7500 images and 22500 pairs. Exception in thread Thread-1: Traceback (most recent call last): File "/home/ppop/miniconda3/envs/torch/lib/python3.8/threading.py", line 932, in _bootstrap_inner
The text was updated successfully, but these errors were encountered:
解决了吗兄弟。
Sorry, something went wrong.
我也是同样的问题
我发现正常退出也会这样,别的没啥问题了
No branches or pull requests
(torch) ppop@DESKTOP-NMJBJQC:~/Chinese-CLIP$ sudo bash run_scripts/muge_finetune_vit-b-16_rbt-base.sh ~/Chinese-CLIP/datapath
Loading vision model config from cn_clip/clip/model_configs/ViT-L-14.json
Loading text model config from cn_clip/clip/model_configs/RoBERTa-wwm-ext-base-chinese.json
2024-04-18,22:23:46 | INFO | Rank 0 | train LMDB file contains 35000 images and 105000 pairs.
2024-04-18,22:23:46 | INFO | Rank 0 | val LMDB file contains 7500 images and 22500 pairs.
2024-04-18,22:23:46 | INFO | Rank 0 | Params:
2024-04-18,22:23:46 | INFO | Rank 0 | accum_freq: 1
2024-04-18,22:23:46 | INFO | Rank 0 | aggregate: True
2024-04-18,22:23:46 | INFO | Rank 0 | batch_size: 128
2024-04-18,22:23:46 | INFO | Rank 0 | bert_weight_path: None
2024-04-18,22:23:46 | INFO | Rank 0 | beta1: 0.9
2024-04-18,22:23:46 | INFO | Rank 0 | beta2: 0.98
2024-04-18,22:23:46 | INFO | Rank 0 | checkpoint_path: /home/ppop/Chinese-CLIP/datapath/experiments/muge_finetune_vit-H-14_roberta-base_bs128_1gpu/checkpoints
2024-04-18,22:23:46 | INFO | Rank 0 | clip_weight_path: None
2024-04-18,22:23:46 | INFO | Rank 0 | context_length: 52
2024-04-18,22:23:46 | INFO | Rank 0 | debug: False
2024-04-18,22:23:46 | INFO | Rank 0 | device: cuda:0
2024-04-18,22:23:46 | INFO | Rank 0 | distllation: False
2024-04-18,22:23:46 | INFO | Rank 0 | eps: 1e-06
2024-04-18,22:23:46 | INFO | Rank 0 | freeze_vision: False
2024-04-18,22:23:46 | INFO | Rank 0 | gather_with_grad: False
2024-04-18,22:23:46 | INFO | Rank 0 | grad_checkpointing: False
2024-04-18,22:23:46 | INFO | Rank 0 | kd_loss_weight: 0.5
2024-04-18,22:23:46 | INFO | Rank 0 | local_device_rank: 0
2024-04-18,22:23:46 | INFO | Rank 0 | log_interval: 1
2024-04-18,22:23:46 | INFO | Rank 0 | log_level: 20
2024-04-18,22:23:46 | INFO | Rank 0 | log_path: /home/ppop/Chinese-CLIP/datapath/experiments/muge_finetune_vit-H-14_roberta-base_bs128_1gpu/out_2024-04-18-14-23-43.log
2024-04-18,22:23:46 | INFO | Rank 0 | logs: /home/ppop/Chinese-CLIP/datapath/experiments/
2024-04-18,22:23:46 | INFO | Rank 0 | lr: 5e-05
2024-04-18,22:23:46 | INFO | Rank 0 | mask_ratio: 0
2024-04-18,22:23:46 | INFO | Rank 0 | max_epochs: 3
2024-04-18,22:23:46 | INFO | Rank 0 | max_steps: 2463
2024-04-18,22:23:46 | INFO | Rank 0 | name: muge_finetune_vit-H-14_roberta-base_bs128_1gpu
2024-04-18,22:23:46 | INFO | Rank 0 | num_workers: 4
2024-04-18,22:23:46 | INFO | Rank 0 | precision: amp
2024-04-18,22:23:46 | INFO | Rank 0 | rank: 0
2024-04-18,22:23:46 | INFO | Rank 0 | report_training_batch_acc: True
2024-04-18,22:23:46 | INFO | Rank 0 | reset_data_offset: False
2024-04-18,22:23:46 | INFO | Rank 0 | reset_optimizer: False
2024-04-18,22:23:46 | INFO | Rank 0 | resume: /home/ppop/Chinese-CLIP/datapath/pretrained_weights/clip_cn_vit-l-14.pt
2024-04-18,22:23:46 | INFO | Rank 0 | save_epoch_frequency: 1
2024-04-18,22:23:46 | INFO | Rank 0 | save_step_frequency: 999999
2024-04-18,22:23:46 | INFO | Rank 0 | seed: 123
2024-04-18,22:23:46 | INFO | Rank 0 | skip_aggregate: False
2024-04-18,22:23:46 | INFO | Rank 0 | skip_scheduler: False
2024-04-18,22:23:46 | INFO | Rank 0 | teacher_model_name: None
2024-04-18,22:23:46 | INFO | Rank 0 | text_model: RoBERTa-wwm-ext-base-chinese
2024-04-18,22:23:46 | INFO | Rank 0 | train_data: /home/ppop/Chinese-CLIP/datapath/datasets/yyut/lmdb/train
2024-04-18,22:23:46 | INFO | Rank 0 | use_augment: False
2024-04-18,22:23:46 | INFO | Rank 0 | use_bn_sync: False
2024-04-18,22:23:46 | INFO | Rank 0 | use_flash_attention: False
2024-04-18,22:23:46 | INFO | Rank 0 | val_data: /home/ppop/Chinese-CLIP/datapath/datasets/yyut/lmdb/valid
2024-04-18,22:23:46 | INFO | Rank 0 | valid_batch_size: 128
2024-04-18,22:23:46 | INFO | Rank 0 | valid_epoch_interval: 1
2024-04-18,22:23:46 | INFO | Rank 0 | valid_num_workers: 1
2024-04-18,22:23:46 | INFO | Rank 0 | valid_step_interval: 150
2024-04-18,22:23:46 | INFO | Rank 0 | vision_model: ViT-L-14
2024-04-18,22:23:46 | INFO | Rank 0 | warmup: 100
2024-04-18,22:23:46 | INFO | Rank 0 | wd: 0.001
2024-04-18,22:23:46 | INFO | Rank 0 | world_size: 1
2024-04-18,22:23:46 | INFO | Rank 0 | Use GPU: 0 for training
2024-04-18,22:23:46 | INFO | Rank 0 | => begin to load checkpoint '/home/ppop/Chinese-CLIP/datapath/pretrained_weights/clip_cn_vit-l-14.pt'
2024-04-18,22:23:47 | INFO | Rank 0 | train LMDB file contains 35000 images and 105000 pairs.
2024-04-18,22:23:47 | INFO | Rank 0 | val LMDB file contains 7500 images and 22500 pairs.
Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/ppop/miniconda3/envs/torch/lib/python3.8/threading.py", line 932, in _bootstrap_inner
The text was updated successfully, but these errors were encountered: