Replies: 1 comment 5 replies
-
I think that all the tensors in the llama-2 model files distributed by meta are BF16. When converting or quantizing the model to GGUF, some of these tensors are always exported as FP32, regardless of the |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
As I understand it, models like
meta-llama/Llama-2-13b-chat-hf
contains both fp16 and fp32 tensors, so I am wondering:--outtype fp16
, do all the fp32 tensors in the model gets converted to fp16 and the tensors that are already fp16 gets no change?--outtype fp32
, do all the fp16 tensors in the model gets converted to fp32, and the fp32 tensors does not change?Thanks!
Beta Was this translation helpful? Give feedback.
All reactions