Weird variable name mistakes in code generation

#1
by krana - opened

Hi, I have been using this model and llama-8b BF16 for code generation. This quantized version of llama-3-70b makes weird variable name and string errors in the code which is kind of surprising as I have not seen this issue with the 8b variant with the exact same prompt.

Is this an expected behaviour of the model due to weight quantization?
Does the quantization also quantize the token embeddings and the lm_head layers?

Neural Magic org

Hi. The token embeddings and lm_head layers are not quantized for this model.

We strive to produce quantize versions that do not deviate much from the original model. Can you please share more details about your prompt so we can investigate further?

Sign up or log in to comment