pad_token_id=-1 now throws errors in HF

#2
by johngiorgi - opened

Thanks for uploading this! Super helpful for writing unit tests.

On a recent version of HF (I didn't track down exactly which version), this model now causes the follow error to be thrown due to its use of pad_token_id=-1:

Thrown during validation:
[UserWarning('`pad_token_id` should be positive but got -1. This will cause errors when batch generating, if there is padding. Please set `pas_token_id` explicitly by `model.generation_config.pad_token_id=PAD_TOKEN_ID` to avoid errors in generation, and ensure your `input_ids` input does not have negative values.')]

Maybe using the same ID as the UNK/EOS token would be a better default? Happy to PR if you agree

Sign up or log in to comment