How to use with batch size > 1?

#9
by RylanSchaeffer - opened

The demo code uses a batch size of 1. When I try passing a list of strings to the tokenizer, I receive the error:

{ValueError}ValueError("Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=Tr...h. Perhaps your features (`input_ids` in this case) have excessive nesting (inputs type `list` where type `int` is expected).")

How do I fix this?

What is an acceptable padding token to choose?

Sign up or log in to comment