Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
hishab
/
bn-tokenizer-llama2-extend
like
0
Transformers
Bengali
Inference Endpoints
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
Edit model card
Bangla Tokenizer with Extending Llama2
Details
Bangla Tokenizer with Extending Llama2
Details
Taken llama2 tokenizer with vocab size: 32000
Add new Bangla tokens: 48667
Now this tokenizer vocab size: 80665
Bangla tokens added from
https://huggingface.co/hishab/bn_sentencepiece_vs_50k_58GB
Downloads last month
-
Downloads are not tracked for this model.
How to track
Inference API
Unable to determine this model’s pipeline type. Check the
docs
.