robinq commited on
Commit
edeb5e1
1 Parent(s): d21f40f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -8,7 +8,7 @@ language:
8
 
9
  This BERT model was trained using the 🤗 transformers library.
10
  The size of the model is a regular BERT-base with 110M parameters.
11
- The model was trained on about 70GB of data, consisting mostly of OSCAR (25GB) and Swedish newspaper text curated by the National Library of Sweden.
12
  To avoid excessive padding documents shorter than 512 tokens were concatenated into one large sequence of 512 tokens, and larger documents were split into multiple 512 token sequences, following https://github.com/huggingface/transformers/blob/master/examples/pytorch/language-modeling/run_mlm.py
13
 
14
  Training was done for a bit more than 8 epochs with a batch size of 2048, resulting in a little less than 125k training steps.
 
8
 
9
  This BERT model was trained using the 🤗 transformers library.
10
  The size of the model is a regular BERT-base with 110M parameters.
11
+ The model was trained on about 70GB of data, consisting mostly of OSCAR and Swedish newspaper text curated by the National Library of Sweden.
12
  To avoid excessive padding documents shorter than 512 tokens were concatenated into one large sequence of 512 tokens, and larger documents were split into multiple 512 token sequences, following https://github.com/huggingface/transformers/blob/master/examples/pytorch/language-modeling/run_mlm.py
13
 
14
  Training was done for a bit more than 8 epochs with a batch size of 2048, resulting in a little less than 125k training steps.