Merge branch 'main' of https://huggingface.co/onlplab/alephbert-base into main
Browse files
README.md
CHANGED
@@ -45,8 +45,8 @@ To optimize training time we split the data into 4 sections based on max number
|
|
45 |
3. 64 <= num tokens < 128 (10M sentences)
|
46 |
4. 128 <= num tokens < 512 (70M sentences)
|
47 |
|
48 |
-
Each section was trained for 5 epochs with an initial learning rate set to 1e-4.
|
49 |
|
50 |
-
Total training time was
|
51 |
|
52 |
|
|
|
45 |
3. 64 <= num tokens < 128 (10M sentences)
|
46 |
4. 128 <= num tokens < 512 (70M sentences)
|
47 |
|
48 |
+
Each section was first trained for 5 epochs with an initial learning rate set to 1e-4. Then each section was trained for another 5 epochs with an initial learning rate set to 1e-5, for a total of 10 epochs.
|
49 |
|
50 |
+
Total training time was 8 days.
|
51 |
|
52 |
|