Update README.md
Browse files
README.md
CHANGED
@@ -30,9 +30,9 @@ alephbert.eval()
|
|
30 |
```
|
31 |
|
32 |
## Training data
|
33 |
-
1. OSCAR [(Ortiz, 2019)](https://oscar-corpus.com/) Hebrew section (
|
34 |
-
2. Hebrew dump of [Wikipedia](https://dumps.wikimedia.org/hewiki/latest/) (650 MB text,
|
35 |
-
3. Hebrew Tweets collected from the Twitter sample stream (
|
36 |
|
37 |
## Training procedure
|
38 |
|
|
|
30 |
```
|
31 |
|
32 |
## Training data
|
33 |
+
1. OSCAR [(Ortiz, 2019)](https://oscar-corpus.com/) Hebrew section (10 GB text, 20 million sentences).
|
34 |
+
2. Hebrew dump of [Wikipedia](https://dumps.wikimedia.org/hewiki/latest/) (650 MB text, 3 million sentences).
|
35 |
+
3. Hebrew Tweets collected from the Twitter sample stream (7 GB text, 70 million sentences).
|
36 |
|
37 |
## Training procedure
|
38 |
|