kyujinpy commited on
Commit
f842695
β€’
1 Parent(s): ac5ae86

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -9
README.md CHANGED
@@ -24,14 +24,38 @@ license: cc-by-nc-4.0
24
 
25
  **Model Architecture**
26
 
27
- KO-Platypus2-13B is an auto-regressive language model based on the LLaMA2 transformer architecture.
 
 
 
28
 
29
  **Training Dataset**
30
 
31
  I use [KOpen-platypus](https://huggingface.co/datasets/kyujinpy/KOpen-platypus).
32
  It is high-quality korean translation dataset about [open-platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus).
33
 
34
- I use A100 GPU 40GB and COLAB, when trianing.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
  # **Model Benchmark**
37
 
@@ -49,8 +73,9 @@ I use A100 GPU 40GB and COLAB, when trianing.
49
  | [Polyglot-ko-12.8b](https://huggingface.co/EleutherAI/polyglot-ko-12.8b) | 0.7937 | 0.8108 | 0.8037 | 0.8369 |
50
  | [Llama-2-Ko-7b 20B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.7388 | 0.7626 | 0.7808 | 0.7979 |
51
  | [Llama-2-Ko-7b 40B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.7436 | 0.7927 | 0.8037 | 0.8259 |
52
- | **KO-platypus2-13B(ours)** | 0.5820 | 0.6269 | 0.6267 | 0.6527 |
53
-
 
54
  > Natural Language Inference (NLI; μžμ—°μ–΄ μΆ”λ‘  평가)
55
  ### HellaSwag (F1)
56
 
@@ -62,7 +87,8 @@ I use A100 GPU 40GB and COLAB, when trianing.
62
  | [Polyglot-ko-12.8b](https://huggingface.co/EleutherAI/polyglot-ko-12.8b) | 0.5954 | 0.6306 | 0.6098 | 0.6118 |
63
  | [Llama-2-Ko-7b 20B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.4518 | 0.4668 | 0.4726 | 0.4828 |
64
  | [Llama-2-Ko-7b 40B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.4562 | 0.4657 | 0.4698 | 0.4774 |
65
- | **KO-platypus2-13B(ours)** | 0.3912 | 0.4129 | 0.4144 | 0.4330 |
 
66
 
67
  > Question Answering (QA)
68
  ### BoolQ (F1)
@@ -75,7 +101,8 @@ I use A100 GPU 40GB and COLAB, when trianing.
75
  | [Polyglot-ko-12.8b](https://huggingface.co/EleutherAI/polyglot-ko-12.8b) | 0.4818 | 0.6041 | 0.6289 | 0.6448 |
76
  | [Llama-2-Ko-7b 20B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.3607 | 0.6797 | 0.6801 | 0.6622 |
77
  | [Llama-2-Ko-7b 40B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.5786 | 0.6977 | 0.7084 | 0.7144 |
78
- | **KO-platypus2-13B(ours)** | 0.3539 | 0.7168 | 0.7328 | 0.7172 |
 
79
 
80
  > Classification
81
  ### SentiNeg (F1)
@@ -88,15 +115,16 @@ I use A100 GPU 40GB and COLAB, when trianing.
88
  | [Polyglot-ko-12.8b](https://huggingface.co/EleutherAI/polyglot-ko-12.8b) | 0.9117 | 0.9015 | 0.9345 | 0.9723 |
89
  | [Llama-2-Ko-7b 20B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.4855 | 0.8295 | 0.8711 | 0.8513 |
90
  | [Llama-2-Ko-7b 40B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.4594 | 0.7611 | 0.7276 | 0.9370 |
91
- | **KO-platypus2-13B(ours)** | 0.5216 | 0.8236 | 0.8487 | 0.8789 |
92
-
 
93
  # Implementation Code
94
  ```python
95
  ### KO-Platypus
96
  from transformers import AutoModelForCausalLM, AutoTokenizer
97
  import torch
98
 
99
- repo = "kyujinpy/KO-Platypus2-13B"
100
  ko_platypus = AutoModelForCausalLM.from_pretrained(
101
  repo,
102
  return_dict=True,
 
24
 
25
  **Model Architecture**
26
 
27
+ KO-Platypus2-7B-ex is an auto-regressive language model based on the LLaMA2 transformer architecture.
28
+
29
+ **Base Model**
30
+ [Llama-2-ko-7b](https://huggingface.co/beomi/llama-2-ko-7b)
31
 
32
  **Training Dataset**
33
 
34
  I use [KOpen-platypus](https://huggingface.co/datasets/kyujinpy/KOpen-platypus).
35
  It is high-quality korean translation dataset about [open-platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus).
36
 
37
+ I use A100 GPU 40GB and COLAB, when trianing.
38
+
39
+ **Vocab Expansion**
40
+
41
+ | Model Name | Vocabulary Size | Description |
42
+ | --- | --- | --- |
43
+ | Original Platypus2 | NaN | Sentencepiece BPE |
44
+ | **Expanded KO-Platypus-ex** | NaN | Sentencepiece BPE. Added Korean vocab and merges |
45
+
46
+ **Tokenizing "μ•ˆλ…•ν•˜μ„Έμš”, μ˜€λŠ˜μ€ 날씨가 μ’‹λ„€μš”."**
47
+
48
+ | Model | Tokens |
49
+ | --- | --- |
50
+ | Platypus2-7b | `[NaN]` |
51
+ | KO-Platypus2-7b-ex | `[NaN]` |
52
+
53
+ **Tokenizing "Platypus: Quick, Cheap, and Powerful Refinement of LLMs"**
54
+
55
+ | Model | Tokens |
56
+ | --- | --- |
57
+ | Platypus2-7b | `[NaN]` |
58
+ | KO-Platypus2-7b-ex | `[NaN]` |
59
 
60
  # **Model Benchmark**
61
 
 
73
  | [Polyglot-ko-12.8b](https://huggingface.co/EleutherAI/polyglot-ko-12.8b) | 0.7937 | 0.8108 | 0.8037 | 0.8369 |
74
  | [Llama-2-Ko-7b 20B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.7388 | 0.7626 | 0.7808 | 0.7979 |
75
  | [Llama-2-Ko-7b 40B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.7436 | 0.7927 | 0.8037 | 0.8259 |
76
+ | [KO-platypus2-13B](https://huggingface.co/kyujinpy/KO-Platypus2-13B) | 0.5820 | 0.6269 | 0.6267 | 0.6527 |
77
+ | **KO-platypus2-7B-EX(ours)** | NaN | NaN | NaN | NaN |
78
+
79
  > Natural Language Inference (NLI; μžμ—°μ–΄ μΆ”λ‘  평가)
80
  ### HellaSwag (F1)
81
 
 
87
  | [Polyglot-ko-12.8b](https://huggingface.co/EleutherAI/polyglot-ko-12.8b) | 0.5954 | 0.6306 | 0.6098 | 0.6118 |
88
  | [Llama-2-Ko-7b 20B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.4518 | 0.4668 | 0.4726 | 0.4828 |
89
  | [Llama-2-Ko-7b 40B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.4562 | 0.4657 | 0.4698 | 0.4774 |
90
+ | [KO-platypus2-13B](https://huggingface.co/kyujinpy/KO-Platypus2-13B) | 0.3912 | 0.4129 | 0.4144 | 0.4330 |
91
+ | **KO-platypus2-7B-EX(ours)** | NaN | NaN | NaN | NaN |
92
 
93
  > Question Answering (QA)
94
  ### BoolQ (F1)
 
101
  | [Polyglot-ko-12.8b](https://huggingface.co/EleutherAI/polyglot-ko-12.8b) | 0.4818 | 0.6041 | 0.6289 | 0.6448 |
102
  | [Llama-2-Ko-7b 20B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.3607 | 0.6797 | 0.6801 | 0.6622 |
103
  | [Llama-2-Ko-7b 40B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.5786 | 0.6977 | 0.7084 | 0.7144 |
104
+ | [KO-platypus2-13B](https://huggingface.co/kyujinpy/KO-Platypus2-13B) | 0.3539 | 0.7168 | 0.7328 | 0.7172 |
105
+ | **KO-platypus2-7B-EX(ours)** | NaN | NaN | NaN | NaN |
106
 
107
  > Classification
108
  ### SentiNeg (F1)
 
115
  | [Polyglot-ko-12.8b](https://huggingface.co/EleutherAI/polyglot-ko-12.8b) | 0.9117 | 0.9015 | 0.9345 | 0.9723 |
116
  | [Llama-2-Ko-7b 20B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.4855 | 0.8295 | 0.8711 | 0.8513 |
117
  | [Llama-2-Ko-7b 40B](https://huggingface.co/beomi/llama-2-ko-7b) | 0.4594 | 0.7611 | 0.7276 | 0.9370 |
118
+ | [KO-platypus2-13B](https://huggingface.co/kyujinpy/KO-Platypus2-13B) | 0.5216 | 0.8236 | 0.8487 | 0.8789 |
119
+ | **KO-platypus2-7B-EX(ours)** | NaN | NaN | NaN | NaN |
120
+
121
  # Implementation Code
122
  ```python
123
  ### KO-Platypus
124
  from transformers import AutoModelForCausalLM, AutoTokenizer
125
  import torch
126
 
127
+ repo = "kyujinpy/KO-Platypus2-7B-ex"
128
  ko_platypus = AutoModelForCausalLM.from_pretrained(
129
  repo,
130
  return_dict=True,