daekeun-ml
commited on
Commit
β’
6e563cc
1
Parent(s):
5791b0d
Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ pipeline_tag: text-generation
|
|
20 |
## Model Details
|
21 |
This model is trained using unsloth toolkit based on Microsoft's phi-3 model with some Korean instruction data added to enhance its Korean generation performance
|
22 |
|
23 |
-
Since my role is not as a working developer, but as ML Technical Specialist helping customers with quick PoCs/prototypes, and I was limited by Azure GPU resources available, I only trained with 40,000 samples on a single
|
24 |
|
25 |
### Dataset
|
26 |
|
@@ -32,6 +32,8 @@ The dataset used for training is as follows. To prevent catastrophic forgetting,
|
|
32 |
|
33 |
|
34 |
## How to Get Started with the Model
|
|
|
|
|
35 |
```python
|
36 |
### Load model
|
37 |
import torch
|
@@ -67,6 +69,7 @@ params = {
|
|
67 |
### Inference
|
68 |
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
|
69 |
|
|
|
70 |
messages = [
|
71 |
{"from": "human", "value": "Continue the fibonnaci sequence in Korean: 1, 1, 2, 3, 5, 8,"},
|
72 |
{"from": "assistant", "value": "νΌλ³΄λμΉ μμ΄μ λ€μ μ«μλ 13, 21, 34, 55, 89 λ±μ
λλ€. κ° μ«μλ μμ λ μ«μμ ν©μ
λλ€."},
|
@@ -82,6 +85,7 @@ inputs = tokenizer.apply_chat_template(
|
|
82 |
text_streamer = TextStreamer(tokenizer)
|
83 |
_ = model.generate(input_ids = inputs, streamer = text_streamer, **params)
|
84 |
|
|
|
85 |
messages = [
|
86 |
{"from": "human", "value": "What is Machine Learning in Korean?"},
|
87 |
{"from": "assistant", "value": "μΈκ³΅μ§λ₯μ ν λΆμΌλ‘ λ°©λν λ°μ΄ν°λ₯Ό λΆμν΄ ν₯ν ν¨ν΄μ μμΈ‘νλ κΈ°λ²μ
λλ€."},
|
@@ -99,6 +103,29 @@ text_streamer = TextStreamer(tokenizer)
|
|
99 |
_ = model.generate(input_ids = inputs, streamer = text_streamer, **params)
|
100 |
```
|
101 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
102 |
### References
|
103 |
- Base model: [microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
|
104 |
|
|
|
20 |
## Model Details
|
21 |
This model is trained using unsloth toolkit based on Microsoft's phi-3 model with some Korean instruction data added to enhance its Korean generation performance
|
22 |
|
23 |
+
Since my role is not as a working developer, but as ML Technical Specialist helping customers with quick PoCs/prototypes, and I was limited by Azure GPU resources available, I only trained with 40,000 samples on a single VM Azure Standard_NC24ads_A100_v4 for PoC purposes. Because I have not done any tokenizer extensions, you need a lot more tokens than English for text generation.
|
24 |
|
25 |
### Dataset
|
26 |
|
|
|
32 |
|
33 |
|
34 |
## How to Get Started with the Model
|
35 |
+
|
36 |
+
### Code snippets
|
37 |
```python
|
38 |
### Load model
|
39 |
import torch
|
|
|
69 |
### Inference
|
70 |
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
|
71 |
|
72 |
+
# 1st example
|
73 |
messages = [
|
74 |
{"from": "human", "value": "Continue the fibonnaci sequence in Korean: 1, 1, 2, 3, 5, 8,"},
|
75 |
{"from": "assistant", "value": "νΌλ³΄λμΉ μμ΄μ λ€μ μ«μλ 13, 21, 34, 55, 89 λ±μ
λλ€. κ° μ«μλ μμ λ μ«μμ ν©μ
λλ€."},
|
|
|
85 |
text_streamer = TextStreamer(tokenizer)
|
86 |
_ = model.generate(input_ids = inputs, streamer = text_streamer, **params)
|
87 |
|
88 |
+
# 2nd example
|
89 |
messages = [
|
90 |
{"from": "human", "value": "What is Machine Learning in Korean?"},
|
91 |
{"from": "assistant", "value": "μΈκ³΅μ§λ₯μ ν λΆμΌλ‘ λ°©λν λ°μ΄ν°λ₯Ό λΆμν΄ ν₯ν ν¨ν΄μ μμΈ‘νλ κΈ°λ²μ
λλ€."},
|
|
|
103 |
_ = model.generate(input_ids = inputs, streamer = text_streamer, **params)
|
104 |
```
|
105 |
|
106 |
+
### Inference results
|
107 |
+
```
|
108 |
+
# 1st example
|
109 |
+
<s><|user|> Continue the fibonnaci sequence in Korean: 1, 1, 2, 3, 5, 8,<|end|><|assistant|> νΌλ³΄λμΉ μμ΄μ λ€μ μ«μλ 13, 21, 34, 55, 89 λ±μ
λλ€. κ° μ«μλ μμ λ μ«μμ ν©μ
λλ€.<|end|><|user|> Compute 2x+3=12 in Korean<|end|><|assistant|> λ°©μ μ 2x + 3 = 12μμ xλ₯Ό νλ €λ©΄ λ€μ λ¨κ³λ₯Ό λ°λ₯΄μμμ€.
|
110 |
+
|
111 |
+
1. λ°©μ μμ μμͺ½μμ 3μ λΉΌμ λ°©μ μμ νμͺ½μ λν΄ xλ₯Ό λΆλ¦¬ν©λλ€.
|
112 |
+
|
113 |
+
2x + 3 - 3 = 12 - 3
|
114 |
+
|
115 |
+
2x = 9
|
116 |
+
|
117 |
+
2. μ΄μ λ°©μ μμ μμͺ½μ 2λ‘ λλμ΄ xμ κ°μ ꡬν©λλ€.
|
118 |
+
|
119 |
+
2λ°° / 2 = 9 / 2
|
120 |
+
|
121 |
+
x = 4.5
|
122 |
+
|
123 |
+
λ°λΌμ λ°©μ μ 2x + 3 = 12μ λν ν΄λ x = 4.5μ
λλ€.<|end|>
|
124 |
+
|
125 |
+
# 2nd example
|
126 |
+
<s><|user|> What is Machine Learning in Korean?<|end|><|assistant|> μΈκ³΅μ§λ₯μ ν λΆμΌλ‘ λ°©λν λ°μ΄ν°λ₯Ό λΆμν΄ ν₯ν ν¨ν΄μ μμΈ‘νλ κΈ°λ²μ
λλ€.<|end|><|user|> What is Deep Learning in Korean?<|end|><|assistant|> 볡μ‘ν λ°μ΄ν° μΈνΈλ₯Ό λΆμνκ³ λ³΅μ‘ν ν¨ν΄μ μΈμνκ³ νμ΅νλ λ° μ¬μ©λλ λ₯λ¬λμ λ§μ λ μ΄μ΄λ‘ ꡬμ±λ μ κ²½λ§μ νμ μ§ν©μ
λλ€. μ΄ κΈ°μ μ μ΄λ―Έμ§ μΈμ, μμ°μ΄ μ²λ¦¬ λ° μμ¨ μ΄μ κ³Ό κ°μ λ€μν μμ© λΆμΌμμ ν° λ°μ μ μ΄λ€μ΅λλ€.<|end|>
|
127 |
+
```
|
128 |
+
|
129 |
### References
|
130 |
- Base model: [microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
|
131 |
|