jgrosjean commited on
Commit
25f607a
1 Parent(s): d8f35d4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -52
README.md CHANGED
@@ -120,26 +120,9 @@ This model has been trained on news articles only. Hence, it might not perform a
120
 
121
  #### Training Hyperparameters
122
 
123
- - **Training regime:** python3 train_simcse_multilingual.py \
124
- --seed 54699 \
125
- --model_name_or_path zurichNLP/swissbert \
126
- --train_file /srv/scratch2/grosjean/Masterarbeit/data_subsets \
127
- --output_dir /srv/scratch2/grosjean/Masterarbeit/model \
128
- --overwrite_output_dir \
129
- --save_strategy no \
130
- --do_train \
131
- --num_train_epochs 1 \
132
- --learning_rate 1e-5 \
133
- --per_device_train_batch_size 4 \
134
- --gradient_accumulation_steps 128 \
135
- --max_seq_length 512 \
136
- --overwrite_cache \
137
- --pooler_type avg \
138
- --pad_to_max_length \
139
- --temp 0.05 \
140
- --fp16 <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
141
-
142
- [More Information Needed]
143
 
144
  ## Evaluation
145
 
@@ -190,35 +173,3 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
190
  ### Model Architecture and Objective
191
 
192
  [More Information Needed]
193
-
194
- ## Citation [optional]
195
-
196
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
197
-
198
- **BibTeX:**
199
-
200
- [More Information Needed]
201
-
202
- **APA:**
203
-
204
- [More Information Needed]
205
-
206
- ## Glossary [optional]
207
-
208
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
209
-
210
- [More Information Needed]
211
-
212
- ## More Information [optional]
213
-
214
- [More Information Needed]
215
-
216
- ## Model Card Authors [optional]
217
-
218
- [More Information Needed]
219
-
220
- ## Model Card Contact
221
-
222
- [More Information Needed]
223
-
224
-
 
120
 
121
  #### Training Hyperparameters
122
 
123
+ Number of epochs: 1
124
+ Learning rate: 1e-5
125
+ Batch size: 512
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
126
 
127
  ## Evaluation
128
 
 
173
  ### Model Architecture and Objective
174
 
175
  [More Information Needed]