YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
We train the model from scratch. The lr is constantly 1e5 during the whole training, and we choose cosine
schedule.
model architecture: qwen-2-1.5B
pt dataset:
- cc-100 en (300k) de (300k)
sft dataset:
- AgentWaller/german-oasst1-qa-format (10k)
- LeoLM/OpenSchnabeltier (10k)
- WMT19 de2en (300k)
- Downloads last month
- 9