urllm-ko_en-2.7b / README.md
chcho's picture
Update README.md
1a93731 verified
|
raw
history blame
No virus
543 Bytes
metadata
license: cc-by-sa-4.0
language:
  - ko
  - en
pipeline_tag: text-generation
tags:
  - meta
  - llama-2
  - llama-2-ko-en
  - sheared llama

Model Details

Model Architecture:

urLLM-KO_EN-2.7B is an auto-regressive language model that leverages an optimized transformer architecture derived from princeton-nlp/Sheared-LLaMA-2.7B.

Training Corpus

The model was trained using selected datasets from Modu Corpus, Korean Wikipedia and Kaggle English News (approximately total 36GB).

Vocab Expansion

The expanded vocab size is 51385.