wenge-research commited on
Commit
2844b2c
1 Parent(s): e60ca2e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -22
README.md CHANGED
@@ -1,9 +1,11 @@
1
  ---
2
- license: apache-2.0
3
  ---
4
 
5
  <div align="center">
6
- <img src="./assets/yayi_dark_small.png" alt="YAYI" style="width: 30%; display: block; margin: auto;">
 
 
7
  <br>
8
 
9
  [![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-brightgreen.svg)](./LICENSE)
@@ -14,35 +16,16 @@ license: apache-2.0
14
  [[🤗HF Repo](https://huggingface.co/wenge-research)]
15
  [[🔗网页端](https://yayi.wenge.com)]
16
 
17
- 中文 | [English](./README_EN.md)
18
 
19
  </div>
20
 
21
 
22
- ## 目录
23
-
24
- - [目录](#目录)
25
- - [介绍](#介绍)
26
- - [模型地址](#模型地址)
27
- - [评测结果](#评测结果)
28
- - [推理](#推理)
29
- - [模型微调](#模型微调)
30
- - [环境安装](#环境安装-1)
31
- - [全参训练](#全参训练)
32
- - [LoRA 微调](#lora-微调)
33
- - [预训练数据](#预训练数据)
34
- - [分词器](#分词器)
35
- - [Loss 曲线](#loss-曲线)
36
- - [相关协议](#相关协议)
37
- - [开源协议](#开源协议)
38
- - [引用](#引用)
39
-
40
  ## 介绍/Introduction
41
  YAYI 2 是中科闻歌研发的**新一代开源大语言模型**,包括 Base 和 Chat 版本,参数规模为 30B,并采用了超过 2 万亿 Tokens 的高质量、多语言语料进行预训练。针对通用和特定领域的应用场景,我们采用了百万级指令进行微调,同时借助人类反馈强化学习方法,以更好地使模型与人类价值观对齐。本次开源的模型为 YAYI2-30B Base 模型。我们希望通过雅意大模型的开源来促进中文预训练大模型开源社区的发展,并积极为此做出贡献。通过开源,我们与每一位合作伙伴共同构建雅意大模型生态。更多技术细节,敬请期待我们的的技术报告🔥。
42
 
43
  YAYI 2 is the new generation of open-source large language models launched by Wenge Technology. It has been pretrained for 2.65 trillion tokens of multilingual data with high quality. The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback (RLHF). We opensource the pre-trained language model in this release, namely **YAYI2-30B**. By open-sourcing the YAYI 2 model, we aim to contribute to the development of the Chinese pre-trained large language model open-source community. Through open-source, we aspire to collaborate with every partner in building the YAYI large language model ecosystem. Stay tuned for more technical details in our upcoming technical report! 🔥
44
 
45
- ## 模型地址/Model download
46
 
47
  | Model Name | Context Length | 🤗 HF Model Name |
48
  |:----------|:----------:|:----------:|
 
1
  ---
2
+ license: other
3
  ---
4
 
5
  <div align="center">
6
+ <h1>
7
+ YAYI 2
8
+ </h1>
9
  <br>
10
 
11
  [![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-brightgreen.svg)](./LICENSE)
 
16
  [[🤗HF Repo](https://huggingface.co/wenge-research)]
17
  [[🔗网页端](https://yayi.wenge.com)]
18
 
 
19
 
20
  </div>
21
 
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  ## 介绍/Introduction
24
  YAYI 2 是中科闻歌研发的**新一代开源大语言模型**,包括 Base 和 Chat 版本,参数规模为 30B,并采用了超过 2 万亿 Tokens 的高质量、多语言语料进行预训练。针对通用和特定领域的应用场景,我们采用了百万级指令进行微调,同时借助人类反馈强化学习方法,以更好地使模型与人类价值观对齐。本次开源的模型为 YAYI2-30B Base 模型。我们希望通过雅意大模型的开源来促进中文预训练大模型开源社区的发展,并积极为此做出贡献。通过开源,我们与每一位合作伙伴共同构建雅意大模型生态。更多技术细节,敬请期待我们的的技术报告🔥。
25
 
26
  YAYI 2 is the new generation of open-source large language models launched by Wenge Technology. It has been pretrained for 2.65 trillion tokens of multilingual data with high quality. The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback (RLHF). We opensource the pre-trained language model in this release, namely **YAYI2-30B**. By open-sourcing the YAYI 2 model, we aim to contribute to the development of the Chinese pre-trained large language model open-source community. Through open-source, we aspire to collaborate with every partner in building the YAYI large language model ecosystem. Stay tuned for more technical details in our upcoming technical report! 🔥
27
 
28
+ ## 模型/Model
29
 
30
  | Model Name | Context Length | 🤗 HF Model Name |
31
  |:----------|:----------:|:----------:|