frankminors123
/

Chinese-CodeLlama-7B-SFT-V2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

frankminors123 commited on Nov 25, 2023

Commit

7d62839

•

1 Parent(s): 8bcfb35

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -11,7 +11,7 @@ the base period of rotary positional embeddings (RoPE) from 10000 to 1000000.
 We use a sequence length of 1k for pre-training, and continue training based on this length during the fine-tuning stage. Based on a larger base period of RoPE, it can support up 15k context length extrapolation at inference time.
-Based on this [dataset](https://huggingface.co/datasets/code_search_net), we calculate the average of PPL on 1k length text to be 5.44. However, this value is 148.70 based on our pre-trained model.
 The Chinese prompt template used is as follows:
 ```python

 We use a sequence length of 1k for pre-training, and continue training based on this length during the fine-tuning stage. Based on a larger base period of RoPE, it can support up 15k context length extrapolation at inference time.
+  Based on this [dataset](https://huggingface.co/datasets/code_search_net) (Python-test), we calculate the average of PPL on 1k length text to be 5.44. However, this value is 148.70 based on our pre-trained model.
 The Chinese prompt template used is as follows:
 ```python