yinsong1986 commited on
Commit
114c6dc
1 Parent(s): 52d2d5a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -81,7 +81,7 @@ there were some limitations on its performance on longer context. Motivated by i
81
  - **Contact:** [GitHub issues](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/issues)
82
  - **Inference Code** [Github Repo](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/)
83
 
84
- ## How to Use MistralFlite from Python Code (HuggingFace transformers) ##
85
 
86
  **Important** - For an end-to-end example Jupyter notebook, please refer to [this link](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/huggingface-transformers/example_usage.ipynb).
87
 
@@ -132,7 +132,7 @@ for seq in sequences:
132
  <|prompter|>What are the main challenges to support a long context for LLM?</s><|assistant|>
133
  ```
134
 
135
- ## How to Serve MistralFlite on TGI ##
136
  **Important:**
137
  - For an end-to-end example Jupyter notebook using the native TGI container, please refer to [this link](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/tgi/example_usage.ipynb).
138
  - If the **input context length is greater than 12K tokens**, it is recommended using a custom TGI container, please refer to [this link](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/tgi-custom/example_usage.ipynb).
@@ -199,7 +199,7 @@ result = invoke_tgi(prompt)
199
  **Important** - When using MistralLite for inference for the first time, it may require a brief 'warm-up' period that can take 10s of seconds. However, subsequent inferences should be faster and return results in a more timely manner. This warm-up period is normal and should not affect the overall performance of the system once the initialisation period has been completed.
200
 
201
 
202
- ## How to Deploy MistralFlite on Amazon SageMaker ##
203
  **Important:**
204
  - For an end-to-end example Jupyter notebook using the SageMaker built-in container, please refer to [this link](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/sagemaker-tgi/example_usage.ipynb).
205
  - If the **input context length is greater than 12K tokens**, it is recommended using a custom docker container, please refer to [this link](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/sagemaker-tgi-custom/example_usage.ipynb).
@@ -307,7 +307,7 @@ print(result)
307
  ```
308
 
309
 
310
- ## How to Serve MistralFlite on vLLM ##
311
  Documentation on installing and using vLLM [can be found here](https://vllm.readthedocs.io/en/latest/).
312
 
313
  **Important** - For an end-to-end example Jupyter notebook, please refer to [this link](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/vllm/example_usage.ipynb).
 
81
  - **Contact:** [GitHub issues](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/issues)
82
  - **Inference Code** [Github Repo](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/)
83
 
84
+ ## How to Use MistralLite from Python Code (HuggingFace transformers) ##
85
 
86
  **Important** - For an end-to-end example Jupyter notebook, please refer to [this link](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/huggingface-transformers/example_usage.ipynb).
87
 
 
132
  <|prompter|>What are the main challenges to support a long context for LLM?</s><|assistant|>
133
  ```
134
 
135
+ ## How to Serve MistralLite on TGI ##
136
  **Important:**
137
  - For an end-to-end example Jupyter notebook using the native TGI container, please refer to [this link](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/tgi/example_usage.ipynb).
138
  - If the **input context length is greater than 12K tokens**, it is recommended using a custom TGI container, please refer to [this link](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/tgi-custom/example_usage.ipynb).
 
199
  **Important** - When using MistralLite for inference for the first time, it may require a brief 'warm-up' period that can take 10s of seconds. However, subsequent inferences should be faster and return results in a more timely manner. This warm-up period is normal and should not affect the overall performance of the system once the initialisation period has been completed.
200
 
201
 
202
+ ## How to Deploy MistralLite on Amazon SageMaker ##
203
  **Important:**
204
  - For an end-to-end example Jupyter notebook using the SageMaker built-in container, please refer to [this link](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/sagemaker-tgi/example_usage.ipynb).
205
  - If the **input context length is greater than 12K tokens**, it is recommended using a custom docker container, please refer to [this link](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/sagemaker-tgi-custom/example_usage.ipynb).
 
307
  ```
308
 
309
 
310
+ ## How to Serve MistralLite on vLLM ##
311
  Documentation on installing and using vLLM [can be found here](https://vllm.readthedocs.io/en/latest/).
312
 
313
  **Important** - For an end-to-end example Jupyter notebook, please refer to [this link](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/MistralLite/vllm/example_usage.ipynb).