Update README.md
Browse files
README.md
CHANGED
@@ -42,11 +42,18 @@ This repo contains GGUF quants for [Flow-Judge-v0.1](https://huggingface.co/flow
|
|
42 |
|
43 |
## Quantization config
|
44 |
|
45 |
-
|
|
|
|
|
|
|
|
|
46 |
|
47 |
## Running the GGUF file
|
48 |
|
49 |
-
|
|
|
|
|
|
|
50 |
|
51 |
# Original model card: Flow-Judge-v0.1
|
52 |
|
|
|
42 |
|
43 |
## Quantization config
|
44 |
|
45 |
+
Version used: github:ggerganov/llama.cpp/8e6e2fbe1458ac91387266241262294a964d6b95?narHash=sha256-Z3Rg43p8G9MdxiGvSl9m43KsJ1FvvhQwtzRy/grg9X0%3D
|
46 |
+
```
|
47 |
+
llama-convert-hf-to-gguf ./flowaicom/Flow-Judge-v0.1 --outfile flow-judge-v0.1-bf16.gguf --outtype auto
|
48 |
+
llama-quantize flow-judge-v0.1-bf16.gguf flow-judge-v0.1-Q4_K_M.gguf Q4_K_M
|
49 |
+
```
|
50 |
|
51 |
## Running the GGUF file
|
52 |
|
53 |
+
```shell
|
54 |
+
llama-server -ngl 33 -t 16 -m Flow-Judge-v0.1-GGUF/flow-judge-v0.1-Q4_K_M.gguf -c 8192 -n 8192 -fa
|
55 |
+
|
56 |
+
```
|
57 |
|
58 |
# Original model card: Flow-Judge-v0.1
|
59 |
|