dahara1
/

gemma-2-2b-jpn-it-gguf-japanese-imatrix

Inference Endpoints

Model card Files Files and versions Community

dahara1 commited on 13 days ago

Commit

f2a964a

•

1 Parent(s): c6473c7

Update README.md

Files changed (1) hide show

README.md +29 -9

README.md CHANGED Viewed

@@ -18,20 +18,40 @@ This is a quantized gguf version of [google/gemma-2-2b-jpn-it](https://huggingfa
 I hope it retains more Japanese support.
 When [compared with the 4-bit quantized version of gemma-2-9b-it](https://huggingface.co/google/gemma-2-2b-jpn-it), we found that the perplexity score improved slightly.
-# How to Use.
-ブラウザインタフェース (browser)
-Windows11のターミナル(CMD, Power shell)では日本語が化けてしまうのでブラウザを使ってください
-Please use a browser as Japanese characters will be garbled in the Windows 11 terminal (CMD, Power shell).
-公式マニュアルに従ってllama.cppをビルドします
-Build llama.cpp according to the official manual
-ダウンロードしたモデルを指定して下記コマンドを実行します
 ```
-llama.cpp\build\bin\Release\llama-server -m .\gemma-2-27b-it-Q4_K_M.gguf
 ```
-どのモデルを使うべきですか？

 I hope it retains more Japanese support.
 When [compared with the 4-bit quantized version of gemma-2-9b-it](https://huggingface.co/google/gemma-2-2b-jpn-it), we found that the perplexity score improved slightly.
+# 使い方 How to Use.
+ggufフォーマットに対応したツールは様々なものがあるのでお好きなツールをお使いください。例えば、[llama.cpp](https://github.com/ggerganov/llama.cpp)での使い方は以下です
+There are many tools that support the gguf format, so please use the one you like. For example, the usage for [llama.cpp](https://github.com/ggerganov/llama.cpp) is as follows.
+Windows11のターミナル(CMD, Power shell)では日本語が化けてしまうのでブラウザを使ってください
+Please use a browser as Japanese characters will be garbled in the Windows 11 terminal (CMD, Power shell).
+公式マニュアルに従ってllama.cppをビルドします
+Build llama.cpp according to the official manual
+ダウンロードしたモデルを指定して下記コマンドを実行します
+Execute command.
 ```
+llama.cpp\build\bin\Release\llama-server -m .\gemma-2-9b-it-Q4_K_M-fp16.gguf
 ```
+ブラウザでhttp://127.0.0.1:8080を開きます
+Open http://127.0.0.1:8080 in your browser
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/630469550907b9a115c91e62/PHli0VVox8bt6ziQoP02B.png)
+# どのモデルを使うべきですか？ Which model should I use?
+人によって意見が異なりますが、目安としては以下です
+- できればQ4以上
+- メモリが許す限り大きいモデル(例えば、利用可能なメモリの７割程度)
+Opinions vary from person to person, but here are some guidelines:
+- Preferably Q4 or higher
+- As large a model as memory allows (for example, about 70% of available memory)