How to split a model

by nib12345 - opened Jun 29, 2023

Discussion

nib12345

Jun 29, 2023

•

edited Jun 29, 2023

Hi guys,
Does anyone have idea?

How to split
mpt-30b-chat.ggmlv0.q4_1.bin

mpt-30b-chat.ggmlv0.q4_1_00001_of_00004.bin
mpt-30b-chat.ggmlv0.q4_1_00002_of_00004.bin
mpt-30b-chat.ggmlv0.q4_1_00003_of_00004.bin
mpt-30b-chat.ggmlv0.q4_1_00004_of_00004.bin

As to load on kaggle (as kaggle has limitation of ram).

If anyone has idea, please say?
I dont know much about how to develop model, i am just a Full Stack Developer.
Thanks.

TheBloke

Owner Jun 29, 2023

That's not possible. GGML does not support multi-part GGML files.

Using KoboldCpp you can offload some of the model to GPU (if you have one), which will reduce RAM usage accordingly.

But there's no GPU support for MPT GGML models from Python code at this time. Only using the KoboldCpp UI.

nib12345 changed discussion status to closed Jun 30, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment