byroneverson commited on
Commit
9dff2b0
1 Parent(s): 43e7302

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -4
README.md CHANGED
@@ -15,11 +15,16 @@ library_name: transformers
15
 
16
 
17
  NOTE: This is a current WIP (work in progress).
18
- This abliteration was 1/2 performed with llama-cpp-python (obtain direction vector) and 1/2 performed with torch (modify .safetensors one at a time).
 
 
 
 
 
19
  It is a rather larger model so it may take me another day or two to figure out which layer I should be using for the direction vector.
20
- This is round 1, layer 20 was used for the direction vector.
21
- I have not tested it yet and cannot test it until I can make a GGUF of this repo.
22
- From there I will determine if I need to change the layer used, etc.
23
 
24
  # gemma-2-27b-it-abliterated
25
  Check out the <a href="https://huggingface.co/byroneverson/gemma-2-27b-it-abliterated/blob/main/abliterate-gemma-2-27b-it.ipynb">jupyter notebook</a> for details of how this model was abliterated from glm-4-9b-chat.
 
15
 
16
 
17
  NOTE: This is a current WIP (work in progress).
18
+
19
+ Abliteration method:
20
+
21
+ 1. Obtain refusal direction with llama-cpp-python.
22
+ 2. Orthogonalization performed with torch directly to .safetensors. (one at a time)
23
+
24
  It is a rather larger model so it may take me another day or two to figure out which layer I should be using for the direction vector.
25
+
26
+ First attempt: Layer 20 was used to obtain refusal direction vector. Refusal mitigation sort of worked but not perfect.
27
+ Second attempt: (Current) Layer 23 was used (mid-point of model). Half-way has proven to work with other model so this should be fine.
28
 
29
  # gemma-2-27b-it-abliterated
30
  Check out the <a href="https://huggingface.co/byroneverson/gemma-2-27b-it-abliterated/blob/main/abliterate-gemma-2-27b-it.ipynb">jupyter notebook</a> for details of how this model was abliterated from glm-4-9b-chat.