--- datasets: - HuggingFaceH4/CodeAlpaca_20K language: - en library_name: transformers pipeline_tag: text-generation tags: - code - LLaMa2 --- # LLaMaCoder ## Model Description `LLaMaCoder` is based on LLaMa2 7B language model, finetuned using LoRA adaptors. ## Usage Generate code with LLaMaCoder in 4bit model according to the following python snippet: ```python from transformers import AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer import torch MODEL_NAME = "Sakuna/LLaMaCoderAll" device = "cuda:0" bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.float16, ) model = AutoModelForCausalLM.from_pretrained( MODEL_NAME, quantization_config=bnb_config, trust_remote_code=True ) tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True) tokenizer.pad_token = tokenizer.eos_token model = model.to(device) model.eval() prompt = "Write a Java program to calculate the factorial of a given number k" input = f"{prompt}\n### Solution:\n" device = "cuda:0" inputs = tokenizer(input, return_tensors="pt").to(device) outputs = model.generate(**inputs, max_length=256, temperature=0.7) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ```