--- license: llama3 datasets: - mzbac/glaive-function-calling-v2-llama-3-format language: - en --- # Model This model is fine-tuned based on Meta-Llama/Meta-Llama-3-8B instructions via mlx-lm. **Note:** The glaive-function-calling-v2 dataset contains some invalid JSON and single quotes for the arguments' values. I have re-trained the model based on cleaned-up data. If you encounter issues with the function calling JSON format, you may try this new version here: https://huggingface.co/mzbac/llama-3-8B-Instruct-function-calling-v0.2 ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_id = "mzbac/llama-3-8B-Instruct-function-calling" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, device_map="auto", ) tool = { "name": "search_web", "description": "Perform a web search for a given search terms.", "parameter": { "type": "object", "properties": { "search_terms": { "type": "array", "items": {"type": "string"}, "description": "The search queries for which the search is performed.", "required": True, } } }, } messages = [ { "role": "system", "content": f"You are a helpful assistant with access to the following functions. Use them if required - {str(tool)}", }, {"role": "user", "content": "Today's news in Melbourne, just for your information, today is April 27, 2014."}, ] input_ids = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors="pt" ).to(model.device) terminators = [ tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|eot_id|>") ] outputs = model.generate( input_ids, max_new_tokens=256, eos_token_id=terminators, do_sample=True, temperature=0.1, ) response = outputs[0] print(tokenizer.decode(response)) # <|begin_of_text|><|start_header_id|>system<|end_header_id|> # You are a helpful assistant with access to the following functions. Use them if required - {'name':'search_web', 'description': 'Perform a web search for a given search terms.', 'parameter': {'type': 'object', 'properties': {'search_terms': {'type': 'array', 'items': {'type':'string'}, 'description': 'The search queries for which the search is performed.','required': True}}}}<|eot_id|><|start_header_id|>user<|end_header_id|> # Today's news in Melbourne, just for your information, today is April 27, 2014.<|eot_id|><|start_header_id|>assistant<|end_header_id|> # {"name": "search_web", "arguments": '{"search_terms": ["Melbourne news", "April 27, 2014"]}'}<|eot_id|> ``` ## Training hyperparameters lora_config.yaml ```yaml # The path to the local model directory or Hugging Face repo. model: "meta-llama/Meta-Llama-3-8B-Instruct" # Whether or not to train (boolean) train: true # Directory with {train, valid, test}.jsonl files data: "data" # The PRNG seed seed: 0 # Number of layers to fine-tune lora_layers: 32 # Minibatch size. batch_size: 1 # Iterations to train for. iters: 6000 # Number of validation batches, -1 uses the entire validation set. val_batches: 25 # Adam learning rate. learning_rate: 1e-6 # Number of training steps between loss reporting. steps_per_report: 10 # Number of training steps between validations. steps_per_eval: 200 # Load path to resume training with the given adapter weights. resume_adapter_file: null # Save/load path for the trained adapter weights. adapter_path: "adapters" # Save the model every N iterations. save_every: 1000 # Evaluate on the test set after training test: false # Number of test set batches, -1 uses the entire test set. test_batches: 100 # Maximum sequence length. max_seq_length: 8192 # Use gradient checkpointing to reduce memory use. grad_checkpoint: false # LoRA parameters can only be specified in a config file lora_parameters: # The layer keys to apply LoRA to. # These will be applied for the last lora_layers keys: ['mlp.gate_proj', 'mlp.down_proj', 'self_attn.q_proj', 'mlp.up_proj', 'self_attn.o_proj','self_attn.v_proj', 'self_attn.k_proj'] rank: 128 alpha: 256 scale: 10.0 dropout: 0.05 # Schedule can only be specified in a config file, uncomment to use. #lr_schedule: # name: cosine_decay # warmup: 100 # 0 for no warmup # warmup_init: 1e-7 # 0 if not specified # arguments: [1e-6, 1000, 1e-7] # passed to scheduler ```