# NeuralHermes-2.5-Mistral-7B

**Developer Portal :**  <https://api.market/store/bridgeml/mlabonne>

<figure><img src="https://blog.api.market/wp-content/uploads/2024/06/qIhaFNM-1.png" alt=""><figcaption></figcaption></figure>

## NeuralHermes 2.5 - Mistral 7B

NeuralHermes is based on the [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) model that has been further fine-tuned with Direct Preference Optimization (DPO) using the [mlabonne/chatml\_dpo\_pairs](https://huggingface.co/datasets/mlabonne/chatml_dpo_pairs) dataset. It surpasses the original model on most benchmarks (see results).

It is directly inspired by the RLHF process described by [Intel/neural-chat-7b-v3-1](https://huggingface.co/Intel/neural-chat-7b-v3-1)'s authors to improve performance. I used the same dataset and reformatted it to apply the ChatML template.

The code to train this model is available on [Google Colab](https://colab.research.google.com/drive/15iFBr1xWgztXvhrj5I9fBv20c7CFOPBE?usp=sharing) and [GitHub](https://github.com/mlabonne/llm-course/tree/main). It required an A100 GPU for about an hour.

### Quantized models

* **GGUF**: <https://huggingface.co/TheBloke/NeuralHermes-2.5-Mistral-7B-GGUF>
* **AWQ**: <https://huggingface.co/TheBloke/NeuralHermes-2.5-Mistral-7B-AWQ>
* **GPTQ**: <https://huggingface.co/TheBloke/NeuralHermes-2.5-Mistral-7B-GPTQ>
* **EXL2**:
  * 3.0bpw: <https://huggingface.co/LoneStriker/NeuralHermes-2.5-Mistral-7B-3.0bpw-h6-exl2>
  * 4.0bpw: <https://huggingface.co/LoneStriker/NeuralHermes-2.5-Mistral-7B-4.0bpw-h6-exl2>
  * 5.0bpw: <https://huggingface.co/LoneStriker/NeuralHermes-2.5-Mistral-7B-5.0bpw-h6-exl2>
  * 6.0bpw: <https://huggingface.co/LoneStriker/NeuralHermes-2.5-Mistral-7B-6.0bpw-h6-exl2>
  * 8.0bpw: <https://huggingface.co/LoneStriker/NeuralHermes-2.5-Mistral-7B-8.0bpw-h8-exl2>

### Results

**Update:** NeuralHermes-2.5 became the best Hermes-based model on the Open LLM leaderboard and one of the very best 7b models. 🎉

[![image/png](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/yWe6VBFxkHiuOlDVBXtGo.png)](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/yWe6VBFxkHiuOlDVBXtGo.png)

Teknium (author of OpenHermes-2.5-Mistral-7B) benchmarked the model ([see his tweet](https://twitter.com/Teknium1/status/1729955709377503660)).

Results are improved on every benchmark: **AGIEval** (from 43.07% to 43.62%), **GPT4All** (from 73.12% to 73.25%), and **TruthfulQA**.

#### AGIEval

[![](https://i.imgur.com/7an3B1f.png)](https://i.imgur.com/7an3B1f.png)

#### GPT4All

[![](https://i.imgur.com/TLxZFi9.png)](https://i.imgur.com/TLxZFi9.png)

#### TruthfulQA

[![](https://i.imgur.com/V380MqD.png)](https://i.imgur.com/V380MqD.png)

You can check the Weights & Biases project [here](https://wandb.ai/mlabonne/DPO/runs/axe71gr0?nw=nwusermlabonne).

### Training hyperparameters

**LoRA**:

* r=16
* lora\_alpha=16
* lora\_dropout=0.05
* bias="none"
* task\_type="CAUSAL\_LM"
* target\_modules=\['k\_proj', 'gate\_proj', 'v\_proj', 'up\_proj', 'q\_proj', 'o\_proj', 'down\_proj']

**Training arguments**:

* per\_device\_train\_batch\_size=4
* gradient\_accumulation\_steps=4
* gradient\_checkpointing=True
* learning\_rate=5e-5
* lr\_scheduler\_type="cosine"
* max\_steps=200
* optim="paged\_adamw\_32bit"
* warmup\_steps=100

**DPOTrainer**:

* beta=0.1
* max\_prompt\_length=1024
* max\_length=1536

[Source](https://github.com/mlabonne)

### Request and Response

#### Request

{% code overflow="wrap" %}

```bash
curl -X 'POST' \
  'https://prod.api.market/api/v1/bridgeml/mlabonne/bridgeml/mlabonne' \
  -H 'accept: application/json' \
  -H 'x-api-market-key: API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
  "messages": [
    {
      "role": "user",
      "content": "hello"
    },
    {
      "role": "assistant",
      "content": "Hello, how can you help me?"
    }
  ],
  "temperature": 1,
  "max_tokens": 256,
  "top_p": 1,
  "frequency_penalty": 0,
  "stream": false
}'
```

{% endcode %}

#### Response

{% code overflow="wrap" %}

```json
{
  "id": "mlabonne/NeuralHermes-2.5-Mistral-7B-eab3ca77-e1d0-41cb-b43e-e8195af77dc7",
  "object": "text_completion",
  "created": 1718905083,
  "model": "mlabonne/NeuralHermes-2.5-Mistral-7B",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "It seems like you have a question or need assistance with something. Please feel free to provide more information or context so I can do my best to help you.",
        "tool_calls": null,
        "tool_call_id": null
      },
      "index": 0,
      "finish_reason": "stop",
      "logprobs": null
    }
  ],
  "usage": {
    "prompt_tokens": 76,
    "completion_tokens": 33,
    "total_tokens": 109
  }
}
```

{% endcode %}

You can use this easy to use and cheap LLM Api here at <https://api.market/store/bridgeml/mlabonne>
