Recommended sampler?

#4
by mratsim - opened

There is no mention of sampler in the model card or in generation_config.json, apart from temp 0.1 in the example snippets which seems very low.

Without settings samplers will default to temp 1 and top-k 40 in vLLM and SGLang I think.

But in llama.cpp it will be:

  • temperature: 0.8
  • top k: 40
  • top p: 0.9
  • min p: 0.1

potential leading to non-ideal experience for people

The default llama.cpp sampling parameters are, in this order:

  1. top-k 40
  2. top-p 0.95
  3. min-p 0.05
  4. temperature 0.80

I agree it would be nice to know the recommended samplers for this model.

It is not just parameters, but also the sequence of application that matters. No one ever specifies that.

Mistral AI_ org

0.1 is the temperature we recommend for an Instruct checkpoint, if you'd like to have more creative responses and maybe let more room to the model thinking trace you can try to increase it.

Depending on your use cases it is interesting to try out different values but starting low is most likely the most appropriate

@.patrickvonplaten updated the model card with this:

Recommended Settings

  • Reasoning Effort:
    • 'none' β†’ Do not use reasoning
    • 'high' β†’ Use reasoning (recommended for complex prompts)
      Use reasoning_effort="high" for complex tasks
  • Temperature: 0.7 for reasoning_effort="high". Temp between 0.0 and 0.7 for reasoning_effort="none" depending on task.

Hope it can help !

Sign up or log in to comment