Recommended sampler?

by mratsim - opened 1 day ago

There is no mention of sampler in the model card or in generation_config.json, apart from temp 0.1 in the example snippets which seems very low.

Without settings samplers will default to temp 1 and top-k 40 in vLLM and SGLang I think.

But in llama.cpp it will be:

temperature: 0.8
top k: 40
top p: 0.9
min p: 0.1

potential leading to non-ideal experience for people

ddh0

1 day ago

•

edited 1 day ago

The default llama.cpp sampling parameters are, in this order:

top-k 40
top-p 0.95
min-p 0.05
temperature 0.80

I agree it would be nice to know the recommended samplers for this model.

KeyboardMasher

1 day ago

It is not just parameters, but also the sequence of application that matters. No one ever specifies that.

juliendenize

Mistral AI_ org 1 day ago

0.1 is the temperature we recommend for an Instruct checkpoint, if you'd like to have more creative responses and maybe let more room to the model thinking trace you can try to increase it.

Depending on your use cases it is interesting to try out different values but starting low is most likely the most appropriate

juliendenize

Mistral AI_ org about 23 hours ago

•

edited about 23 hours ago

@.patrickvonplaten updated the model card with this:

Recommended Settings

Reasoning Effort:
- 'none' → Do not use reasoning
- 'high' → Use reasoning (recommended for complex prompts)
  Use reasoning_effort="high" for complex tasks
Temperature: 0.7 for reasoning_effort="high". Temp between 0.0 and 0.7 for reasoning_effort="none" depending on task.

Hope it can help !

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment