Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated-mlx-bf16
MLX-VLM bf16 export of huihui-ai/Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated for Apple Silicon workflows, including LM Studio and local mlx_vlm usage.
Overview
- Variant:
bf16 - Repository payload at upload time:
9.1G - Repository file count:
10 - Format:
mlx-vlmmodel package
Compatibility
- Uses the corrected Qwen VL chat template with image token placeholders.
- Uses
<|im_end|>-compatible stop token settings for cleaner chat termination in MLX/LM Studio.
Validation
- Local text generation smoke test: passed
- Local image-input smoke test: passed
- Local black-box abliterated check:
6/6non-refused - Refusal rate:
0.0 - Actionable non-refused cases:
6 - Median cleaned response length:
559chars
Behavior Notes
- This variant preserved the abliterated behavior on the local 6-case regression set used during conversion validation.
- These checks are behavioral acceptance tests, not a formal guarantee of identical outputs to the source checkpoint.
Usage
mlx_vlm.generate \
--model /path/to/Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated-mlx-bf16 \
--prompt "你好" \
--max-tokens 256
For behavior-focused checks, it is safer to disable thinking output so refusal scoring is based on the final answer instead of the thinking trace.
mlx_vlm.generate \
--model /path/to/Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated-mlx-bf16 \
--prompt "你好" \
--max-tokens 256 \
--processor-kwargs '{"enable_thinking": false}'
LM Studio
- If LM Studio has an older cached copy, refresh or re-download the repository so the latest chat template and config are picked up.
- These repositories are meant for
mlx-vlm/ Apple MLX runtimes rather than Transformers CPU inference.
- Downloads last month
- 64
Model size
5B params
Tensor type
BF16
·
F32 ·
Hardware compatibility
Log In to add your hardware
Quantized
Model tree for vanch007/Huihui-Qwen3.5-4B-Claude-4.6-Opus-abliterated-mlx-bf16
Base model
Qwen/Qwen3.5-4B-Base Finetuned
Qwen/Qwen3.5-4B