Qwen3.5-122B-A10B-abliterated-4bit-vlm-mlx-cs2764-pass2
4-bit MLX VLM release of a Qwen3.5-122B-A10B abliterated checkpoint, with an additional broad cs2764/mlx-abliteration pass applied on the MLX model.
A newer, more speed-balanced variant is available here:
Summary
- Architecture:
qwen3_5_moe/Qwen3_5MoeForConditionalGeneration - Modality: vision-language model
- Quantization: 4-bit MLX (
group_size=64,mode=affine) - Model size on disk: about
65.1 GB - Weight shards:
14 - Toolkit used for the extra ablation pass:
cs2764/mlx-abliteration
When To Use This Repo vs The Newer final Repo
| Aspect | This pass2 repo |
Newer final repo |
|---|---|---|
| Extra ablation scope | broader pass over layers 0..47 |
targeted hot-path pass on layers 36..47 |
| Goal | maximize refusal reduction | better speed / behavior tradeoff |
| Local behavior spot checks | stronger refusal weakening | milder than pass2, but still weaker refusal than direct 4-bit base |
| Local short generation speed | about 12.6 tok/s |
about 24.0 tok/s |
| Recommendation: |
- Use this
pass2repo if you want the more aggressive refusal-removal behavior. - Use the newer
finalrepo if you want a faster model with a narrower targeted extra pass.
Lineage
The release chain for this repository is:
- Foundation model: Qwen/Qwen3.5-122B-A10B
- Input checkpoint: a user-supplied abliterated Qwen3.5-122B-A10B VLM checkpoint
- MLX VLM conversion: quantized to 4-bit MLX while preserving
vision_config,preprocessor_config.json,processor_config.json, andvideo_preprocessor_config.json - Extra MLX abliteration pass: run with
cs2764/mlx-abliterationon the converted MLX model
What Is In This Repo
This repository contains a complete MLX VLM checkpoint:
config.jsonwithvision_configmodel.safetensors.index.jsonmodel-00001-of-00014.safetensorsthroughmodel-00014-of-00014.safetensorstokenizer.json,tokenizer_config.json,vocab.jsonpreprocessor_config.json,processor_config.json,video_preprocessor_config.jsonabliteration_log.jsonfor the pass-2 MLX abliteration run
Note that ablation_meta.json is inherited from the input checkpoint. For the extra MLX pass in this repository, use abliteration_log.json and this model card as the authoritative record.
Pass-2 Abliteration Configuration
The additional MLX abliteration pass was run with the following settings:
| Parameter | Value |
|---|---|
| Toolkit | cs2764/mlx-abliteration |
| Refusal vector policy | per-layer |
| Ablation vector source | per-layer |
| Ablation strength | 2.0 |
| Refusal direction method | projected |
| Probed layers | 0..47 |
| Adaptive search | False |
| Attention only | False |
| MoE safe mode | True |
| Timestamp | 2026-03-07T02:40:35Z |
This run was done on the MLX model, not on a raw Transformers checkpoint.
Compatibility
This repository is for MLX / Apple Silicon usage. It is not a standard Transformers-only release.
Verified locally:
mlx_vlm.load(..., lazy=True)loads successfullyprocessor_class = Qwen3VLProcessorhas_vision_tower = Trueconfig.jsonretainsvision_config
Usage
Python with mlx-vlm
from mlx_vlm import load, generate
model, processor = load(
"vanch007/Qwen3.5-122B-A10B-abliterated-4bit-vlm-mlx-cs2764-pass2",
lazy=True,
)
result = generate(
model,
processor,
prompt="Describe the image briefly.",
image="/absolute/path/to/example.jpg",
max_tokens=128,
verbose=False,
)
print(result.text)
Safety Notice
This repository contains a model with reduced refusal behavior. It may produce harmful, offensive, or unsafe content.
Do not use it in consumer-facing or safety-sensitive systems without independent safety controls.
You are responsible for ensuring compliance with applicable law, policy, and platform rules.
- Downloads last month
- 703
4-bit
Model tree for vanch007/Qwen3.5-122B-A10B-abliterated-4bit-vlm-mlx-cs2764-pass2
Base model
Qwen/Qwen3.5-122B-A10B