Qwen3.5-122B-A10B-abliterated-4bit-vlm-mlx-cs2764-pass2

4-bit MLX VLM release of a Qwen3.5-122B-A10B abliterated checkpoint, with an additional broad cs2764/mlx-abliteration pass applied on the MLX model.

A newer, more speed-balanced variant is available here:

vanch007/Qwen3.5-122B-A10B-abliterated-4bit-vlm-mlx-cs2764-final

Summary

Architecture: qwen3_5_moe / Qwen3_5MoeForConditionalGeneration
Modality: vision-language model
Quantization: 4-bit MLX (group_size=64, mode=affine)
Model size on disk: about 65.1 GB
Weight shards: 14
Toolkit used for the extra ablation pass: cs2764/mlx-abliteration

When To Use This Repo vs The Newer `final` Repo

Aspect	This `pass2` repo	Newer `final` repo
Extra ablation scope	broader pass over layers `0..47`	targeted hot-path pass on layers `36..47`
Goal	maximize refusal reduction	better speed / behavior tradeoff
Local behavior spot checks	stronger refusal weakening	milder than `pass2`, but still weaker refusal than direct 4-bit base
Local short generation speed	about `12.6 tok/s`	about `24.0 tok/s`
Recommendation:

Use this pass2 repo if you want the more aggressive refusal-removal behavior.
Use the newer final repo if you want a faster model with a narrower targeted extra pass.

Lineage

The release chain for this repository is:

Foundation model: Qwen/Qwen3.5-122B-A10B
Input checkpoint: a user-supplied abliterated Qwen3.5-122B-A10B VLM checkpoint
MLX VLM conversion: quantized to 4-bit MLX while preserving vision_config, preprocessor_config.json, processor_config.json, and video_preprocessor_config.json
Extra MLX abliteration pass: run with cs2764/mlx-abliteration on the converted MLX model

What Is In This Repo

This repository contains a complete MLX VLM checkpoint:

config.json with vision_config
model.safetensors.index.json
model-00001-of-00014.safetensors through model-00014-of-00014.safetensors
tokenizer.json, tokenizer_config.json, vocab.json
preprocessor_config.json, processor_config.json, video_preprocessor_config.json
abliteration_log.json for the pass-2 MLX abliteration run

Note that ablation_meta.json is inherited from the input checkpoint. For the extra MLX pass in this repository, use abliteration_log.json and this model card as the authoritative record.

Pass-2 Abliteration Configuration

The additional MLX abliteration pass was run with the following settings:

Parameter	Value
Toolkit	`cs2764/mlx-abliteration`
Refusal vector policy	`per-layer`
Ablation vector source	`per-layer`
Ablation strength	`2.0`
Refusal direction method	`projected`
Probed layers	`0..47`
Adaptive search	`False`
Attention only	`False`
MoE safe mode	`True`
Timestamp	`2026-03-07T02:40:35Z`

This run was done on the MLX model, not on a raw Transformers checkpoint.

Compatibility

This repository is for MLX / Apple Silicon usage. It is not a standard Transformers-only release.

Verified locally:

mlx_vlm.load(..., lazy=True) loads successfully
processor_class = Qwen3VLProcessor
has_vision_tower = True
config.json retains vision_config

Usage

Python with `mlx-vlm`

from mlx_vlm import load, generate

model, processor = load(
    "vanch007/Qwen3.5-122B-A10B-abliterated-4bit-vlm-mlx-cs2764-pass2",
    lazy=True,
)

result = generate(
    model,
    processor,
    prompt="Describe the image briefly.",
    image="/absolute/path/to/example.jpg",
    max_tokens=128,
    verbose=False,
)

print(result.text)

Safety Notice

This repository contains a model with reduced refusal behavior. It may produce harmful, offensive, or unsafe content.

Do not use it in consumer-facing or safety-sensitive systems without independent safety controls.

You are responsible for ensuring compliance with applicable law, policy, and platform rules.

Downloads last month: 703

Safetensors

Model size

20B params

Tensor type

BF16

U32

F32

MLX

Hardware compatibility

4-bit

Model tree for vanch007/Qwen3.5-122B-A10B-abliterated-4bit-vlm-mlx-cs2764-pass2

Base model

Qwen/Qwen3.5-122B-A10B

Quantized

(61)

this model