128GB UMA Models
Collection
Models optimized for Strix Halo and similar systems • 4 items • Updated
• 1
Quant optimized for quality / speed on a Strix Halo 128GiB system. Possibly also beneficial on DGX Spark and similar systems.
The TL;DR is this quant achieves both superior quality and speed compared to homogenous Q6_K.
See the GLM version for more details on theory and comparisons.
We're not able to determine the quantization variants.