March 5 updates compared to Feb ones?

#11

by tnuvkeg - opened 13 days ago

•

I see you guys are re-uploading all quants.
I wonder about UD-Q5_K_XL size that is currently bigger than the February one by about 7Gb? why is that? should I update it? (although I probably will need to go for Q4 instead).

edit: thank you for all the quants and updates!!!

langezwieper

13 days ago

•

edited 13 days ago

I see you guys are re-uploading all quants.
I wonder about UD-Q5_K_XL size that is currently bigger than the February one by about 7Gb? why is that? should I update it? (although I probably will need to go for Q4 instead).

edit: thank you for all the quants and updates!!!

I updated Yesterday but couldn't fit it anymore myself now, did some testing and now ended up on the Q5_K_S which at least doesn't perform worse compared to the previous Q5_K_XL and leave a bit more VRAM space. So went for that myself. But was wondering the same.

danielhanchen

Unsloth AI org 13 days ago

Hey yes - we were going to release benchmarks today hopefully soon!

sofiageo

13 days ago

I downloaded UD-Q5_K_XL yesterday for my 128GB system and I really like it so far. I have no experience with the old files though, first time I use it

igor255

13 days ago

•

edited 13 days ago

Yesterday there were more quants, downloaded iq3_s (also fresh upload), which works nicely on my 64gb setup, are they gonna be reuploaded?

tnuvkeg

13 days ago

I downloaded UD-Q5_K_XL yesterday for my 128GB system and I really like it so far. I have no experience with the old files though, first time I use it

that's actually one of the "old" ones, since the current ones are only 2 hours old :D

danielhanchen

Unsloth AI org 13 days ago

Yes sorry they'll be back up - we had some uploading issues sorry

qcsmire

13 days ago

I had some issues with the Q4_xx and ik_llama.cpp where the model became corrupt after some usage.
Then I downloaded the UD_Q5_K_XL, and I have not had any issues whatsover.
Anyone else seen the same on RTX6000 96GBVRAM?
Any changes in the new Q4_xxx that might change things?

danielhanchen

Unsloth AI org 13 days ago

The new ones will be the final ones for this line of updates, so they should all be able now - sorry on the issues and the delay!

igor255

13 days ago

•

edited 13 days ago

Actually the one i downloaded yesterday was q3_k_s, and it had iq3_s/iq3_xxs mix for experts. I choose this mix based on your research that iq3_xxs provides large benefit for experts weights over q2. Restoring it would be good if possible. It was 52gb in size (which leave good margin for running OS on 64 gb).

Up - can see it was restored - many thanks!

danielhanchen

Unsloth AI org 13 days ago

•

edited 13 days ago

Hey folks see https://www.reddit.com/r/LocalLLaMA/comments/1rlkptk/final_qwen35_unsloth_gguf_update/!

sofiageo

13 days ago

I downloaded UD-Q5_K_XL yesterday for my 128GB system and I really like it so far. I have no experience with the old files though, first time I use it

that's actually one of the "old" ones, since the current ones are only 2 hours old :D

I just verified that the files from yesterday have the same sha256 hashes that are in the commit, so I believe it's ok! :)

qcsmire

12 days ago

An update on the repition issue. Turns out that it is an issue in ik_llama.cpp. When switching to llama.cpp this morning (after they fixed a problem with VRAM allocation), everything works VERY good!
Stable and the Q5_K_M fits very nicely on my RTX6000 PRO with about 75 tokens/second!

ttruong1889

8 days ago

any way i can get to the pre-march 5 quants?

BingoBird

7 days ago

Good job unsloth devs. You can hide in my basement.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment