3 Comments
Feb 14Liked by Benjamin Marie

Nice piece, as usual!

Am I reading the results correctly? I see that AWQ is only a little bit higher in terms of perplexity.

Expand full comment

Yes, that's correct. Awq is only slightly but consistently behind, except for the 7b 3-bit where the gap is larger.

Expand full comment
Feb 12Liked by Benjamin Marie

vLLM recently optimized its use of AWQ. I wonder if/when they'll do the same for SqueezeLLM. https://github.com/vllm-project/vllm/pull/2566

Expand full comment