Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected perplexity of Baichuan2 #4670

Closed
QwertyJack opened this issue Dec 28, 2023 · 2 comments
Closed

Unexpected perplexity of Baichuan2 #4670

QwertyJack opened this issue Dec 28, 2023 · 2 comments

Comments

@QwertyJack
Copy link

QwertyJack commented Dec 28, 2023

Version: llama.cpp@f679349, tag: b1708
Model: Baichuan2-13B-Chat
GPU: T4

Summary

I want to deploy Baichuan2 with a proper quantization, but the perplexity testing shows a confusing result.
Advice needed please.

Step to reproduce

# llama.cpp@f679349 compiled with `-DLLAMA_CUBLAS=1 -DLLAMA_CUDA_MMV_Y=4'
/app/convert-hf-to-gguf.py --outfile f16 /models/Baichuan2-13B-Chat
for quant in f16 Q4_0 Q4_1 Q4_K_S Q4_K_M Q5_0 Q5_1 Q5_K_S Q5_K_M Q6_K Q8_0
do
    [ $quant = f16 ] || /app/quantize f16 $quant $quant
    /app/perplexity -m $quant -f /path/to/wikitext-2-raw/wiki.test.raw -ngl 64
done

Result

Quant Size Ppl
Fp16 27797048992 10.5931 +/- 0.10915
Q4_0 7987103456 10.6613 +/- 0.10784
Q4_1 8815396576 10.7079 +/- 0.10837
Q4_K_S 8368359136 10.7842 +/- 0.11014
Q4_K_M 8998815456 10.6721 +/- 0.10821
Q5_0 9643689696 10.5271 +/- 0.10765
Q5_1 10471982816 10.6211 +/- 0.10884
Q5_K_S 9818998496 10.6195 +/- 0.10920
Q5_K_M 10326902496 10.6085 +/- 0.10892
Q6_K 12083134144 10.6267 +/- 0.11008
Q8_0 14769311424 10.5689 +/- 0.10885

Surprisingly, Qx_0 is better than Qx_1 even Qx_K*, and Q6 seems worse than Q5.

Did I miss anything?

@ggerganov
Copy link
Owner

Wikitext perplexity tests do not make sense for fine-tuned models - only base models

@turian
Copy link

turian commented May 6, 2024

"Wikitext perplexity tests do not make sense for fine-tuned models - only base models"

Can you explain why? Because fine-tuning brings the model away from being able to generate wikitext?

Related: #7066

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants