oobabooga benchmark

The list is sorted by size (on disk) for each score.

Highlighted = Pareto frontier.

Score Model Parameters Size (GB) Loader Additional info
38/48Meta-Llama-3.1-70B-Instruct-Q4_K_M70B42.5llamacpp_HF[link]
38/48Mistral-Large-Instruct-2407.i1-Q4_K_S123B69.5llamacpp_HF[link]
37/48Meta-Llama-3.1-70B-Instruct-Q3_K_M70B34.3llamacpp_HF[link]
37/48Meta-Llama-3.1-70B-Instruct-Q3_K_L70B37.1llamacpp_HF[link]
37/48Meta-Llama-3.1-70B-Instruct-Q3_K_XL70B38.1llamacpp_HF[link]
37/48Meta-Llama-3.1-70B-Instruct-Q4_K_L70B43.3llamacpp_HF[link]
37/48Qwen2.5-72B-Instruct-Q4_K_M72B47.42llamacpp_HF
37/48Athene-V2-Chat-Q4_K_M72B47.42llamacpp_HF[link]
37/48Mistral-Large-Instruct-2407.i1-IQ4_XS123B65.4llamacpp_HF[link]
37/48Mistral-Large-Instruct-2407.i1-Q6_K123B100.7llamacpp_HF[link]
36/48Meta-Llama-3.1-70B-Instruct-Q3_K_S70B30.9llamacpp_HF[link]
36/48Meta-Llama-3.1-70B-Instruct-Q4_K_S70B40.3llamacpp_HF[link]
36/48turboderp_Llama-3.1-70B-Instruct-exl2_6.0bpw70B54.12ExLlamav2_HF
35/48Meta-Llama-3.1-70B-Instruct-IQ3_XS70B29.3llamacpp_HF[link]
35/48NyxKrage_Microsoft_Phi-414B29.32Transformers
35/48Meta-Llama-3.1-70B-Instruct-IQ4_XS70B37.9llamacpp_HF[link]
35/48hugging-quants_Meta-Llama-3.1-70B-Instruct-AWQ-INT470B39.77Transformers
35/48turboderp_Llama-3.1-70B-Instruct-exl2_4.5bpw70B41.41ExLlamav2_HF
35/48Meta-Llama-3.1-70B-Instruct-Q5_K_S70B48.7llamacpp_HF[link]
35/48RYS-XLarge-Q4_K_M72B50.7llamacpp_HF[link]
35/48Meta-Llama-3.1-70B-Instruct-Q6_K70B57.9llamacpp_HF[link]
35/48Meta-Llama-3.1-70B-Instruct-Q6_K_L70B58.4llamacpp_HF[link]
35/48Mistral-Large-Instruct-2411-IQ4_XS123B65.4llamacpp_HF[link]
35/48turboderp_Mistral-Large-Instruct-2407-123B-exl2_4.25bpw123B65.83ExLlamav2_HF
34/48Meta-Llama-3.1-70B-Instruct-Q2_K70B26.4llamacpp_HF[link]
34/48Meta-Llama-3.1-70B-Instruct-Q2_K_L70B27.4llamacpp_HF[link]
34/48Meta-Llama-3.1-70B-Instruct-IQ3_M70B31.9llamacpp_HF[link]
34/48Qwen2.5-32B-Instruct-Q8_032B34.82llamacpp_HF
34/48platypus-yi-34b.Q8_034B36.5llamacpp_HF
34/48Meta-Llama-3-70B-Instruct-Q4_K_S70B40.3llamacpp_HF[link]
34/48Llama-3.1-Nemotron-70B-Instruct-HF-Q4_K_M70B42.52llamacpp_HF
34/48Llama-3.3-70B-Instruct-Q4_K_M70B42.52llamacpp_HF[link]
34/48Meta-Llama-3.1-70B-Instruct-Q5_K_M70B50.0llamacpp_HF[link]
34/48Meta-Llama-3.1-70B-Instruct-Q5_K_L70B50.6llamacpp_HF[link]
34/48LoneStriker_OpenBioLLM-Llama3-70B-6.0bpw-h6-exl270B54.27ExLlamav2_HF
33/48gemma-2-27b-it-Q4_K_S27B15.74llamacpp_HF[link]
33/48gemma-2-27b-it-Q4_K_M27B16.65llamacpp_HF[link]
33/48gemma-2-27b-it-Q5_K_S27B18.88llamacpp_HF[link]
33/48gemma-2-27b-it-Q5_K_M27B19.41llamacpp_HF[link]
33/48microsoft_Phi-3.5-MoE-instruct16x3.8B20.94Transformers--load-in-4bit
33/48ISTA-DASLab_Meta-Llama-3-70B-Instruct-AQLM-2Bit-1x1670B21.92Transformers
33/48gemma-2-27b-it-Q6_K27B22.34llamacpp_HF[link]
33/48gemma-2-27b-it-Q6_K_L27B22.63llamacpp_HF[link]
33/48Meta-Llama-3-70B-Instruct-IQ3_XXS70B27.5llamacpp_HF[link]
33/48gemma-2-27b-it-Q8_027B28.94llamacpp_HF[link]
33/48Meta-Llama-3-70B-Instruct-IQ3_XS70B29.3llamacpp_HF[link]
33/48Meta-Llama-3-70B-Instruct-Q3_K_S70B30.9llamacpp_HF[link]
33/4801-ai_Yi-1.5-34B-Chat34B34.39Transformers--load-in-8bit
33/4801-ai_Yi-1.5-34B-Chat-16K34B34.39Transformers--load-in-8bit
33/48Undi95_Meta-Llama-3-70B-Instruct-hf70B35.28Transformers--load-in-4bit
33/48Mistral-Large-Instruct-2407-IQ2_XS123B36.1llamacpp_HF[link]
33/48dolphin-2.9.3-Yi-1.5-34B-32k-Q8_034B36.5llamacpp_HF[link]
33/48Llama3-TenyxChat-70B.i1-Q4_K_S70B40.3llamacpp_HF
33/48Smaug-Llama-3-70B-Instruct.i1-Q4_K_S70B40.3llamacpp_HF
33/48Meta-Llama-3-70B-Instruct.Q4_K_M70B42.5llamacpp_HF
33/48turboderp_Cat-Llama-3-70B-instruct-exl2_5.0bpw70B45.7ExLlamav2_HF
33/48turboderp_Llama-3-70B-Instruct-exl2_5.0bpw70B45.72ExLlamav2_HF
33/48magnum-72b-v1-Q4_K_M72B47.4llamacpp_HF[link]
33/48turboderp_Llama-3-70B-Instruct-exl2_6.0bpw70B54.17ExLlamav2_HF
33/48LoneStriker_dolphin-2.9-llama3-70b-6.0bpw-h6-exl270B54.27ExLlamav2_HF
33/48google_gemma-2-27b-it27B54.45Transformers--bf16 --use_eager_attention
33/48Meta-Llama-3-70B-Instruct.Q8_070B75.0llamacpp_HF
33/48Meta-Llama-3.1-405B-Instruct.i1-IQ2_XXS405B109.0llamacpp_HF[link]
32/48Phi-3-medium-128k-instruct-Q4_K_S14B7.95llamacpp_HF[link]
32/48Phi-3-medium-128k-instruct-Q5_K_S14B9.62llamacpp_HF[link]
32/48Phi-3-medium-128k-instruct-Q6_K14B11.45llamacpp_HF[link]
32/48gemma-2-27b-it-IQ4_XS27B14.81llamacpp_HF[link]
32/48gemma-2-27b-it-Q3_K_XL27B14.81llamacpp_HF[link]
32/48Phi-3-medium-128k-instruct-Q8_014B14.83llamacpp_HF[link]
32/48gemma-2-27b-it-Q4_K_L27B16.93llamacpp_HF[link]
32/48Meta-Llama-3-70B-Instruct-IQ2_S70B22.2llamacpp_HF[link]
32/48Meta-Llama-3-70B-Instruct-IQ2_M70B24.1llamacpp_HF[link]
32/48turboderp_gemma-2-27b-it-exl2_8.0bpw27B26.7ExLlamav2_HF--no_flash_attn --no_xformers --no_sdpa
32/48Meta-Llama-3-70B-Instruct-IQ3_S70B30.9llamacpp_HF[link]
32/48Meta-Llama-3-70B-Instruct-Q3_K_M70B34.3llamacpp_HF[link]
32/48Meta-Llama-3-70B-Instruct-Q3_K_L70B37.1llamacpp_HF[link]
32/48turboderp_Llama-3.1-70B-Instruct-exl2_4.0bpw70B37.15ExLlamav2_HF
32/48Meta-Llama-3-70B-Instruct-IQ4_XS70B37.9llamacpp_HF[link]
32/48Meta-Llama-3-70B-Instruct-IQ4_NL70B40.1llamacpp_HF[link]
32/48Llama-3-Giraffe-70B-Instruct.i1-Q4_K_S70B40.3llamacpp_HF[link]
32/48Athene-70B.i1-Q4_K_M70B42.5llamacpp_HF[link]
32/48Qwen2-72B-Instruct-Q4_K_M72B47.4llamacpp_HF[link]
32/48dnhkng_RYS-Gemma-2-27b-it27B58.98Transformers
31/48Phi-3-medium-4k-instruct-IQ3_XXS14B5.45llamacpp_HF[link]
31/48Phi-3-medium-128k-instruct-IQ3_M14B6.47llamacpp_HF[link]
31/48Phi-3-medium-128k-instruct-IQ4_XS14B7.47llamacpp_HF[link]
31/48Phi-3-medium-128k-instruct-IQ4_NL14B7.90llamacpp_HF[link]
31/48Phi-3-medium-128k-instruct-Q4_K_M14B8.57llamacpp_HF[link]
31/48Phi-3-medium-128k-instruct-Q5_K_M14B10.07llamacpp_HF[link]
31/48gemma-2-27b-it-Q3_K_S27B12.17llamacpp_HF[link]
31/48gemma-2-27b-it-IQ3_M27B12.45llamacpp_HF[link]
31/48gemma-2-27b-it-Q3_K_L27B14.52llamacpp_HF[link]
31/48gemma-2-27b-it-Q5_K_L27B19.69llamacpp_HF[link]
31/48Meta-Llama-3-70B-Instruct-IQ2_XS70B21.1llamacpp_HF[link]
31/48Meta-Llama-3-70B-Instruct-IQ2_XS70B21.1llamacpp_HF[link]
31/48Meta-Llama-3-70B-Instruct-Q2_K70B26.4llamacpp_HF[link]
31/48microsoft_Phi-3-medium-128k-instruct14B27.92Transformers
31/48Meta-Llama-3-70B-Instruct-IQ3_M70B31.9llamacpp_HF[link]
31/48Qwen_QwQ-32B-Preview32B32.76Transformers--load-in-8bit
31/48miqu-1-70b.q4_k_m70B41.4llamacpp_HF
31/48Reflection-Llama-3.1-70B-Q4_K_M70B42.52llamacpp_HF
31/48miqu-1-70b.q5_K_M70B48.8llamacpp_HF
31/48oobabooga_miqu-1-70b-sf-EXL2-6.000b70B52.1ExLlamav2_HF
31/48cloudyu_Phoenix_DPO_60B60B60.81Transformers--load-in-8bit
31/48Meta-Llama-3-120B-Instruct.Q4_K_M120B73.22llamacpp_HF[link]
30/48Phi-3-medium-128k-instruct-IQ3_XXS14B5.45llamacpp_HF[link]
30/48Phi-3-medium-128k-instruct-IQ3_S14B6.06llamacpp_HF[link]
30/48Phi-3-medium-128k-instruct-Q3_K_L14B7.49llamacpp_HF[link]
30/48gemma-2-27b-it-Q3_K_M27B13.42llamacpp_HF[link]
30/48google_gemma-2-27b-it27B13.61Transformers--load-in-4bit --bf16 --use_eager_attention
30/48Qwen_Qwen2.5-7B-Instruct7B15.23Transformers
30/48Meta-Llama-3.1-70B-Instruct-IQ2_M70B24.1llamacpp_HF[link]
30/48Meta-Llama-3-70B-Instruct-abliterated-v3.5.i1-Q4_K_S70B40.3llamacpp_HF[link]
30/48Senku-70B-Full-Q4_K_M70B41.4llamacpp_HF
30/48microsoft_Phi-3.5-MoE-instruct16x3.8B41.87Transformers--load-in-8bit
30/48Llama3-70B-ShiningValiant2.i1-Q4_K_M70B42.5llamacpp_HF[link]
30/48Rhea-72b-v0.5-Q4_K_M72B43.8llamacpp_HF
30/48qwen1_5-72b-chat-q4_k_m72B44.2llamacpp_HF
30/48Phi-3-medium-128k-instruct-f3214B55.80llamacpp_HF[link]
30/48Dracones_WizardLM-2-8x22B_exl2_4.0bpw8x22B70.67ExLlamav2_HF
30/48turboderp_Mixtral-8x22B-Instruct-v0.1-exl2_4.0bpw8x22B70.68ExLlamav2_HF
30/48miquliz-120b-v2.0.Q4_K_M120B72.1llamacpp_HF
30/48falcon-180b-chat.Q4_K_M180B108.6llamacpp_HF
29/48Phi-3-medium-128k-instruct-IQ3_XS14B5.81llamacpp_HF[link]
29/48gemma-2-9b-it-Q5_K_S9B6.48llamacpp_HF[link]
29/48Phi-3-medium-128k-instruct-Q3_K_M14B6.92llamacpp_HF[link]
29/48Phi-3-medium-4k-instruct-Q4_K_S14B7.95llamacpp_HF[link]
29/48Phi-3-medium-4k-instruct-Q4_K_M14B8.57llamacpp_HF[link]
29/48Phi-3-medium-4k-instruct-Q5_K_S14B9.62llamacpp_HF[link]
29/48Phi-3-medium-4k-instruct-Q5_K_M14B10.07llamacpp_HF[link]
29/48Phi-3-medium-4k-instruct-Q6_K14B11.45llamacpp_HF[link]
29/48gemma-2-27b-it-IQ3_XS27B11.55llamacpp_HF[link]
29/48Phi-3-medium-4k-instruct-Q8_014B14.83llamacpp_HF[link]
29/48google_gemma-2-9b-it9B18.48Transformers--bf16 --use_eager_attention
29/48UCLA-AGI_Gemma-2-9B-It-SPPO-Iter39B18.48Transformers--bf16 --use_eager_attention
29/48bartowski_Qwen1.5-32B-Chat-exl2_5_032B21.52ExLlamav2_HF
29/48turboderp_Llama-3.1-70B-Instruct-exl2_2.5bpw70B24.32ExLlamav2_HF
29/48microsoft_Phi-3-medium-4k-instruct14B27.92Transformers
29/48Mistral-Large-Instruct-2407-IQ2_XXS123B32.4llamacpp_HF[link]
29/48LoneStriker_Yi-34B-Chat-8.0bpw-h8-exl234B34.87ExLlamav2_HF
29/4834b-beta.Q8_034B36.5llamacpp_HF
29/48Dracones_Llama-3-Lumimaid-70B-v0.1_exl2_4.5bpw70B41.43ExLlamav2_HF
29/48Llama3-ChatQA-1.5-70B.Q4_K_M70B42.5llamacpp_HFAlpaca template.
29/48turboderp_Llama-3-70B-exl2_5.0bpw70B45.7ExLlamav2_HF
29/48turboderp_command-r-plus-103B-exl2_3.0bpw104B47.72ExLlamav2_HF
29/48daybreak-miqu-1-70b-v1.0-q5_k_m70B48.8llamacpp_HF
29/48Phi-3-medium-4k-instruct-f3214B55.80llamacpp_HF[link]
29/48c4ai-command-r-plus.i1-IQ4_XS104B56.2llamacpp_HF[link]
29/48Qwen1.5-110B-Chat-Q4_K_M110B67.2llamacpp_HF[link]
28/48gemma-2-9b-it-IQ3_M9B4.49llamacpp_HF[link]
28/48gemma-2-9b-it-Q3_K_M9B4.76llamacpp_HF[link]
28/48gemma-2-9b-it-Q3_K_L9B5.13llamacpp_HF[link]
28/48gemma-2-9b-it-Q3_K_L-Q89B5.35llamacpp_HF[link]
28/48gemma-2-9b-it-Q3_K_XL9B5.35llamacpp_HF[link]
28/48gemma-2-9b-it-Q4_K_S9B5.48llamacpp_HF[link]
28/48gemma-2-9b-it-Q4_K_M9B5.76llamacpp_HF[link]
28/48gemma-2-9b-it-Q4_K_L9B5.98llamacpp_HF[link]
28/48Phi-3-medium-128k-instruct-Q3_K_S14B6.06llamacpp_HF[link]
28/48gemma-2-9b-it-Q5_K_M9B6.65llamacpp_HF[link]
28/48gemma-2-9b-it-Q4_K_M-fp169B6.84llamacpp_HF[link]
28/48gemma-2-9b-it-Q5_K_L9B6.87llamacpp_HF[link]
28/48Phi-3-medium-4k-instruct-Q3_K_M14B6.92llamacpp_HF[link]
28/48gemma-2-27b-it-IQ2_M27B9.40llamacpp_HF[link]
28/48gemma-2-9b-it-Q8_09B9.83llamacpp_HF[link]
28/48gemma-2-9b-it-Q8_0-f169B10.69llamacpp_HF[link]
28/48gemma-2-9b-it-Q8_0_L9B10.69llamacpp_HF[link]
28/48tiiuae_Falcon3-7B-Instruct7B14.91Transformers
28/48CohereForAI_aya-expanse-32b32B32.30Transformers--load-in-8bit
28/48gemma-2-9b-it-f329B36.97llamacpp_HF[link]
28/48dolphin-2.7-mixtral-8x7b.Q8_08x7B49.6llamacpp_HF
28/48turboderp_command-r-plus-103B-exl2_3.5bpw104B54.2ExLlamav2_HF
28/48LoneStriker_Smaug-72B-v0.1-6.0bpw-h6-exl272B55.82ExLlamav2_HF
28/48command-r-plus-Q4_K_M104B62.81llamacpp_HF
28/48turboderp_command-r-plus-103B-exl2_4.5bpw104B67.18ExLlamav2_HF
27/48gemma-2-9b-it-Q3_K_S9B4.34llamacpp_HF[link]
27/48gemma-2-9b-it-IQ4_XS9B5.18llamacpp_HF[link]
27/48Phi-3-medium-4k-instruct-IQ3_XS14B5.81llamacpp_HF[link]
27/48Phi-3-medium-4k-instruct-IQ3_S14B6.06llamacpp_HF[link]
27/48Phi-3-medium-4k-instruct-Q3_K_L14B7.49llamacpp_HF[link]
27/48gemma-2-9b-it-Q6_K9B7.59llamacpp_HF[link]
27/48gemma-2-9b-it-Q6_K_L9B7.81llamacpp_HF[link]
27/48gemma-2-9b-it-Q6_K-Q89B7.81llamacpp_HF[link]
27/48Phi-3-medium-4k-instruct-IQ4_NL14B7.90llamacpp_HF[link]
27/48google_gemma-2-9b-it9B9.24Transformers--load-in-8bit --bf16 --use_eager_attention
27/48gemma-2-9b-it-Q6_K-f329B10.51llamacpp_HF[link]
27/48Qwen_Qwen2-7B-Instruct7B15.23Transformers
27/4801-ai_Yi-1.5-9B-Chat9B17.66Transformers
27/48Meta-Llama-3.1-70B-Instruct-IQ2_S70B22.2llamacpp_HF[link]
27/48miqu-1-70b.q2_K70B25.5llamacpp_HF
27/48ISTA-DASLab_c4ai-command-r-plus-AQLM-2Bit-1x16104B31.94Transformers
27/48Platypus2-70B.i1-Q4_K_M70B41.4llamacpp_HF[link]
27/48Midnight-Miqu-70B-v1.0.Q4_K_M70B41.7llamacpp_HF
27/48Llama3-ChatQA-1.5-70B.Q4_K_M70B42.5llamacpp_HFNVIDIA-ChatQA template.
27/48mixtral-8x7b-instruct-v0.1.Q8_08x7B49.6llamacpp_HF
27/48TheBloke_Helion-4x34B-GPTQ4x34B58.31ExLlamav2_HF
26/48Phi-3-mini-4k-instruct-Q5_K_S3.8B2.64llamacpp_HF[link]
26/48gemma-2-9b-it-IQ3_XS9B4.14llamacpp_HF[link]
26/48google_gemma-2-9b-it9B4.62Transformers--load-in-4bit --bf16 --use_eager_attention
26/48Phi-3-medium-4k-instruct-IQ3_M14B6.47llamacpp_HF[link]
26/48Phi-3-medium-4k-instruct-IQ4_XS14B7.47llamacpp_HF[link]
26/48gemma-2-27b-it-IQ2_XS27B8.40llamacpp_HF[link]
26/48gemma-2-27b-it-Q2_K27B10.45llamacpp_HF[link]
26/48gemma-2-27b-it-Q2_K_L27B10.74llamacpp_HF[link]
26/48gemma-2-27b-it-IQ3_XXS27B10.75llamacpp_HF[link]
26/48turboderp_gemma-2-27b-it-exl2_3.0bpw27B13.06ExLlamav2_HF--no_flash_attn --no_xformers --no_sdpa
26/48microsoft_Phi-3-small-8k-instruct7B14.78Transformers
26/48internlm_internlm2_5-7b-chat7B15.48Transformers
26/48Meta-Llama-3-70B-Instruct-IQ2_XXS70B19.1llamacpp_HF
26/48tiiuae_Falcon3-10B-Instruct10B20.61Transformers
26/48Qwen_Qwen1.5-14B-Chat14B28.33Transformers
26/48Mistral-Large-Instruct-2407-IQ1_M123B28.4llamacpp_HF[link]
26/48LoneStriker_Yi-34B-200K-8.0bpw-h8-exl234B34.84ExLlamav2_HF
26/48LoneStriker_dolphin-2.2-yi-34b-200k-8.0bpw-h8-exl234B34.89ExLlamav2_HF
26/48CausalLM-RP-34B.q8_034B37.0llamacpp_HF
26/48lzlv_70b_fp16_hf.Q5_K_M70B48.8llamacpp_HF
26/48nous-hermes-2-mixtral-8x7b-dpo.Q8_08x7B49.6llamacpp_HF
26/48Mixtral-8x22B-Instruct-v0.1.Q4_K_M8x22B85.5llamacpp_HF
25/48Phi-3-mini-4k-instruct-Q5_K_M3.8B2.82llamacpp_HF[link]
25/48Phi-3-mini-4k-instruct-old-Q5_K_M3.8B2.82llamacpp_HF[link]
25/48gemma-2-9b-it-IQ2_S9B3.21llamacpp_HF[link]
25/48gemma-2-9b-it-IQ2_M9B3.43llamacpp_HF[link]
25/48gemma-2-9b-it-IQ3_XXS9B3.80llamacpp_HF[link]
25/48gemma-2-9b-it-Q2_K9B3.81llamacpp_HF[link]
25/48gemma-2-9b-it-Q2_K_L9B4.03llamacpp_HF[link]
25/48Phi-3-medium-4k-instruct-IQ2_S14B4.34llamacpp_HF[link]
25/48gemma-2-27b-it-IQ2_S27B8.65llamacpp_HF[link]
25/48NousResearch_Nous-Hermes-2-SOLAR-10.7B10.7B21.46Transformers
25/48turboderp_Llama-3-70B-Instruct-exl2_2.4bpw70B23.47ExLlamav2_HF
25/48internlm_internlm2_5-20b-chat20B39.72Transformers
25/48LoneStriker_Llama-3-70B-Instruct-Gradient-524k-6.0bpw-h6-exl270B54.11ExLlamav2_HF
25/48goliath-120b.Q4_K_M120B70.6llamacpp_HF
25/48Meta-Llama-3.1-405B-Instruct.i1-IQ1_M405B95.1llamacpp_HF[link]
24/48Phi-3-mini-4k-instruct-IQ4_XS3.8B2.06llamacpp_HF[link]
24/48Phi-3-mini-4k-instruct-IQ4_NL3.8B2.18llamacpp_HF[link]
24/48Phi-3-mini-4k-instruct-old-IQ4_NL3.8B2.18llamacpp_HF[link]
24/48Phi-3-mini-4k-instruct-old-Q5_K_S3.8B2.64llamacpp_HF[link]
24/48Phi-3-mini-4k-instruct-Q6_K3.8B3.14llamacpp_HF[link]
24/48Phi-3-mini-4k-instruct-old-Q6_K3.8B3.14llamacpp_HF[link]
24/48Phi-3-mini-4k-instruct-Q8_03.8B4.06llamacpp_HF[link]
24/48Phi-3-mini-4k-instruct-old-Q8_03.8B4.06llamacpp_HF[link]
24/48Phi-3-mini-4k-instruct-old-fp163.8B7.64llamacpp_HF[link]
24/48tiiuae_Falcon3-Mamba-7B-Instruct7B14.55Transformers
24/48Phi-3-mini-4k-instruct-fp323.8B15.29llamacpp_HF[link]
24/48meta-llama_Meta-Llama-3.1-8B-Instruct8B16.06Transformers
24/48Meta-Llama-3.1-70B-Instruct-IQ2_XS70B21.1llamacpp_HF[link]
24/48upstage_SOLAR-10.7B-Instruct-v1.010.7B21.46Transformers
24/48mistralai_Mistral-Nemo-Instruct-240712B24.5Transformers
24/48dnhkng_RYS-Phi-3-medium-4k-instruct14B35.42Transformers
24/48MultiVerse_70B.Q4_K_M70B45.2llamacpp_HF
24/48maid-yuzu-v8-alter.Q8_08x7B49.8llamacpp_HF
24/48LoneStriker_Llama-3-70B-Instruct-Gradient-262k-6.0bpw-h6-exl270B54.27ExLlamav2_HF
23/48Phi-3-mini-4k-instruct-old-IQ4_XS3.8B2.06llamacpp_HF[link]
23/48Phi-3-mini-4k-instruct-Q4_K_M3.8B2.39llamacpp_HF[link]
23/48Phi-3-medium-128k-instruct-Q2_K14B5.14llamacpp_HF[link]
23/48Phi-3-medium-4k-instruct-Q3_K_S14B6.06llamacpp_HF[link]
23/48microsoft_Phi-3-mini-4k-instruct3.8B7.64Transformers
23/48Mistral-Nemo-Instruct-2407-Q8_012B13.0llamacpp_HF[link]
23/48Zyphra_Zamba2-7B-Instruct7B15.05Transformers
23/48bhenrym14_airoboros-3_1-yi-34b-200k34B34.39Transformers--load-in-8bit
23/48xwin-lm-70b-v0.1.Q4_K_M70B41.4llamacpp_HF
22/48Phi-3-mini-4k-instruct-old-IQ3_XXS3.8B1.51llamacpp_HF[link]
22/48Phi-3-mini-4k-instruct-Q4_K_S3.8B2.19llamacpp_HF[link]
22/48Phi-3-mini-4k-instruct-old-Q4_K_M3.8B2.39llamacpp_HF[link]
22/48Phi-3-medium-128k-instruct-IQ2_S14B4.34llamacpp_HF[link]
22/48Meta-Llama-3-8B-Instruct-Q4_K_S8B4.69llamacpp_HF
22/48Phi-3-medium-128k-instruct-IQ2_M14B4.72llamacpp_HF[link]
22/48Phi-3-medium-4k-instruct-IQ2_M14B4.72llamacpp_HF[link]
22/48Phi-3-medium-4k-instruct-Q2_K14B5.14llamacpp_HF[link]
22/48Qwen_Qwen1.5-7B-Chat7B15.44Transformers
22/48THUDM_glm-4-9b-chat9B18.80Transformers
22/48liuhaotian_llava-v1.5-13b13B26.09Transformers
22/48turboderp_command-r-v01-35B-exl2_6.0bpw35B32.09ExLlamav2_HF
22/48turboderp_command-r-plus-103B-exl2_2.5bpw104B41.23ExLlamav2_HF
22/48tulu-2-dpo-70b.Q4_K_M70B41.4llamacpp_HF
22/48wizardlm-70b-v1.0.Q4_K_M70B41.4llamacpp_HF
22/48MoMo-72B-lora-1.8.6-DPO-Q4_K_M72B43.8llamacpp_HF
22/48meraGPT_mera-mix-4x7B4x7B48.31Transformers
21/48Phi-3-mini-4k-instruct-IQ3_XXS3.8B1.51llamacpp_HF[link]
21/48Phi-3-mini-4k-instruct-old-Q3_K_L3.8B2.09llamacpp_HF[link]
21/48Phi-3-mini-4k-instruct-old-Q4_K_S3.8B2.19llamacpp_HF[link]
21/48Phi-3.1-mini-4k-instruct-Q4_K_M3.8B2.39llamacpp_HF[link]
21/48Phi-3.1-mini-4k-instruct-Q5_K_S3.8B2.64llamacpp_HF[link]
21/48gemma-2-9b-it-IQ2_XS9B3.07llamacpp_HF[link]
21/48Phi-3-medium-4k-instruct-IQ2_XS14B4.13llamacpp_HF[link]
21/48Meta-Llama-3.1-8B-Instruct-Q4_K_S8B4.69llamacpp_HF[link]
21/48Meta-Llama-3.1-8B-Instruct-Q5_K_M8B5.73llamacpp_HF[link]
21/48Meta-Llama-3.1-8B-Instruct-Q6_K8B6.6llamacpp_HF[link]
21/48Meta-Llama-3.1-8B-Instruct-Q6_K_L8B6.85llamacpp_HF[link]
21/48microsoft_Phi-3-vision-128k-instruct4.2B8.29Transformers
21/48Meta-Llama-3-8B-Instruct-Q8_08B8.54llamacpp_HF
21/4801-ai_Yi-1.5-6B-Chat6B12.12Transformers
21/48microsoft_Phi-3-small-128k-instruct7B14.78Transformers
21/48gustavecortal_oneirogen-7B7B15.23Transformers
21/48NurtureAI_Meta-Llama-3-8B-Instruct-64k8B16.06Transformers
21/48Undi95_Meta-Llama-3-8B-Instruct-hf8B16.06Transformers
21/48Salesforce_SFR-Iterative-DPO-LLaMA-3-8B-R8B16.06Transformers
21/48Meta-Llama-3-8B-Instruct-fp168B16.1llamacpp_HF
21/48c4ai-command-r-v01-Q8_035B37.2llamacpp_HF
21/48internlm_internlm2-chat-20b20B39.72Transformers
21/48falcon-180b.Q4_K_M180B108.6llamacpp_HF
20/48gemma-2-2b-it-Q4_K_S2.6B1.64llamacpp_HF[link]
20/48Phi-3-mini-4k-instruct-Q3_K_M3.8B1.96llamacpp_HF[link]
20/48Phi-3-mini-4k-instruct-Q3_K_L3.8B2.09llamacpp_HF[link]
20/48Phi-3.1-mini-4k-instruct-Q4_K_S3.8B2.19llamacpp_HF[link]
20/48Phi-3.1-mini-4k-instruct-Q5_K_M3.8B2.82llamacpp_HF[link]
20/48Phi-3.1-mini-4k-instruct-Q6_K3.8B3.14llamacpp_HF[link]
20/48Phi-3.1-mini-4k-instruct-Q6_K_L3.8B3.18llamacpp_HF[link]
20/48Phi-3-medium-128k-instruct-IQ2_XS14B4.13llamacpp_HF[link]
20/48Meta-Llama-3.1-8B-Instruct-Q4_K_M8B4.92llamacpp_HF[link]
20/48Meta-Llama-3.1-8B-Instruct-Q4_K_L8B5.31llamacpp_HF[link]
20/48Meta-Llama-3.1-8B-Instruct-Q5_K_L8B6.06llamacpp_HF[link]
20/48Meta-Llama-3-8B-Instruct-Q6_K8B6.6llamacpp_HF
20/48TheBloke_llava-v1.5-13B-GPTQ13B7.26ExLlamav2_HF
20/48microsoft_Phi-3-mini-4k-instruct-202407013.8B7.64Transformers
20/48openchat_openchat_3.57B14.48Transformers
20/48Weyaxi_Einstein-v6.1-Llama3-8B8B16.06Transformers
20/48BAAI_Bunny-Llama-3-8B-V8B16.96Transformers
20/48CohereForAI_aya-23-35B35B34.98Transformers--load-in-8bit
20/48llama-2-70b-chat.Q4_K_M70B41.4llamacpp_HF
20/48Ein-72B-v0.1-full.Q4_K_M72B45.2llamacpp_HF
19/48Phi-3.1-mini-128k-instruct-IQ3_XXS3.8B1.51llamacpp_HF[link]
19/48Phi-3.1-mini-128k-instruct-IQ3_XS3.8B1.63llamacpp_HF[link]
19/48gemma-2-2b-it-Q4_K_M2.6B1.71llamacpp_HF[link]
19/48Phi-3-mini-4k-instruct-IQ3_M3.8B1.86llamacpp_HF[link]
19/48Phi-3-mini-4k-instruct-old-IQ3_M3.8B1.86llamacpp_HF[link]
19/48gemma-2-2b-it-Q5_K_S2.6B1.88llamacpp_HF[link]
19/48Phi-3.1-mini-4k-instruct-Q3_K_M3.8B1.96llamacpp_HF[link]
19/48Phi-3-mini-4k-instruct-old-Q3_K_M3.8B1.96llamacpp_HF[link]
19/48Phi-3.1-mini-4k-instruct-IQ4_XS3.8B2.06llamacpp_HF[link]
19/48Phi-3.1-mini-128k-instruct-Q3_K_L3.8B2.09llamacpp_HF[link]
19/48Phi-3.1-mini-4k-instruct-Q3_K_L3.8B2.09llamacpp_HF[link]
19/48Phi-3.1-mini-128k-instruct-Q3_K_XL3.8B2.17llamacpp_HF[link]
19/48Phi-3.1-mini-4k-instruct-Q3_K_XL3.8B2.17llamacpp_HF[link]
19/48Phi-3.1-mini-128k-instruct-Q4_K_L3.8B2.47llamacpp_HF[link]
19/48Phi-3.1-mini-4k-instruct-Q4_K_L3.8B2.47llamacpp_HF[link]
19/48Phi-3.1-mini-128k-instruct-Q5_K_S3.8B2.64llamacpp_HF[link]
19/48Phi-3.1-mini-128k-instruct-Q5_K_M3.8B2.82llamacpp_HF[link]
19/48Phi-3.1-mini-128k-instruct-Q5_K_L3.8B2.88llamacpp_HF[link]
19/48Phi-3.1-mini-4k-instruct-Q5_K_L3.8B2.88llamacpp_HF[link]
19/48Meta-Llama-3-8B-Instruct-IQ3_S8B3.68llamacpp_HF[link]
19/48Phi-3.1-mini-4k-instruct-Q8_03.8B4.06llamacpp_HF[link]
19/48mistral-7b-instruct-v0.2.Q4_K_S7B4.14llamacpp_HF
19/48Meta-Llama-3-8B-Instruct-IQ4_XS8B4.45llamacpp_HF
19/48Meta-Llama-3-8B-Instruct-IQ4_NL8B4.68llamacpp_HF
19/48Meta-Llama-3-8B-Instruct-Q4_K_M8B4.92llamacpp_HF
19/48Meta-Llama-3-8B-Instruct-Q5_K_S8B5.6llamacpp_HF
19/48Meta-Llama-3-8B-Instruct-Q5_K_M8B5.73llamacpp_HF
19/48microsoft_Phi-3-mini-128k-instruct3.8B7.64Transformers
19/48microsoft_Phi-3.5-vision-instruct3.8B8.29Transformers
19/48Nexusflow_Starling-LM-7B-beta7B14.48Transformers
19/48NousResearch_Hermes-2-Pro-Mistral-7B7B14.48Transformers
19/48Phi-3.1-mini-4k-instruct-f323.8B15.29llamacpp_HF[link]
19/48internlm_internlm2-chat-7b7B15.48Transformers
19/48lightblue_suzume-llama-3-8B-multilingual8B16.06Transformers
19/48UCLA-AGI_Llama-3-Instruct-8B-SPPO-Iter38B16.06Transformers
19/48ai21labs_Jamba-v0.152B25.79Transformers--load-in-4bit
19/48Meta-Llama-3.1-8B-Instruct-f328B32.1llamacpp_HF[link]
19/48internlm_internlm2-chat-20b-sft20B39.72Transformers
19/48llama-2-70b.Q5_K_M70B48.8llamacpp_HF
19/48DeepSeek-Coder-V2-Instruct-IQ2_XS236B68.7llamacpp_HF[link]
19/48zephyr-orpo-141b-A35b-v0.1.Q4_K_M141B85.59llamacpp_HF
18/48gemma-2-2b-it-IQ4_XS2.6B1.57llamacpp_HF[link]
18/48Phi-3.1-mini-4k-instruct-IQ3_XS3.8B1.63llamacpp_HF[link]
18/48Phi-3-mini-4k-instruct-IQ3_S3.8B1.68llamacpp_HF[link]
18/48Phi-3-mini-4k-instruct-Q3_K_S3.8B1.68llamacpp_HF[link]
18/48Phi-3-mini-4k-instruct-old-IQ3_S3.8B1.68llamacpp_HF[link]
18/48Phi-3-mini-4k-instruct-old-Q3_K_S3.8B1.68llamacpp_HF[link]
18/48Phi-3.1-mini-128k-instruct-IQ3_M3.8B1.86llamacpp_HF[link]
18/48Phi-3.1-mini-4k-instruct-IQ3_M3.8B1.86llamacpp_HF[link]
18/48gemma-2-2b-it-Q5_K_M2.6B1.92llamacpp_HF[link]
18/48Phi-3.1-mini-128k-instruct-Q3_K_M3.8B1.96llamacpp_HF[link]
18/48gemma-2-2b-it-Q6_K2.6B2.15llamacpp_HF[link]
18/48Phi-3.1-mini-128k-instruct-Q4_K_S3.8B2.19llamacpp_HF[link]
18/48gemma-2-2b-it-Q6_K_L2.6B2.29llamacpp_HF[link]
18/48Phi-3.1-mini-128k-instruct-Q4_K_M3.8B2.39llamacpp_HF[link]
18/48gemma-2-2b-it-Q8_02.6B2.78llamacpp_HF[link]
18/48Phi-3.1-mini-128k-instruct-Q6_K3.8B3.14llamacpp_HF[link]
18/48Phi-3.1-mini-128k-instruct-Q6_K_L3.8B3.18llamacpp_HF[link]
18/48Meta-Llama-3-8B-Instruct-Q3_K_M8B4.02llamacpp_HF
18/48Meta-Llama-3-8B-Instruct-Q3_K_L8B4.32llamacpp_HF
18/48google_gemma-2-2b-it2B5.23Transformers
18/48tiiuae_Falcon3-3B-Instruct3B6.46Transformers
18/48microsoft_Phi-3-mini-128k-instruct3.8B7.64Transformers--load-in-8bit
18/48microsoft_Phi-3.5-mini-instruct3.8B8.29Transformers
18/48Meta-Llama-3.1-8B-Instruct-Q8_08B8.54llamacpp_HF[link]
18/48gemma-2-2b-it-f322.6B10.46llamacpp_HF[link]
18/48mistralai_Mistral-7B-Instruct-v0.27B14.48Transformers
18/48jieliu_Storm-7B7B14.48Transformers
18/48failspy_kappa-3-phi-abliterated3.8B15.28Transformers
18/48Phi-3.1-mini-128k-instruct-f323.8B15.29llamacpp_HF[link]
18/48turboderp_llama3-turbcat-instruct-8b8B16.06Transformers
18/48Meta-Llama-3-70B-Instruct-IQ1_M70B16.8llamacpp_HF
18/48Orenguteng_Lexi-Llama-3-8B-Uncensored8B17.67Transformers
18/48Meta-Llama-3.1-70B-Instruct-IQ2_XXS70B19.1llamacpp_HF[link]
18/48LoneStriker_Nous-Capybara-34B-4.65bpw-h6-exl234B20.76ExLlamav2_HF
18/48Qwen_Qwen1.5-MoE-A2.7B-Chat14.3B28.63Transformers
18/48TheProfessor-155b.i1-IQ3_XS155B63.2llamacpp_HF[link]
17/48gemma-2-2b-it-IQ3_M2.6B1.39llamacpp_HF[link]
17/48Phi-3-mini-4k-instruct-v0.3-IQ3_XXS3.8B1.51llamacpp_HF[link]
17/48gemma-2-2b-it-Q3_K_L2.6B1.55llamacpp_HF[link]
17/48Phi-3-mini-4k-instruct-v0.3-Q3_K_S3.8B1.68llamacpp_HF[link]
17/48microsoft_Phi-3-mini-128k-instruct3.8B1.91Transformers--load-in-4bit
17/48turboderp_Phi-3-mini-128k-instruct-exl2_5.0bpw3.8B2.54ExLlamav2_HF
17/48Phi-3-mini-4k-instruct-v0.3-Q5_K_S3.8B2.64llamacpp_HF[link]
17/48turboderp_Phi-3-mini-128k-instruct-exl2_6.0bpw3.8B2.99ExLlamav2_HF
17/48Phi-3.1-mini-128k-instruct-Q8_03.8B4.06llamacpp_HF[link]
17/48Meta-Llama-3.1-8B-Instruct-Q3_K_L8B4.32llamacpp_HF[link]
17/48Meta-Llama-3.1-8B-Instruct-IQ4_XS8B4.45llamacpp_HF[link]
17/48Meta-Llama-3.1-8B-Instruct-Q5_K_S8B5.6llamacpp_HF[link]
17/48amazingvince_Not-WizardLM-2-7B7B14.48Transformers
17/48Undi95_Toppy-M-7B7B14.48Transformers
17/48internlm_internlm2-chat-7b-sft7B15.48Transformers
17/48mzbac_llama-3-8B-Instruct-function-calling8B16.06Transformers
17/48openchat_openchat-3.6-8b-202405228B16.06Transformers
17/48kubernetes-bad_Mistral-7B-Instruct-v0.37B28.99Transformers
17/48ZeusLabs_L3-Aethora-15B-V215B30.02Transformers
17/48ggml-alpaca-dragon-72b-v1-q4_k_m72B43.8llamacpp_HF
17/48DeepSeek-V2-Chat-0628-IQ2_XS236B68.7llamacpp_HF[link]
17/48Meta-Llama-3.1-405B-Instruct.i1-IQ1_S405B86.8llamacpp_HF[link]
17/48grok-1-IQ2_XS314B93.31llamacpp_HF[link]
16/48Phi-3.1-mini-4k-instruct-Q3_K_S3.8B1.68llamacpp_HF[link]
16/48Phi-3.1-mini-128k-instruct-IQ4_XS3.8B2.06llamacpp_HF[link]
16/48Phi-3-mini-4k-instruct-v0.3-Q4_K_S3.8B2.19llamacpp_HF[link]
16/48Meta-Llama-3-8B-Instruct-IQ3_M8B3.78llamacpp_HF
16/48Meta-Llama-3.1-8B-Instruct-Q3_K_M8B4.02llamacpp_HF[link]
16/48TheBloke_Mistral-7B-Instruct-v0.2-GPTQ7B4.16ExLlamav2_HF
16/48Meta-Llama-3.1-8B-Instruct-Q3_K_XL8B4.78llamacpp_HF[link]
16/48microsoft_Phi-3-mini-128k-instruct-202407013.8B7.64Transformers
16/48mixtral-8x7b-instruct-v0.1.Q2_K8x7B15.6llamacpp_HF
15/48Phi-3-mini-4k-instruct-IQ3_XS3.8B1.63llamacpp_HF[link]
15/48Phi-3-mini-4k-instruct-old-IQ3_XS3.8B1.63llamacpp_HF[link]
15/48Phi-3.1-mini-128k-instruct-Q3_K_S3.8B1.68llamacpp_HF[link]
15/48Phi-3-mini-4k-instruct-v0.3-Q3_K_L3.8B2.09llamacpp_HF[link]
15/48Phi-3-mini-4k-instruct-v0.3-Q4_K_M3.8B2.39llamacpp_HF[link]
15/48Phi-3-mini-4k-instruct-v0.3-Q5_K_M3.8B2.82llamacpp_HF[link]
15/48Meta-Llama-3-8B-Instruct-IQ3_XS8B3.52llamacpp_HF
15/48Meta-Llama-3.1-8B-Instruct-Q3_K_S8B3.66llamacpp_HF[link]
15/48hjhj3168_Llama-3-8b-Orthogonalized-exl28B6.7ExLlamav2_HF
15/48cognitivecomputations_dolphin-2.9-llama3-8b8B16.06Transformers
15/48xtuner_llava-llama-3-8b-v1_18B16.06Transformers
15/48NousResearch_Hermes-2-Pro-Llama-3-8B8B16.06Transformers
15/48CohereForAI_c4ai-command-r-v01-4bit35B22.69Transformers
14/48turboderp_Phi-3-mini-128k-instruct-exl2_4.0bpw3.8B2.09ExLlamav2_HF
14/48Meta-Llama-3-8B-Instruct-IQ3_XXS8B3.27llamacpp_HF
14/48Meta-Llama-3.1-8B-Instruct-IQ3_XS8B3.52llamacpp_HF[link]
14/48Meta-Llama-3-8B-Instruct-Q3_K_S8B3.66llamacpp_HF
14/48Meta-Llama-3.1-8B-Instruct-IQ3_M8B3.78llamacpp_HF[link]
14/48mattshumer_Llama-3-8B-16K8B16.06Transformers
14/48nvidia_ChatQA-1.5-8B8B16.06TransformersAlpaca template.
14/48Gryphe_MythoMax-L2-13b13B26.03Transformers
14/48microsoft_Orca-2-13b13B52.06Transformers
14/48Undi95_ReMM-SLERP-L2-13B13B52.06Transformers
13/48Phi-3-mini-4k-instruct-v0.3-Q3_K_M3.8B1.96llamacpp_HF[link]
13/48Phi-3-mini-4k-instruct-v0.3-Q6_K3.8B3.14llamacpp_HF[link]
13/48Phi-3-medium-4k-instruct-IQ2_XXS14B3.72llamacpp_HF[link]
13/48gradientai_Llama-3-8B-Instruct-262k8B16.06Transformers
13/48gradientai_Llama-3-8B-Instruct-Gradient-1048k8B16.06Transformers
13/48nvidia_ChatQA-1.5-8B8B16.06TransformersNVIDIA-ChatQA template.
13/48alpindale_gemma-7b-it7B17.08Transformers
13/48internlm_internlm2-wqx-20b20B39.72Transformers
13/48GeorgiaTechResearchInstitute_galactica-30b-evol-instruct-70k30B59.95TransformersGALACTICA template.
13/48turboderp_dbrx-instruct-exl2_3.75bpw132B62.83ExLlamav2_HFWithout the "You are DBRX..." system prompt.
12/48Phi-3.1-mini-128k-instruct-Q2_K3.8B1.42llamacpp_HF[link]
12/48Phi-3.1-mini-128k-instruct-Q2_K_L3.8B1.51llamacpp_HF[link]
12/48Phi-3-mini-4k-instruct-v0.3-Q8_03.8B4.06llamacpp_HF[link]
12/48ISTA-DASLab_Meta-Llama-3-8B-Instruct-AQLM-2Bit-1x168B4.08Transformers
12/48HuggingFaceH4_zephyr-7b-beta7B14.48Transformers
12/48Phi-3-mini-4k-instruct-v0.3-f323.8B15.29llamacpp_HF[link]
12/48Meta-Llama-3-70B-Instruct-IQ1_S70B15.3llamacpp_HF
12/48NousResearch_Llama-2-13b-chat-hf13B26.03Transformers
12/48mistralai_mathstral-7B-v0.17B28.99Transformers
12/48llama-65b.Q5_K_M65B46.2llamacpp_HF
11/48Phi-3-mini-4k-instruct-Q2_K3.8B1.42llamacpp_HF[link]
11/48Phi-3-mini-4k-instruct-v0.3-IQ3_M3.8B1.86llamacpp_HF[link]
11/48Phi-3-mini-4k-instruct-v0.3-IQ4_XS3.8B2.06llamacpp_HF[link]
11/48Meta-Llama-3-8B-Instruct-IQ2_M8B2.95llamacpp_HF
11/48Meta-Llama-3.1-8B-Instruct-IQ2_M8B2.95llamacpp_HF[link]
11/48Meta-Llama-3-8B-Instruct-Q2_K8B3.18llamacpp_HF
11/48Phi-3-medium-128k-instruct-IQ2_XXS14B3.72llamacpp_HF[link]
11/48meta-llama_Llama-3.2-3B-Instruct3B6.43Transformers
11/48mlabonne_phixtral-2x2_82x2.8B8.92Transformers
11/48Meta-Llama-3.1-70B-Instruct-IQ1_M70B16.8llamacpp_HF[link]
11/48mlfoundations_tabula-8b8B32.12Transformers
11/48nyunai_nyun-c2-llama3-56B56B55.62Transformers--load-in-8bit
10/48ISTA-DASLab_c4ai-command-r-v01-AQLM-2Bit-1x1635B12.72Transformers
10/48facebook_galactica-30b30B60.66Transformers
9/48Phi-3-mini-4k-instruct-old-Q2_K3.8B1.42llamacpp_HF[link]
9/48Qwen_Qwen2-1.5B-Instruct1.5B3.09Transformers
9/48tiiuae_Falcon3-1B-Instruct1B3.34Transformers
9/48Meta-Llama-3.1-8B-Instruct-Q2_K_L8B3.69llamacpp_HF[link]
9/48microsoft_phi-22.7B5.56Transformers
9/48TheBloke_vicuna-33B-GPTQ33B16.94ExLlamav2_HF
9/48qingy2019_NaturalLM12B24.50Transformers
8/48Phi-3-mini-4k-instruct-v0.3-Q2_K3.8B1.42llamacpp_HF[link]
8/48Meta-Llama-3-8B-Instruct-IQ2_S8B2.76llamacpp_HF
8/48ISTA-DASLab_Meta-Llama-3-8B-Instruct-AQLM-2Bit-1x168B4.08Transformers
8/48NousResearch_Llama-2-7b-chat-hf7B13.48Transformers
8/48mistralai_Mistral-7B-Instruct-v0.17B14.48Transformers
8/48NousResearch_Nous-Capybara-7B-V1.97B14.48Transformers
8/48gradientai_Llama-3-8B-Instruct-Gradient-1048k8B16.06TransformersRevision of 2024/05/04.
7/48Phi-3.1-mini-4k-instruct-Q2_K3.8B1.42llamacpp_HF[link]
7/48Phi-3-mini-4k-instruct-v0.3-IQ3_XS3.8B1.63llamacpp_HF[link]
7/48Meta-Llama-3.1-8B-Instruct-Q2_K8B3.18llamacpp_HF[link]
7/48h2oai_h2o-danube3-4b-chat4B7.92Transformers
7/48tiiuae_falcon-11B11B22.21Transformers
7/48GeorgiaTechResearchInstitute_galactica-30b-evol-instruct-70k30B59.95TransformersAlpaca template.
6/48Phi-3-mini-4k-instruct-IQ2_S3.8B1.22llamacpp_HF[link]
6/48Phi-3.1-mini-4k-instruct-Q2_K_L3.8B1.51llamacpp_HF[link]
6/48meta-llama_Meta-Llama-3.1-8B8B16.06Transformers
6/48tiiuae_falcon-40b-instruct40B41.84Transformers--load-in-8bit; falcon-180B-chat instruction template.
5/48Phi-3-mini-4k-instruct-old-IQ2_S3.8B1.22llamacpp_HF[link]
5/48Meta-Llama-3-8B-Instruct-IQ2_XS8B2.61llamacpp_HF
5/48Phi-3-medium-128k-instruct-IQ1_S14B2.96llamacpp_HF[link]
5/48internlm_internlm2-chat-1_8b-sft1.8B3.78Transformers
5/48TheBloke_Llama-2-13B-GPTQ13B7.26ExLlamav2_HF
5/48LoneStriker_deepseek-coder-33b-instruct-6.0bpw-h6-exl233B25.32ExLlamav2_HF
5/48NousResearch_Llama-2-13b-hf13B26.03Transformers
5/48unsloth_llama-3-70b-bnb-4bit70B39.52Transformers
4/48Phi-3-mini-4k-instruct-IQ2_M3.8B1.32llamacpp_HF[link]
4/48turboderp_Phi-3-mini-128k-instruct-exl2_3.0bpw3.8B1.63ExLlamav2_HF
4/48internlm_internlm2-chat-1_8b1.8B3.78Transformers
4/48TheBloke_deepseek-coder-33B-instruct-AWQ33B18.01AutoAWQ
3/48Phi-3-mini-4k-instruct-IQ2_XXS3.8B1.04llamacpp_HF[link]
3/48Phi-3-mini-4k-instruct-old-IQ2_XXS3.8B1.04llamacpp_HF[link]
3/48Phi-3.1-mini-4k-instruct-IQ2_M3.8B1.32llamacpp_HF[link]
3/48Phi-3-mini-4k-instruct-old-IQ2_M3.8B1.32llamacpp_HF[link]
3/48Phi-3-medium-128k-instruct-IQ1_M14B3.24llamacpp_HF[link]
3/48tiiuae_Falcon3-7B-Instruct-1.58bit7B3.27Transformers
3/48TheBloke_Llama-2-7B-GPTQ7B3.9ExLlamav2_HF
3/48turboderp_dbrx-instruct-exl2_3.75bpw132B62.83ExLlamav2_HF
2/48Phi-3-mini-4k-instruct-v0.3-IQ2_S3.8B1.22llamacpp_HF[link]
2/48Meta-Llama-3-8B-Instruct-IQ2_XXS8B2.4llamacpp_HF
2/48Phi-3-medium-4k-instruct-IQ1_M14B3.24llamacpp_HF[link]
2/48facebook_galactica-6.7b6.7B13.72Transformers
1/48Phi-3.1-mini-128k-instruct-IQ2_XS3.8B1.15llamacpp_HF[link]
1/48Phi-3-mini-4k-instruct-v0.3-IQ2_XS3.8B1.15llamacpp_HF[link]
1/48Phi-3-mini-4k-instruct-IQ2_XS3.8B1.15llamacpp_HF[link]
1/48Phi-3-mini-4k-instruct-old-IQ2_XS3.8B1.15llamacpp_HF[link]
1/48Phi-3.1-mini-128k-instruct-IQ2_S3.8B1.22llamacpp_HF[link]
1/48Phi-3.1-mini-128k-instruct-IQ2_M3.8B1.32llamacpp_HF[link]
1/48turboderp_Phi-3-mini-128k-instruct-exl2_2.5bpw3.8B1.41ExLlamav2_HF
1/48Phi-3-medium-4k-instruct-IQ1_S14B2.96llamacpp_HF[link]
1/48tiiuae_Falcon3-10B-Instruct-1.58bit10B3.99Transformers
1/48bartowski_CodeQwen1.5-7B-Chat-exl2_8_07B7.63ExLlamav2_HF
1/48NousResearch_Llama-2-7b-hf7B13.48Transformers
1/48Qwen_CodeQwen1.5-7B-Chat7B14.5Transformers
0/48facebook_galactica-125m0.125B0.25Transformers
0/48openai-community_gpt20.124B0.55Transformers
0/48Phi-3-mini-4k-instruct-IQ1_S3.8B0.84llamacpp_HF[link]
0/48Phi-3-mini-4k-instruct-old-IQ1_S3.8B0.84llamacpp_HF[link]
0/48Phi-3-mini-4k-instruct-IQ1_M3.8B0.92llamacpp_HF[link]
0/48Phi-3-mini-4k-instruct-old-IQ1_M3.8B0.92llamacpp_HF[link]
0/48Qwen_Qwen2-0.5B-Instruct0.5B0.99Transformers
0/48h2oai_h2o-danube3-500m-chat0.5B1.03Transformers
0/48Phi-3-mini-4k-instruct-v0.3-IQ2_M3.8B1.32llamacpp_HF[link]
0/48openai-community_gpt2-medium0.355B1.52Transformers
0/48Meta-Llama-3-8B-Instruct-IQ1_S8B2.02llamacpp_HF
0/48Meta-Llama-3-8B-Instruct-IQ1_M8B2.16llamacpp_HF
0/48TinyLlama_TinyLlama-1.1B-Chat-v1.01.1B2.2Transformers
0/48ISTA-DASLab_Llama-2-7b-AQLM-2Bit-1x16-hf7B2.38Transformers
0/48meta-llama_Llama-3.2-1B-Instruct1B2.47Transformers
0/48facebook_galactica-1.3b1.3B2.63Transformers
0/48openai-community_gpt2-large0.774B3.25Transformers
0/48EleutherAI_gpt-neo-1.3B1.3B5.31Transformers
0/48openai-community_gpt2-xl1.5B6.43Transformers
0/48EleutherAI_gpt-neo-2.7B2.7B10.67Transformers
0/48facebook_opt-6.7b6.7B13.32Transformers
0/48gpt4chan_model_float166B24.21Transformers
0/48EleutherAI_gpt-j-6b6B24.21Transformers
0/48facebook_opt-13b13B25.71Transformers
0/48EleutherAI_gpt-neox-20b20B41.29Transformers
0/48facebook_opt-30b30B59.95Transformers

Updates

2024/12/18

2024/12/13

2024/12/09

2024/12/08

2024/11/18

2024/10/24

2024/10/16

2024/10/14

2024/09/27

2024/08/20

2024/08/10

2024/08/08

2024/08/05

2024/08/04

2024/08/01

2024/07/29

2024/07/27

2024/07/25

2024/07/24

2024/07/23

2024/07/22

2024/07/20

2024/07/18

2024/07/15

2024/07/11

2024/07/03

2024/07/02

2024/07/01

2024/06/28

2024/06/26

2024/06/25

2024/05/24

2024/05/23

2024/05/21

2024/05/20

2024/05/19

2024/05/12

2024/05/10

2024/05/07

2024/05/06

2024/05/05

2024/05/04

2024/05/03

2024/04/28

2024/04/27

2024/04/26

2024/04/25

2024/04/24

2024/04/23

About

This test consists of 48 manually written multiple-choice questions. It evaluates a combination of academic knowledge and logical reasoning.

Compared to MMLU, it has the advantage of not being in any training dataset, and the disadvantage of being much smaller. Compared to lmsys chatbot arena, it is harsher on small models like Starling-LM-7B-beta that write nicely formatted replies but don't have much knowledge.

The correct Jinja2 instruction template is used for each model, as autodetected by text-generation-webui from the model's metadata. For base models without a template, Alpaca is used. The questions are evaluated using the /v1/internal/logits endpoint in the project's API.

The questions are private.

Limitations

This benchmark does not evaluate code generation, non-English languages, role-playing, RAG, and long context understanding. The performance in those areas may have a weak or nonexistent correlation with what is being measured.

Do you find this useful? Consider making a donation.