Skip to content

Releases: tatsu-lab/alpaca_eval

Release v0.6.5

17 Aug 23:39
Compare
Choose a tag to compare

What's Changed

  • Add Llama-3-Instruct-8B-WPO-HB-v2 to AlpacaEval by @wzhouad in #377
  • [ENH] add llama 3.1 by @YannDubs in #378
  • [ENH] add example for LLama 3 vllm by @YannDubs in #381
  • Add Infinity-Instruct-7M-0729-Llama3_1-70B, Infinity-Instruct-7M-0729-Llama3_1-8B, Infinity-Instruct-7M-0729-mistral-7B to AlpacaEval by @cszhengyh in #383
  • Add gemma-2-9b-it-WPO-HB to AlpacaEval by @wzhouad in #384
  • Add link to gemma-2-9b-it-WPO-HB by @wzhouad in #385
  • Change the name of the Infinity-Instruct-7M-0729-Models to Infinity-Instruct-7M-Gen-Models by @cszhengyh in #387
  • Add blendaxai-gm-l3-v35 to AlpacaEval by @ym-blendax-ai in #389
  • [ENH] OpenAI use tools instead of functions by @YannDubs in #391
  • [ENH] enable base_dir to be a list by @YannDubs in #392
  • [ENH] add mistral v0.3, Qwen2 70b, gtp4 mini by @YannDubs in #393

New Contributors

Full Changelog: v0.6.4...v0.6.5

Release v0.6.4

18 Jul 18:01
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.6.3...v0.6.4

Release v0.6.3

24 Jun 00:58
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.6.2...v0.6.3

Release v0.6.2

19 Apr 06:28
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.6.1...v0.6.2

Release v0.6.1

13 Apr 05:40
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.6...v0.6.1

Release v0.6

20 Mar 02:50
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.5.4...v0.6

Release v0.5.4

24 Feb 08:56
Compare
Choose a tag to compare

What's Changed

  • Add Qwen1.5-72B-Chat to AlpacaEval by @Lukeming-tsinghua in #226
  • Add claude-instant-1.2, deepseek-llm-67b-chat, wizardlm-70b, Qwen-14B-Chat (config + outputs without annotations) by @gblazex in #228
  • [DATA] Adding annotations for the arena models by @YannDubs in #229
  • Update README.md - Add missing "Y" to "ou" by @yoderj in #230
  • [DEV] Analyzing length-controlled metrics. by @YannDubs in #231
  • [DOC] add annotation interpretation by @YannDubs in #232
  • [DATA] add results from the Arena openai models by @YannDubs in #234
  • update ELO for llama-2-13b-chat-hf by @gblazex in #235
  • [NOTEBOOK] add length-corrected GLM by @YannDubs in #237
  • [ENH] add inverse mapper to make sure in and out types are the same by @YannDubs in #240
  • [ENH] update to allow AF to use AE by @YannDubs in #241

New Contributors

Full Changelog: v0.5.3...v0.5.4

Release v0.5.3

01 Feb 08:54
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.5.2...v0.5.3

Release v0.5.2

10 Jan 23:57
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.5.1...v0.5.2

Release v0.5.1

10 Jan 06:16
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.5.0...v0.5.1