Add the new Multi-Modal model of mistral AI: pixtral-12b #3535

SuperPat45 · 2024-09-12T11:30:13Z

Add the new Multi-Modal model of mistral AI: pixtral-12b:

It supports image encoder, can it also be added to the image generator API as an alternative to Stable Diffusion?

AlexM4H · 2024-09-13T10:00:26Z

Since yesterday vllm has internVL2 support. :-)

mudler · 2024-09-13T16:12:14Z

I guess that would work already with llama.cpp GGUF models if/when is getting supported in there ( see also ggerganov/llama.cpp#9440 ).

I'd change the focus of this one to be more generic and add support for multimodal with vLLM, examples:

SuperPat45 added the enhancement New feature or request label Sep 12, 2024

mudler added the roadmap label Sep 13, 2024

Provide feedback