Run and Serve Quicker VLMs Like Pixtral and Phi-3.5 Imaginative and prescient with vLLM

Understanding how a lot reminiscence it’s essential serve a VLM

A picture encoded by Pixtral — Picture by the writer

vLLM is presently one of many quickest inference engines for giant language fashions (LLMs). It helps a variety of mannequin architectures and quantization strategies.

vLLM additionally helps vision-language fashions (VLMs) with multimodal inputs containing each photos and textual content prompts. As an illustration, vLLM can now serve fashions like Phi-3.5 Imaginative and prescient and Pixtral, which excel at duties reminiscent of picture captioning, optical character recognition (OCR), and visible query answering (VQA).

On this article, I’ll present you the way to use VLMs with vLLM, specializing in key parameters that influence reminiscence consumption. We’ll see why VLMs devour rather more reminiscence than commonplace LLMs. We’ll use Phi-3.5 Imaginative and prescient and Pixtral as case research for a multimodal software that processes prompts containing textual content and pictures.

The code for working Phi-3.5 Imaginative and prescient and Pixtral with vLLM is offered on this pocket book:

Get the pocket book (#105)

In transformer fashions, producing textual content token by token is sluggish as a result of every prediction is dependent upon all earlier tokens…

Supply hyperlink

Run and Serve Quicker VLMs Like Pixtral and Phi-3.5 Imaginative and prescient with vLLM

Must read

VP Kamala Harris’s First Speech On Crypto Sparks 7% Rise In This Memecoin

The Selectmenu Ingredient Is No Extra…Lengthy Reside Choose!

I Requested 12 SEOs To Share Their Favourite search engine optimization Books

Crypto Funds See $321 Million Inflows amid Dovish FOMC Feedback

Understanding how a lot reminiscence it’s essential serve a VLM

More articles

LEAVE A REPLY Cancel reply

Latest article

VP Kamala Harris’s First Speech On Crypto Sparks 7% Rise In This Memecoin

The Selectmenu Ingredient Is No Extra…Lengthy Reside Choose!

I Requested 12 SEOs To Share Their Favourite search engine optimization Books

Crypto Funds See $321 Million Inflows amid Dovish FOMC Feedback

Bitcoin Dominance ‘Hinting At Attainable Dip To 47%’ – Altseason On The Horizon?

Popular Category

Editor Picks

VP Kamala Harris’s First Speech On Crypto Sparks 7% Rise In This Memecoin

The Selectmenu Ingredient Is No Extra…Lengthy Reside Choose!