Friday, June 14, 2024

Deploying Massive Language Fashions With HuggingFace TGI | by Ram Vegiraju | Jul, 2023

Must read

One other solution to effectively host and scale your LLMs with Amazon SageMaker

Towards Data Science
Picture from Unsplash

Massive Language Fashions (LLMs) proceed to soar in reputation as a brand new one is launched practically each week. With the variety of these fashions rising, so are the choices for the way we are able to host them. In my earlier article we explored how we may make the most of DJL Serving inside Amazon SageMaker to effectively host LLMs. On this article we discover one other optimized mannequin server and answer in HuggingFace Textual content Technology Inference (TGI).

NOTE: For these of you new to AWS, be sure you make an account on the following hyperlink if you wish to observe alongside. The article additionally assumes an intermediate understanding of SageMaker Deployment, I’d recommend following this text for understanding Deployment/Inference extra in depth.

DISCLAIMER: I’m a Machine Studying Architect at AWS and my opinions are my very own.

Why HuggingFace Textual content Technology Inference? How Does It Work With Amazon SageMaker?

TGI is a Rust, Python, gRPC mannequin server created by HuggingFace that can be utilized to host particular giant language fashions. HuggingFace has lengthy been the central hub for NLP and it incorporates a big set of optimizations on the subject of LLMs particularly, look under for just a few and the documentation for an intensive listing.

  • Tensor Parallelism for environment friendly internet hosting throughout a number of GPUs
  • Token Streaming with SSE
  • Quantization with bitsandbytes
  • Logits warper (completely different params reminiscent of temperature, top-k, top-n, and many others)

A big optimistic of this answer that I famous is the simplicity of use. TGI at this second helps the next optimized mannequin architectures that you may straight deploy using the TGI containers.

Supply hyperlink

More articles


Please enter your comment!
Please enter your name here

Latest article