PwC and AWS Alliance

Generative AI chatbot using Llama 2 on AWS

  • Blog
  • 15 Minute Read
  • September 30, 2023

Jayant Raj

Director, Cloud & Digital, AWS Ambassador, PwC US


This post demonstrates building a GenAI chatbot using a private instance of the open source Llama 2 model deployed on Amazon Sagemaker using AWS Cloud Development Kit (CDK) and fronted by AWS Lambda and API Gateway. Llama 2 is a family of pretrained and fine-tuned large language models (LLMs) released by Meta in July 2023. Llama 2 was pretrained on 2 Trillion tokens and has a 4k context length.

The blog outlines the approach to deploy an open source LLM on SageMaker and use an open source python package, Chainlit, to build a ChatGPT-like user interface for LLM applications. The Llama 2 model is deployed on SageMaker using two different approaches. The first approach uses SageMaker Studio console to deploy the model via SageMaker JumpStart. The second approach uses AWS CDK to deploy the Llama 2 model from HuggingFace on SageMaker using the HuggingFace Text Generation Inference Container.

Want to see how it’s done?

Generative AI chatbot using Llama 2 on AWS details

Follow us