PwC and AWS Alliance

Generative AI chatbot using Llama 2 on AWS

Blog
15 minute read
September 30, 2023

Jayant Raj

Director, Cloud & Digital, AWS Ambassador, PwC US

This post demonstrates building a GenAI chatbot using a private instance of the open source Llama 2 model deployed on Amazon Sagemaker using AWS Cloud Development Kit (CDK) and fronted by AWS Lambda and API Gateway. Llama 2 is a family of pretrained and fine-tuned large language models (LLMs) released by Meta in July 2023. Llama 2 was pretrained on 2 Trillion tokens and has a 4k context length.

The blog outlines the approach to deploy an open source LLM on SageMaker and use an open source python package, Chainlit, to build a ChatGPT-like user interface for LLM applications. The Llama 2 model is deployed on SageMaker using two different approaches. The first approach uses SageMaker Studio console to deploy the model via SageMaker JumpStart. The second approach uses AWS CDK to deploy the Llama 2 model from HuggingFace on SageMaker using the HuggingFace Text Generation Inference Container.

Want to see how it’s done?

Generative AI chatbot using Llama 2 on AWS details

Download now

Related content

27/03/24

Cloud engineering with PwC and AWS Ambassadors

Our Cloud Engineering team, led by PwC’s AWS Ambassadors, is dedicated to guiding businesses through their cloud transformation journey.

21/11/24

Switch on possibility with PwC and Amazon Web Services

Redefining business through strategic adoption of PwC’s Amazon Cloud services for technology modernization, enhanced security and innovation.

Audit and Assurance services Consulting Tax services Newsroom Alumni US offices Contact us

© 2017 - 2025 PwC. All rights reserved. PwC refers to the PwC network and/or one or more of its member firms, each of which is a separate legal entity. Please see www.pwc.com/structure for further details.