
Cloud engineering with PwC and AWS Ambassadors
Our Cloud Engineering team, led by PwC’s AWS Ambassadors, is dedicated to guiding businesses through their cloud transformation journey.
Jayant Raj
Director, Cloud & Digital, AWS Ambassador, PwC US
This post demonstrates building a GenAI chatbot using a private instance of the open source Llama 2 model deployed on Amazon Sagemaker using AWS Cloud Development Kit (CDK) and fronted by AWS Lambda and API Gateway. Llama 2 is a family of pretrained and fine-tuned large language models (LLMs) released by Meta in July 2023. Llama 2 was pretrained on 2 Trillion tokens and has a 4k context length.
The blog outlines the approach to deploy an open source LLM on SageMaker and use an open source python package, Chainlit, to build a ChatGPT-like user interface for LLM applications. The Llama 2 model is deployed on SageMaker using two different approaches. The first approach uses SageMaker Studio console to deploy the model via SageMaker JumpStart. The second approach uses AWS CDK to deploy the Llama 2 model from HuggingFace on SageMaker using the HuggingFace Text Generation Inference Container.
Our Cloud Engineering team, led by PwC’s AWS Ambassadors, is dedicated to guiding businesses through their cloud transformation journey.
Redefining business through strategic adoption of PwC’s Amazon Cloud services for technology modernization, enhanced security and innovation.