Guidewire migration and data considerations in AWS

Insurers are evaluating options to migrate from Guidewire’s on-premises solution or from another on-premises solution to Guidewire’s hosted solution that runs on Amazon Web Services (AWS). This change represents a large shift for the organization as a whole, but especially for the Information Technology (IT) groups. Moving a system as central as Guidewire will have impacts on systems that interact with Guidewire. 

In this article, we will focus on the impacts for the data backbone of an organization when Guidewire is moved into the hosted solution. Many things need to be considered as part of this move and we would like to share leading practices to address the data needs of the business. The illustrative diagram below shows a potential pattern for the new landscape after the customer migration to Guidewire’s SaaS solution is complete.

Guidewire data ecosystem

Guidewire data ecosystem

The Guidewire system runs out of its own AWS account that is owned and maintained by the Guidewire team (notated by the Guidewire icon in the Guidewire data ecosystem graphic). This puts substantial distance and latency between any applications or data systems that are running on-premises. The leading practice is to plan on moving the applications and data backbone into an AWS account that is now “adjacent” to the Guidewire setup. Leveraging AWS native solutions like Amazon S3, Amazon Lake Formation, AWS Glue, and Amazon Redshift, organizations should look to establish a scalable data architecture in AWS.

Let's explore what this new environment could look like and map over where the data would live from the on-premises system and explore where we can inject AWS cloud native approaches to this environment.

The biggest difference in building a Data Lake architecture in AWS compared to the typical on-premises data warehouse is that we are no longer focused on schema on write. In a traditional configuration, we have to translate data loads into the data warehouse to match the predefined schema of the data warehouse. This approach is very limiting in what we can do with the data and is time consuming for the developers to translate data schemas between systems. In an AWS Data Lake architecture with S3 at the center, we can store all of our data from various systems at an extremely low cost in the format of the source system. In this approach, we will declare our schema when we are reading the data from the Data Lake. This approach gives us flexibility to combine data from siloed systems more easily and to explore data for patterns or insights without having to make data conversions.

The first step in the new environment is to put ingestion processes into place to load the data from the source systems into S3 Buckets within the unified landing zone on AWS or “raw zone”. The source of this data can be various databases and document stores that are running the customer’s larger ecosystem with Guidewire potentially being one of those sources. Because of the economics of data storage in S3, we can bring more data into our ecosystem in the cloud. This is another advantage of a cloud-based Data Lake. In an on-premises environment, we are limited to what we can store in our data warehouse by its schema and the disk space that is available.

The second step is to build a data catalog or data dictionary of the data that is in Amazon S3. AWS Glue is designed for this purpose. The data catalog is then usable by other AWS services like AWS Athena for ad hoc exploration of data and provides a mechanism to provide data governance.

The third step is to determine how the data should be transformed and where the data needs to be moved for consumption by other business processes. In a traditional configuration, the data warehouse might be the single system where all reporting, analytics, and transactional work would occur. Using AWS’s extensive set of data services, we can identify the correct destination for the data based on the usage patterns of that data. AWS solutions such as Amazon RDS or Amazon Redshift are strong candidates for this use case. Unlike traditional data warehouse technology, Redshift is differentiated in that the data storage and compute are separated; storage is provided by Amazon S3 and we can scale up and down the number of Redshift instances that are needed rather than provisioning for peak usage.

The final consideration is reporting and AI/ML. This layer doesn’t necessarily need to see a technology change depending on the current tools in use. However, a customer should consider if there is a better, more Cloud native solution for reporting. Amazon QuickSight is the tool that AWS provides and it has integrations to multiple types of sources, AI & ML tools built in, and has a unique pricing structure that separates administrators from readers. One of the benefits of getting the data into the cloud is it enables the ability to use various AI and ML tools from AWS for processing of the data, providing new insights and ways to address business issues. The cornerstone service for this activity from AWS is Sagemaker Studio. Using Sagemaker Studio, data can be explored, analyzed, and models deployed for continuous processing to provide deeper insights or new services for the business.

By moving to the Guidewire hosted solution, the customer should consider moving the data ecosystem into AWS. In doing so, considerations should be made for adopting an AWS native solution based on a Data Lake architecture. This can allow the organization to expand what they can do with data and not be hampered by legacy systems on-premises.

Contact us

Scott Weber

Managing Director, Cloud & Digital, AWS Ambassador, PwC US

Email

Justin Guse

Director, Cloud & Digital, AWS Ambassador, PwC US

Email

Follow us