Developing a Data Warehouse Solution with Scalable Data Ingestion Process 

Developing a Data Warehouse Solution with Scalable Data Ingestion Process 

Industry: iGaming
Client since: 2021
Technological Strategies Implemented
Centralized Data Repository
Insights and Optimization Tools
Services used
Data & Analytics Managed Infrastructure
Technologies used
iGaming

Background 

The client is a renowned developer of fun table and sportsbook pool-based casino games. To support the growth and success of its business, they recognized the need for a robust data warehouse solution that could handle the vast amounts of data generated by its gaming operations. A data warehouse is a specialized database system that is designed to support the efficient querying and analysis of substantial amounts of data and is typically used to support business intelligence and data analytics activities.

To design and build a data warehouse solution that met their needs the company turned to Symphony Solutions, a leading provider of data engineering and data analytics services.

Client 

A leading developer of casino gaming products has a portfolio of pool-based games with a unique twist, powered by a custom engine. These games, including table and sportsbook options, are sophisticated, engaging, and offer dynamic odds, social elements, and immersive entertainment. The company works with top-notch operators to offer these games on its own platform as well as third-party platforms.

Challenge 

One of the main challenges faced by the Symphony Solutions team was the fact that the data warehouse solution was being developed for a product that was not yet in production and actively developed. This meant that the team had to work closely with the client to understand the future needs of the product and design a solution that would be able to support it. In addition, Symphony team had to consult with the client on the system needs and use, as well as better ways of leveraging the system, as the client was lacking knowledge in this area. 

Another challenge faced by the Symphony Solutions team was the need to cooperate with various vendors who were contributing data to the data warehouse. This required careful coordination and communication to ensure that the data was being ingested and transformed in a consistent and reliable manner. The team also had to make architecture adjustments according to the changing environment and business needs, as well as support the evolving nature of data sources, which were being developed at the same time as the data warehouse. 

In addition to the above challenges, the Symphony Solutions team had to design a scalable batch data ingestion process with the possibility of introducing near real-time data in the future. This required the team to carefully consider the performance and scalability requirements of the data warehouse and design a solution that would be able to handle the expected volume and velocity of data. The team also had to create a flexible data warehouse baseline that would allow the business to build insightful analytics on full-system data and deliver it to multiple third-party systems using a single integration point.

Solution 

To address the challenges the team faced with data warehouse development, they implemented several technologies and approaches. First, the data warehouse had a direct connection to all data sources. This approach required more efforts to develop the data ingestion part, but it allowed the team to deliver data to the data warehouse as quickly as possible without additional overhead for development teams from other system components. 

To further improve the data ingestion process, all data sources were forced to push data to EventBridge, a fully managed event bus service that makes it easy to process and transfer data between different applications and services. Some data sources were able to do this automatically using MongoDB change stream or DynamoDB stream, while others had to develop dedicated EventBridge push mechanisms. This allowed the team to unify the data ingestion process and use Kinesis Firehose, a fully managed service for real-time streaming data ingestion and load data into data stores and analytics tools, for scalable data stream consumption. 

The use of EventBridge and Kinesis Firehose, combined with Redshift COPY, a data loading utility that enables fast data ingestion from various data sources into Amazon Redshift, ensured scalability at all stages of the data ingestion process. In the future, the team plans to extend the data ingestion process to serve near real-time data by leveraging more features of Kinesis. 

In addition to the data ingestion process, the Symphony Solutions team also focused on designing a smart data model that would add value to the data warehouse. The data model included multiple layers, including historical and keep-latest tables as a core layer and a star schema as a top layer. This design allowed the team to serve different business cases and provide the client with the flexibility to build insightful analytics on full-system data. 

Finally, the Symphony Solutions team chose Rudderstack, an open-source event stream processing platform, to ease the integration of the data warehouse with multiple third-party tools such as Braze and Google Analytics. This helped the client to deliver data to these systems using a single integration point, streamlining the data flow, and reducing the complexity of the system. 

To ensure the smooth flow of data into the data warehouse, the Symphony Solutions team implemented a few technical solutions, including EventBridge, Kinesis Firehose, and Airflow DAGs. Multiple data sources, some of which have a CDC (change data capture) nature, are now sending data to EventBridge, where it is buffered into batches by Kinesis Firehose and stored in AWS S3. From there, Airflow DAGs run Redshift COPY commands to ingest the data files into the data warehouse. 

One of the key benefits of the data warehouse system implemented by Symphony Solutions is the ability to build an OLAP (online analytical processing) layer that is ready for consumption from QuickSight reports and other systems. This layer is created through a process of extracting, loading, and transforming data from multiple sources, using Redshift as the data warehouse and Airflow as the orchestration tool. 

By using Redshift orchestrated by Airflow, the Symphony Solutions team was able to build an OLAP layer that is scalable, flexible, and easy to use, providing the client with a powerful tool for gaining insights into its business and driving growth. 

Results 

The data warehouse services provided by Symphony Solutions will play a crucial role in the success and growth of the client’s business. With access to a centralized repository of structured data, the company will be able to gain insights into the performance and popularity of its games, understand customer preferences and behavior, and optimize business operations. 

The data warehouse will enable the client’s teams to have ready-to-use BI dashboards to analyze product performance and make informed decisions. The marketing team will have the capability to run conversion-optimized campaigns thanks to the data integrations made by Symphony Solutions, while the CRM team will be able to create engaging communication campaigns using many different channels, target future users based on their segments, run trigger-based campaigns, and personalize messages with relevant content. 

Overall, the data warehouse solution provides the client with the tools and capabilities it needs to drive growth and success in its gaming operations, and to better understand its customers and optimize its business. 

Share