Red Pill Analytics Builds a Modern Data Platform for a National Health Registry

A leading national health registry company and nonprofit organization (Healthcare Company) operates a registry of volunteer health donors in the United States. In addition to providing access to the registry, they provide support resources to donors, financial assistance to patients, and funding for research opportunities. In 2018, Healthcare Company matched thousands of donors and provided millions in financial assistance. With the phenomenal success of their life-saving work came challenges.

A BI & Analytics Bottleneck

While Healthcare Company enjoyed many successes with their BI platform in the past, the gap between ongoing enhancement requests and the capabilities of the existing system became increasingly wider. Further, due to the inflexible nature of the existing toolset, the BI team could not quickly respond to customer requests for bug fixes or enhancements which became continuously more frustrating for both the BI team and their customers. These deficits typically resulted in two outcomes: waterfall-style projects or the business creating their own solutions; neither of which was entirely desirable. Healthcare Company faced a decision point: upgrade the existing tools or reimagine the technology stack.

With the support of senior leadership, Healthcare Company decided to reimagine the analytics toolset and hired Red Pill Analytics to help create a modern data platform. The ensuing journey included: evaluating and selecting new technologies, executing proofs-of-concept, procurement, implementation, development, and various degrees of change management.

Evaluating a new BI Stack

During the initial phase of the project, a Red Pill Analytics Solution Architect assisted Healthcare Company in an objective evaluation of all components of a modern BI & Analytics stack. This included data integration/transformation tools and a data warehouse. The front-end reporting tool evaluation was pushed out to allow representatives from the user community to have more time to review in order to provide valuable input. An exhaustive fact-finding effort resulted in several documents which, combined with discussions between the BI team and various stakeholders, were used to rate and compare tools. While there were several categories and sub-categories, at a high level, the tools were selected based on these driving factors:

  • The data replication tool should have native connectivity to all of the source database platforms and the target warehouse and should replicate the data in real time, using change-log reading mechanisms whenever possible to ensure that data was consumed without interfering with core database workloads.
  • The data transformation tool needed to have native connectivity to the target data warehouse and should use “push-down” data processing. It is also preferred to have robust track-back and documentation functionality.
  • The data warehouse should require low up-front costs and be able to scale up as the environment grows. The data warehouse should also be SQL-based and be able to accommodate relational as well as semi-structured data.
  • All tools must be within budget.

Data Integration & Transformation

After thoughtful consideration, Attunity Replicate and Fivetran were selected as the replication tools, WhereScape RED as the transformation tool, and Snowflake as the data warehouse.

  • Attunity Replicate performs an initial full load and then uses database log mining (whenever possible) to process CDC and has connectors for the required source databases.
  • Fivetran was included to replicate Salesforce and positioned for other applications not supported by Attunity (e.g. JIRA, Google Analytics).
  • Snowflake uses a consumption-based pricing model so there was not a large up-front investment needed to start the project. The environment can grow over time as more data is introduced and user adoption goes up.
  • WhereScape RED has a Snowflake connector, uses push-down (ELT) processing that leverages the Snowflake investment, and provides auto-generation of documentation.

After the tools were selected, a proof-of-concept was conducted by building out a pilot use case that was representative of the larger vision. With the POC successful, Red Pill assisted Healthcare Company in the next steps of procurement and implementation of the new tools. Implementation activities included architecture design, set-up/configuration, and development for each component of the platform. Throughout the entire process, Red Pill and the BI team worked closely with an Information Security Architect to ensure that the environment adhered to all of Healthcare Company security policies.

 

It is worth noting that the overall guiding principle was to replicate the entirety of Healthcare Company’s databases that contain business data in Snowflake, as opposed to replicating data based on user requirements. This strategy resulted in the BI team being better equipped for conversations with business users in the future as any requested data would already be available in the new platform. It was also mutually beneficial as queries run by analysts and data integration processes could be shifted over to Snowflake and given dedicated compute resources that could be appropriately sized and those queries were no longer running on source system databases, consuming potentially valuable resources for the core application.

Data Viz

With the effort to replicate source data into Snowflake well underway, the remaining task was to select and implement a front-end reporting tool. On par with the evaluation for the rest of the tools in the stack, Red Pill worked with Healthcare Company to gather information for various reporting tools and compiled a list of potential candidates. A POC was carried out for each of the three finalists and after several iterations involving demos, hands-on development, and discussion, Looker was selected to round out the BI & Analytics platform. Ultimately, Looker was chosen for its ability to deliver consistent results across the enterprise, drill-down functionality, and relative ease of use.

Recapping the Tools

  • Attunity Replicate runs on-premises, reads the database changelogs for CDC, and pushes data to Snowflake. It also includes data transformation functionality using SQLite syntax.
  • Fivetran runs in the cloud and has connectors for replicating both on-premises and cloud-based databases and applications. Fivetran includes a light data transformation layer as well.
  • WhereScape RED provides a graphical user interface for development and push-down (ELT) processing which leveraged Healthcare Company’s investment in Snowflake.
  • Snowflake is a cloud-native data warehouse that separates compute from storage, allowing the environment to grow over time as more data is introduced and user adoption increases.

The Final Results

Attunity, Fivetran, WhereScape, Snowflake, and Looker collectively created a feature-rich, flexible, and robust Business Intelligence & Analytics platform to effectively and efficiently serve Healthcare Company and their customers. 

Since Healthcare Company was new to Looker, Red Pill and the BI team took an iterative approach to dashboard design and development. Following a rapid-delivery/rapid-feedback cadence, each iteration progressed toward more refined reports and dashboards as the BI team learned requirements from the business and the business grew more familiar with Looker.

To align with Healthcare Company’s vision to shift towards a more data-driven culture, Red Pill Analytics and the BI team leveraged dashboard development as a means to train power users on Looker in an effort to enable discovery and self-service analytics culture throughout the enterprise.

  • A single data source for business data, replicated in real-time
  • Federated Looker data models
  • Vertical-specific Looker Explores
  • Pre-built Looker dashboards for consumption by less-than-power users. An org-wide KPI Looker dashboard displayed on every user’s laptop each morning

While Red Pill Analytics met Healthcare Company’s requirements for the project, it should be noted that the implementation is ongoing.

One More Thing

Adjusting to a modern data platform can be challenging. Fortunately, Healthcare Company‘s data science team was already using tools like R, SAS, and various third-party SQL query clients. RPA collaborated with the team to connect these tools to Snowflake, which allowed them to continue using their tools of choice in the new platform. Red Pill Analytics also assisted with converting queries and processes that had been running on the source databases to ensure continuity. In the end, the successful collaboration between both teams ensured a smooth transition to a modern data platform.

THE GOALS

  • Build a modern data platform 
  • Avoid a BI & Analytics bottleneck
  • Provide a single source of data truth
  • Promote data governance
  • Allow users to focus on analysis, rather than procurement and curation of data

THE APPROACH

  • Replicate data from various transactional source systems into Snowflake using Attunity and Fivetran
  • Focus on replicating all data right away as opposed to user-driven data requirements
  • Use WhereScape to transform and load data into a data marketplace and data warehouse
  • Use an iterative approach to define user requirements for reports and dashboards
  • Train power users for self-sufficiency

THE RESULTS

  • A single source of truth from a modern data platfor
  • A Data Lake with replicated raw business data
  • A Data Marketplace with lightly-curated data sets for easier consumption
  • A Data Warehouse with dimensional data models, federated across business verticals
  • Looker Explores along with “starter” reports and dashboards delivered by the BI team
  • A dashboard with org-wide KPIs, automatically displayed on each user’s laptop every day