This interactive hub is your guide to the campus case study. Please select the track assigned to you to explore a representative problem, data files, deliverables, and a complete solution framework.
1. Analytics Track
Play the role of an **Analytics Consultant**. This track involves deep, hands-on data analysis, ETL, and visualization. You'll wrangle messy, real-world data to solve complex business problems like calculating rebates, assessing profitability, or segmenting customers.
▶Skills: SQL/Python, Advanced Excel, ETL Logic
▶Focus: Data Wrangling, Business Rule Application, Insight Generation
2. Technology Track
Play the role of a **Technology Consultant**. This track is strategic and architectural. Your client needs you to design a complete technology platform from the ground up, such as a commercial data and analytics platform for a new product launch.
▶Skills: Data Architecture, Cloud Tech (AWS/Azure), Data Modeling
▶Focus: Strategy, System Design, Tool Justification
Please select a track above
A representative case study will be loaded here.
Analytics Track: Business Analysis Case
You are an Analytics Consultant for **"The Client,"** a large pharmaceutical company. They have just completed Q1'25 and need you to analyze their complex sales data. Your goal is to navigate their messy data, correctly apply intricate business rules, and deliver a series of key reports and insights.
The Challenge: "Product X"
The Client's product 'Product X' is used for treating blood cancers. They need to calculate territory-level sales and rebates for each Account (Hospital) for Q1'25.
Territory sales are complex, weighted by Group Purchasing Organization (GPO) and Account Type.
Rebates are calculated at a System level based on GPO criteria and special memberships (like ICOP).
Your data is spread across more than 10 different files, with known (and unknown) discrepancies.
**Your Goal:** Create a single, reliable source of truth by cleaning, joining, and processing this data to answer 8 specific business questions.
Explore Your Data Files
Click on a file name to see its description and a sample of its structure. This is your primary challenge: linking these files together.
Master Data.csv
Child Id to System Id mapping.csv
Child ID Address mapping.csv
Zip to Terr Alignment.csv
ICOP Member.csv
Sales crediting rule.csv
Rebate Criteria.csv
Key Business Logic
You must correctly apply these rules. Errors here will cascade through all your results.
1. Sales Crediting Rule
Weighted Sales must be used for all questions *except* rebate calculation and profit calculation. The multiplier depends on GPO and Account Type.
GPO
Account Type
Multiplier
APXRV01PHS
340-B Clinic
0.2
NO CONTRACT
Hospital
1.5
NO CONTRACT
Clinic
1.0
IONRV01
Clinic
1.0
...and 20+ more combinations...
2. Rebate Criteria
Rebates are calculated using **raw sales (Sold Qty)** at the System ID level. Tiers vary by GPO. ICOP members have their own special contract.
Contract
Tier
Volume Range
Rebate/Unit
Unity (UNIRV01)
Base Rebate
1+
$800
Ally, ION, Onmark, etc.
1
1 - 24
$600
2
25 - 49
$750
3
50+
$800
ICOP Special Contract
1
1 - 24
$600
2
25 - 49
$750
3
50+
$800
Your 8 Solution Deliverables
Your final Excel file must contain these 8 "Solution" tabs, correctly populated. Your presentation will summarize the insights from them.
List of data discrepancies and resolutions.
Quarterly sales by territory (with graph).
Top 10 system accounts by recent 2-quarter volume.
ICOP member sales and total % share.
System account level rebate calculations for Q1'25.
Top 5 accounts (and Top 3 ICOP) by rebate.
Top GPOs by rebate (with graph).
Total Q1'25 profit calculation and key insight.
Example Deliverable: Solution 2(A)
Your output for Solution 2(A) will be a graph similar to this mock-up of total sales.
Solution Framework & Methodology
This framework outlines the solution for each of the 8 deliverables. Use the navigation to see the mock-up and methodology for each solution.
Solution 1: Data Discrepancies
Methodology: Load `Master Data.csv`. Filter `Sold Qty` for non-numeric values. Check for blank `GPO`s. Cross-reference `Child ID`s in Master Data against mapping files to find orphans.
Discrepancy
Resolution
Non-numeric `Sold Qty` (e.g., 'ABC')
Filtered out and excluded from all calculations.
Blank `GPO`
Treated as 'NO CONTRACT' per business rule.
Missing `Child ID` in address map
Sales for this ID cannot be mapped to a territory. Excluded from territory roll-ups.
Solution 2: Quarterly Sales by Territory
Methodology: 1) Join `Master Data` with `Child ID Address` (on `Child ID`) to get `Zip`. 2) Join with `Zip to Terr Alignment` (on `Zip`) to get `Full Territory`. 3) Join with `Sales Crediting Rule` (on `GPO` & `Account Type`) to get `Multiplier`. 4) Calculate `Weighted Sales` = `Sold Qty` * `Multiplier`. 5) Create a Pivot Table of `Weighted Sales` by `Full Territory` and `Quarter`.
Solution 3: Top 10 Accounts by Recent 2-Quarter Volume
Methodology: Use the enriched Master Data. 1) Join with `Child Id to System Id` (on `Child ID`) to get `System Name`. 2) Filter for Q1'25 and Q4'24. 3) Pivot `Weighted Sales` by `System Name` and `Quarter`. 4) Create a `Total` column. 5) Sort by `Total` descending and take top 10.
System Name
Q4'24 Sales
Q1'25 Sales
Total
ONEONCOLOGY, LLC
850
920
1,770
FLORIDA CANCER SPECIALISTS
710
750
1,460
NEW YORK ONCOLOGY
650
680
1,330
...and 7 more...
Solution 4: ICOP Sales Share
Methodology: 1) Join `Master Data` with `ICOP Member` (on `Child ID`) to create an `ICOP Member (Y/N)` flag. 2) Calculate `SUM(Weighted Sales)` for Q1'25 where `ICOP Member` = 'Y'. 3) Calculate `SUM(Weighted Sales)` for all Q1'25. 4) Share = (ICOP Sales / Total Sales).
Metric
Value
Total ICOP Member Sales (Q1'25)
2,150 Units
Total All Sales (Q1'25)
8,600 Units
% ICOP Share
25.0%
Solution 5: System-Level Rebate Calculation
Methodology: **Use RAW Sales (Sold Qty)**. 1) Aggregate `Sold Qty` by `System ID`, `GPO`, and `ICOP Flag`. 2) Apply logic: If `ICOP Flag` = 'Y', use ICOP tiers. 3) Else, use `GPO` tiers from `Rebate Criteria` file. 4) Calculate `Rebate` = `Sold Qty` * `Rebate per unit`.
System ID
GPO
ICOP
Q1'25 Qty
Tier
Rebate/Unit
Total Rebate
300041984
IONRV01
N
45
Tier 2
$750
$33,750
300017707
IONRV01
Y
30
ICOP T2
$750
$22,500
300003333
UNIRV01
N
20
Base
$800
$16,000
...and 100+ more...
Solution 6: Top Accounts by Rebate
Methodology: Use the table from Solution 5. (A) Sort by `Total Rebate` descending, take top 5. (B) Filter for `ICOP` = 'Y', sort by `Total Rebate` descending, take top 3.
(A) Top 5 All Accounts
Rank
System Name
Rebate
1
FLORIDA CANCER SPECIALISTS
$55,000
...
...
...
(B) Top 3 ICOP Accounts
Rank
System Name
Rebate
1
ICOP System A
$22,500
...
...
...
Solution 7: Top GPOs by Rebate
Methodology: Use the table from Solution 5. Create a Pivot Table to `SUM(Total Rebate)` by `GPO`. Calculate % Share.
The ICOP members, while driving 25% of sales, account for over 35% of total rebates. This suggests their contract is less profitable per unit than standard GPO contracts, warranting a review before renewal.
Technology Track: System Design Case
You are a Technology Consultant for **"The Client,"** an emerging company preparing to launch its first product. **They have no existing IT systems.** Your objective is to design their entire commercial data and analytics foundation from scratch to support their go-to-market strategy.
Your Key Tasks & Deliverables
Your presentation must provide a comprehensive plan that addresses:
A high-level **Go-to-Market (GTM) strategy** for the product launch.
A list of all necessary **pharmaceutical data sources**.
A complete **Data Architecture Design** (see example).
**Recommended tools** (e.g., Cloud, ETL, Warehouse, BI) with justification.
A proposed **Data Model** (e.g., Star Schema) and its layers.
A strategy for **data quality assurance** and restatements.
An assessment of potential **risks and mitigation** strategies.
Conceptual Architecture Example
Your core deliverable is a diagram like this, built with your recommended tools. You must be able to explain and justify every component and data flow.
CRM Data (e.g., Veeva)
Sales Data (Wholesaler)
3rd Party Data (e.g., IQVIA)
Marketing Data
→
Ingestion (e.g., AWS S3, Fivetran)
→
Data Warehouse (e.g., Snowflake)
→
BI Tool (e.g., Tableau)
Proposed Solution Framework
The following is a high-level framework for a robust, scalable commercial data platform for "The Client."
1. Go-to-Market (GTM) Strategy
**Market Sizing:** Identify total addressable market (patients, physicians).
**Segmentation:** Group physicians by specialty, patient volume, and location.
**Targeting:** Prioritize high-value segments for sales team focus.
**Salesforce Deployment:** Align sales territories to targeted segments.
**Patient Data (Claims):** Anonymized data from sources like IQVIA or Symphony to track prescriptions.
**Sales Data:** Wholesaler data (e.g., McKesson, AmerisourceBergen) to track product flow to pharmacies.
**CRM Data:** (e.g., Veeva, Salesforce) To track sales rep interactions with physicians.
**Physician Data (Master):** A master list of all target physicians and their attributes.
**Marketing Data:** Data from email campaigns, website visits, and ad spend.
3. Proposed Data Architecture & Tools
Veeva (CRM)
IQVIA (Claims)
McKesson (Sales)
→
Ingestion (AWS S3 + Fivetran)
→
Warehouse (Snowflake)
→
BI Tool (Tableau)
Component
Recommended Tool
Justification
Cloud
AWS
Mature, comprehensive services (S3, Redshift, etc.), strong industry adoption.
Ingestion
Fivetran / S3
Fivetran for pre-built connectors (e.g., Salesforce), S3 as a scalable data lake for raw files.
Warehouse
Snowflake
Excellent scalability, separates storage and compute, strong performance for analytics.
BI Tool
Tableau
Highly intuitive, powerful visualizations, strong mobile support for sales reps.
4. Recommended Data Model
A **Star Schema** is recommended for the analytics layer.
- **Fact Tables:** `Prescriptions` (daily/weekly), `Sales_Rep_Activity`.
- **Dimension Tables:** `Physician_Master`, `Patient_Master` (de-identified), `Territory_Dimension`, `Date_Dimension`, `Product_Dimension`.
This model is simple, fast for querying, and well-understood by BI tools like Tableau.