-
Notifications
You must be signed in to change notification settings - Fork 122
Add pdc calculation to pharmacy models #770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
saywurdson
wants to merge
32
commits into
tuva-health:main
Choose a base branch
from
saywurdson:saywurdson/pdc
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
f984b13
add product_to_ingredient value set
saywurdson abb431c
Create pharmacy__product_to_ingredient.csv
saywurdson 37f3fab
Update pharmacy_models.yml with pdc config
saywurdson c87ccc7
Create pharmacy__stg_add_ingredient_concepts.sql
saywurdson 4b50497
Create pharmacy__int_calculate_sub_exposures.sql
saywurdson 46b667d
Create pharmacy__pdc.sql
saywurdson 5175540
Refactor drug exposure calculations in SQL
saywurdson a03d507
Merge branch 'main' into saywurdson/pdc
saywurdson b96620d
Refactor SQL for drug exposure calculations
saywurdson 50721d7
Refactor SQL for drug exposure adjustments and grouping
saywurdson eded42a
Update models/pharmacy/intermediate/pharmacy__int_calculate_sub_expos…
saywurdson 66d81e8
Enhance data quality checks and simplify date calculations
saywurdson 2ba90ab
Refactor SQL to use dbt date functions
saywurdson 04f3b99
Merge branch 'main' into saywurdson/pdc
saywurdson dd0d381
Refactor SQL for ingredient adjustment calculations
saywurdson 9519db8
Refactor SQL for calculating sub-exposures
saywurdson 6558c76
Refactor SQL for date calculations and PDC logic
saywurdson faf106b
Enhance claim line processing with therapy grouping
saywurdson e6135c0
Enhance claim line processing with ingredient therapy keys
saywurdson 6a4fe41
Simplify sub-exposure calculation logic in SQL
saywurdson 2211b76
Refactor end_date calculation and improve comments
saywurdson 21d0b2a
Refactor claim line and ingredient processing logic
saywurdson f3e8984
Merge branch 'main' into saywurdson/pdc
saywurdson 385e6c3
Refactor ingredient_name to use min function
saywurdson 485247e
Update pharmacy__pdc.sql
saywurdson fb1979c
Refactor pharmacy exposure calculation SQL logic
saywurdson 72fe137
Merge branch 'main' into saywurdson/pdc
saywurdson 38cb228
Delete models/pharmacy/staging/pharmacy__stg_add_ingredient_concepts.sql
saywurdson e61c7eb
Refactor claims exposure calculation logic
saywurdson 0c5822c
Merge branch 'main' into saywurdson/pdc
saywurdson 4c5511e
Merge branch 'main' into saywurdson/pdc
saywurdson e616db7
Merge branch 'main' into saywurdson/pdc
aneiderhiser File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,128 @@ | ||
| {{ config( | ||
| enabled = var('brand_generic_enabled', var('claims_enabled', var('tuva_marts_enabled', False))) | as_bool | ||
| ) }} | ||
|
|
||
| with sub_exposures as ( | ||
| select * from {{ ref('pharmacy__int_calculate_sub_exposures') }} | ||
| ), | ||
|
|
||
| -- Apply a 30-day persistence window | ||
| -- Sub-exposures within 30 days are combined into the same era | ||
| get_end_dates as ( | ||
| select | ||
| person_id, | ||
| ingredient_rxcui, | ||
| ingredient_name, | ||
| event_date as end_date | ||
| from ( | ||
| select | ||
| person_id, | ||
| ingredient_rxcui, | ||
| ingredient_name, | ||
| event_date, | ||
| event_type, | ||
| max(start_ordinal) over ( | ||
| partition by person_id, ingredient_rxcui | ||
| order by event_date, event_type | ||
| rows unbounded preceding | ||
| ) as start_ordinal, | ||
| row_number() over ( | ||
| partition by person_id, ingredient_rxcui | ||
| order by event_date, event_type | ||
| ) as overall_ord | ||
| from ( | ||
| -- Start events for sub-exposures | ||
| select | ||
| person_id, | ||
| ingredient_rxcui, | ||
| ingredient_name, | ||
| drug_sub_exposure_start_date as event_date, | ||
| -1 as event_type, | ||
| row_number() over ( | ||
| partition by person_id, ingredient_rxcui | ||
| order by drug_sub_exposure_start_date | ||
| ) as start_ordinal | ||
| from sub_exposures | ||
|
|
||
| union all | ||
|
|
||
| -- End events padded by 30 days for the grace period | ||
| -- This +30 days creates the persistence window | ||
| select | ||
| person_id, | ||
| ingredient_rxcui, | ||
| ingredient_name, | ||
| {{ dbt.dateadd('day', 30, 'drug_sub_exposure_end_date') }} as event_date, | ||
| 1 as event_type, | ||
| null as start_ordinal | ||
| from sub_exposures | ||
| ) raw_data | ||
| ) e | ||
| -- Event matching algorithm: pairs start/end events to identify era boundaries | ||
| where (2 * e.start_ordinal) - e.overall_ord = 0 | ||
| ), | ||
|
|
||
| -- Determine the final drug era end date by joining sub-exposures with padded end dates | ||
| drug_era_ends as ( | ||
| select | ||
| se.person_id, | ||
| se.ingredient_rxcui, | ||
| min(se.ingredient_name) as ingredient_name, | ||
| se.drug_sub_exposure_start_date, | ||
| min(e.end_date) as drug_era_end_date, | ||
| se.drug_exposure_count, | ||
| se.days_exposed | ||
| from sub_exposures se | ||
| inner join get_end_dates e | ||
| on se.person_id = e.person_id | ||
| and se.ingredient_rxcui = e.ingredient_rxcui | ||
| and e.end_date >= se.drug_sub_exposure_start_date | ||
| group by | ||
| se.person_id, | ||
| se.ingredient_rxcui, | ||
| se.drug_sub_exposure_start_date, | ||
| se.drug_exposure_count, | ||
| se.days_exposed | ||
| ), | ||
|
|
||
| -- Aggregate results | ||
| final_eras as ( | ||
| select | ||
| person_id, | ||
| ingredient_rxcui, | ||
| min(ingredient_name) as ingredient_name, | ||
| min(drug_sub_exposure_start_date) as drug_era_start_date, | ||
| drug_era_end_date, | ||
| sum(drug_exposure_count) as drug_exposure_count, | ||
| greatest(0, {{ dbt.datediff('min(drug_sub_exposure_start_date)', 'drug_era_end_date', 'day') }} + 1 - sum(days_exposed)) as gap_days, | ||
| sum(days_exposed) as total_days_exposed, | ||
| {{ dbt.datediff('min(drug_sub_exposure_start_date)', 'drug_era_end_date', 'day') }} + 1 as era_duration_in_days | ||
| from drug_era_ends | ||
| group by | ||
| person_id, | ||
| ingredient_rxcui, | ||
| drug_era_end_date | ||
| ) | ||
|
saywurdson marked this conversation as resolved.
|
||
|
|
||
| -- Calculate PDC and select final columns | ||
| select | ||
| person_id, | ||
| ingredient_rxcui, | ||
| ingredient_name, | ||
| cast(drug_era_start_date as date) as drug_era_start_date, | ||
| cast(drug_era_end_date as date) as drug_era_end_date, | ||
| cast(drug_exposure_count as integer) as drug_exposure_count, | ||
| cast(total_days_exposed as integer) as total_days_exposed, | ||
| cast(era_duration_in_days as integer) as era_duration_in_days, | ||
| cast(gap_days as integer) as gap_days, | ||
| -- Safety guard for division by zero | ||
| round( | ||
| case | ||
| when era_duration_in_days > 0 | ||
| then (cast(total_days_exposed as {{ dbt.type_float() }}) / cast(era_duration_in_days as {{ dbt.type_float() }})) * 100 | ||
| else 0 | ||
| end, | ||
| 2 | ||
| ) as pdc, | ||
|
saywurdson marked this conversation as resolved.
|
||
| '{{ var('tuva_last_run') }}' as tuva_last_run | ||
| from final_eras | ||
183 changes: 183 additions & 0 deletions
183
models/pharmacy/intermediate/pharmacy__int_calculate_sub_exposures.sql
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,183 @@ | ||
| {{ config( | ||
| enabled = var('brand_generic_enabled', var('claims_enabled', var('tuva_marts_enabled', False))) | as_bool | ||
| ) }} | ||
|
|
||
| with pharmacy_claim_input as ( | ||
| select * from {{ ref('pharmacy_claim') }} | ||
| ), | ||
|
|
||
| product_to_ingredient as ( | ||
| select * from {{ ref('pharmacy__product_to_ingredient') }} | ||
| ), | ||
|
|
||
| -- Step 1: Prepare claim lines with calculated end dates | ||
| prepared_claim_lines as ( | ||
| select | ||
| claim_id, | ||
| claim_line_number, | ||
| data_source, | ||
| person_id, | ||
| ndc_code, | ||
| dispensing_date, | ||
| days_supply, | ||
| -- Calculate original end date at claim line level | ||
| case | ||
| when days_supply > 0 | ||
| then {{ dbt.dateadd('day', 'days_supply - 1', 'dispensing_date') }} | ||
| else dispensing_date | ||
| end as original_end_date | ||
| from pharmacy_claim_input | ||
| -- Filter invalid data upfront for data quality | ||
| where days_supply is not null | ||
| and days_supply > 0 | ||
| ), | ||
|
|
||
| -- Step 2: Explode claim lines to ingredient level | ||
| claim_line_ingredients as ( | ||
| select | ||
| pcl.claim_id, | ||
| pcl.claim_line_number, | ||
| pcl.data_source, | ||
| pcl.person_id, | ||
| pcl.ndc_code, | ||
| pcl.dispensing_date as original_start_date, | ||
| pcl.original_end_date, | ||
| pcl.days_supply, | ||
| pti.ingredient_rxcui, | ||
| pti.ingredient_name | ||
| from prepared_claim_lines pcl | ||
| inner join product_to_ingredient pti | ||
| on pcl.ndc_code = pti.ndc | ||
| ), | ||
|
|
||
| -- Step 3: Work at claim_line level (not claim level) to preserve granularity | ||
| -- Each claim line may have different dates, so we treat them independently | ||
| unique_claim_lines as ( | ||
| select distinct | ||
| claim_id, | ||
| claim_line_number, | ||
| data_source, | ||
| person_id, | ||
| original_start_date, | ||
| original_end_date, | ||
| days_supply | ||
| from claim_line_ingredients | ||
| ), | ||
|
|
||
| -- Step 4: Get all ingredients for each claim line | ||
| claim_line_ingredients_distinct as ( | ||
| select distinct | ||
| claim_id, | ||
| claim_line_number, | ||
| data_source, | ||
| person_id, | ||
| ingredient_rxcui, | ||
| ingredient_name | ||
| from claim_line_ingredients | ||
| ), | ||
|
|
||
| -- Step 5: Replaced complex self-join and recursive CTE logic to use LAG to find the prior exposure for each ingredient | ||
| ingredient_with_prior as ( | ||
| select | ||
| cli.claim_id, | ||
| cli.claim_line_number, | ||
| cli.data_source, | ||
| cli.person_id, | ||
| cli.ingredient_rxcui, | ||
| cli.ingredient_name, | ||
| ucl.original_start_date, | ||
| ucl.original_end_date, | ||
| ucl.days_supply, | ||
| lag(ucl.original_end_date) over ( | ||
| partition by cli.person_id, cli.ingredient_rxcui | ||
| order by ucl.original_start_date, cli.claim_id, cli.claim_line_number | ||
| ) as prior_end_date | ||
| from claim_line_ingredients_distinct cli | ||
| inner join unique_claim_lines ucl | ||
| on cli.claim_id = ucl.claim_id | ||
| and cli.claim_line_number = ucl.claim_line_number | ||
| and cli.data_source = ucl.data_source | ||
| and cli.person_id = ucl.person_id | ||
| ), | ||
|
|
||
| -- Step 6: For each claim line, find the latest prior_end_date across all its ingredients | ||
| -- This ensures all ingredients from the same claim line stay synchronized | ||
| claim_line_dependencies as ( | ||
| select | ||
| claim_id, | ||
| claim_line_number, | ||
| data_source, | ||
| person_id, | ||
| original_start_date, | ||
| original_end_date, | ||
| days_supply, | ||
| max(prior_end_date) as max_prior_end_date | ||
| from ingredient_with_prior | ||
| group by | ||
| claim_id, | ||
| claim_line_number, | ||
| data_source, | ||
| person_id, | ||
| original_start_date, | ||
| original_end_date, | ||
| days_supply | ||
| ), | ||
|
|
||
| -- Step 7: Adjust claim lines based on prior exposures | ||
| adjusted_claim_lines as ( | ||
| select | ||
| claim_id, | ||
| claim_line_number, | ||
| data_source, | ||
| person_id, | ||
| original_start_date, | ||
| original_end_date, | ||
| days_supply, | ||
| -- Adjusted start: later of (original start, prior end + 1 day) | ||
| case | ||
| when max_prior_end_date is not null | ||
| then greatest( | ||
| original_start_date, | ||
| {{ dbt.dateadd('day', 1, 'max_prior_end_date') }} | ||
| ) | ||
| else original_start_date | ||
| end as adjusted_start_date | ||
| from claim_line_dependencies | ||
| ), | ||
|
|
||
| -- Step 8: Calculate adjusted end date for each claim line | ||
| adjusted_claim_lines_with_end as ( | ||
| select | ||
| claim_id, | ||
| claim_line_number, | ||
| data_source, | ||
| person_id, | ||
| adjusted_start_date, | ||
| -- Adjusted end: adjusted start + days_supply - 1 | ||
| {{ dbt.dateadd('day', 'days_supply - 1', 'adjusted_start_date') }} as adjusted_end_date, | ||
| days_supply | ||
| from adjusted_claim_lines | ||
| ), | ||
|
|
||
| -- Step 9: Join adjusted dates back to all claim line ingredients | ||
| -- All ingredients from the same claim line get the same adjusted dates | ||
| final_ingredient_exposures as ( | ||
| select | ||
| cli.claim_id, | ||
| cli.claim_line_number, | ||
| cli.data_source, | ||
| cli.person_id, | ||
| cli.ingredient_rxcui, | ||
| cli.ingredient_name, | ||
| ac.adjusted_start_date as drug_exposure_start_date, | ||
|
saywurdson marked this conversation as resolved.
|
||
| ac.adjusted_end_date as drug_exposure_end_date, | ||
| cli.days_supply | ||
| from claim_line_ingredients cli | ||
| inner join adjusted_claim_lines_with_end ac | ||
| on cli.claim_id = ac.claim_id | ||
| and cli.claim_line_number = ac.claim_line_number | ||
| and cli.data_source = ac.data_source | ||
| and cli.person_id = ac.person_id | ||
| ) | ||
|
|
||
| select * from final_ingredient_exposures | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.