Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] incremental loading method for satellites missing data #233

Open
crystalgtaylor91 opened this issue May 13, 2024 · 1 comment
Open
Assignees
Labels
bug Something isn't working

Comments

@crystalgtaylor91
Copy link

crystalgtaylor91 commented May 13, 2024

Describe the bug
Incremental loading for a satellite with batch loaded data (multiple rows for the same pk, with different hashdiff, but the same load_datetime) doesn't work as it should. Because of the way the CTE is written this skips rows of unique data that should be included. This is because the window functions don't maintain ordering.

Environment

dbt version: 1.7.13
automate_dv version: 0.10.2
Database/Platform: Snowflake

To Reproduce
Steps to reproduce the behavior:

  1. Create a raw staging table with multiple entries with the same PK and load_datetime, and different payloads
  2. Create a stg table using the staging table as source, hash the pk, and the payload columns
  3. Create a sat table, with incremental load and source filtering
  4. Check the row count for the stg and satelitte tables, these will be different from expected.

Example I used to replicate this error + potential solution.
example.zip

Expected behavior
All unique rows should be selected. This can be fixed by applying the row_number() function in the source_data CTE, and then using it in the order by clause of the LAG function in the unique_source_records CTE. See attached files for example. Unique rows are being filtered out because applying the window functions in two separate CTE's means that the row ordering is different for each CTE.

AB#5344

@crystalgtaylor91 crystalgtaylor91 added the bug Something isn't working label May 13, 2024
@DVAlexHiggs
Copy link
Member

DVAlexHiggs commented May 13, 2024

Thank you for this report! On the surface this does look like a legitimate bug - we will look to test this our end as soon as we can and get back to you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants