-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: add optional --schema_location
argument to check_model_parents_schema
#132
base: main
Are you sure you want to change the base?
Conversation
Not sure how to ensure test coverage if it's needed – would be very grateful for any advice as I kept getting coverage failures for this section:
For now, I have escaped the coverage for the new optional |
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #132 +/- ##
===========================================
- Coverage 100.00% 98.39% -1.61%
===========================================
Files 47 50 +3
Lines 2194 2486 +292
Branches 272 321 +49
===========================================
+ Hits 2194 2446 +252
- Misses 0 24 +24
- Partials 0 16 +16 ☔ View full report in Codecov by Sentry. |
Potential fix for #135 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kiliantscherny thanks for contributing to dbt-checkpoint
Would you be able to extend existing test check model parents schema with this new argument?
@BAntonellini I've just had a go at adding the test coverage for this new argument, but please just let me know if it wasn't done correctly. Thanks! |
Hi @BAntonellini would you be able to take another look at this when you get the chance? |
@@ -40,7 +42,13 @@ def check_parents_schema( | |||
) | |||
) | |||
for parent in parents: | |||
db = parent.node.get("schema") | |||
# Selecting the location of the model schema | |||
if schema_location == "config": # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why adding the # pragma: no cover
? Please cover this case as part of the tests
if schema_location == "config": # pragma: no cover | ||
# Chooses the schema inside in the model config (nodes -> model -> config -> schema) | ||
db = parent.node.get("config").get("schema") | ||
else: # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
I'm proposing a small change to the
check_model_parents_schema
check to fix an issue with the parsing of the parent models' schema.Current state
When this check is run, it parses the
schema
from nodes → schema, which is prefixed by an individual's configuration inprofiles.yml
– for example:For a BigQuery-based profile:
For a Snowflake-based profile:
In
manifest.json
, for any given model,nodes → schema
inherits the individual's[dev_dataset_name]
and prefixes it in front of the "true" schema name, which itself comes fromdbt_project.yml
, e.g. for a "dim" folder in an example repo:This then causes the check to fail if we set the args as:
Because the schema is parsed from
nodes → schema
asdev_dim_schema_name
rather thandim_schema_name
(which is found innodes → config → schema
), the check doesn't work as expected.Proposed solution
In our organisation, we have had issues with this check because we actually need it to parse the schema from nodes → config → schema instead, as the different dbt developers each have their own dev dataset name which prefixes the model schema in their dev environment. This must have changed when the manifest was updated to a newer version.
By selecting from the "nested" schema that's inside
config
, we get the "true" schema name reliably every time, for each developer and in production.The additional argument I'm adding to
args
is--schema_location
to optionally specify if the intended schema is located insideconfig
.How it works
It's completely optional, and only requires the user to add
"--schema_location", "config"
to theargs
after the white-/blacklist argument, like:Note on testing coverage
I'm not really too familiar with Pytest or how to ensure complete coverage, so if there are any other contributors with an idea of how to help with this I'd be hugely grateful!