-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sc-28901] Validate query logs as they're parsed #986
[sc-28901] Validate query logs as they're parsed #986
Conversation
☂️ Python Coverage
Overall Coverage
New FilesNo new covered files... Modified Files
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #986 +/- ##
==========================================
- Coverage 89.38% 89.37% -0.01%
==========================================
Files 191 191
Lines 12575 12578 +3
==========================================
+ Hits 11240 11242 +2
- Misses 1335 1336 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't we validate them when ingestion?
No, ingestion does not look at what's in the query log files. |
Got it, can we also valid entities in the MCE as well? We encounter the MCE validation error when ingestion many times, if we can catch the error earlier, it is better. |
MCEs are partially validated: https://github.com/MetaphorData/connectors/blob/main/metaphor/common/sink.py#L23C1-L28C14 It only validates against the json schema though, so it's kind of limited. |
…validate-parsed-query-logs
🤔 Why?
We should validate the query logs as they're parsed, otherwise bugs are harder to find when they get to the lineage parser.
🤓 What?
Validate the parsed query logs before writing them to file.
Tested this against our own Snowflake instance (13,589 queries)
🧪 Tested?
Tested against
☑️ Checks
pyproject.toml
.