-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add service-level check and fct table #2206
Conversation
@lauriemerrell , could you take a look at this when you are back? I created a new fct table as you suggested, but I'm still using service_key as a unique ID. I think I need some high-level assistance from you - for example, should I be grouping by "provider" rather than "service"? Thank you! |
Ok yeah sorry, I did not understand the context when I was weighing in on Slack yesterday, sorry if I muddied the waters. There are a couple of things here:
TLDR: I think that you and @atvaccaro need to figure out who is making |
Ok update I missed that @atvaccaro did already make this https://github.com/cal-itp/data-infra/blob/main/warehouse/models/intermediate/gtfs/gtfs_quality/int_gtfs_quality__services_guideline_index.sql |
Thank you! I think I can get this updated today based on your feedback here. |
@lauriemerrell, this is taking quite a long time to run, I wonder if the join with the BETWEEN statement is slowing stuff down. |
I think I will hold off on this PR until #2205 is done, as it will speed things up dramatically for me. |
@owades re:
The tables in my sample SQL above are both tiny (~MB), so I don't think that step is the issue -- |
starting fresh on new branch |
Description
New check to see whether this guideline is met:
I might be able to incorporate the observed_trip data structure proposed here to reduce the cost of my big join.
Type of change
How has this been tested?
Not yet