Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defer backend registration to validation time #1818

Merged
merged 9 commits into from
Sep 24, 2024
Merged

Conversation

cosmicBboy
Copy link
Collaborator

Fixes #1810.

This PR speeds up pandera schema definition runtime by deferring the registration of validation backends to validation time. Before, schema backends would be registered at schema initialization time, but this slows things down in cases where schemas are just being initialized.

It also refactors the pandas register.py logic by only importing the dask, modin, pyspark, and geopandas modules when the corresponding dataframe is passed in at validation time.

@cosmicBboy cosmicBboy merged commit 3a2ff78 into main Sep 24, 2024
144 checks passed
@cosmicBboy cosmicBboy deleted the bugfix/1810-profiling branch September 25, 2024 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Initial schema creation is very slow
1 participant