Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break up rawcovariates.csv output #102

Open
brenmous opened this issue Jul 2, 2020 · 0 comments
Open

Break up rawcovariates.csv output #102

brenmous opened this issue Jul 2, 2020 · 0 comments

Comments

@brenmous
Copy link
Collaborator

brenmous commented Jul 2, 2020

An output of learning is a rawcovariates.csv file that was originally intended to show untransformed values for each covariate at each position. It now contains multiple extra fields, such as target values, prediction values from cross-validation, user defined fields etc.

It's overloaded and in a bad state at the moment because it gets written initially when covariate/target intersection occurs and then opened and written to again after cross-validation is performed. It might be a better idea to break up this file and write multiple files instead - or maybe carry a Pandas DataFrame throughout the workflow adding results to it and outputting it as one big results table.

If embarking on this be aware that the a lot of diagnostics.py functions (plotting) read this file and rely on the column ordering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant