Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap_up_plot fails when it has to handle unsuccessful runs #5

Open
rcap107 opened this issue Jan 4, 2024 · 1 comment
Open

Wrap_up_plot fails when it has to handle unsuccessful runs #5

rcap107 opened this issue Jan 4, 2024 · 1 comment

Comments

@rcap107
Copy link
Owner

rcap107 commented Jan 4, 2024

wrap_up_plot fails to vstack elements because some columns have dtype utf-8 instead of f64: these columns are created when the corresponding run fails, so it's caused by the poor handling of failed runs.

    161 def wrap_up_plot(exp_name, task="regression", variable_of_interest=None):                                                                             
    162     """Prepare and save the plots relevant to the task under consideration.                                                                           
    163     If the task is `regression`, plot `r2score`, if the task is `classification`,                                                                     
    164     plot `f1score`.                                                                                                                                   
   (...)                                                                                                                                                      
    169         `classification`. Defaults to "regression".                                                                                                   
    170     """                                                                                                                                               
--> 171     df_raw = read_logs(exp_name=exp_name)                                                                                                             
    173     if task == "regression":                                                                                                                          
    174         current_score = "r2score"                                                                                                                     
                                                                                                                                                              
File ~/work/benchmark-join-suggestions/src/utils/logging.py:75, in read_logs(exp_name, exp_path)                                                              
     73 for f in path_agg_logs.glob("*.log"):                                                                                                                 
     74     logs.append(pl.read_csv(f))                                                                                                                       
---> 75 df_agg = pl.concat(logs)                                                                                                                              
     77 return df_agg                                                                                                                                         
                                                                                                                                                              
File ~/mambaforge/envs/bench-repro/lib/python3.10/site-packages/polars/functions/eager.py:170, in concat(items, how, rechunk, parallel)                       
    168 if isinstance(first, pl.DataFrame):                                                                                                                   
    169     if how == "vertical":                                                                                                                             
--> 170         out = wrap_df(plr.concat_df(elems))                                                                                                           
    171     elif how == "vertical_relaxed":                                                                                                                   
    172         out = wrap_ldf(                                                                                                                               
    173             plr.concat_lf(                                                                                                                            
    174                 [df.lazy() for df in elems],                                                                                                          
   (...)                                                                                                                                                      
    178             )                                                                                                                                         
    179         ).collect(no_optimization=True)                                                                                                               
                                                                                                                                                              
ShapeError: unable to vstack, dtypes for column "time_join_train" don't match: `f64` and `str`                                                                
@rcap107
Copy link
Owner Author

rcap107 commented Jan 4, 2024

Related to: #4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant