Skip to content

Commit

Permalink
paradox compatability, remove holdout rows
Browse files Browse the repository at this point in the history
* check with new paradox

* new paradox syntax

* paradox::

* update vignette to not use holdout role

* add workflows

* fix workflow

* trigger actions

* dev cmd check with paradox master

* news

* Update vignettes/mcboost_basics_extensions.Rmd

* delete workflows

* paradox compatibility, vignette

* update maintainer

* release 0.4.3

---------

Co-authored-by: mb706 <[email protected]>
  • Loading branch information
sebffischer and mb706 committed Apr 16, 2024
1 parent 8ac638b commit 7eccb17
Show file tree
Hide file tree
Showing 7 changed files with 88 additions and 56 deletions.
44 changes: 44 additions & 0 deletions .github/workflows/r-cmd-check-paradox.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# r cmd check workflow of the mlr3 ecosystem v0.1.0
# https://github.com/mlr-org/actions
on:
workflow_dispatch:
push:
branches:
- main
pull_request:
branches:
- main

name: r-cmd-check-paradox

jobs:
r-cmd-check:
runs-on: ${{ matrix.config.os }}

name: ${{ matrix.config.os }} (${{ matrix.config.r }})

env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}

strategy:
fail-fast: false
matrix:
config:
- {os: ubuntu-latest, r: 'devel'}
- {os: ubuntu-latest, r: 'release'}

steps:
- uses: actions/checkout@v3

- name: paradox
run: 'echo -e "Remotes:\n mlr-org/paradox,\n mlr-org/mlr3learners,\n mlr-org/mlr3pipelines,\n mlr-org/mlr3oml" >> DESCRIPTION'

- uses: r-lib/actions/setup-r@v2
with:
r-version: ${{ matrix.config.r }}

- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: any::rcmdcheck
needs: check
- uses: r-lib/actions/check-r-package@v2
19 changes: 11 additions & 8 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
Package: mcboost
Type: Package
Title: Multi-Calibration Boosting
Version: 0.4.2
Version: 0.4.3
Authors@R:
c(person(given = "Florian",
family = "Pfisterer",
role = c("cre", "aut"),
role = "aut",
email = "[email protected]",
comment = c(ORCID = "0000-0001-8867-762X")),
person(given = "Susanne",
Expand All @@ -18,18 +18,21 @@ Authors@R:
role = "ctb",
email = "[email protected]",
comment = c(ORCID = "0000-0001-7363-4299")),
person(given = "Carolin",
person(given = "Carolin",
family = "Becker",
role = "ctb"),
person(given = "Bernd",
family = "Bischl",
role = "ctb",
email = "[email protected]",
comment = c(ORCID = "0000-0001-6002-6980"))
comment = c(ORCID = "0000-0001-6002-6980")),
person(given = "Sebastian",
family = "Fischer",
role = c("ctb", "cre"),
email = "[email protected]")
)
Maintainer: Florian Pfisterer <[email protected]>
Description: Implements 'Multi-Calibration Boosting' (2018) <https://proceedings.mlr.press/v80/hebert-johnson18a.html> and
'Multi-Accuracy Boosting' (2019) <arXiv:1805.12317> for the multi-calibration of a machine learning model's prediction.
'Multi-Accuracy Boosting' (2019) <doi:10.48550/arXiv.1805.12317> for the multi-calibration of a machine learning model's prediction.
'MCBoost' updates predictions for sub-groups in an iterative fashion in order to mitigate biases like poor calibration or large accuracy differences across subgroups.
Multi-Calibration works best in scenarios where the underlying data & labels are unbiased, but resulting models are.
This is often the case, e.g. when an algorithm fits a majority population while ignoring or under-fitting minority populations.
Expand Down Expand Up @@ -66,9 +69,9 @@ Suggests:
covr,
testthat (>= 3.1.0)
Roxygen: list(markdown = TRUE, r6 = TRUE)
RoxygenNote: 7.2.1
RoxygenNote: 7.3.1
VignetteBuilder: knitr
Collate:
Collate:
'AuditorFitters.R'
'MCBoost.R'
'PipelineMCBoost.R'
Expand Down
7 changes: 5 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
# mcboost (development version)
# mcboost 0.4.3

* Compatibility with upcoming 'paradox' release.
* Change the vignette to not use the holdout task.

# mcboost 0.4.2
* Removed new functionality for survival tasks added in `0.4.0`.
* Removed new functionality for survival tasks added in `0.4.0`.
A dependency, `mlr3proba` was removed from CRAN for now.
The functionality will be added back when `mlr3proba` is re-introduced to CRAN.
Users who wish to use `mcboost` for `survival` are adviced to use version `0.4.1` usetogether with the GitHub version of `mlr3proba`.
Expand Down
26 changes: 13 additions & 13 deletions R/PipeOpMCBoost.R
Original file line number Diff line number Diff line change
Expand Up @@ -65,19 +65,19 @@ PipeOpMCBoost = R6Class("PipeOpMCBoost",
#' @param param_vals [`list`] \cr
#' List of hyperparameters for the `PipeOp`.
initialize = function(id = "mcboost", param_vals = list()) {
param_set = paradox::ParamSet$new(list(
paradox::ParamInt$new("max_iter", lower = 0L, upper = Inf, default = 5L, tags = "train"),
paradox::ParamDbl$new("alpha", lower = 0, upper = 1, default = 1e-4, tags = "train"),
paradox::ParamDbl$new("eta", lower = 0, upper = 1, default = 1, tags = "train"),
paradox::ParamLgl$new("partition", tags = "train", default = TRUE),
paradox::ParamInt$new("num_buckets", lower = 1, upper = Inf, default = 2L, tags = "train"),
paradox::ParamLgl$new("rebucket", default = FALSE, tags = "train"),
paradox::ParamLgl$new("multiplicative", default = TRUE, tags = "train"),
paradox::ParamUty$new("auditor_fitter", default = NULL, tags = "train"),
paradox::ParamUty$new("subpops", default = NULL, tags = "train"),
paradox::ParamUty$new("default_model_class", default = ConstantPredictor, tags = "train"),
paradox::ParamUty$new("init_predictor", default = NULL, tags = "train")
))
param_set = paradox::ps(
max_iter = paradox::p_int(lower = 0L, upper = Inf, default = 5L, tags = "train"),
alpha = paradox::p_dbl(lower = 0, upper = 1, default = 1e-4, tags = "train"),
eta = paradox::p_dbl(lower = 0, upper = 1, default = 1, tags = "train"),
partition = paradox::p_lgl(tags = "train", default = TRUE),
num_buckets = paradox::p_int(lower = 1, upper = Inf, default = 2L, tags = "train"),
rebucket = paradox::p_lgl(default = FALSE, tags = "train"),
multiplicative = paradox::p_lgl(default = TRUE, tags = "train"),
auditor_fitter = paradox::p_uty(default = NULL, tags = "train"),
subpops = paradox::p_uty(default = NULL, tags = "train"),
default_model_class = paradox::p_uty(default = ConstantPredictor, tags = "train"),
init_predictor = paradox::p_uty(default = NULL, tags = "train")
)
super$initialize(id,
param_set = param_set, param_vals = param_vals, packages = character(0),
input = data.table(name = c("data", "prediction"), train = c("TaskClassif", "TaskClassif"), predict = c("TaskClassif", "TaskClassif")),
Expand Down
30 changes: 3 additions & 27 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,6 @@
## Reason for resubmission

Removed dependency on package mlr3proba that was removed from CRAN.
Apologies for not being able to upload a new version in time.

## R CMD check

Results in one NOTE:

CRAN repository db overrides:
X-CRAN-Comment: Archived on 2022-05-16 as requires archived package 'mlr3proba'.

The dependency on 'mlr3proba' has been removed in the updated version.


There is one NOTE that is only found on Windows (Server 2022, R-devel 64-bit):

```
* checking for detritus in the temp directory ... NOTE
Found the following files/directories:
'lastMiKTeXException'
```

As noted in R-hub issue #503, this could be due to a bug/crash in MiKTeX and can likely be ignored.

- WARNINGs or ERRORs

## R-HUB
0 errors | 0 warnings | 1 note

All checks show "Status: success"
New maintainer:
Sebastian Fischer <[email protected]>
9 changes: 7 additions & 2 deletions man/mcboost-package.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 5 additions & 4 deletions vignettes/mcboost_basics_extensions.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -517,10 +517,11 @@ summary(data$ViolentCrimesPerPop)
```

We again split our task into **train** and **test**.
We do this in `mlr3` by simply setting some (here 500) row roles to `"holdout"`.
We do this in `mlr3` by creating a 2/3 - 1/3 split using `mlr3::partition()` and assigning the train ids to the row role `"use"`.

```{r}
tsk$set_row_roles(sample(tsk$row_roles$use, 500), "holdout")
split = partition(tsk)
tsk$set_row_roles(split$train, "use")
```

### 6.1 Preprocessing
Expand Down Expand Up @@ -571,13 +572,13 @@ mc$multicalibrate(data, labels)

### 6.3 Evaluation on Test Data

We first create the test task by setting the `holdout` rows to `use`, and then
We first create the test task by assigning the test ids to the row role `"use"`, and then
use our preprocessing `pipe's` predict function to also impute missing values
for the validation data. Then we again extract features `X` and target `y`.

```{r}
test_task = tsk$clone()
test_task$row_roles$use = test_task$row_roles$holdout
test_task$row_roles$use = split$test
test_task = pipe$predict(list(test_task))[[1]]
test_data = test_task$data(cols = tsk$feature_names)
test_labels = test_task$data(cols = tsk$target_names)[[1]]
Expand Down

0 comments on commit 7eccb17

Please sign in to comment.