Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update vignette to not use holdout role #44

Merged
merged 15 commits into from
Apr 16, 2024
43 changes: 43 additions & 0 deletions .github/workflows/test-task-1.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# r cmd check workflow of the mlr3 ecosystem v0.1.0
# https://github.com/mlr-org/actions
on:
workflow_dispatch:
push:
branches:
- main
pull_request:
branches:
- main

name: mlr3 & mlr3pipelines change

jobs:
r-cmd-check:
runs-on: ${{ matrix.config.os }}

name: ${{ matrix.config.os }} (${{ matrix.config.r }})

env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}

strategy:
fail-fast: false
matrix:
config:
- {os: ubuntu-latest, r: 'release'}

steps:
- uses: actions/checkout@v3

- name: mlr3
run: 'echo -e "Remotes:\n mlr-org/mlr3@feat/train-predict,\n mlr-org/mlr3pipelines@fix/uses_test_task" >> DESCRIPTION'

- uses: r-lib/actions/setup-r@v2
with:
r-version: ${{ matrix.config.r }}

- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: any::rcmdcheck
needs: check
- uses: r-lib/actions/check-r-package@v2
43 changes: 43 additions & 0 deletions .github/workflows/test-task-2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# r cmd check workflow of the mlr3 ecosystem v0.1.0
# https://github.com/mlr-org/actions
on:
workflow_dispatch:
push:
branches:
- main
pull_request:
branches:
- main

name: mlr3 & mlr3pipelines change

jobs:
r-cmd-check:
runs-on: ${{ matrix.config.os }}

name: ${{ matrix.config.os }} (${{ matrix.config.r }})

env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}

strategy:
fail-fast: false
matrix:
config:
- {os: ubuntu-latest, r: 'release'}

steps:
- uses: actions/checkout@v3

- name: mlr3
run: 'echo -e "Remotes:\n mlr-org/mlr3@feat/train-predict,\n mlr-org/mlr3pipelines@feat/test-rows" >> DESCRIPTION'

- uses: r-lib/actions/setup-r@v2
with:
r-version: ${{ matrix.config.r }}

- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: any::rcmdcheck
needs: check
- uses: r-lib/actions/check-r-package@v2
9 changes: 5 additions & 4 deletions vignettes/mcboost_basics_extensions.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -517,10 +517,11 @@ summary(data$ViolentCrimesPerPop)
```

We again split our task into **train** and **test**.
We do this in `mlr3` by simply setting some (here 500) row roles to `"holdout"`.
We do this in `mlr3` by creating a 2/3 - 1/3 split using `mlr3::partition()` and assigning the train ids to the row role `"use"`.

```{r}
tsk$set_row_roles(sample(tsk$row_roles$use, 500), "holdout")
split = partition(tsk)
tsk$set_row_roles(split$train, "use")
```

### 6.1 Preprocessing
Expand Down Expand Up @@ -571,13 +572,13 @@ mc$multicalibrate(data, labels)

### 6.3 Evaluation on Test Data

We first create the test task by setting the `holdout` rows to `use`, and then
We first create the test task by assining the test ids to the row role `"use"`, and then
sebffischer marked this conversation as resolved.
Show resolved Hide resolved
use our preprocessing `pipe's` predict function to also impute missing values
for the validation data. Then we again extract features `X` and target `y`.

```{r}
test_task = tsk$clone()
test_task$row_roles$use = test_task$row_roles$holdout
test_task$row_roles$use = split$test
test_task = pipe$predict(list(test_task))[[1]]
test_data = test_task$data(cols = tsk$feature_names)
test_labels = test_task$data(cols = tsk$target_names)[[1]]
Expand Down
Loading