Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building a model needs write access to cmdstan installation #995

Open
weshinsley opened this issue Jun 12, 2024 · 15 comments
Open

Building a model needs write access to cmdstan installation #995

weshinsley opened this issue Jun 12, 2024 · 15 comments

Comments

@weshinsley
Copy link
Contributor

Describe the bug
Not sure if this is an issue with cmdstanr, or cmdstan - and is something of a design problem, rather than a bug per se.

When we run

mod <- cmdstanr::cmdstan_model('code.stan', dir = path)

These two files (and perhaps others) get modified within the cmdstan installation :-
cmdstan-2.35.0\src\cmdstan\main.o
cmdstan-2.35.0\stan\lib\stan_math\make\ucrt

Therefore, this can only work if users have write-access to the cmdstan installation - something we need to avoid. See below.

To Reproduce
Run any stan model - we used the usual one in the start-up documentation, and put it in code.stan, and then...

path <- tempfile()
dir.create(path)
mod <- cmdstanr::cmdstan_model('code.stan', dir = path)

The above two files get modified, and their date will be changed.

Expected behavior
We should not have to grant users write-access to the cmdstan installation for them to run a model.

Operating system
Windows 10

CmdStanR version number
8.0.1

Additional context
We are running Stan in an HPC cluster context, and the issue cause us two problems:-

  1. It requires all users to have write-access to the installation of cmdstan, otherwise they get an access denied error writing the ucrt file and their job fails.

  2. Further, there are possibilities that multiple users, or even the same user might run multiple Stan jobs on the same node at the same time, which will conflict with each other if a shared cmdstan install folder is used for read/writing as part of the process.

We definitely do not want to install a new separate private cmdstan installation for every cluster job, and it's surprising to us that running a model makes actual file changes within the cmdstan installation folders. Is there a way of ensuring this happens elsewhere? We had hoped providing a dir argument would allow this.

@weshinsley weshinsley added the bug Something isn't working label Jun 12, 2024
@andrjohns
Copy link
Collaborator

These files are not created/modified every time that a cmdstan model is executed - only the first time. If you install cmdstan and compile a model, these files will be created and will not be modified in subsequent calls

@andrjohns andrjohns removed the bug Something isn't working label Jun 12, 2024
@weshinsley
Copy link
Contributor Author

weshinsley commented Jun 12, 2024

We are observing something different to that - here is the relevant directory:-

 Directory of I:\cmdstan\cmdstan-2.35.0\stan\lib\stan_math\make

11/06/2024  14:03    <DIR>          .
11/06/2024  14:03    <DIR>          ..
03/06/2024  15:43             2,061 clang-tidy
03/06/2024  15:43            15,664 compiler_flags
03/06/2024  15:43               818 cpplint
03/06/2024  15:43               305 dependencies
03/06/2024  15:43            11,248 libraries
03/06/2024  15:43             1,137 standalone
03/06/2024  15:43             7,120 tests
12/06/2024  18:33                16 ucrt
               8 File(s)         38,369 bytes

And after running the same model again:-

 Directory of I:\cmdstan\cmdstan-2.35.0\stan\lib\stan_math\make

11/06/2024  14:03    <DIR>          .
11/06/2024  14:03    <DIR>          ..
03/06/2024  15:43             2,061 clang-tidy
03/06/2024  15:43            15,664 compiler_flags
03/06/2024  15:43               818 cpplint
03/06/2024  15:43               305 dependencies
03/06/2024  15:43            11,248 libraries
03/06/2024  15:43             1,137 standalone
03/06/2024  15:43             7,120 tests
12/06/2024  22:13                16 ucrt
               8 File(s)         38,369 bytes

and if I remove my permissions so I don't have write-access to this directory, then I get this error early in the job.

/bin/sh: line 1: stan/lib/stan_math/make/ucrt: Permission denied

@andrjohns
Copy link
Collaborator

@WardBrian it looks like the UCRT flag cache step is actually still running the detection and write step even when the file already exists

@weshinsley to workaround this for now, you can delete/comment L194-204 in stan/lib/stan_math/make/compiler_flags:

  make/ucrt:
    pound := \#
    UCRT_STRING := $(shell echo '$(pound)include <windows.h>' | $(CXX) -E -dM -  | $(STR_SEARCH) _UCRT)
    ifneq (,$(UCRT_STRING))
      IS_UCRT ?= true
    else
      IS_UCRT ?= false
    endif
    $(shell echo "IS_UCRT ?= $(IS_UCRT)" > $(MATH)make/ucrt)

  include make/ucrt

And add the following to your make/local file (you're using rtools44, so we know it's UCRT):

IS_UCRT=true

@weshinsley
Copy link
Contributor Author

weshinsley commented Jun 12, 2024

Thanks - that seems to solve the problem for cmdstan-2.35.0\stan\lib\stan_math\make\ucrt - but cmdstan-2.35.0\src\cmdstan\main.o is still getting rewritten every time. If I don't have write-access there, I get...

Assembler messages:
Fatal error: can't create src/cmdstan/main.o: Permission denied

make: *** [make/program:14: src/cmdstan/main.o] Error 1

@weshinsley
Copy link
Contributor Author

(I searched the entire set of files for ones with a date change, and it was only those two examples. I am not sure if anything else gets created/deleted as part of the build process; only these two gave me permission problems)

@weshinsley
Copy link
Contributor Author

Also just to say - thank you for the rapid responses/suggestions with this. I am on UK time so signing off tonight, but will try any suggestions in the morning and report back.

@WardBrian
Copy link
Member

There are two issues here:

  • in general, cmdstan’s current build system is just not set up to work without write access to its own directory. See Error compiling a simple model with read-only access to CmdStan cmdstan#1175 for some discussion. An alternative build system using something like cmake, which would allow for this, is on our wish list.
  • separate from the fact that it expects write access in general, it may also be the case that it is writing in some situations where it shouldn’t. The UCRT file is one such example.

The first will continue to give you headaches even if we fix all the examples of the second. For example, the first time someone wants to build a model with multithreading support, cmdstan will attempt to create a src/cmdstan/main_threads.o. With our current build system, there is really no good way around it. In a shared environment, the best you could currently do is probably have a copy of the Stan math library somewhere that all users point their cmdstan/stan installation at, and have each user have their own cmdstan in a writable location

@weshinsley
Copy link
Contributor Author

Wouldn't each user potentially have to have a separate cmdstan install per cluster job that might concurrently run? For users running a lot of jobs, that's potentially quite a burden.

If writing those .o files in some other working directory is not possible, then perhaps the way for now (hinted at in the other issue I think) is to have our shared read-only cmdstan and make a complete copy of it in a temp directory at the start of every stan cluster job, then delete that after the job has finished.

@WardBrian
Copy link
Member

That might be necessary if different models are needed at different points during the cluster run.

In my experience (with cmdstanpy, not cmdstanr, but I imagine the same is possible) I have usually compiled my model once, on the head node, and then just copied the executable to each worker node and instantiated my cmdstanmodel object with just that exe, not the path to the .stan file. This will not invoke the cmdstan build system at all during the job, unlike passing the path to the stan file, which does invoke to to make sure everything is “up to date”

@andrjohns
Copy link
Collaborator

Given that cmdstan_main.o is just being repeatedly linked against when compiling a model, could we instead build cmdstan_main_* as a shared/static library when build is called? That way the model compilation shouldn't modify the file

@WardBrian
Copy link
Member

There are technical combinatorially many main.o options - THREADS, MPI, OPENCL, and NORANGE, plus all possible combinations of them…

@weshinsley
Copy link
Contributor Author

In my experience (with cmdstanpy, not cmdstanr, but I imagine the same is possible) I have usually compiled my model once, on the head node, and then just copied the executable to each worker node and instantiated my cmdstanmodel object with just that exe, not the path to the .stan file.

This helps once you've got the exe, but to build it, you have to have write-access - hence your own private cmdstan copy on the head-node, so that it can change its own main.o when it builds your model, and so that no-one else uses the same cmdstan installation at the same time for building their model.

While this could be configured, it's very unusual to open up write-access to tools in that way, and with a lot of users, at 2.5Gb per current cmdstan installation, it feels likely to cause a lot of messy storage use and not be scalable.

Ultimately, just being able to set a working directory for the build seems to me the ideal angle to try and improve this. Clearly just in our conversation here, the current approach of stan(cmd) is difficult to work with, and HPC cluster use of Stan is only going to increase.

@WardBrian
Copy link
Member

Ultimately, just being able to set a working directory for the build seems to me the ideal angle to try and improve this.

Yes, this would be ideal. Unfortunately re-writing the build system is difficult, but some work is ongoing.

I think one should think of the source files in the CmdStan installation as part of the user's code, for now. I believe the vast majority of the file size is from the math library and it's dependencies, so having a built version of that somewhere which other installations can point to should cut down on the directory issues

@weshinsley
Copy link
Contributor Author

OK - are there any docs/pointers for how to build the stan math library on its own, and then get that wired into cmdstan for when it builds your model?

It sounds like a way forward if that can reduce the size of the cmdstan installation we'd need to replicate for all jobs to something much more minimal. It would also have the advantage of speeding up model-building, which is taking a few minutes in our tests so far.

@WardBrian
Copy link
Member

The minimal set up is:

  • clone https://github.com/stan-dev/math to the desired location
  • someone with write permissions there should cd to that location and run make -f make/standalone math-libs to build the dependencies
  • people who want to use this version from inside cmdstan can set a variable called MATH to point to that location, either on the command line, as an environment variable, or in $cmdstan/make/local

The same could be done with stan-dev/stan and the STAN setting if all your users set PRECOMPILED_HEADERS=false, otherwise writes to the stan subfolder are necessary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants