Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache incorrectly initialized when no space left on device #2

Open
ahendriksen opened this issue Apr 21, 2023 · 1 comment
Open

Cache incorrectly initialized when no space left on device #2

ahendriksen opened this issue Apr 21, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@ahendriksen
Copy link
Contributor

Describe the bug
I had little space left in the volume hosting my .cache directory. So constructing the cache failed (see backtrace 0). When I made more space and reran nixglhost, it failed (with backtrace 1). I had expected it to rebuild the cache first.

To resolve, I deleted the cache and everything worked fine again:

$ rm -rf .cache/nix-gl-host
$ rm -rf .cache/nix-gl-host.lock
$ nixglhost -p
$ nixglhost nvidia-smi

Backtrace 0

$ nixglhost nvidia-smi                                                                                                                                                                                                                                             
Traceback (most recent call last):                                                                                                                                                                                                                                              
  File "/nix/store/c1vb2z3c64i0sd92iz7fv0lb720qcvhb-python3-3.10.6/lib/python3.10/shutil.py", line 815, in move                                                                                                                                                                 
    os.rename(src, real_dst)                                                                                                                                                                                                                                                    
OSError: [Errno 18] Invalid cross-device link: '/tmp/tmpdf59_vtp/nix-gl-host' -> '/home/ahendriksen/.cache/nix-gl-host'                                                                                                                                                         
                                                                                                                                                                                                                                                                                
During handling of the above exception, another exception occurred:                                                                                                                                                                                                             
                                                                                                                                                                                                                                                                                
Traceback (most recent call last):                                                                                                                                                                                                                                              
  File "/nix/store/0vgv4lwdr98y5m7ffxi3dlf5s3azn02f-nix-gl-host-0.1/bin/nixglhost", line 680, in <module>                                                                                                                                                                       
    ret = main(args)                                                                                                                                                                                                                                                            
  File "/nix/store/0vgv4lwdr98y5m7ffxi3dlf5s3azn02f-nix-gl-host-0.1/bin/nixglhost", line 628, in main                                                                                                                                                                           
    new_env = nvidia_main(cache_dir, host_dsos_paths, args.print_ld_library_path)                                                                                                                                                                                               
  File "/nix/store/0vgv4lwdr98y5m7ffxi3dlf5s3azn02f-nix-gl-host-0.1/bin/nixglhost", line 576, in nvidia_main                                                                                                                                                                    
    shutil.move(tmp_cache_dir, os.path.split(cache_dir)[0])                                                                                                                                                                                                                     
  File "/nix/store/c1vb2z3c64i0sd92iz7fv0lb720qcvhb-python3-3.10.6/lib/python3.10/shutil.py", line 831, in move                                                                                                                                                                 
    copytree(src, real_dst, copy_function=copy_function,                                                                                                                                                                                                                        
  File "/nix/store/c1vb2z3c64i0sd92iz7fv0lb720qcvhb-python3-3.10.6/lib/python3.10/shutil.py", line 558, in copytree                                                                                                                                                             
    return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,                                                                                                                                                                                                      
  File "/nix/store/c1vb2z3c64i0sd92iz7fv0lb720qcvhb-python3-3.10.6/lib/python3.10/shutil.py", line 512, in _copytree                                                                                                                                                            
    raise Error(errors)                                                                                                                                                                                                                                                         
shutil.Error: [('/tmp/tmpdf59_vtp/nix-gl-host/[.. snip ..]', 
'/home/ahendriksen/.cache/nix-gl-host/[.. snip ..]', 
'[Errno 28] No space left on device'), 
[.. snip more such tuples ..] ]   

Backtrace 1

[nix-shell:~]$ nixglhost nvidia-smi                                            
Traceback (most recent call last):                                             
  File "/nix/store/0vgv4lwdr98y5m7ffxi3dlf5s3azn02f-nix-gl-host-0.1/bin/nixglhost", line 680, in <module>                                                       
    ret = main(args)                                                                                                                                               
  File "/nix/store/0vgv4lwdr98y5m7ffxi3dlf5s3azn02f-nix-gl-host-0.1/bin/nixglhost", line 628, in main                                                                                                                                         
    new_env = nvidia_main(cache_dir, host_dsos_paths, args.print_ld_library_path)                                                                                                                                                  
  File "/nix/store/0vgv4lwdr98y5m7ffxi3dlf5s3azn02f-nix-gl-host-0.1/bin/nixglhost", line 583, in nvidia_main
    assert nix_gl_ld_library_path, "The nix-host-gl LD_LIBRARY_PATH is not set"
AssertionError: The nix-host-gl LD_LIBRARY_PATH is not set                                                                                                      

To Reproduce
I guess limit the space available in the volume hosting the .cache directory and run nixglhost -p twice.

Expected behavior

I had expected nixglhost to rebuild the cache successfully on the second run of nixglhost -p (after I had made enough space).

Additional context

Thanks for developing this tool! I have only run it once and I already have the feeling it will be invaluable!

@ahendriksen ahendriksen added the bug Something isn't working label Apr 21, 2023
@ahendriksen ahendriksen changed the title More incomplete cache issues Cache incorrectly initialized when no space left on device Apr 21, 2023
picnoir added a commit that referenced this issue Apr 22, 2023
@picnoir
Copy link
Member

picnoir commented Apr 22, 2023

Haha, I did not see that one coming. I see what's happening here.

Thanks for the detailed bug report! I need to try to find a way to test that.


Notes to future self: create small tmpfs for cache dir, couple that with large tmpdir to repro.

[Edit]: thanks for the kind words btw!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants