Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad interaction between selinux and an APFS-backed directory shared over virtiofs #70

Open
cfergeau opened this issue Dec 6, 2023 · 17 comments

Comments

@cfergeau
Copy link
Collaborator

cfergeau commented Dec 6, 2023

Creating an issue for the record, when using virtio-fs:

xattr/setfattr is working and persisted across reboots, but with one big caveat: on linux, security.selinux is special, and can be set even on read-only files (0444 permissions). On macOS, this is not true, if the file is read-only, xattr can't be used to set the security.selinux xattr.

This causes issues with podman-remote run -v /Users/teuf/some-folder:/some/path:Z as this this involves some setfattr calls which failed (Error: error preparing container a1961ae610561be62638aef839ad9f7f2735a16f3fcf15ece95b09865167e415 for attach: setxattr /Users/teuf/dev/crc/.git/objects/00/63a6fe0324d63105a6649c0b46fb27ad767b17: permission denied).

To avoid this, the filesystem in the VM can be mounted with: mount -t virtiofs -o context=system_u:object_r:container_file_t:s0 and the container started with podman-remote run -v /Users/teuf/dev:/test --rm registry.access.redhat.com/ubi8 ls -alZ /test/crc from macOS

@cfergeau
Copy link
Collaborator Author

cfergeau commented Dec 6, 2023

When sharing a directory between a linux guest and the macos host with virtiofs, the behaviour for setting xattrs on a read-only file inside the guest differs between the linux fs and the virtiofs share.

For example, on a fedora coreos VM:

[core@podman ~]$ mkdir ~/selinux-test-ro
[core@podman ~]$ chmod 400 selinux-test-ro/
[core@podman ~]$ getfattr -m - -d /home/core/selinux-test-ro/
getfattr: Removing leading '/' from absolute path names 
# file: home/core/selinux-test-ro/ security.selinux="unconfined_u:object_r:user_home_t:s0" 
[core@podman ~]$ setfattr -n security.selinux -v "system_u:object_r:var_log_t:s0" selinux-test-ro 
[core@podman ~]$ getfattr -m - -d /home/core/selinux-test-ro getfattr: Removing leading '/' from absolute path names 
# file: home/core/selinux-test-ro/ security.selinux="system_u:object_r:var_log_t:s0" 


# the virtiofs share is mounted on /mnt: 
# vfkit-share on /mnt type virtiofs (rw,relatime,seclabel) 
[core@podman ~]$ mkdir /mnt/selinux-test 
[core@podman ~]$ mkdir /mnt/selinux-test-ro 
[core@podman ~]$ chmod 400 /mnt/selinux-test-ro/ 
[core@podman ~]$ getfattr -m - -d /mnt/selinux-test 
[core@podman ~]$ getfattr -m - -d /mnt/selinux-test-ro 
[core@podman ~]$ setfattr -n security.selinux -v "system_u:object_r:var_log_t:s0" /mnt/selinux-test 
[core@podman ~]$ getfattr -m - -d /mnt/selinux-test 
getfattr: Removing leading '/' from absolute path names 
# file: mnt/selinux-test security.selinux="system_u:object_r:var_log_t:s0" 

[core@podman ~]$ setfattr -n security.selinux -v "system_u:object_r:var_log_t:s0" /mnt/selinux-test-ro/ 
setfattr: /mnt/selinux-test-ro/: Permission denied 
[core@podman ~]$ getfattr -m - -d /mnt/selinux-test-ro 
[core@podman ~]$ 

This is an issue as when using podman run --volume with the z|Z option, a virtiofs share is used under the hood, and podman tries to change the selinux labels of all the files under the shared volume. The selinux label is changed by setting the security.selinux xattr, and it's unexpected that it's not possible for read-only files.

@rhatdan
Copy link

rhatdan commented Dec 6, 2023

Can the return be changes to ENOSUP rather then EPERM? Podman and container tools now handle ENOSUP on :Z and :z gracefully.

@rhatdan
Copy link

rhatdan commented Dec 6, 2023

BTW It seems like the MAC version makes more sense then the Linux one.

@rhvgoyal
Copy link

rhvgoyal commented Dec 6, 2023

virtiofs is simply a passthrough filesystem. If MacOS host does not allow setfattr operation, it will return that error back to the client.

So what you are seeing is not really related to virtiofs. Its just a setfattr behavior difference between macos and linux.

I find it odd that one can set xattr on read-only file. Say I want to change selinux label of a read-only file so that a certain process can 't get access to it. How would I do that?

My understanding of linux read-only mode is that it is about file data. You can't modify file's data. But it does not prohibit you from changing file's metadata.

If you are trying to make metadata also immutable, I think that will be "immutable" file. And one can use "chattr" for that.

So I would argue that tcalling it "virtio-fs limitation" is not the correct title for this issue.

@cfergeau cfergeau changed the title virtio-fs limitations Bad interactions between selinux and an APFS-backed directory shared over virtiofs Dec 7, 2023
@cfergeau
Copy link
Collaborator Author

cfergeau commented Dec 7, 2023

Testing this directly on the mac gives:

% xattr -w security.selinux 'system_u:object_r:var_log_t:s0' ./selinux-test-ro
xattr: [Errno 13] Permission denied: './selinux-test-ro'

errno 13 is EACCES, not EPERM (errno 1)


> So I would argue that tcalling it "virtio-fs limitation" is not the correct title for this issue.

Fair enough, I renamed it to " Bad interactions between selinux and an APFS-backed directory shared over virtiofs" which should be more accurate.

@cfergeau cfergeau changed the title Bad interactions between selinux and an APFS-backed directory shared over virtiofs Bad interaction between selinux and an APFS-backed directory shared over virtiofs Dec 7, 2023
@baude
Copy link
Collaborator

baude commented Dec 7, 2023

is there a solution in mind?

@cfergeau
Copy link
Collaborator Author

cfergeau commented Dec 7, 2023

So far I was thinking that for this one we were highly dependent on Apple, but seeing these comments from Dan Walsh:

Can the return be changes to ENOSUP rather then EPERM? Podman and container tools now handle ENOSUP on :Z and :z gracefully.
BTW It seems like the MAC version makes more sense then the Linux one.

maybe podman could be changed to also ignore EACCES when doing these mappings?

@rhatdan
Copy link

rhatdan commented Dec 7, 2023

Can we check if the underlying file system is read-only?

@rhatdan
Copy link

rhatdan commented Dec 7, 2023

Why do you have read-only content in your homedir?

@baude
Copy link
Collaborator

baude commented Dec 7, 2023

Why do you have read-only content in your homedir?

is it not more than a little likely users will have this kind of thing whether they know it or not ?

@rhatdan
Copy link

rhatdan commented Dec 8, 2023

Well they have to have a read-only partition on their homedir, and they have to attempt to share it into a container which they want to trigger a relabel. The issue is there is not a clear place to say ignore the relabel on ENOACCESS, if the partition is mounted read-only on a virtiofsd.

@cfergeau
Copy link
Collaborator Author

cfergeau commented Dec 8, 2023

Why do you have read-only content in your homedir?

Files in .git/objects/ and in ~/go/pkg/mod/ are read-only on my mac system, they were created this way by git/golang.

@rhatdan
Copy link

rhatdan commented Dec 9, 2023

Could you look at the labels of those files within the VM with a ls -lZ to see what label they had?

@cfergeau
Copy link
Collaborator Author

These read-only files exist on the macOS side, on the APFS filesystem it uses, they don't have any labels nor xattrs. podman-machine shares the full home directory over virtio-fs, and when the user of podman wants to expose one of these dirs in a container using -v...:Z, the failure occurs.

@rhatdan
Copy link

rhatdan commented Dec 13, 2023

I understand, the question is if they are not relabeled will they fail to be used within the Container, since SELinux will hit them with a permission denied when a process within a container attempts to read the content, If I know the label, then I can at least tell if SELinux will block reading.

As far as the Relabel, we could change the code to only warn on failures to relabel, but this could get mighty noisy.
No ideal solution to this issue.

@rhatdan
Copy link

rhatdan commented Jan 18, 2024

@slp could you check if libkrun has the same problems?

@slp
Copy link

slp commented Jan 18, 2024

@slp could you check if libkrun has the same problems?

No, it doesn't. In the virtio-fs implementation in libkrun we deliberately exclude macOS from participating in file permission decisions by storing the permission and ownership bits from Linux PoV as extended attributes:

bash-5.2# mkdir /work/ro
bash-5.2# chmod 400 /work/ro
bash-5.2# ls -l /work
total 0
dr-------- 2 root root 64 Jan 18 18:23 ro
bash-5.2# getfattr -m - -d /work/ro
bash-5.2# setfattr -n security.selinux -v "system_u:object_r:var_log_t:s0" /work/ro
bash-5.2# getfattr -m - -d /work/ro
getfattr: Removing leading '/' from absolute path names
# file: work/ro
security.selinux="system_u:object_r:var_log_t:s0"

bash-5.2#
exit
slp@Sergios-MacBook-Air selinux-test % ls -l
total 0
drwx------@ 2 slp  staff  64 Jan 18 19:23 ro
slp@Sergios-MacBook-Air selinux-test % xattr -l ro
security.selinux: system_u:object_r:var_log_t:s0
user.containers.override_stat: 0:0:040400

I'm afraid that if the virtio-fs implementation in vfkit tries to store the permission bits in Linux as actual permission bits in APFS we'll find plenty of corner cases like this, as the filesystem semantics between Linux and macOS aren't 1:1.

Probably with libkrun we simply noticed those issues earlier because the root filesystem is also mounted via virtio-fs, so we already adopted this alternative strategy for storing the permission and ownership bits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants