$ docker run -ti fedora:rawhide bash [root@fcf191dbace3 /]# dnf update filesystem Fedora - Rawhide - Developmental packages f 708 kB/s | 60 MB 01:26 0 Last metadata expiration check: 0:01:00 ago on Fri Feb 23 12:26:00 2018. Dependencies resolved. ============================================================================ Package Arch Version Repository Size ============================================================================ Upgrading: filesystem x86_64 3.8-2.fc28 rawhide 1.1 M Transaction Summary ============================================================================ Upgrade 1 Package Total download size: 1.1 M Is this ok [y/N]: y Downloading Packages: filesystem-3.8-2.fc28.x86_64.rpm 520 kB/s | 1.1 MB 00:02 ---------------------------------------------------------------------------- Total 110 kB/s | 1.1 MB 00:10 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Running scriptlet: filesystem-3.8-2.fc28.x86_64 1/1 Preparing : 1/1 Upgrading : filesystem-3.8-2.fc28.x86_64 1/2 Error unpacking rpm package filesystem-3.8-2.fc28.x86_64 Error unpacking rpm package filesystem-3.8-2.fc28.x86_64 error: unpacking of archive failed on file /proc: cpio: chown filesystem-3.8-2.fc28.x86_64 was supposed to be installed but is not! Verifying : filesystem-3.8-2.fc28.x86_64 1/2 filesystem-3.5-1.fc28.x86_64 was supposed to be removed but is not! Verifying : filesystem-3.5-1.fc28.x86_64 2/2 Failed: filesystem.x86_64 3.8-2.fc28 Error: Transaction failed
This bug appears to have been reported against 'rawhide' during the Fedora 29 development cycle. Changing version to '29'.
From discussion in mock pull-request [1] related to fedora toolbox, it seems like work-around for this issue could be to define that /proc:/sys are on %_netsharedpath, from /usr/lib/rpm/macros: # A colon separated list of paths where files should *not* be installed. # Usually, these are network file system mount points. # #%_netsharedpath Would it be possible to make the /proc directory %ghost or something, and let some scriptlet to create it? [1] https://github.com/rpm-software-management/mock/pull/234
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle. Changing version to '31'.
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle. Changing version to 31.
*** Bug 1837787 has been marked as a duplicate of this bug. ***
An easy reproducer: dnf upgrade filesystem --releasever=<N+1>
It may be off-topic, dunno, but glibc-common installation fails too if filesystem installation fails: ... Error unpacking rpm package filesystem-3.14-2.fc32.x86_64 Installing : basesystem-11-9.fc32.noarch 14/142 error: unpacking of archive failed on file /proc: cpio: chown error: filesystem-3.14-2.fc32.x86_64: install failed Installing : coreutils-common-8.32-4.fc32.1.x86_64 15/142 .... Installing : glibc-common-2.31-2.fc32.x86_64 25/142 Error unpacking rpm package glibc-common-2.31-2.fc32.x86_64 Running scriptlet: glibc-2.31-2.fc32.x86_64 26/142 error: unpacking of archive failed on file /usr/lib/locale/C.utf8: cpio: mkdir error: glibc-common-2.31-2.fc32.x86_64: install failed ...
This is happening again now in rawhide again. Any chance this can be addressed? https://pipelines.actions.githubusercontent.com/0v9771Hs5cPQUVTnXwXZ0gqk9opLP1Q0hyLl3htdE40JtWxNLO/_apis/pipelines/1/runs/1145/signedlogcontent/3?urlExpires=2020-08-03T18%3A27%3A13.4572678Z&urlSigningMethod=HMACV1&urlSignature=tS6uOPk0ieLeYl8hNtkofiYbaxPtrXgV9%2BI120WjE0Q%3D 2020-08-03T17:53:59.0319645Z Failed: 2020-08-03T17:53:59.0320145Z filesystem-3.14-2.fc32.x86_64 filesystem-3.14-3.fc33.x86_64 2020-08-03T17:53:59.0320227Z 2020-08-03T17:53:59.1949990Z Error: Transaction failed At minimum, rawhide container should be respun every time filesystem gets updated.
This not a bug in filesystem package. The filesystem package states that /proc is owned by root:root and thus rpm tries to fix the directory ownership when upgrading the filesystem package. But /proc is a mount point owned by name space where the user ID does not map to a superuser in the container. In other words this a limitation how UID mapping is configured when spawning the container. *** This bug has been marked as a duplicate of bug 1708249 ***
Well, I've seen similar problems in docker and systemd-nspawn (with mock). And the problems existed even when user namespaces were not enabled.
And was /proc owned by root:root inside the container? If it was, then rpm could be taught not to call the idempotent chown(). I tried it with docker now. Because docker still does not support user name space, running it as a normal user fails on DNF insisting on being run by super user, and running it as root ("docker run -u root") works for me. Exactly as recommended in the bug #1708249 ("If you need to install/update the filesystem package you have to run podman/buildah as root").
I am pretty lazy to reproduce the original issues with docker (no docker on my box these days, sorry). Comment #2 is running mock from toolbox, and it failed. But that falls into the same category you describe (toolbox should be mostly unprivileged). So accepted. Though don't claim it isn't bug (somewhere). No matter privileged/non-privlieged, buildah or normal 'podman run', when I'm running container and I am able to update any other package -- I should be able to update filesystem as well. And we should do something about it, or?
The major benefit of user namespaces in OpenShift is that we can actually run "root" (in container) processes and install/update RPM packages. Otherwise people tend to use our images -- but install external software (PyPI, rubygems,...) instead of RPMs. We should at least give a proper name to this problem, and create some guideline so we don't create package updates that can not be installed/updated inside non-privileged containers. In the worst case, it can just mean "no filesystem.rpm updates during one Fedora release" guideline. But I personally don't see anything wrong on relaxing filesystem's %files section with %ghost (+ %post scriptlet). Or inventing e.g. some new %files %verify-like option.
There is nothing special on "filesystem" package. You can observe the same failure with any package when you mount an unshared filesystem over one of its files. Therefore changing this package is not a fix. It's just a workaround. And thus I understand why the package maintainer does not want to do it. In addition, your proposal with RPM post scripts is also bad, because it goes against a trend with converting scripts into a declarative definition. It's required by OSTree. In my opinion it's a deficiency in rpm tool. Changing an ownership of an i-node on the proc pseudofilesystem does no make sense, because the change disappear after remounting the file system. The meaning of /proc entry in filesystem package is create a /proc on a root file system. Not on the proc file system. Thus in ideal world, rpm should recognize that it's a pseudo file system, enter a new mount name space, unmount /proc there, install /proc entry (this time on the root file system), and exit the name space. Otherwise, the mission of rpm is not completed. Imagine you delete /proc from the root file system, then successfully install filesystem package, and after rebooting, mounting /proc will fail because of no /proc mount point. So if you want really fix it, you need to reassign this bug to rpm component.
> In addition, your proposal with RPM post scripts is also bad, because it > goes against a trend with converting scripts into a declarative > definition. It's required by OSTree. I agree. It's ugly, but that's the only possible workaround (except for the %_netsharedpath hack, which is even uglier). Except that it is ugly, are there any practical problems in doing so? > And thus I understand why the package maintainer does not want to do it. I do as well. That's why I didn't submit PR until now... filesystem seems to be the only package causing real-life headaches though, and I'm changing my mind: https://src.fedoraproject.org/rpms/filesystem/pull-request/5 > Thus in ideal world, rpm should recognize that it's a pseudo file > system, enter a new mount name space, unmount /proc there, install /proc > entry (this time on the root file system), and exit the name space. This is a good way of thinking, thank you. (a) Many times we benefit from RPM touching the actual mount points... (it is a very common practice to mount an empty volume to a package path, and reinstall the package to pre-populate the data on the mount point). But (b) users may as well want to do what you propose even with normal mount points, does that make sense? Is this detail really worth special-casing a certain set of filesystems? Considering the special filesystems may be mounted anywhere, is it worth special-casing also a concrete set of paths in RPM? I think that it would be much easier to have a way to declare the purpose of certain paths, and keep them untouched by RPM if someone else takes care of them (as container environments take care of /proc, /sys, etc.).
Thus I wrote "recognize that it's a pseudo file system". The proposed exercise with mounts does not make sense for a normal file system, because a normal file system preserves the changes and, as you wrote, people want to slice the virtual file system among more file systems. I don't think rpm should special-case on a path (/proc). It should decide on a file system type (proc). You are right that it will be difficult.
This bug appears to have been reported against 'rawhide' during the Fedora 33 development cycle. Changing version to 33.
Clearing needinfo... as Petr said, it is not clear bug in filesystem package, more limitation of the rpm - and how it handles some special situations with shared mountpoints. There is possible workaround with ghosting /proc and /sys dirs and lua post scriptlet... but it is just workaround of broader problem.
> as Petr said, it is not clear bug in filesystem package, more limitation > of the rpm - and how it handles some special situations with shared > mountpoints. There is possible workaround with ghosting /proc and /sys > dirs and lua post scriptlet... but it is just workaround of broader > problem. Even though I sort of agree, we have no realistic way out than accept some work-around for foreseeable future. At least I don't see anyone pushing the RPM upstream development forward WRT this issue. Yes, any package can be affected by this problem -- but I haven't seen any other complaint so far. The thing probably is that so far nobody does 'dnf update' in production environments during container _runtime_; majority of use-cases run 'dnf' at container image _build time_. And at that point, it is really unlikely that there's any other mount point than /proc and /sys causing problems. I'm afraid delaying the workaround doesn't make sense in this case (if it makes, please tell), it only causes unnecessary users' headaches.
Hi Pavel, LGTM and I understand the problem (at least I think so :) ) There was similar issue with setup package recently which was fixed in RPM. I wonder if it may cause some issues in installation process/Anaconda. Do we have test coverage for this? Is rawhide enough for now?
From wat I've heard, Anaconda uses mock somehow, and mock installs the packages into a separate chroot where (at the installation time) are no preexisting mountpoints like /proc. The main difference between and the others is that packages are installed into container from within the container itself, elsewhere we install from the outside by 'rpm --installroot <directory>'. > Is rawhide enough for now? Definitely. It would be nice to have this in ELN so the next RHEL is fixed.
> Is rawhide enough for now? From what I observe, we rarely update filesystem package through bodhi anyway.
> Do we have test coverage for this? I doubt so, but testing this should be about as easy as: $ podman run --rm -ti fedora:33 bash [root@41d55ba8e20c /]# dnf distro-sync --releasever=34
Adding myself to this - as I've just done a heap of LXC container upgrades from F32 -> F33 and every one failed with the upgrade of filesystem. I think this will probably stop dnf-automatic from doing its thing as well... Whilst it may not be a bug per se, its certainly something that needs to be fixed - likely in all streams (Fedora and RHEL) - especially as LXC / Docker etc are becoming much more popular with time...
The proposed PR has been merged some time ago, thank you @pzhukov! The fix is applied in F34+ at this moment. While I don't see any issue in merging back to F33, doing the Bodhi update only to fix Bodhi updates would be a bit weird. It is questionable how much important is the use-case of upgrading Fedora N to Fedora N+1 _inside_ containers. From my point of view, the fix should be backported to Fedora <= 33 if and only if we had to do the update anyway (because of something else). Otherwise, IMO this should be fixed in ELN.
With respect, I disagree. Fedora 32 -> 33 is a valid upgrade path and well within support lifetime windows. I would suggest that maybe backporting as far as Fedora 32 is a possibility - but kind of pushing the limit. I feel that backporting to at least Fedora 33 should be required - simply because it is currently the supported version. Everything newer is really pre-release.
(In reply to Pavel Raiskup from comment #25) > While I don't see any issue in merging back to F33, doing the Bodhi update > only to fix Bodhi updates would be a bit weird. It is questionable how > much important is the use-case of upgrading Fedora N to Fedora N+1 > _inside_ containers. This still affects updating my F32 toolbox container (to latest F32). This is not about upgrades. (I think you mentioned upgrades as an easy way to reproduce.)
I meant that upgrading container F32 to F33 is something rather rare (and that the patch currently in Rawhide may be a bit risky to backport); it's IMO easier to build the container image from scratch against the newer repo. Sorry for confusion, I didn't find at that time a better reproducer than just bump the --releasever. But yes, FWIW, the fact that e.g. update FEDORA-2020-7fec35fe20 in F32 is rather fresh - I can admit it may be worth backporting. It's up to the maintainers though.
I wish to point out that the adoption of solutions like Proxmox have been bringing LXC containers into many hobby situations (the Fedora bread and butter) where they are usually treated as VMs by the users. I don't believe its true anymore to assume that containers are easily replaceable because that's how its been in the CentOS / RHEL environment traditionally. There are a lot of changes going on over the last year or so in how LXC is being used and implemented - as such, I don't think the old assumptions are correct anymore. The advantage being that LXC installs vs traditional KVM style VMs are much more lightweight on the hardware being used - so we're seeing it used in everything from home automation to remote 'workstations'.
(In reply to Pavel Raiskup from comment #28) > But yes, FWIW, the fact that e.g. update FEDORA-2020-7fec35fe20 in F32 is > rather fresh - I can admit it may be worth backporting. It's up to the > maintainers though. I prefer to NOT backport such kind of "corner case fixes" into stable releases. System upgrade inside of the container is corner case. Filesystem is core package and pushing changes into it requires more testing than resting in testing repo for 7 days.
> System upgrade inside of the container is corner case. System update, not upgrade
For what its worth, and given this will likely have more people hit this issue as the Fedora 33 release gets older and more people upgrade, F32 -> F33 etc, the following workaround allows things to work: # dnf upgrade filesystem --releasever=34 This will allow the upgrade of this package without causing other issues with normal updates etc... Not ideal, but at least it can be used as a workaround...
FWIW, /dev is also a mount point that will have the same problems as /proc and /sys in a toolbox container for example: $ toolbox create -r 34 [...] $ toolbox enter -r 34 $ mount | grep "/dev " devtmpfs on /dev type devtmpfs (rw,nosuid,noexec,seclabel,size=12226400k,nr_inodes=3056600,mode=755,inode64) devtmpfs on /home/hadess/.local/share/containers/storage/overlay/90c78f938c7caa3cd027e22f6652e05b413c39376fa0fc5b4485caf121ae1477/merged/dev type devtmpfs (rw,nosuid,noexec,seclabel,size=12226400k,nr_inodes=3056600,mode=755,inode64) $ sudo dnf update filesystem Fedora 34 openh264 (From Cisco) - x86_64 4.5 kB/s | 2.5 kB 00:00 Fedora - Modular Rawhide - Developmental packages for the next Fedora release 3.3 MB/s | 5.3 MB 00:01 Fedora - Rawhide - Developmental packages for the next Fedora release 16 MB/s | 73 MB 00:04 Dependencies resolved. =================================================================================================================================================================================================================== Package Architecture Version Repository Size =================================================================================================================================================================================================================== Upgrading: filesystem x86_64 3.14-4.fc34 rawhide 1.1 M Transaction Summary =================================================================================================================================================================================================================== Upgrade 1 Package Total download size: 1.1 M Is this ok [y/N]: y Downloading Packages: filesystem-3.14-4.fc34.x86_64.rpm 4.6 MB/s | 1.1 MB 00:00 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Total 3.1 MB/s | 1.1 MB 00:00 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Running scriptlet: filesystem-3.14-4.fc34.x86_64 1/1 Preparing : 1/1 Upgrading : filesystem-3.14-4.fc34.x86_64 1/2 Error unpacking rpm package filesystem-3.14-4.fc34.x86_64 Verifying : filesystem-3.14-4.fc34.x86_64 1/2 Verifying : filesystem-3.14-3.fc33.x86_64 2/2 Failed: filesystem-3.14-3.fc33.x86_64 filesystem-3.14-4.fc34.x86_64 Error: Transaction failed
The difference is that podman doesn't set the ownership to 'nobody:nobody' for /dev. So updating RPM shouldn't fail on "fixing" the ownership (as with /proc and /sys) as that should be just fine. What is 'rpm -V filesystem' saying before the attempt to upgrade?
(In reply to Pavel Raiskup from comment #34) > The difference is that podman doesn't set the ownership > to 'nobody:nobody' for /dev. So updating RPM shouldn't > fail on "fixing" the ownership (as with /proc and /sys) > as that should be just fine. What is 'rpm -V filesystem' > saying before the attempt to upgrade? $ rpm -V filesystem .M....... / .....UG.. /dev .....UG.. /media .....UG.. /mnt .....UG.. /proc .....UG.. /sys .....UG.. /tmp missing /usr/share/locale/en missing /usr/share/locale/en/LC_MESSAGES missing /usr/share/locale/en@arabic missing /usr/share/locale/en@arabic/LC_MESSAGES missing /usr/share/locale/en@boldquot missing /usr/share/locale/en@boldquot/LC_MESSAGES missing /usr/share/locale/en@cyrillic missing /usr/share/locale/en@cyrillic/LC_MESSAGES missing /usr/share/locale/en@greek missing /usr/share/locale/en@greek/LC_MESSAGES missing /usr/share/locale/en@hebrew missing /usr/share/locale/en@hebrew/LC_MESSAGES missing /usr/share/locale/en@piglatin missing /usr/share/locale/en@piglatin/LC_MESSAGES missing /usr/share/locale/en@quot missing /usr/share/locale/en@quot/LC_MESSAGES missing /usr/share/locale/en@shaw missing /usr/share/locale/en@shaw/LC_MESSAGES missing /var/cache/bpf Note that the 2-command reproducer is listed in my comment, if you need more information.
Indeed, toolbox does a different thing... $ ls -alh / | grep nobody drwxr-xr-x. 22 nobody nobody 4.2K Dec 10 09:03 dev drwxr-xr-x. 2 nobody nobody 6 Jul 27 20:22 media drwxr-xr-x. 4 nobody nobody 42 Jul 27 20:22 mnt dr-xr-xr-x. 467 nobody nobody 0 Dec 7 08:28 proc dr-xr-xr-x. 13 nobody nobody 0 Dec 7 08:28 sys drwxrwxrwt. 55 nobody nobody 2.5K Dec 11 14:26 tmp from the (even --privileged) rootless podman: # ls -alh / | grep nobody dr-xr-xr-x. 470 nobody nobody 0 Dec 11 13:28 proc dr-xr-xr-x. 13 nobody nobody 0 Dec 7 07:28 sys So it looks only the toolbox is affected. This should be consulted in toolbox people, and I think it deserves another bug report for filesystem if this is really expected.
toolbox just uses podman. I've filed a new bug and CC:ed the toolbox maintainer on it: https://bugzilla.redhat.com/show_bug.cgi?id=1906833
Crossposting from https://bugzilla.redhat.com/show_bug.cgi?id=1906833#c12 > Is there some reason this would be a problem again upgrading a F35 toolbox container to F36 (on a silverblue F36 host)? The workaround doesn't seem to make a difference now
I think there were a combination of factors, as I was able to resolve it by: - dnf autoremove - dnf list --installed | rg fc35 - dnf remove fc35 packages - remove the /etc/rpm/macros.dist file https://ask.fedoraproject.org/t/how-to-upgrade-fedora-version-inside-a-toolbox-container/10463/8