Description of problem: On X11, when the X server dies, gnome-shell is able to restart it. On wayland, when the wayland compositor dies, all GUI applications crash and the session dies, so all work is lost. On X11, when gnome-shell crashes, it can be restarted. Apart from some windows not remembering their previous position or workspace (happens sometimes), no work is lost. On wayland, when gnome-shell crashes, all GUI applications crash as a result and the session dies, so all work is lost. This is a regression of Gnome+Wayland compared to Gnome+X11 sessions. Since both wayland and gnome-shell happen to crash a lot, this regression seriously affects user experience with Gnome. Please consider working on this before making Gnome+Wayland the default. Version-Release number of selected component (if applicable): Any gnome-shell version up to 3.20.x (3.21.x not tested) with wayland support. Any wayland version up to 1.10.0 (newer versions not yet tested). How reproducible: Always due to design. Steps to Reproduce: 1. Press Alt+F2 2. enter "r", confirm with [Enter] key Actual results: On wayland, gnome-shell tells you "Restarting is not available on Wayland". Expected results: Restart gnome-shell Additional info: For getting an idea how often gnome-shell crashes, have a look at the retrace server: https://retrace.fedoraproject.org/faf/problems/?component_names=gnome-shell&associate=__None&daterange=2016-08-03%3A2016-08-17&bug_filter=None&function_names=&binary_names=&source_file_names=&since_version=&since_release=&to_version=&to_release= https://retrace.fedoraproject.org/faf/stats/last_week/#Fedora%2024 (gnome-shell is often within top 5 of bugs per Fedora release https://retrace.fedoraproject.org/faf/reports/?component_names=gnome-shell&associate=__None&first_occurrence_daterange=&last_occurrence_daterange=&order_by=last_occurrence Thanks to Sébastien Wilmet for bringing this up. One idea to achieve this goal is to split up gnome-shell into two processes: 1. the compositor / window manager 2. the GUI which is using crash-risky technologies like JavaScript
An old bug closed with disabling restart on wayland: https://bugzilla.gnome.org/show_bug.cgi?id=741665
(In reply to Christian Stadelmann from comment #0) > On X11, when the X server dies, gnome-shell is able to restart it. On > wayland, when the wayland compositor dies, all GUI applications crash and > the session dies, so all work is lost. Nope, that sentence is not accurate, on X11 if the X server dies, all X11 applications lose their connection with the X server, including the session manager, so the entire session goes along with it. Easy to try, log in GNOME on X11 and kill the X server.... If the window manager dies, though, the session manager can restart it, but it's not as bad as the X server crashing. On Wayland, the window manger (mutter/gnome-shell) also plays the role of Wayland compositor so losing the compositor in Wayland will take the session with it (just like losing the X server in X11).
Another issue that affects mutter/gnome-shell is its dependencies on Xwayland, so if Xwayland dies in GNOME on Wayland, mutter/gnome-shell also dies and the rest of the session follows. Weston, for example, can survive a crash of Xwayland, X11 apps will be lost but native Wayland clients will continue to work. See https://bugzilla.gnome.org/show_bug.cgi?id=759538
(In reply to Christian Stadelmann from comment #0) > One idea to achieve this goal is to split up gnome-shell into two processes: > 1. the compositor / window manager > 2. the GUI which is using crash-risky technologies like JavaScript Is it possible to architect wayland compositors in this way ? Certainly. Is it feasible to change gnome-shell in this way as a 'bug fix' ? Almost certainly not. Contributions to gnome-shell stability are more than welcome, but as it is, the suggestion is WONTFIX, I think.
(In reply to Matthias Clasen from comment #4) > (In reply to Christian Stadelmann from comment #0) > > > One idea to achieve this goal is to split up gnome-shell into two processes: > > 1. the compositor / window manager > > 2. the GUI which is using crash-risky technologies like JavaScript > > Is it possible to architect wayland compositors in this way ? Certainly. Is > it feasible to change gnome-shell in this way as a 'bug fix' ? Almost > certainly not. > > Contributions to gnome-shell stability are more than welcome, but as it is, > the suggestion is WONTFIX, I think. I know I have been talking about this before but I think eventually we really want to do the compositing/UI split. Not now, since it's a huge undertaking and will involve a massive amount of work, but eventually it's the solution that I think we should work towards - long term. The reason is not only because of stability, but for responsiveness and other things as well. I think for this bug, a WONTFIX is appropriate. There is no single "fix" for stability to do here. We just need to hunt down as many mutter/gnome-shell crashes as we can.
FWIW, for just making Alt-F2 "r" work, we could probably do something similar to what Enlightenment does: https://blogs.s-osg.org/recovery-journey-discovery/ . It'd involve protocol work as well as compositor and client support. To make it work for Xwayland is a completely different story (would need to design the protocol to take Xwayland into account, implement support in Xwayland and rewrite how Xwayland and mutter integrates so that Xwayland can survive a mutter restart). So it's also a non-trivial task that would take a lot of resources to fulfill, and wouldn't work for all clients anyway.
(In reply to Jonas Ådahl from comment #5) > I know I have been talking about this before but I think eventually we > really want to do the compositing/UI split. Any upstream plans on this? > I think for this bug, a WONTFIX is appropriate. There is no single "fix" for > stability to do here. We just need to hunt down as many mutter/gnome-shell > crashes as we can. This won't help much unless gnome-shell gets something like the "Session Recovery Extension" from the article you linked against. Right now, even if gnome-shell@wayland had just 10% of the bugs gnome-shell@x11 has (or had), user experience would suffer. The current design on wayland is "crash-prone", but it doesn't have to be this way. (In reply to Jonas Ådahl from comment #6) > FWIW, for just making Alt-F2 "r" work, we could probably do something > similar to what Enlightenment does: > https://blogs.s-osg.org/recovery-journey-discovery/ That looks like a possible resolution for this bug.
(In reply to Matthias Clasen from comment #4) > (In reply to Christian Stadelmann from comment #0) > > > One idea to achieve this goal is to split up gnome-shell into two processes: > > 1. the compositor / window manager > > 2. the GUI which is using crash-risky technologies like JavaScript > > Is it possible to architect wayland compositors in this way ? Certainly. Is > it feasible to change gnome-shell in this way as a 'bug fix' ? Almost > certainly not. > > Contributions to gnome-shell stability are more than welcome, but as it is, > the suggestion is WONTFIX, I think. It should be possible or why use Wayland? Separation is necessary for a good reliable system. Hunting down bugs is a good thing but that does not replace the need for a good recovery strategy. I have been experiencing a memory leak in gnome-shell since fedora 23. This results in a general slow down. I recover the performance by restarting gnome-shell. I can do this without crashing applications. I am not sure what the issue is but it seems to have something to do the GPU drivers leaking memory when playing video content on Firefox. This issue has been reported else where but that is not my point. My point is that gnome-shell must have the ability to perform a warm restart. The compositor part of gnome-shell needs to be the most simplest, bullet-proof thing you can imagine. Keep the features to a minimum and separate the window instances from gnome-shell gui stuff. Keep enough information so that gnome-shell-gui stuff can find the windows again by discovery when it restarts. It may be done by having these compositor resources owned by the applications. Then gnome-shell just needs to query the running applications to rebuild the desktop. It might be as simple as sending each application a "window-damage" event. Above all, keep the display and keyboard alive so support system maintenance and recovery.
Not going to happen in f24
Well, this will also affect F25 and any further version until this is fixed (it was reported upstream as https://bugzilla.gnome.org/show_bug.cgi?id=741665 but just disabling the ability for the user to restart the shell is not fixing the core problem, just hiding it)
Maybe we should investigate implementing something similar to what EFL-Wayland has? Which is to have app startup not be handled by the compositor (systemd-user would be the obvious candidate here).. And then add some code to be able to reconnect apps to the compositor when it crashes. Something like: https://blogs.s-osg.org/recovery-journey-discovery/ To a lot of people, this is a pretty bad blocker to using Wayland, as gnome-shell will never be stable enough to be a "never crashes" component of the system.
If this is feasible before F25, it would be great, but I think it would be a lot of rewrite and re-factoring for a code never written with this feature in mind. I believe that GNOME Shell on Wayland now is very stable, it hadn't crash on me except one time when I tried to use HDMI in a middle of a session and it returned me back to GDM, then it worked just fine. But it's annoying that I lose the shell extensions every time one of them is crashing, and I need to save my work and close all apps to log out and log in again in order to restore them. I know, it's not GNOME Shell fault that some extensions are buggy but those extensions make our workflow more usable and I need some of them in my work. I used to keep my session up for maybe a month without rebooting the laptop on Xorg, because if GNOME Shell crashed, or one of the extensions caused others to be disabled, or GNOME Shell just got slow over time, I can just use `Alt + F2 + 'r'` to fix the issue and move on.
(In reply to Olivier Crête from comment #11) > To a lot of people, this is a pretty bad blocker to using Wayland, as > gnome-shell will never be stable enough to be a "never crashes" component of > the system. Pretty negative attitude... good luck engineering workarounds then. As far as I am concerned, this is still WONTFIX. I'll leave this bug open now, but don't expect action here.
the efl thing doesn't actually help for crashes. and it is quite a bit of engineering just for a restart button...
Matthias, where's the best place for upstream discussion of this?
I've had about 1-2 gnome-shell crashes per month in Fedora 25. It's extremely frustrating when it happens since it kills everything I'm working on. From a functional standpoint I'd say it's about 75% as painful as a kernel crash, whereas a gnome-shell crash under Xorg was < 5% as painful as a kernel crash. In general, achieving crash independence between applications and gnome-shell is a great goal, and Gnome 3 + Wayland's current behavior is a regression in this regard. Remember how painful web browsing was before we had separate processes per tab/page? It was a killer feature when introduced. I agree with Matthias though that it makes sense to be a WONTFIX for the Red Hat / Fedora bugzillas. Fixing this is a big upstream change that should probably be discussed on https://bugzilla.gnome.org/ or gnome-shell-list.
*** Bug 1479408 has been marked as a duplicate of this bug. ***
Could we consider prioritizing this? I currently see 332 open abrt bugs against gnome-shell, and the FAF report consistently shows thousands of crashes a week. I don't think gnome-shell is particularly crashy as software goes, but it's in a uniquely prominent position, and the fact that this goes from "minor blip" to "lose your session" is a cause for concern in the Wayland shift. https://retrace.fedoraproject.org/faf/summary/?component_names=gnome-shell&daterange=2016-11-04%3A2017-11-03&resolution=w We're also seeing a lot of F27 crash reports (see chart) despite F27 still being in beta and having an install base about ¹⁄₁₀₀th of F26 or F25. I'm concerned that this will lead to negative perception of the release overall.
For what it's worth, this issue and the lack of pen support in XWayland is the only reason I'm still running the GNOME Xorg Session on all of my Fedora computers..
(In reply to Matthew Miller from comment #18) > Could we consider prioritizing this? I currently see 332 open abrt bugs > against gnome-shell, and the FAF report consistently shows thousands of > crashes a week. I don't think gnome-shell is particularly crashy as software > goes, but it's in a uniquely prominent position, and the fact that this goes > from "minor blip" to "lose your session" is a cause for concern in the > Wayland shift. +1, so I've nominated it.
This bug has been accepted on the list of Prioritized bugs: https://meetbot.fedoraproject.org/fedora-meeting/2017-11-08/fedora_prioritized_bugs_and_issues.2017-11-08-15.06.log.html#l-30
This message is a reminder that Fedora 25 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 25. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '25'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 25 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
As this is an architectural issue and upstream is aware and working on it we are removing this bug from the Prioritized list
Will this include considerations to do a universal compositor <-> wm protocol for this, and unification of KDE/GNOME/... compositor efforts into a single project? Right now there seems to be a lot of duplicate efforts (which at least to me as an outsider looks like a bad idea), and all the smaller window managers like xfce, i3, ... without the manpower to play along with this appear to be left behind.
@Jonas I'll certainly bring up that idea with the developers, but I think that's out of scope for this bug.
Hi Matthew, since Jan (who de-c.c.ed) said two years ago that "upstream is aware and working on it" but it's been really quiet here, do you or one of your colleagues have a link to "the" upstream place (a gitlab ticket or whatever system it is) that we can track, simply for those who would like to be able to follow along but don't have the luxury of having access to a hallway track / watercooler? I tried searching for: site:"gitlab.gnome.org" gnome shell crash wayland session "restart" OR "restarting" ...but couldn't find an equivalent upstream metabug, so am I looking at the wrong place entirely? Surely there's something in the open out there, but I can't find it in the haystack. Every six months I try Fedora Workstation with the GNOME Wayland session in the hope that it will work "this time", and every time I have to go back to the Xorg version after a few days because of stability issues... and that makes me sad because I would _want_ to be running the Wayland version. I just _can't_, not if I want to be able to work—I cannot entrust my work to a system where a random crash kills all the running apps and therefore makes me lose data. It happened to me again with Fedora 32 twice in two days, on a ThinkPad with Intel graphics (and no hybrid graphics), which you would think is the "ideal" system to be running this on.
> couldn't find an equivalent upstream metabug, so am I looking at the wrong place entirely? Surely there's something in the open out there, but I can't find it in the haystack. If you couldn't find it, then you can safely assume lack thereof and report one. If someone figures it's a duplicate, they'll mark it as such. Having a duplicate in such case IMO would be useful, because it makes the report easier to find.
Alright, as per Konstantin's suggestion above, I have now filed a ticket upstream: https://gitlab.gnome.org/GNOME/gnome-shell/-/issues/5634
*** Bug 1334226 has been marked as a duplicate of this bug. ***