• SupportQuestion
  • Major difficulties recovering system after week 27 and 28 updates

Matt_Nico I have since rebooted my desktop system again, and just in case ran the sudo eopkg check | grep Broken ... command one last time to make sure everything was in order. It came back with a clr-boot-manager error yet again. This time the error is as follows:

[✗] Updating clr-boot-manager failed

A copy of the command output follows:

[FATAL] cbm (../src/bootman/bootman.c:L562): FATAL: Cannot mount boot device /dev/nvme0n1p1 on /boot: No such device

I tried to run sudo clr-boot-manager update to see if that was able to do anything, and it appeared to fix thing when I ran the command to repair broken packages again. I will now reboot the system and see if the fix sticks

    Matt_Nico After rebooting the system yet again and running the sudo eopkg check | grep Broken ... command, I was met with this error yet again:

    [✗] Updating clr-boot-manager failed

    A copy of the command output follows:

    [FATAL] cbm (../src/bootman/bootman.c:L562): FATAL: Cannot mount boot device /dev/nvme0n1p1 on /boot: No such device

    I have no clue why this error has returned after it was seemingly fixed. Running sudo clr-boot-manager update once again got the error to go away. I've done this reboot sequence a few times now and the error always returns upon the reboot but will go away once you update the clr-boot-manager. I have also tried running sudo usysconf run -f prior to rebooting the system but the error will still return.

    Absolutely no clue where to go from here, guess I just wont turn of either my desktop or laptop for the time being.

      Also, can we get the output of: sudo env CBM_DEBUG=1 clr-boot-manager update?

        infinitymdm silke
        Here is the output of lsblk nvme0n1p1 would be my efi partition

        sda           8:0    0   5.5T  0 disk 
        ├─sda1        8:1    0   5.4T  0 part /mnt/quaternary
        └─sda2        8:2    0  94.1G  0 part 
        sdb           8:16   0 465.8G  0 disk 
        ├─sdb1        8:17   0   499M  0 part 
        ├─sdb2        8:18   0   100M  0 part 
        ├─sdb3        8:19   0    16M  0 part 
        └─sdb4        8:20   0 195.4G  0 part 
        sdc           8:32   0 931.5G  0 disk 
        └─sdc1        8:33   0 931.5G  0 part /mnt/secondary
        sdd           8:48   0   3.6T  0 disk 
        └─sdd3        8:51   0   3.6T  0 part /mnt/tertiary
        sde           8:64   1  14.5G  0 disk 
        └─sde1        8:65   1  14.5G  0 part 
        zram0       252:0    0     8G  0 disk [SWAP]
        nvme0n1     259:0    0 931.5G  0 disk 
        ├─nvme0n1p1 259:1    0   512M  0 part 
        ├─nvme0n1p2 259:2    0  70.8G  0 part /
        ├─nvme0n1p3 259:3    0 161.4G  0 part /home
        ├─nvme0n1p4 259:4    0  29.8G  0 part [SWAP]
        └─nvme0n1p5 259:5    0 668.9G  0 part /mnt/fast storage

        and here is the output of sudo env CBM_DEBUG=1 clr-boot-manager update

        [DEBUG] cbm (../src/cli/cli.c:L142): No such file: //etc/kernel/update_efi_vars
        [INFO] cbm (../src/bootman/bootman.c:L787): Current running kernel: 6.9.8-294.current
        [INFO] cbm (../src/bootman/sysconfig.c:L179): Discovered UEFI ESP: /dev/disk/by-partuuid/e9fc2609-be10-4546-ab1b-f7beebb9167e
        [INFO] cbm (../src/bootman/sysconfig.c:L256): Fully resolved boot device: /dev/nvme0n1p1
        [DEBUG] cbm (../src/bootman/bootman.c:L141): shim-systemd caps: 0x26, wanted: 0x26
        [DEBUG] cbm (../src/bootman/bootman.c:L156): UEFI boot now selected (shim-systemd)
        [INFO] cbm (../src/bootman/bootman.c:L807): path ///etc/kernel/initrd.d does not exist
        [INFO] cbm (../src/bootman/bootman.c:L807): path ///usr/lib/initrd.d does not exist
        [INFO] cbm (../src/bootman/bootman.c:L503): Checking for mounted boot dir
        [INFO] cbm (../src/bootman/bootman.c:L555): Mounting boot device /dev/nvme0n1p1 at /boot
        [SUCCESS] cbm (../src/bootman/bootman.c:L568): /dev/nvme0n1p1 successfully mounted at /boot
        [DEBUG] cbm (../src/bootman/update.c:L164): Now beginning update_native
        [DEBUG] cbm (../src/bootman/update.c:L173): update_native: 1 available kernels
        [DEBUG] cbm (../src/bootman/update.c:L191): update_native: Running kernel is (current) ///usr/lib/kernel/com.solus-project.current.6.9.8-294
        [SUCCESS] cbm (../src/bootman/update.c:L205): update_native: Bootloader updated
        [DEBUG] cbm (../src/bootman/kernel.c:L617): installing extra initrd: /usr/lib64/kernel/initrd-com.solus-project.current.6.9.8-294.nvidia
        [DEBUG] cbm (../src/bootloaders/systemd-class.c:L219): adding extra initrd to bootloader: initrd-com.solus-project.current.6.9.8-294.nvidia
        [SUCCESS] cbm (../src/bootman/update.c:L220): update_native: Repaired running kernel ///usr/lib/kernel/com.solus-project.current.6.9.8-294
        [DEBUG] cbm (../src/bootman/update.c:L230): update_native: Checking kernels for type current
        [INFO] cbm (../src/bootman/update.c:L243): update_native: Default kernel for type current is ///usr/lib/kernel/com.solus-project.current.6.9.8-294
        [DEBUG] cbm (../src/bootman/kernel.c:L617): installing extra initrd: /usr/lib64/kernel/initrd-com.solus-project.current.6.9.8-294.nvidia
        [DEBUG] cbm (../src/bootloaders/systemd-class.c:L219): adding extra initrd to bootloader: initrd-com.solus-project.current.6.9.8-294.nvidia
        [SUCCESS] cbm (../src/bootman/update.c:L255): update_native: Installed tip for current: ///usr/lib/kernel/com.solus-project.current.6.9.8-294
        [DEBUG] cbm (../src/bootman/kernel.c:L617): installing extra initrd: /usr/lib64/kernel/initrd-com.solus-project.current.6.9.8-294.nvidia
        [DEBUG] cbm (../src/bootloaders/systemd-class.c:L219): adding extra initrd to bootloader: initrd-com.solus-project.current.6.9.8-294.nvidia
        [SUCCESS] cbm (../src/bootman/update.c:L269): update_native: Installed last_good kernel (current) (///usr/lib/kernel/com.solus-project.current.6.9.8-294)
        [DEBUG] cbm (../src/bootman/update.c:L280): update_native: Analyzing for type current: ///usr/lib/kernel/com.solus-project.current.6.9.8-294
        [DEBUG] cbm (../src/bootman/update.c:L285): update_native: Skipping running kernel
        [INFO] cbm (../src/bootman/bootman.c:L503): Checking for mounted boot dir
        [INFO] cbm (../src/bootman/bootman.c:L510): boot_dir is already mounted: /boot
        [SUCCESS] cbm (../src/bootman/update.c:L338): update_native: Default kernel for current is ///usr/lib/kernel/com.solus-project.current.6.9.8-294
        [DEBUG] cbm (../src/bootman/update.c:L353): No kernel removals found
        [INFO] cbm (../src/bootman/bootman.c:L469): Attempting umount of /boot
        [SUCCESS] cbm (../src/bootman/bootman.c:L473): Unmounted boot directory

        Which appears to work as intended?

        minh I was about to chroot and try this boot rescue when my computer booted as normal unexpectedly. I am not sure if it would still be worthwhile to chroot in, shouldn't I be able to do everything from inside my system now? If it's still worthwhile lmk and I will give it a shot!

        Also weirdly I haven't been having any issues with my nvidia card since the initial error with lightdm. I have tested out a few different games as well and the system is definitely running on the gpu rather than internal graphics as performance is as expected.

        We pushed a hotfix for the nvidia driver, if you installed that it should be working (assuming it's the same issue). We're still trying to figure out the root cause of the issue.

          Which appears to work as intended?

          Yep, strange that it sometimes complains about /dev/nvme0n1p1 not existing.

            ReillyBrogan If I try to update my system now it appears everything is up to date. The issue doesn't appear to be with the nvidia driver anymore, but clr-boot-manager is still giving me errors.

            For now it seem I am still able to restart my computer and use it as normal. I believe it is able to find the efi partition when actually booting since my system starts, so I am not sure why it cannot properly detect it once the system is up and running.

            silke It really is bizarre. I was also having a very similar issue with my laptop today when I went to start it, except it was unable to detect my /home partition and so was only reaching the terminal. From the terminal if I rebooted it one or two times it would catch the partition and start as normal. But again if I turned it off I would risk it "losing" the partition again.

            Incredibly odd that it is happening across both my main solus devices. The laptop is a t480s without any integrated graphics for what it is worth so nvidia should most definitely not be playing a role on my laptop.

            *on Friday I believe I will be able to hop into the matrix at some point as I have more time available. I just need to sign up for it still.

              Matt_Nico I am wondering if it may be something to do with these drives being nvme devices. Both the drive on my laptop and desktop with my solus install are fast nvme drives, this is purely conjecture but I wonder if the speed of these drives may be causing the issue? Could things be moving faster than clr-boot-manager or eopkg can keep up? That would explain why the issue is intermittent, as the system may be able to grab the information in time on some boot sequences but not on others.

                Matt_Nico I've run Solus on a variety of PCIe x4 gen 3 and gen 4 drives. The speed of the drive is probably not your problem.

                Matt_Nico I'm also using NVMe devices and have no issues. My guess is that there's something weird going on. You can check the kernel logs (journalctl -k) for any suspicious information.

                You could try updating the firmware using fwupd (eopkg install fwupd). Make sure you have a good backup beforehand though (I haven't seen it brick a system yet, but someone has to be the first).

                1. Ensure /boot is mounted:
                     clr-boot-manager mount-boot
                  This is normally done automatically, but it can't harm to double check seeing as the partitions seem a bit flaky.
                2. Check for updates:
                     fwupdmgr refresh
                     fwupgmgr get-updates
                3. Install them:
                     fwupdmgr update

                  silke here is the output of journalctl -k to me it didn't look like there was anything glaringly wrong but I am definitely out of my depth here.

                  Jul 18 14:09:59 solus kernel: Command line: initrd=\EFI\com.solus-project\initrd-com.solus-project.current.6.9.8-294 initrd=>
                  Jul 18 14:09:59 solus kernel: BIOS-provided physical RAM map:
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x0000000000000000-0x0000000000057fff] usable
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x0000000000058000-0x0000000000058fff] reserved
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x0000000000059000-0x000000000009dfff] usable
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x000000000009e000-0x00000000000fffff] reserved
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x0000000000100000-0x000000003fffffff] usable
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x0000000040000000-0x00000000403fffff] reserved
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x0000000040400000-0x0000000069bd7fff] usable
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x0000000069bd8000-0x0000000069bd8fff] ACPI NVS
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x0000000069bd9000-0x0000000069bd9fff] reserved
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x0000000069bda000-0x000000007b1befff] usable
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x000000007b1bf000-0x000000007b68efff] reserved
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x000000007b68f000-0x000000007b6fefff] ACPI data
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x000000007b6ff000-0x000000007bb2efff] ACPI NVS
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x000000007bb2f000-0x000000007cffefff] reserved
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x000000007cfff000-0x000000007cffffff] usable
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x000000007d000000-0x000000007fffffff] reserved
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x00000000fe000000-0x00000000fe010fff] reserved
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x00000000fed00000-0x00000000fed00fff] reserved
                  Jul 18 14:09:59 solus kernel: BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
                  lines 1-24

                  When I am home later I would be happy to try to update the firmware. I have my important files saved on a separate drive but have not done a proper backup. Would something like this guide be acceptable to create a backup?

                  iirc this installation was done years ago with a Solus 4.2 or 4.3 iso, would that point to a firmware issue?

                    Matt_Nico silke I didn't end up going through with the fwupdmgr update commands as of yet as I am not sure if it will actually do anything given the output of fwupdmgr get-updates command.

                    matt@matt-solus-desktop ~ $ fwupdmgr get-updates
                    WARNING: This package has not been validated, it may not work properly.
                    Devices with no available firmware updates: 
                     • SSD 850 EVO 500GB
                     • SSD 860 EVO 1TB
                     • WD BLACK SN750 SE 1TB
                     • WDC WD40EZRZ-75GXCB0
                     • WDC WD60EZAZ-00SF3B0
                    No updatable devices

                    It appears as though all my drives are up to date firmware wise so would it be worth it to go through and do the fwupdmgr update command?

                    Still running into these issues on both my laptop and my desktop.

                      Matt_Nico If there are no updates available, running fwupdmgr update will just tell you the same thing that get-updates did. No need to run it.

                      Just to be clear, you're still getting this error you mentioned in your earlier post, correct?

                      Matt_Nico
                      [✗] Updating clr-boot-manager failed

                      A copy of the command output follows:

                      [FATAL] cbm (../src/bootman/bootman.c:L562): FATAL: Cannot mount boot device /dev/nvme0n1p1 on /boot: No such device

                      Are there any other errors you're seeing, or any other behavior that you don't think is normal?

                      Could we get the output of sudo journalctl -k | grep nvme? That should filter the system logs for kernel messages containing the string "nvme". There may be better search strings to try, this is just where I would start.

                        infinitymdm On my laptop there is additional weird behaviour occurring. Namely it is failing to mount the /home partition on startup around 50% of the time. If I just get it to reboot after the error it will boot just fine most times but sometimes it takes multiple attempts. Here is an image of the error:

                        And yes to be clear the other error is still persisting, this is as it appears on my laptop:

                         [✗] Updating clr-boot-manager                                           failed
                        
                        A copy of the command output follows:
                        
                        [FATAL] cbm (../src/bootman/bootman.c:L562): FATAL: Cannot mount boot device /dev/nvme0n1p1 on /boot: No such device
                        
                        
                         [✗] Updating clr-boot-manager                                           failed
                        
                        A copy of the command output follows:
                        
                        [FATAL] cbm (../src/bootman/bootman.c:L562): FATAL: Cannot mount boot device /dev/nvme0n1p1 on /boot: No such device
                        
                        
                         [✓] Running depmod on kernel 6.9.8-294.current                         success

                        Here is the output of sudo journalctl -k | grep nvme on my laptop (I will also respond with my desktops output in a few minutes):

                        Jul 19 15:42:35 solus kernel: nvme nvme0: 8/0/0 default/read/poll queues
                        Jul 19 15:42:35 solus kernel:  nvme0n1: p1 p2 p3 p4
                        Jul 19 15:42:37 solus kernel: EXT4-fs (nvme0n1p3): mounted filesystem 9aa71c0a-9d55-4f79-b370-a3f554f8eb80 r/w with ordered data mode. Quota mode: none.
                        Jul 19 15:42:35 solus kernel: nvme nvme0: pci function 0000:3e:00.0
                        Jul 19 15:42:35 solus kernel: nvme nvme0: 8/0/0 default/read/poll queues
                        Jul 19 15:42:35 solus kernel:  nvme0n1: p1 p2 p3 p4
                        Jul 19 15:42:37 solus kernel: EXT4-fs (nvme0n1p3): mounted filesystem 9aa71c0a-9d55-4f79-b370-a3f554f8eb80 r/w with ordered data mode. Quota mode: none.
                        Jul 19 15:42:38 matt-solus-t480s kernel: EXT4-fs (nvme0n1p3): re-mounted 9aa71c0a-9d55-4f79-b370-a3f554f8eb80 r/w. Quota mode: none.
                        Jul 19 15:42:38 matt-solus-t480s kernel: Adding 11718652k swap on /dev/nvme0n1p2.  Priority:-2 extents:1 across:11718652k SS
                        Jul 19 15:42:40 matt-solus-t480s kernel: EXT4-fs (nvme0n1p4): mounted filesystem fa0ff32f-4b57-468f-981f-e79f6fed9aa7 r/w with ordered data mode. Quota mode: none.

                        What I find most bizarre is that the error is almost exactly replicated on both of my systems.

                        infinitymdm One additional weird thing which is occurring on my desktop is that I will sometimes be unable to log in from the lock screen after the computer has gone into standby mode. There will be no option to type in the field where I must enter the password. This behaviour stops if I log out of the account and then log back into the system. This happens when the system is left in standby for longer than 6 hours, so I had just switched to powering off my system when this behaviour occasionally flares up (usually it will happen a few days in a row and then I will switch to powering the system down. I don't think this would be related.

                        Here is the error as it has been appearing on my desktop system:

                         [✓] Updating dynamic library cache                                     success
                         [✗] Updating clr-boot-manager                                           failed
                        
                        A copy of the command output follows:
                        
                        [FATAL] cbm (../src/bootman/bootman.c:L562): FATAL: Cannot mount boot device /dev/nvme0n1p1 on /boot: No such device
                        
                        
                         [✗] Updating clr-boot-manager                                           failed
                        
                        A copy of the command output follows:
                        
                        [FATAL] cbm (../src/bootman/bootman.c:L562): FATAL: Cannot mount boot device /dev/nvme0n1p1 on /boot: No such device
                        
                        
                         [✗] Updating clr-boot-manager                                           failed
                        
                        A copy of the command output follows:
                        
                        [FATAL] cbm (../src/bootman/bootman.c:L562): FATAL: Cannot mount boot device /dev/nvme0n1p1 on /boot: No such device
                        
                        
                         [✓] Running depmod on kernel 6.9.8-294.current                         success
                         [✓] Updating hwdb                                                      success
                         [✓] Updating system users                                              success
                         [✓] Updating systemd tmpfiles                                          success
                         [✓] Reloading systemd configuration                                    success
                         [✓] Re-starting vendor-enabled .socket units                           success
                         [✓] Compiling and Reloading AppArmor profiles                          success
                         [✓] Updating manpages database                                         success
                         [✓] Reloading udev rules                                               success
                         [✓] Applying udev rules                                                success

                        and the output of sudo journalctl -k | grep nvme on my desktop system:

                        Jul 19 16:01:31 solus kernel: nvme nvme0: allocated 64 MiB host memory buffer.
                        Jul 19 16:01:31 solus kernel: nvme nvme0: 6/0/0 default/read/poll queues
                        Jul 19 16:01:31 solus kernel:  nvme0n1: p1 p2 p3 p4 p5
                        Jul 19 16:01:33 solus kernel: EXT4-fs (nvme0n1p2): mounted filesystem 2456cde0-a7e1-4af1-99ca-c30eb65f868a r/w with ordered data mode. Quota mode: none.
                        Jul 19 16:01:31 solus kernel: nvme nvme0: pci function 0000:03:00.0
                        Jul 19 16:01:31 solus kernel: nvme nvme0: allocated 64 MiB host memory buffer.
                        Jul 19 16:01:31 solus kernel: nvme nvme0: 6/0/0 default/read/poll queues
                        Jul 19 16:01:31 solus kernel:  nvme0n1: p1 p2 p3 p4 p5
                        Jul 19 16:01:33 solus kernel: EXT4-fs (nvme0n1p2): mounted filesystem 2456cde0-a7e1-4af1-99ca-c30eb65f868a r/w with ordered data mode. Quota mode: none.
                        Jul 19 16:01:34 matt-solus-desktop kernel: EXT4-fs (nvme0n1p2): re-mounted 2456cde0-a7e1-4af1-99ca-c30eb65f868a r/w. Quota mode: none.
                        Jul 19 16:01:34 matt-solus-desktop kernel: Adding 31250428k swap on /dev/nvme0n1p4.  Priority:-2 extents:1 across:31250428k SS
                        Jul 19 16:01:34 matt-solus-desktop kernel: EXT4-fs (nvme0n1p5): mounted filesystem 81b8fa97-0b82-4375-a08a-6d7ec6daa7af r/w with ordered data mode. Quota mode: none.
                        Jul 19 16:01:35 matt-solus-desktop kernel: EXT4-fs (nvme0n1p3): mounted filesystem 168e8227-9af4-4f3a-b2be-3bdf0875dece r/w with ordered data mode. Quota mode: none.
                        Jul 19 16:01:36 matt-solus-desktop kernel: block nvme0n1: No UUID available providing old NGUID

                          Matt_Nico now I am running into the same issue with the /home partition on my desktop computer. I snapped a pic of it as well:

                          It seems to be identical to the issue which is present on the laptop. So now I can say that both systems are exhibiting the exact same set of errors.

                          I have not yet applied the W39 updates as I do not know what will happen when I do. Should I just go for it?