My system sometimes freezes for up to a minute - ways to diagnose or even solutions? (Lenovo T15g G2, Debian 12)

rhialto

New Member
Joined
Aug 4, 2022
Messages
3
Reaction score
0
Credits
57
Hello there,

last year I bought a Lenovo ThinkPad T15g Gen 2 laptop. Now it came with Windows 10 and I've used that for a few hours to mainly gain performance on some games and applications before I installed Linux.
Between August and November I had used Ubuntu 21.04 (since I was waiting for a newer kernel on Debian which fully supported bluetooth on the Intel AX210 wifi card) and after that installed Debian 12 (Bookworm, current testing version) like I usually do. The issue I am about to describe did not happen on Windows and happened on both Linux distributions.

Sometimes the system freezes. It does not freeze completely since I still can ping the machine and ssh to it when it happens. I still hear sound and I can move the mouse around. But the system does not accept key presses. I think I cannot Magic SysRq, I cannot switch to a virtual terminal and I cannot interact with any windows. The X Server seems to halt, too. The clock in KDE stops ticking and everything stops moving. But like I said, the system does not freeze completely. It's also not permanent. I've never had a full system freeze from which it recovered.
MOST of the time this happens when I start a game on Steam (doesn't matter whether it's native or uses Proton) and it also happens when I try to use wine with any application.

The problem is that I have no idea what's causing this or how to find out what the culprit might be. This system has two M2 drives installed. Is it one of those, i.e. is it maybe just bad firmware? I read that it could happen. Or is this related to the dedicated NVidia RTX 3070 that's in this laptop? I am running it in hybrid mode i.e. X is running on the integrated Intel video card but I can run any application using the dedicated video card.

While this issue is not a dealbreaker (I've been living with this for a year now) since it really mostly happens when I either start a game or an application using wine (it never happens when something is already running). I think I've had some occurrences where it happened for some other reason but those were so rare that I cannot pinpoint to a reason why it happened.

So yeah, does anyone have any ideas how to debug this or even has some suggestions what might be the cause?

Thanks and cheers
 


Without any logs or system info, it will be hard to tell. Can you publish the output of inxi -Fxxz here please. Also check sudo journalctl -b 0 for any segfaults occuring during the freeze ? Thanks...
 
Thank you for your message!

Sure thing, the output of inxi -Fxxz is:
Code:
System:
  Kernel: 5.18.0-3-amd64 arch: x86_64 bits: 64 compiler: gcc v: 11.3.0 Console: pty pts/4 DM:
    1: GDM3 2: SDDM Distro: Debian GNU/Linux bookworm/sid
Machine:
  Type: Laptop System: LENOVO product: 20YS0004GE v: ThinkPad T15g Gen 2i serial: <filter>
    Chassis: type: 10 serial: <filter>
  Mobo: LENOVO model: 20YS0004GE v: SDK0J40697 WIN serial: <filter> UEFI: LENOVO v: N37ET37W
    (1.18 ) date: 12/24/2021
Battery:
  ID-1: BAT0 charge: 92.1 Wh (100.0%) condition: 92.1/94.0 Wh (98.0%) volts: 12.9 min: 11.5
    model: Celxpert 5B10W13959 serial: <filter> status: not charging
CPU:
  Info: 8-core model: 11th Gen Intel Core i7-11800H bits: 64 type: MT MCP arch: Tiger Lake rev: 1
    cache: L1: 640 KiB L2: 10 MiB L3: 24 MiB
  Speed (MHz): avg: 1517 high: 1915 min/max: 800/4600 cores: 1: 1278 2: 1048 3: 1819 4: 1614
    5: 1627 6: 1433 7: 1416 8: 1794 9: 1249 10: 1915 11: 1636 12: 1352 13: 1568 14: 1495 15: 1693
    16: 1344 bogomips: 73728
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Graphics:
  Device-1: Intel TigerLake-H GT1 [UHD Graphics] vendor: Lenovo driver: i915 v: kernel
    arch: Gen-12.1 ports: active: eDP-1 empty: DP-1, DP-2, DP-3, DP-4, DP-5, HDMI-A-1
    bus-ID: 00:02.0 chip-ID: 8086:9a60
  Device-2: NVIDIA GA104M [GeForce RTX 3070 Mobile / Max-Q] vendor: Lenovo driver: nvidia
    v: 470.129.06 arch: Ampere pcie: speed: 2.5 GT/s lanes: 16 ports: active: none off: HDMI-A-2
    empty: DP-6,DP-7,DP-8 bus-ID: 01:00.0 chip-ID: 10de:249d
  Device-3: Acer Integrated Camera type: USB driver: uvcvideo bus-ID: 3-4:3 chip-ID: 5986:212b
  Display: x11 server: X.org v: 1.21.1.4 with: Xwayland v: 22.1.3 compositor: kwin_x11 driver:
    X: loaded: modesetting,nvidia unloaded: fbdev,nouveau,vesa alternate: nv
    gpu: i915,nvidia,nvidia-nvswitch tty: 236x52
  Monitor-1: HDMI-A-2 model: NEC EA244WMi res: 1920x1200 dpi: 94 diag: 612mm (24.1")
  Monitor-2: eDP-1 model: BOE Display 0x086e res: 1920x1080 dpi: 142 diag: 395mm (15.5")
  Message: GL data unavailable in console for root.
Audio:
  Device-1: Intel Tiger Lake-H HD Audio vendor: Lenovo driver: sof-audio-pci-intel-tgl
    bus-ID: 3-7.3:10 chip-ID: 0b05:183c
  Device-2: NVIDIA GA104 High Definition Audio vendor: Lenovo driver: snd_hda_intel v: kernel
    pcie: speed: 2.5 GT/s lanes: 16 bus-ID: 01:00.1 chip-ID: 10de:228b
  Device-3: ASUSTek Xonar U7 MKII type: USB driver: hid-generic,snd-usb-audio,usbhid
  Sound Server-1: ALSA v: k5.18.0-3-amd64 running: yes
  Sound Server-2: JACK v: 1.9.21 running: no
  Sound Server-3: PulseAudio v: 15.0 running: no
  Sound Server-4: PipeWire v: 0.3.56 running: yes
  Network:
  Device-1: Intel Wi-Fi 6 AX210/AX211/AX411 160MHz driver: iwlwifi v: kernel pcie: speed: 5 GT/s
    lanes: 1 bus-ID: 09:00.0 chip-ID: 8086:2725
  IF: wlp9s0 state: up mac: <filter>
  Device-2: Intel Ethernet I225-V vendor: Lenovo driver: igc v: kernel pcie: speed: 5 GT/s
    lanes: 1 port: N/A bus-ID: 0b:00.0 chip-ID: 8086:15f3
  IF: enp11s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Bluetooth:
  Device-1: Intel AX210 Bluetooth type: USB driver: btusb v: 0.8 bus-ID: 3-14:7
    chip-ID: 8087:0032
  Report: hciconfig ID: hci0 rfk-id: 1 state: up address: <filter> bt-v: 3.0 lmp-v: 5.2
    sub-v: 3756
Drives:
  Local Storage: total: 3.19 TiB used: 2.87 TiB (89.9%)
  ID-1: /dev/nvme0n1 vendor: KIOXIA model: N/A size: 476.94 GiB speed: 63.2 Gb/s lanes: 4
    serial: <filter> temp: 46.9 C
  ID-2: /dev/nvme1n1 vendor: Samsung model: SSD 980 1TB size: 931.51 GiB speed: 31.6 Gb/s
    lanes: 4 serial: <filter> temp: 42.9 C
  ID-3: /dev/sda type: USB vendor: Inateck model: NS1066 size: 1.82 TiB serial: <filter>
Partition:
  ID-1: / size: 466.95 GiB used: 372.21 GiB (79.7%) fs: ext4 dev: /dev/nvme0n1p2
  ID-2: /boot/efi size: 511 MiB used: 29.9 MiB (5.8%) fs: vfat dev: /dev/nvme0n1p1
Swap:
  ID-1: swap-1 type: partition size: 976 MiB used: 79.5 MiB (8.1%) priority: -2
    dev: /dev/nvme0n1p3
Sensors:
  System Temperatures: cpu: 50.0 C mobo: N/A
  Fan Speeds (RPM): N/A
Info:
  Processes: 433 Uptime: 2h 25m Memory: 31.08 GiB used: 9.54 GiB (30.7%) Init: systemd v: 251
  target: graphical (5) default: graphical Compilers: gcc: 11.3.0 alt: 10/11 Packages: 5879
  note: see --pkg apt: 5851 flatpak: 28 Shell: Bash v: 5.1.16 running-in: pty pts/4 inxi: 3.3.20

A quick way I found to reproduce this is to run openttd. My computer then freezes for about 3 seconds.
Code:
Aug 08 10:17:33 hostname rtkit-daemon[1227]: Supervising 19 threads of 14 processes of 1 users.
Aug 08 10:17:33 hostname rtkit-daemon[1227]: Supervising 19 threads of 14 processes of 1 users.
Aug 08 10:17:33 hostname rtkit-daemon[1227]: Supervising 19 threads of 14 processes of 1 users.
Aug 08 10:17:33 hostname rtkit-daemon[1227]: Successfully made thread 25283 of process 25264 owned by '1000' high priority at nice level -15.
Aug 08 10:17:33 hostname rtkit-daemon[1227]: Supervising 20 threads of 15 processes of 1 users.
Aug 08 10:17:34 hostname plasmashell[2190]: qml: temp unit: 0
Aug 08 10:17:35 hostname kwin_x11[2096]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 25698, resource id: 178257928, major code: 19 (DeleteProperty), minor code: 0
Aug 08 10:17:35 hostname kwin_x11[2096]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 25710, resource id: 178257928, major code: 19 (DeleteProperty), minor code: 0
Aug 08 10:17:35 hostname kwin_x11[2096]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 25711, resource id: 178257928, major code: 18 (ChangeProperty), minor code: 0
Aug 08 10:17:35 hostname kwin_x11[2096]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 25712, resource id: 178257928, major code: 19 (DeleteProperty), minor code: 0
Aug 08 10:17:35 hostname kwin_x11[2096]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 25713, resource id: 178257928, major code: 19 (DeleteProperty), minor code: 0
Aug 08 10:17:35 hostname kwin_x11[2096]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 25714, resource id: 178257928, major code: 19 (DeleteProperty), minor code: 0
Aug 08 10:17:35 hostname kwin_x11[2096]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 25715, resource id: 178257928, major code: 7 (ReparentWindow), minor code: 0
Aug 08 10:17:35 hostname kwin_x11[2096]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 25716, resource id: 178257928, major code: 6 (ChangeSaveSet), minor code: 0
Aug 08 10:17:35 hostname kwin_x11[2096]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 25717, resource id: 178257928, major code: 2 (ChangeWindowAttributes), minor code: 0
Aug 08 10:17:35 hostname kwin_x11[2096]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 25718, resource id: 178257928, major code: 10 (UnmapWindow), minor code: 0

I ran openttd at 10:17:30 and the clock stood frozen until 10:17:33. All these messages appear later and I'm not sure they are related.

I also don't know what other logs to look at to figure out this problem. I recently saw a talk about ftrace and it could probably help but I have a feeling that with something that microscopic it would take me weeks.
Also in case of openttd I could also probably compile it myself and attach it to a debugger but I'm not sure whether that would get me anywhere since it's unrelated to any specific application and rather related to a piece of hardware or firmware.
 
I'm still not sure what's going on with your kwin and the 'bad window', I don't get those lines when I start openTTD (yes, I'm also a fan of this great game) :) . Searches on google reveiled no usefull info either.
Code:
Display: x11 server: X.org v: 1.21.1.4 with: Xwayland v: 22.1.3 compositor: kwin_x11 driver:
Are you running your system through Wayland instead of X11 ? Did you already tested Wayland instead of X11 ?
I guessing it must be somehow related to the graphic (mesa) accelleration drivers as it only occurs when playing games, on both videocards...
 
Last edited:
Another idea, if you open a konsole and start openttd on the CLI, do you see any (error) messages after start of the game ?
Cheers, Eddy
 
Last edited:
I'm still not sure what's going on with your kwin and the 'bad window', I don't get those lines when I start openTTD (yes, I'm also a fan of this great game) :) . Searches on google reveiled no usefull info either.
Code:
Display: x11 server: X.org v: 1.21.1.4 with: Xwayland v: 22.1.3 compositor: kwin_x11 driver:
Are you running your system through Wayland instead of X11 ? Did you already tested Wayland instead of X11 ?
I guessing it must be somehow related to the graphic (mesa) accelleration drivers as it only occurs when playing games, on both videocards...
Another idea, if you open a konsole and start openttd on the CLI, do you see any (error) messages after start of the game ?
Cheers, Eddy

Yeah, it's a great little game. Have been playing it since the mid 90s and I still keep coming back to it. :) There are no error messages when I start is. Probably because, well...

I think that was a good call, about X11. I love KDE so I'm still using X11 but the problem is probably related to that.
I quickly fired up Gnome using Wayland and lo and behold I did not witness a single freeze. The OpenTTD window opened instantly, no problem. I could start steam without a freeze and then I tried three games (art of rally, X4 and Noita) and they all loaded without problems. With X4 and art of rally Gnome complained that the game window probably froze (it didn't, it just took a while to load) but the rest of the system worked fine meanwhile.
I then tried the same with Gnome on X11 and I got the same issues as with KDE.

Which is a shame for me because I doubt it will be fixed. I have a feeling that it might be somehow related to my Intel/NVidia card set up since this whole Optimus thing has always been a bit wonky on Liinux (right now there is also a bug, which has been reintroduced, where you cannot run anything in fullscreen mode because of a failed page flipping loop) and with the advent of Wayland I kinda doubt that something like that will ever be fixed.
And there are still quite a few things that stop KDE from woking on wayland.

Again, thanks for your tips. I think that's the problem and I'll probably have to live with it for a while.
Except if there are workarounds for that, of course. :D
 

Members online


Top