Linux is making my ssd really hot compared to windows, making my bios saying S.M.A.R.T status is bad when using linux long enough

the other way around is just "N/A", so it doesn't really matter anyways though but thanks for noticing it. I thought "N/A" is like undefined so that it will sort it the lowest, somehow the app is sorting it highest for some reason?
That's fine then, you're not having I/O issue.

also you saying my reports are incorrect and invalid in the first screenshot, but if you look into reddit's post, you will see the same command, where the sensor is around 55-60 degrees.
The software is bad, it reports max temp as 65K°C which is insane, we're on planet Earth not Venus :)
it also reports minimum temp as -273°C which is equally insane, on these temps your drive should either melt of totally freeze.

The only value that's probably correct is "Composite" but I would not trust the software with such crazy min/max numbers.

Can you please run smartctl? and tell us what does it say.
 


That's fine then, you're not having I/O issue.


The software is bad, it reports max temp as 65K°C which is insane, we're on planet Earth not Venus :)
it also reports minimum temp as -273°C which is equally insane, on these temps your drive should either melt of totally freeze.

The only value that's probably correct is "Composite" but I would not trust the software with such crazy min/max numbers.

Can you please run smartctl? and tell us what does it say.
Bash:
sudo smartctl -a /dev/nvme1n1
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.65-1-lts] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       KINGSTON SNV2S500G
Serial Number:                      50026B7382C34F49
Firmware Version:                   CRTP3011
PCI Vendor/Subsystem ID:            0x2646
IEEE OUI Identifier:                0x0026b7
Total NVM Capacity:                 500.107.862.016 [500 GB]
Unallocated NVM Capacity:           0
Controller ID:                      0
NVMe Version:                       1.4
Number of Namespaces:               1
Namespace 1 Size/Capacity:          500.107.862.016 [500 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            0026b7 382c34f495
Local Time is:                      Fri Dec 20 17:07:40 2024 +07
Firmware Updates (0x02):            1 Slot
Optional Admin Commands (0x0016):   Format Frmw_DL Self_Test
Optional NVM Commands (0x0016):     Wr_Unc DS_Mngmt Sav/Sel_Feat
Log Page Attributes (0x02):         Cmd_Eff_Lg
Maximum Data Transfer Size:         64 Pages
Warning  Comp. Temp. Threshold:     60 Celsius
Critical Comp. Temp. Threshold:     65 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     4.00W       -        -    0  0  0  0        1       1
 1 +     4.00W       -        -    1  1  1  1       10      10
 2 +     4.00W       -        -    2  2  2  2       50      50
 3 -     0.50W       -        -    3  3  3  3    10000    5000
 4 -     0.50W       -        -    4  4  4  4    35000  175000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        55 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    2.400.128 [1,22 TB]
Data Units Written:                 1.880.648 [962 GB]
Host Read Commands:                 12.515.565
Host Write Commands:                9.991.776
Controller Busy Time:               14
Power Cycles:                       81
Power On Hours:                     283
Unsafe Shutdowns:                   13
Media and Data Integrity Errors:    0
Error Information Log Entries:      44
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               91 Celsius
Temperature Sensor 2:               70 Celsius
Temperature Sensor 3:               70 Celsius

Error Information (NVMe Log 0x01, 4 of 4 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS  Message
  0         44     0  0x1015  0x4004      -            0     0     -  Invalid Field in Command

Read Self-test Log failed: Invalid Field in Command (0x002)



Here you go.
 
That looks better.



A little high, but not that bad.
Yeah, it's running higher than normal, normal temp is somewhere 25-35°C cca, with good cooling.
But 55°C is not too extreme IMO, the drive won't suffer damage I think.

I might be due to drivers not handling cooling properly.

so, there's no way to fix it? I mean the composite sensor is 55, but what about sensor 1 when it's around 90.

if there's truly no way to fix it, I really have to believe it won't do any damage to the drive :(
 
so, there's no way to fix it? I mean the composite sensor is 55, but what about sensor 1 when it's around 90.

if there's truly no way to fix it, I really have to believe it won't do any damage to the drive :(
How is your drive formatted? which file system and which distro are you using?
Is Linux installed to that drive which reports 55°C? or is it used as backup drive?
 
What I find really weird, and correct me if I'm wrong, is this is also happening when you're using Linux in a live environment. There should be little to no disk activity during this, unless you're actively mounting the drive and using it to save or read.

Everything else should be in RAM.

If it's a live USB/DVD/CD then the only activity the drive should see is when you write something to it or read something off of it. You should be able to unplug that drive and still have the same experience. Not even temp files should be written to the drive.
 
How is your drive formatted? which file system and which distro are you using?
Is Linux installed to that drive which reports 55°C? or is it used as backup drive?
so this is the past experience, like I said I don't have linux on me this time, tomorrow I will reinstall and answer again.

but my drive was ext4, I used debian, endeavour os, arch and it's all ext4.
Is Linux installed to that drive which reports 55°C? or is it used as backup drive?
Yes and no, I installed debian on other drive, it's still hot, I installed endeavour os and arch on the hot drive, it's still hot.

I will have to reinstall this to have more infos for you guys.

I never thought this website has a really good support, thanks for y'all support, right now it's already 3AM in my country, I would like to continue this problem later.

Thanks so much for helping right now, but I will have to rest, for right now I'm pretty tired.
 
What I find really weird, and correct me if I'm wrong, is this is also happening when you're using Linux in a live environment. There should be little to no disk activity during this, unless you're actively mounting the drive and using it to save or read.

Everything else should be in RAM.

If it's a live USB/DVD/CD then the only activity the drive should see is when you write something to it or read something off of it. You should be able to unplug that drive and still have the same experience. Not even temp files should be written to the drive.
yes you are correct, the ssd is getting hotter and hotter even when I'm in a live enviroment. that's why I feel weird haha.
 
I just looked at my buddy's page (you know him as Google) and the extremes for operating temps would be:

-40°C to 85°C

However, the ideal range is -0° C to 70° C.

That's a generalized answer.

Which is to say, you're probably just fine.

It is still very weird.
 
"Usually" how do you like that word usually?

When you run these commands it will show thresholds.
These are set by the manufacturer of the device.

root@absTower:~# smartctl -a /dev/nvme0n1 | grep Thres
Warning Comp. Temp. Threshold: 90 Celsius
Critical Comp. Temp. Threshold: 94 Celsius
Available Spare Threshold: 10%

My understanding is that the warning temp is OK for a short period of time, but how long that time is, seems to vary.
The critical temp, means you better shutdown pretty quick, within a few seconds if possible.
 
Last edited:
yes you are correct, the ssd is getting hotter and hotter even when I'm in a live enviroment.
Therefore the problem is coming from the ssd itself

Is it fitted with a heatsink/thermal pad etc ?

This is in a laptop. Have the vents, front and back been cleared out....dust and cruft removed ?

Is the cpu fan clean ?


etc
 
Last edited:
Therefore the problem is coming from the ssd itself

Is it fitted with a heatsink/thermal pad etc ?

This is in a laptop. Have the vents, front and back been cleared out....dust and cruft removed ?

Is the cpu fan clean ?


etc
Everything seems to be normal using on windows, I don't think it's a ssd's issue if windows is fine with it though.

And yes, I cleaned my laptop like 3 months ago, and to be honest I keep it real clean so I don't think there's dust :) I clean my laptop like once every year, last time I cleaned it, there were literally no to little dust, so yeah, with the way I'm using, there won't be much.
 
1736917771794.png

I booted to linux live enviroment, use it for 20 minutes, check the temps, it's around 50 degrees, maybe it's wrong, I go back to windows, welp it is 50 degrees as well, but windows will make it cooler and cooler. and now this is when I let windows cool it down for like 5-10 minutes

1736918355278.png
 
So the 30C we are seeing there in windows is after it has been sitting for some time ?

This my temps
1736923849604.png

this from CPU temperature Indicator (It is an Applet in Linux Mint....right click the task bar (panel in linux Language) select applets....

There are no nasty effects from these temps.....they occasionally rise to 60C....but because the critical temp is 100C i have zero problems. The cpu and nvme m.2 have been in place for 2 years +

I will ask another 2 members here for their valued input. No action necessary from you.

@osprey
@GatorsFan
 
I will ask another 2 members here for their valued input. No action necessary from you.

@osprey
@GatorsFan
The report of temperatures in post #16 do seem odd with the ranges from -273.1C to 65261.8C as pointed out by @CaffeineAddict in post #19 and post #21.

Unfortunately, the linux software does behave in that odd manner, as shown here too:
Code:
[root@min ~]# sensors
<snip>
nvme-pci-0200
Adapter: PCI adapter
Composite:    +42.9°C  (low  =  -0.1°C, high = +84.8°C)
                       (crit = +94.8°C)
Sensor 1:     +42.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +52.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 8:     +42.9°C  (low  = -273.1°C, high = +65261.8°C)

nouveau-pci-0100
Adapter: PCI adapter
GPU core:    912.00 mV (min =  +0.80 V, max =  +1.19 V)
temp1:        +48.0°C  (high = +95.0°C, hyst =  +3.0°C)
                       (crit = +105.0°C, hyst =  +5.0°C)
                       (emerg = +135.0°C, hyst =  +5.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +33.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:        +30.0°C  (high = +80.0°C, crit = +100.0°C)
Core 4:        +31.0°C  (high = +80.0°C, crit = +100.0°C)
Core 8:        +32.0°C  (high = +80.0°C, crit = +100.0°C)
Core 12:       +30.0°C  (high = +80.0°C, crit = +100.0°C)
Core 16:       +32.0°C  (high = +80.0°C, crit = +100.0°C)
Core 20:       +30.0°C  (high = +80.0°C, crit = +100.0°C)
Core 24:       +31.0°C  (high = +80.0°C, crit = +100.0°C)
Core 25:       +31.0°C  (high = +80.0°C, crit = +100.0°C)
Core 26:       +31.0°C  (high = +80.0°C, crit = +100.0°C)
Core 27:       +31.0°C  (high = +80.0°C, crit = +100.0°C)
Core 28:       +31.0°C  (high = +80.0°C, crit = +100.0°C)
Core 29:       +30.0°C  (high = +80.0°C, crit = +100.0°C)
Core 30:       +30.0°C  (high = +80.0°C, crit = +100.0°C)
Core 31:       +30.0°C  (high = +80.0°C, crit = +100.0°C)

spd5118-i2c-0-51
Adapter: SMBus I801 adapter at efa0
temp1:        +35.2°C  (low  =  +0.0°C, high = +55.0°C)
                       (crit low =  +0.0°C, crit = +85.0°C)

Looking at the output though, it's clear that the temperatures of the GPU, CPU cores and SMBus look quite reasonable, and prima facie, plausible on this well running machine here, despite the absurd looking ranges shown for sensors 1, 2 and 8 which are the same values as those shown in the output by @hoanghieubrant in post #16.

The suggestion is that the software has a problem with ranges for nvme disks, but maybe not with the initial temperatures it is showing.

@hoanghieubrant reports the temperatures rising, and the outputs provided show that rising phenomenon when comparing the outputs in post #16 and post #22:
From post #16:
Sensor 1 +80.8C
Sensor 2 +40.9C
....
Sensor 3 +40.9C

From post #22:
Code:
Temperature Sensor 1:               91 Celsius
Temperature Sensor 2:               70 Celsius
Temperature Sensor 3:               70 Celsius

Sensor 1 appears to have risen to a near critical level of 91C where critical on the machine here looks like 94.8C (see output above). Sensor 1 on my motherboard appears to be sensing the nvme disk. These disks have their own internal sensors and although the adapter is mentioned in the output, it's not the adapter that is being measured.

It's worth pointing out that smartctl and sensors access the same temperature data in the /sys filesystem, e.g. at /sys/class/thermal/* and /sys/class/hwmon/*. For example on the machine here:
Code:
[root@min ~]# smartctl -a /dev/nvme0n1 | grep -i Temper
Temperature:                        43 Celsius
<snip>
Temperature Sensor 1:               43 Celsius
Temperature Sensor 2:               52 Celsius
Temperature Sensor 8:               43 Celsius
is outputting virtually the same values as:
Code:
[root@min ~]# sensors
<snip>
nvme-pci-0200
Adapter: PCI adapter
Composite:    +42.9°C  (low  =  -0.1°C, high = +84.8°C)
                       (crit = +94.8°C)
Sensor 1:     +42.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +52.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 8:     +42.9°C  (low  = -273.1°C, high = +65261.8°C)
The tiny differences can be explained by the slightly different times the programs were run, one after the other, or rounding by the software.

The data suggests that Sensor 1 is sensing the temperature of the nvme disk itself in the output from the sensors command, both at 42.9C, on this particular motherboard. The Sensors may vary on different motherboards as to what they are sensing. One needs to read the motherboard documentation to be certain.

In the output of temperatures in post #16 from @hoanghieubrant, the nvme disk is shown at 38.0C, but none of the Sensors show the same, so it's difficult to surmise what Sensors 1, 2 and 3 are measuring, unlike the motherboard here.

At this point my suggestion would be to install the linux distribution and configure the system to shut down automatically when a certain temperature threshold is reached to prevent overheating. Since there doesn't appear to be a built-in, direct command for setting an automatic shutdown temperature, one is left with having to achieve the functionality through scripting, using the temperature monitoring tools and system management commands. Some options for running such a script are from a systemd unit written for the purpose, a cron job or it could run from
/etc/rc.local.

If such a script was created and run, and the linux system never shut itself itself down, then there's probably no problem. Such a script would be a safe way to go to protect things and is possibly a reasonable test for this particular problem.
 
Last edited:
I use LMsensors temp sensor reader, this is the result on my Latitude with Mint LMDE6 [its a bit long]
Adapter: ISA adapter
in0: 5.00 V (min = +5.00 V, max = +5.00 V)
curr1: 0.00 A (max = +0.00 A)

iwlwifi_1-virtual-0
Adapter: Virtual device
temp1: +30.0°C

pch_skylake-virtual-0
Adapter: Virtual device
temp1: +37.0°C

nvme-pci-0300
Adapter: PCI adapter
Composite: +25.9°C (low = -273.1°C, high = +78.8°C)
(crit = +83.8°C)
Sensor 1: +25.9°C (low = -273.1°C, high = +65261.8°C)

acpitz-acpi-0
Adapter: ACPI interface
temp1: +25.0°C (crit = +107.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +44.0°C (high = +100.0°C, crit = +100.0°C)
Core 0: +44.0°C (high = +100.0°C, crit = +100.0°C)
Core 1: +40.0°C (high = +100.0°C, crit = +100.0°C)
Core 2: +42.0°C (high = +100.0°C, crit = +100.0°C)
Core 3: +43.0°C (high = +100.0°C, crit = +100.0°C)

dell_smm-isa-0000
Adapter: ISA adapter
fan1: 0 RPM (min = 0 RPM, max = 5300 RPM)
temp1: +44.0°C
temp2: +36.0°C
temp3: +33.0°C
temp4: +28.0°C

BAT0-acpi-0
Adapter: ACPI interface
in0: 7.54 V
curr1: 3.54 A
 
and this is the short report for my desktop, NOTE neither of the machines have heatsinks on the NVMe
acpitz-acpi-0
Adapter: ACPI interface
temp1: +27.8°C (crit = +97.0°C)
temp2: +29.8°C (crit = +97.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +41.0°C (high = +86.0°C, crit = +92.0°C)
Core 0: +34.0°C (high = +86.0°C, crit = +92.0°C)
Core 1: +36.0°C (high = +86.0°C, crit = +92.0°C)
Core 2: +39.0°C (high = +86.0°C, crit = +92.0°C)
Core 3: +35.0°C (high = +86.0°C, crit = +92.0°C)

nvme-pci-0100
Adapter: PCI adapter
Composite: +39.9°C (low = -273.1°C, high = +82.8°C)
(crit = +84.8°C)

if your NVMe or CPU are running hot, and the surrounding airflow is not clogged, then i would look for erroneous applications thrashing away in the background and in the case of the NVMe additionally consider a fault, but if both are OK in windows, I fall back on running apps in the background.
 
Another thought, if this is in a desktop, is it next to the GPU? if so, can one or others be moved?
 


Top