Linux is making my ssd really hot compared to windows, making my bios saying S.M.A.R.T status is bad when using linux long enough

I thought the SSD is just bad and I got a faulty one, so I just go for a warrant, they told me it was perfectly fine.
Did they perform any actual test?....I know they told you they did, but are you actually sure?

Who is "they"...a local retailer ?

When you have Linux reinstalled, be sure to install Smartymontools In Linux Mint you will find it in Software Manager. The fast test should be enough.
Smart tests are accessible only when you boot to the usb stick......then type in disks......highlight the drive you are going to check....click on line of three vertical dots...select smart data & self tests,,,..etc....select the nv drive etc
 
Last edited:


if the sensors are reporting erroneously (or potentially erroneously), I'd troubleshoot it by bypassing the sensors entirely and use a 3rd party sensor, like an infrared thermometer gun - crack the case open & use that to directly monitor the ssd's temp.

it's really odd that the sensors would report a significant temperature rise in linux but not in windows on the exact same hardware.
 
if the sensors are reporting erroneously (or potentially erroneously), I'd troubleshoot it by bypassing the sensors entirely and use a 3rd party sensor, like an infrared thermometer gun - crack the case open & use that to directly monitor the ssd's temp.

it's really odd that the sensors would report a significant temperature rise in linux but not in windows on the exact same hardware.
I don't have a thermometer gun, but I did take the case out and feel it, it does feel hotter for real. so I guess it does get hotter.

I think I read somewhere that kingston ssd have some closed-source stuff that makes linux can not detect, hence the temp is high? IDK, but yeah...
 
Did they perform any actual test?....I know they told you they did, but are you actually sure?

Who is "they"...a local retailer ?

When you have Linux reinstalled, be sure to install Smartymontools In Linux Mint you will find it in Software Manager. The fast test should be enough.
Smart tests are accessible only when you boot to the usb stick......then type in disks......highlight the drive you are going to check....click on line of three vertical dots...select smart data & self tests,,,..etc....select the nv drive etc
They took my drive, they formatted it, they put on their computer, they tested it, it all worked out, I told them to wait a bit to see if there's any problem, there weren't any problem, they put it back in my laptop, I boot it up on windows, nothing happened, I told them linux made it hot, they told me they didn't support linux so that's pointless.
 
Read Self-test Log failed: Invalid Field in Command (0x002)
This is an error in SmartmonTools 7.4 itself and should be fixed when 7.5 is released - hopefully sometime later this year

from your post #22
I would look at the main power supply if it is not putting out enough power this will cause things to heat up as well, because of the draw - that would mean you need to measure the amp draw

There is a power supply calculator here - https://outervision.com/power-supply-calculator

Also using an ammeter on the +12v ATX four or eight pin cable will give you a number close to the amps the CPU is drawing. Multiply that by the real CPU voltage and you have a good estimate of the wattage

I have seen this on several occasions where a person upgrades some things but their old power supply does not put out enough power to drive the new gear and hence it starts to overheat - mostly from upgrading the cpu or gpu
 
Last edited:
@hoanghieubrant

At this point my suggestion would be to install the linux distribution and configure the system to shut down automatically when a certain temperature threshold is reached to prevent overheating. Since there doesn't appear to be a built-in, direct command for setting an automatic shutdown temperature, one is left with having to achieve the functionality through scripting, using the temperature monitoring tools and system management commands. Some options for running such a script are from a systemd unit written for the purpose, a cron job or it could run from
/etc/rc.local.

If such a script was created and run, and the linux system never shut itself itself down, then there's probably no problem. Such a script would be a safe way to go to protect things and is possibly a reasonable test for this particular problem.
At least this suggestion is solid. It takes any guesswork etc out of the equation. It is beyond my abilities to make such a script/cron job for you.
 
Does Mint have /etc/acpi/sleep.sh ?

Maybe something like this. I wouldn't have it check every 60 seconds but that's just me.
Code:
#!/bin/bash

# Set the temperature threshold in Celsius
TEMP_THRESHOLD=80

# Check the temperature every 60 seconds
while true; do
    # Get the current temperature from the sensor
    TEMP=$(sensors | grep "Package id 0:" | cut -d "+" -f 2 | cut -d "." -f 1)

    # Check if the temperature exceeds the threshold
    if [ "$TEMP" -gt "$TEMP_THRESHOLD" ]; then
        echo "Temperature exceeded threshold ($TEMP_THRESHOLD°C). Shutting down..."
        sudo shutdown -h now
    fi

8.2.4. Cron Job Management​

 
when you boot, do you see any acpi errors going past on the screen ?

Run this in terminal and post the result:

Code:
grep -Ev "^[ ]*0" /sys/firmware/acpi/interrupts/gpe?? | sort --field-separator=: --key=2 --numeric --reverse | head -1




Also....which Linux are you booting ?
 
Last edited:
when you boot, do you see any acpi errors going past on the screen ?

Run this in terminal and post the result:

Code:
grep -Ev "^[ ]*0" /sys/firmware/acpi/interrupts/gpe?? | sort --field-separator=: --key=2 --numeric --reverse | head -1




Also....which Linux are you booting ?
I will reinstall endeavour os and run this command alright? or do I only need to boot from linux usb live?
 
oh wow thanks so much, this is like a gift from you guys haha. I'll try it out using linux mint
Do you mean you will Install Linux Mint..... or only run it live?
You can try it in Live mode......I am not sure it will throw errors booting to Live....I have only encountered them when it is fully installed.

Try out the systemd script first and see what happens there first
 
Do you mean you will Install Linux Mint..... or only run it live?
You can try it in Live mode......I am not sure it will throw errors booting to Live....I have only encountered them when it is fully installed.

Try out the systemd script first and see what happens there first
Okay, the I think I'll install linux mint first, then I'll run all the scripts so that I can provide infos, when finished I guess I will use endeavour os. wait for me
 
when you boot, do you see any acpi errors going past on the screen ?

Run this in terminal and post the result:

Code:
grep -Ev "^[ ]*0" /sys/firmware/acpi/interrupts/gpe?? | sort --field-separator=: --key=2 --numeric --reverse | head -1




Also....which Linux are you booting ?
I am running linux mint installed now, this is the result of the command:

Code:
/sys/firmware/acpi/interrupts/gpe6E:       9  EN     enabled      unmasked
 
Code:
sudo echo disable > /sys/firmware/acpi/interrupts/gpe6E

Run the above command.

I'm not certain this is going to help a great deal, but it will do no harm

Edit to Add: I have @osprey to thank for those commands. They got me out of a great deal of trouble approx 2 years ago.
My problem was related to very high cpu use....which of course induces heat problems.
 
Last edited:
Code:
sudo echo disable > /sys/firmware/acpi/interrupts/gpe6E

Run the above command.

I don't think this is going to help a great deal, but it will do no harm
Ohhh it does seem to make the temp so much lower, not as low as on windows, but the temps now is around 36 degrees with sensor 1 around 60 degrees, which is really similar to windows.
1737103591582.png


How the hell did that command fix the problem? That's amazing dude. I think the fix will be able to be used in many distros right?
 
@Brickwizard said somewhere he would put his money on something still thrashing around...or words to that effect. That reminded me of my own experience. In my case it was an acpi error that did not stop.....which in turn affected the cpu....running at 98% constantly.....which also in turn raised the temps

The command "singled out" the offending 'interrupt'....which was an acpi error....these come from motherboards, not the distros (as far as I can remember).....So, the problem will be fixed for all distros.

But, the fix so far may not be permanent....so...follow the below:

I then needed to make the fix permanent
So,

then...
# crontab -e
This opens your favourite editor. Add these lines:

@reboot echo "disable" > /sys/firmware/acpi/interrupts/gpe6E
  • Press Ctrl+X or F2 to Exit. You will then be asked if you want to save.
  • Press Ctrl+O or F3 and Ctrl+X or F2 for Save and Exit

That's it.

Those temps are fine, btw

Enjoy your Linux.
 
@Brickwizard said somewhere he would put his money on something still thrashing around...or words to that effect. That reminded me of my own experience. In my case it was an acpi error that did not stop.....which in turn affected the cpu....running at 98% constantly.....which also in turn raised the temps

The command "singled out" the offending 'interrupt'....which was an acpi error....these come from motherboards, not the distros (as far as I can remember).....So, the problem will be fixed for all distros.

But, the fix so far may not be permanent....so...follow the below:

I then needed to make the fix permanent
So,

then...
# crontab -e
This opens your favourite editor. Add these lines:

@reboot echo "disable" > /sys/firmware/acpi/interrupts/gpe6E
  • Press Ctrl+X or F2 to Exit. You will then be asked if you want to save.
  • Press Ctrl+O or F3 and Ctrl+X or F2 for Save and Exit

That's it.

Those temps are fine, btw

Enjoy your Linux.
Thanks so much for all the help guys :) I will try out Endeavour OS with this fix, hoping it will work so that I can remove windows.

Thanks everyone so much.
 


Top