Accounts e-mail HP

Automatic hardware probe

Post a reply

This question is a means of preventing automated form submissions by spambots.
:D :) ;) :( :o :shock: :? 8-) :lol: :x :P :oops: :cry: :evil: :twisted: :roll: :!: :?: :idea: :arrow: :| :mrgreen: :geek: :ugeek: :halo:
BBCode is ON
[img] is ON
[flash] is OFF
[url] is ON
Smilies are ON
Topic review

Expand view Topic review: Automatic hardware probe

Re: Automatic hardware probe

Post by Davide » Tue Jul 02, 2013 6:22 pm

Scheduled at 5:00 UTC of each Monday. Lowest peak of users there.

Re: Automatic hardware probe

Post by Major Nimrod » Tue Jul 02, 2013 6:03 pm

Impact on CPU and useability ? Scheduled to run at opposite end of peak times ?

Automatic hardware probe

Post by Davide » Mon Jul 01, 2013 12:56 am

To stand one step ahead from hardware failures, I wrote a hardware tester script which probes the server each week and sends email alerts.

It's basic simple and uses the GCC compiler to detect segfaults; in my experience there's no better tool than GCC to probe hardware. Memtest86, Cpuburn, Memtester - you name the tool, I say it didn't work for me at least once, whilst the most latent and asymptomatic hardware failure always causes a segfault to GCC and of course also other undetected data corruption to the regular services.

GCC simultaneously tests the hard drive, IDE buses, RAMs, motherboard, CPU and power supply, while specific tools mostly test one single component at a time and so are unable to detect combined problems; for example, if the power supply is slightly undersized and unable to provide enough power to run the system on full load but seems adequate to run on moderate load, a specific tester tool won't detect that, as it would merely overload one single component.


Code: Select all
# Complete computer hardware test with email report.

# Require: heirloom-mailx to send email notifications;
#          make;

basedir=$(dirname "$0")
cd "$basedir"

: >log
cd ./linux
if { nice -n 19 make clean &&
     nice -n 19 make &&
     nice -n 19 make clean
   } >../log 2>&1                                   
    # success
    mail -s "Hardware test success" -r "Greatturn_Hardware_Tester" "$email" <<-EOF
        Periodic hardware tester running on server succeeded the GCC compilation test.
    # fail
    mail -s "Possible hardware failure" -r "Greatturn_Hardware_Tester" "$email" <<-EOF
        Periodic hardware tester running on server failed the GCC compilation test.

        Latest output from compilation (stdout, stderr):
        $(tail -n100 ../log)