Blog

Temperature monitoring script. -Find out what it mainly does.-

Overview: After having a high temperature in one of our servers and not being able to notice it as a ‘heat wave’ came in… I thought about adding to our monitoring stack a simple script to monitor the temperature and notify us in case something goes wrong.

Basically, the script will run SENSORS and parse the output from it to let us know if something goes wrong, and leave a simple logline to check the average temperatures.

So packages you’ll need : sensors.

To install install sensors and be able to run this script:

  1. Install the lm-sensors package.
  2. Run sudo sensors-detect and choose YES to all YES/no questions.
  3. At the end of sensors-detect, a list of modules that needs to be loaded will displayed. Type “yes” to have sensors-detect insert those modules into /etc/modules, or edit /etc/modules yourself.
  4. Next, run sudo service module-init-tools restart This will read the changes you made to /etc/modules in step 3, and insert the new modules into the kernel.
  5. Copy this script to ~ (root folder), add permissions to execute chmod u+x (scriptname).sh
  6. crontab -e add the line “@reboot /root/(scriptname).sh
#!/bin/bash
######################################################################################
# Derek Demuro, this script is given as is, CopyLEFT                                 #
######################################################################################
######################################################################################
# README                              LEAME                                          #
######################################################################################
# This script will run and check the temperature, in case of being high, MAIL!
#
############################To Set Up#################################################
#
# To set this script up, you'll need to add it to a cronjob to run on boot
# Most linux distros will allow a param @reboot /(path)/servermon.sh
# So... crontab -e
# At the bottom add: @reboot /(path)/servermon.sh
# REMEMBER TO ADD EXECUTABLE BIT TO THE FILE (777 Permissions)
######################################################################################
# SCRIPT CONFIGURATION                                                               #
######################################################################################
### Where should the log be saved?
readonly LOGNAME='servertemp.log'
### Who should we mail on error
readonly MAILTO='<a href="mailto:mail@derekdemuro.me" class="mailto">mail@derekdemuro.me<span class="mailto"><span class="element-invisible"> (link sends e-mail)</span></span></a>'
### How much time between check's?
readonly SLEEPTIME=30
###Alert if temp Above
readonly MAXTEMP=75
###ServName
readonly SERVNAME='UYMF1DEB'
##How long to sleep after message sent
readonly SLEEPERROR=216000
### How many records to keep
readonly CLEARLOGTMS=1000
######################################################################################
#################################FUNCTIONS START################################
 
###Function to clear the log
function clearLog() {
  echo 'Log cleared' > $LOGNAME
  echo 'Script will run ' $1 ' times then will clear itself'>> $LOGNAME
  return 0
}
 
#################################FUNCTIONS FINISH################################
#################################MAIN SCRIPT FUNC################################
while [ TRUE ]; do
    #add 1 to times
    times=`expr $times + 1`
    ##CLEAR THE LOG
    if [ $times -eq $CLEARLOGTMS ]; then
        clearlog
        $times=0
    fi
    ##Run script for every line in serverList
    currentTemp=`sensors|grep Core|awk '{print $3}'|cut -b2,3,3|tail -1`
    if [ $currentTemp -gt  $MAXTEMP ]; then
        echo -e "\e[00;31m`date +'%Y-%m-%d %H:%M:%S'` ALERT: Current temperature: $currentTemp, at server: $SERVNAME \e[39m"
        mail -s "ALERT: Temperature above umbral" $MAILTO -a "Reply-To: " <<< "`date +'%Y-%m-%d %H:%M:%S'` ALERT: Current temperature: $currentTemp, at server: $SERVNAME"
        sleep $SLEEPERROR
    else
        echo -e "\033[38;5;148m `date +'%Y-%m-%d %H:%M:%S'` All good: $currentTemp at server: $SERVNAME \033[39m"
    fi
  ##We sleep till new run
  sleep $SLEEPTIME
done
ddemuro
administrator

Sr. Software Engineer with over 10 years of experience. Hobbist photographer and mechanic. Tinkering soul in an endeavor to better understand this world. Love traveling, drinking coffee, and investments.

You may also like

Leave a Reply

Recent Comments

    %d bloggers like this: