Thursday, December 9, 2010

Hot Copy File System Snapshot - Installation, Setup and Use

What is HotCopy

(Description taken from the R1Soft website)
R1Soft Hot Copy (hcp) is the answer to taking online point-in-time disk and volume snapshots in Linux.  Use the hcp command line utility to take an instant snapshot of any mounted file system on almost any block device.
·        Add point-in-time open file backups to your existing backup scripts for free e.g. tar and rsync
·        Check your disk for errors with fsck without rebooting and without unmounting your file system!
·        Test scripts and programs in an instant snapshot of your live environment before you use them on real data
·        Keep instantly recoverable snapshots available by taking periodic snapshots via cron

Installation

This document will assume the Operating System this tool will be installed to is a 64-bit OS.  At the time this document was created HotCopy was at version 3.6.1.
There are two components needed for install.
1.      r1soft-hotcopy-3.6.1.x86_64.rpm
2.      r1soft-setup-3.6.1.x86_64.rpm
The first package is the HotCopy utility.  The second package is the setup utility to get and install the kernel driver.

r1soft-hotcopy-3.6.1.x86_64.rpm

To install this package you need to be root.  Copy the packages to the server’s /tmp directory.
# cd /tmp
# rpm -ivh r1soft-hotcopy-3.6.1.x86_64.rpm

r1soft-setup-3.6.1.x86_64.rpm

To install this package you need to be root.  Copy the packages to the server’s /tmp directory.
# cd /tmp
# rpm -ivh r1soft-setup-3.6.1.x86_64.rpm

Setup

Now that the packages are installed you need to download and install the Kernel driver.  Your server will need access to the Internet to perform the following command.
As a root user run the following command and you should see the following output.
# /usr/sbin/hcp-setup --get-module
Checking for binary module
Waiting
Complete.
Saving kernel module to ‘/lib/modules/r1soft/hcpdriver-cki-2.6.18-194.17.4.el5.ko’
#

Usage

Creating a Snapshot

Verify the filesystem you want to snapshot.
# df -h
Filesystem                                                        Size        Used      Avail      Use%     Mounted on
/dev/mapper/VolGroup00-LogVol00         7.8G       3.4G       4.0G       47%       /
/dev/sda1                                                         99M       19M       75M       21%       /boot
Tmpfs                                                                1006M  0             1006M  0%          /dev/shm

Create the snapshot

Verify the Snapshot Exists

Example of  “foo”

In this example we’ll “accidentally” delete a file in the /tmp directory but be able to retrieve it from the snapshot.

Good thing I took a snapshot before I began messing around.  I will now retrieve the file from the snapshot.
We’re back!  File has been retrieved.  Disaster averted.

Removing a Snapshot

Additional Cool Things

You can have the Snapshot created and mounted to a completely different filesystem.
Why would you want to do that?
Backups of course.  Let’s assume we have a system that can have very little downtime and cannot be backed up while its special application is running.  Backups take 2 or 3 hours and that kind of downtime in not acceptable.  Here is what we would do (assuming there is a separate disk to snap to)
1.      Bring down the application
2.      Create a Snap of the filesystem to another disk.  This process takes a second or two.
3.      Bring the application up
4.      Take a backup of the Snap instead of the live filesystem.
Your application would be down only for a matter of seconds and backing up the Snap would not affect disk I/O on production filesystem.

Example:

Verify the New Disk is Seen

[root@centos55-test /]# fdisk -lu /dev/sdb
Disk /dev/sdb: 12.8 GB, 12884901888 bytes
255 heads, 63 sectors/track, 1566 cylinders, total 25165824 sectors
Units = sectors of 1 * 512 = 512 bytes
   Device Boot      Start         End      Blocks   Id  System

Partition the Disk

[root@centos55-test /]# fdisk /dev/sdb

The number of cylinders for this disk is set to 1566.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): u
Changing display/entry units to sectors

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First sector (63-25165823, default 63):
Using default value 63
Last sector or +size or +sizeM or +sizeK (63-25165823, default 25165823):
Using default value 25165823

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Create a Linux FileSystem on the New Partition

[root@centos55-test /]# mkfs.ext3 /dev/sdb1
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
1572864 inodes, 3145720 blocks
157286 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=3221225472
96 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208

Writing inode tables: done                           
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 26 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

Create a Directory to Mount the New Filesystem to

[root@centos55-test /]# mkdir -p /mnt/bkup

Mount it

[root@centos55-test /]# mount /dev/sdb1 /mnt/bkup

Verify it is Mounted

[root@centos55-test /]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      7.8G  3.4G  4.1G  46% /
/dev/sda1              99M   19M   75M  21% /boot
tmpfs                1006M     0 1006M   0% /dev/shm
/dev/sdb1              12G  159M   12G   2% /mnt/bkup

Stop Application or Database

Do what is needed to stop the application or database

Create a Snapshot of the FileSystem we want to backup

-m   and  -c

This takes all “additional” I/O off of the production filesystem
[root@centos55-test /]# hcp -m /mnt/bkup -c /dev/sdb1 /dev/mapper/VolGroup00-LogVol00
R1Soft Hot Copy    3.6.1 build 10439 (http://www.r1soft.com)
Documentation      http://wiki.r1soft.com
Forums             http://forum.r1soft.com

Thank you for using Hot Copy!
R1Soft makes the only Continuous Data Protection software for Linux.

Starting Hot Copy: /dev/mapper/VolGroup00-LogVol00.
Changed blocks stored: /dev/sdb1
Snapshot completed: 0.001 seconds
File system frozen: 0.128 seconds
Hot Copy created: Fri Oct 15:12:05 CDT 2010
Creating Hot Copy snapshot device: /dev/hcp1, Please Wait...

Hot Copy created at: /dev/hcp1
Mounting /dev/hcp1 read-write
Hot Copy mounted at: /mnt/bkup
[root@centos55-test /]#

Start Application or Database again

Do what is need to start the DB or APP again

Verify the Snap has been made and mounted.

[root@centos55-test /]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      7.8G  3.4G  4.1G  46% /
/dev/sda1              99M   19M   75M  21% /boot
tmpfs                1006M     0 1006M   0% /dev/shm
/dev/sdb1             7.8G  3.4G  4.1G  46% /mnt/bkup
/dev/hcp1             7.8G  3.4G  4.1G  46% /mnt/bkup
[root@centos55-test /]#

Run the backup on the Snap

Run Backup Against /mnt/bkup

Once the backup is complete you can remove the Snap

[root@centos55-test /]# hcp -r /dev/hcp1
R1Soft Hot Copy    3.6.1 build 10439 (http://www.r1soft.com)
Documentation      http://wiki.r1soft.com
Forums             http://forum.r1soft.com

Thank you for using Hot Copy!
R1Soft makes the only Continuous Data Protection software for Linux.

Hot Copy Session has successfully been stopped.

All active Hot Copy sessions have been stopped. It is now safe to restart the R1Soft Backup Agent.
[root@centos55-test /]#

Friday, August 6, 2010

Backing Up a django Web Server

I have a webserver I want to backup weekly and keep 4 weeks of backup history.
The webserver runs CentOS 5.4 and Django.
In the event of a server crash I'd like to be able to recover my website rather easily.  I figure it would be relatively easy to reinstall the OS and install the applications.  Then all I would need to do is restore the configuration files and the database.

Assumptions:
You have django installed and working
$DBDUMPDIR = where ever you want the Database Dump files to go
$BKUPDIR = where ever you want the Backup files to go.  Note:  This backup directory could be a remote server such as an NFS mount or SMB mount.  This is suggested so if this server crashes, your backups would be on another server.
Edit the scripts below to for "your" appropriate directories and file names.



Here are the four files I will need to backup:
*  /etc/httpd/conf/httpd.conf
*  /etc/httpd/conf/mysite.conf
*  /etc/www/django
*  $DBDUMPDIR/data.json

I created a script called dumpfiles.bash and put it in root's bin directory "/root/bin/dumpfiles.bash".
Make sure that permissions are right to be able to execute the file.
chmod 770 /root/bin/dumpfiles.bash

Here is the script:
----- start of script -----
#!/bin/bash
#
# SCRIPT: dumpfiles.bash
# AUTHOR: Bob
# DATE: 07/01/2010
# REV: 1.blah
#
# PURPOSE: This script is used to backup webserver specific data
#
# set -x # Uncomment to debug this script
#
# set -n # Uncomment to check the script.s syntax
#        # without any execution. Do not forget to
#        # recomment this line!
#
####################
# Define Variables #
####################
# Capture the shell script file name
THIS_SCRIPT=$(basename $0)
#
# Define the start time of the script
STARTTIME=`date +%T`
#
#Set Backup and DB Dump Directories
#Change the directories below to match your environment
BKUPDIR=/mnt/backup/webserver
DBDUMPDIR=/root
#
##################
# Increment Backups #
##################
rm -f $BKUPDIR/bkup4.tar.gz
mv -f
$BKUPDIR/bkup3.tar.gz $BKUPDIR/bkup4.tar.gz
mv -f
$BKUPDIR/bkup2.tar.gz $BKUPDIR/bkup3.tar.gz
mv -f
$BKUPDIR/bkup1.tar.gz $BKUPDIR/bkup2.tar.gz
#
###################
# Dump Database #
###################
# export PYTHONPATH for dumpdata script
export PYTHONPATH='/var/www/django':'/var/www/django/apps'
#
#Change Directories to /var/www/django/mysite
cd /var/www/django/mysite
#
#Backup the DJango database and files to a flat file
python manage.py dumpdata > $DBDUMPDIR
/data.json
#
#Get out of the /var/www/django/mysite directory
#Let us go home
cd /root

#
###############################
# Backup and compress important files #
###############################
tar cvfz $BKUPDIR/bkup1.tar.gz /etc/httpd/conf/httpd.conf /etc/httpd/conf.d/mysite.conf /var/www/django $DBDUMPDIR/data.json
----- end of script -----

I want to schedule a weekly backup so I'll use root's cron to do this.
Edit root's cron (assuming you are logged in as root).

Enter the following line to root's cron.
0 4 * * 1 /root/bin/dumpfiles.bash

This will run the script every Monday morning at 4:00 AM.
Now each time the script runs bkup1.tar.gz will be created.  If it already exists the old files will be incremented up to the number 4 giving you four weeks of backup files.

In the event of a disaster and you rebuilt a new server, you'd untar the backup files and copy the files back to their appropriate place and you the manage.py script to restore the django database.

Hope this helps someone out there.

Tuesday, June 29, 2010

Expect - Using expect to Automate Processes or Generate Reports

I have 51 Linux servers that I manage (soon to grow to well over 70).  Over the past year the company I am with has moved and we have redesigned the network (a few times).  During this redesign we changed which servers provide DNS and NTP services.  I like to think I am a pretty thorough person and believe I updated all 51 servers with the correct DNS and NTP IP Addresses, but I also want to validate my thoroughness as a sanity check and a c.y.a. BTW - I use IP Addresses for the DNS and NTP settings just in case DNS is unavailable

I really don't want to log in to 51 different servers and verify the contents of 3 different configuration files on each of these servers.  It would be nice if I could spend a few minutes writing a script that could poll each server and write out a report that I could review.  So that is just what I did.

A few things I needed to have in place before I got started.
1.  A linux account defined on all 51 servers that has remote SSH permissions and the ability to read the three configuration files I am interested in.  I don't allow root to remotely SSH to any server.
2.  On the computer I will be running the script from (my Linux laptop) a linux utility called expect.
3.  A list of all 51 servers in a text file.

Number 1 is easy as I have a service account (we'll call it saccount) that has access to every server but has very little permissions (but enough to read the files I am interested in).  For number 2 I had to install expect on my laptop which is running a flavor/type-of Redhat Linux.  Expect should be available in your repository for updates.  Number 3 was easy too.  I had a file containing all of my Linux servers.

I ended up with 3 files (not including the report file generated after running the script/s).
File 1:  serverlist.txt - this file contains a list of my servers.  One server name per line.  Example:
serverA
serverB
serverC
server1
server2
server3
   you get the idea...

File 2:  dnsntpreport.exp - you can call it anything you want.  Just make sure it is executable.  the contents of the files are as follows:

   #!/usr/bin/expect -f
   spawn ./dnsntpreport.ksh
   expect {
   "*re you sure you want to continue connecting (yes/no)?"
   {send -- "yes\r\n"
   exp_continue}
   "*assword:*"
   {send "#######\r\n"
   exp_continue}
   }
   exit

where you see #######, you would put the actual password for the user you are using.  This script will watch for certain prompts and answer them with the text you entered automatically.

File 3:  dnsntpreport.ksh - you can call it whatever you want but notice that the above script will call this script so if you change the file name you will need to edit the script above.  The contents of this script are as follows:

   #!/bin/ksh
   for line in $(cat ./serverlist.txt)
   do
   echo -e "\n###$line###" >> dnsntp_report.txt
   echo -e "/etc/resolv.conf file" >> dnsntp_report.txt
   ssh saccount@$line grep -e "10\." /etc/resolv.conf >> dnsntp_report.txt
   echo -e "\n/etc/ntp.conf file" >> dnsntp_report.txt
   ssh saccount@$line grep -e "10\." /etc/ntp.conf >> dnsntp_report.txt
   echo -e "\n/etc/ntp/step-tickers file" >> dnsntp_report.txt
   ssh saccount@$line grep -e "10\." /etc/ntp/step-tickers >> dnsntp_report.txt
   echo -e "###" >> dnsntp_report.txt
   done

So what is going on here?  File 3 will SSH to a server and look through three files for IP Addresses starting with a "10" and record its finds to a file called dnsntp_report.txt.  During our moves and reconfigures the first octet has remained "10" but the others have changed.  Of course when you SSH to a server (assuming you do not have Passwordless SSH setup) you are sometimes prompted whether you trust the key and then for a password.  This is where File 2 comes in and is actually the file you execute from the command-line since it will call File 3.  This file (File 2) will look for two specific prompts and answer them automatically so we don't have to respond 51 or more times.  Obviously, where you see "saccount" in the above script replace with the account you are using.  Remember, the password is stored in File2.

Assuming you have all three files in the same directory and File 2 and File 3 executable all you need to do is run File 2 from the command line.  After the script runs you should have a text file called dnsntp_report.txt that indicates the settings you were (or in this case I was) interested in.

I hope this helps someone else out there.

Wednesday, June 2, 2010

A Quick and Dirty Virtual IP (VIP) Address for HA Purposes

A virtual IP address?  Why do I want one of those?  I have two servers (not clustered) set up to run an application.  High Availability is important, but I do not need automatic failover.  So I have the application run on a specific IP address on Server-A while the application is off on Server-B.  In the event Server-A needs to be brought down for maintenance or has an issue, I want to be able to start the application on Server-B with the same IP Address.


Here is the quick and dirty way to do it.
Assumption:  IP Network is 192.168.1.0/255.255.255.0
We will pick 192.168.1.100 for our Virtual IP (VIP) Address


On Server-A:
1.  Create a file called start_adm_vip.sh in /usr/local/sbin
Its contents should be as follows:
/sbin/ifconfig eth0:1 192.168.1.100 netmask 255.255.255.0
/sbin/arping -q -U -c 3 -I eth0 192.168.1.100

2.  Create a file called shutdown_adm_vip.sh
Its contents should be as follows:
/sbin/ifconfig eth0:1 down

3.  Modify permissions on these files so they are executable. 750 should suffice.
chmod 750 *adm_vip.sh

4.  Edit the /etc/hosts file to add a friendly name to the VIP.  Obviously use the name and the IP Address you choose here:
192.168.1.100     appadm.mydomain.net     appadm

5.  Copy the two scripts you just created to Server-B and place them in /usr/local/sbin as well.

On Server-B:
1.  Edit the /etc/hosts file to add a friendly name to the VIP.  Obviously use the name and the IP Address you choose here:

192.168.1.100     appadm.mydomain.net     appadm

Starting the VIP

Now run the script "start_adm_vip.sh" on Server-A.  You should be able to ping "appadm" from both Server-A and Server-B.  Do not go and run the start script on Server-B.  If you do you will have duplicate IP Addresses on the network.  If you want to move the VIP to Server-B, shut it down on Server-A first.

Shutting the VIP down
If you want to manually move the VIP to Server-B you need to shut it down on Server-A first.
On Server-A run the "shutdown_adm_vip.sh".  Now "appadm" should not be pingable from either server.

Go to Server-B and run the script "start_adm_vip.sh".  "appadm" should now be live on Server-B and pingable from both Server-A and Server-B.


Like I said, this is a quick and dirty way to have a VIP.  Hope this helps someone out there.

Thursday, April 29, 2010

snmpd Information Filling up the /var/log/messages File

I am using net-snmp on my linux servers so cacti can poll for data and graph statistics.  I noticed that the /var/log/messages file was filling up with snmpd messages.  All of which were merely informational and benign.  I know snmpd works and is configured properly and those log messages in my /var/log/messages file makes it hard to find anything useful in it.

I found that (at least in the version of net-snmp that I am using) debug logging is turned on by default.  Well, I don't want debug level logging.  In fact I don't want any logging for snmpd to go to my /var/log/messages file.

I run Oracle Enterprise Linux and Ubuntu.

On any flavor of Red Hat Enterprise Linux (example:  RHEL, OEL, CentOS) modify the /etc/sysconfig/snmpd.options file.  If it doesn't exist, create it.

The contents of that file should be changed to this:

     # snmpd command line options
     OPTIONS="-Lf /dev/null -p /var/run/snmpd.pid -a"

This will turn off all logging for snmpd.  Remember to restart snmpd for the changes to take affect.
     service snmpd restart

On Ubuntu edit the /etc/init.d/snmpd file and change the line that looks like this:
     SNMPDOPTS='-Lsd -Lf /dev/null -p /var/run/snmpd.pid'
to this
     SNMPDOPTS='-Lf /dev/null -p /var/run/snmpd.pid'

That's it.  Remember to restart snmpd for the changes to take affect
     /etc/init.d/snmpd restart

Wednesday, February 24, 2010

Interactive Tape Backups using TAR and Linux

I sometimes want to run an on-demand backup of either a particular directory or file system.  I wrote an interactive script to do this and thought I would share it.
Assumptions:
1.  You have a tape device attached to your Computer or Server.
2.  You know what device your tape drive is.  Example: /dev/st0
3.  You have the mt-st package installed to manage the tape device.
4.  The tape you are using will be overwritten.
5.  You copy the contents of the script below in to a utility like notepad and check the contents.  Then copy from there in to a script called (whatever you want).
6.  Pay close attention to the command that starts like this:  TAPECHK=$(mt  It show up correctly in this post but if you cut and past the script in to notepad the lines do not match.  Edit it so it looks like it does here in the post.
7.  You make the script executable.
8.  I placed the script in /usr/local/sbin but you can put it where ever it make sense to you.

Here is the script:

#!/bin/bash
#
# SCRIPT: Interactive_2_tape.bash
# AUTHOR: Bob
# DATE: 02/24/2010
# REV:
#
# PURPOSE: This script is used to backup files
# from $SOURCE to $TAPEDEV
#
# set -x # Uncomment to debug this script
#
# set -n # Uncomment to check the script.s syntax
#        # without any execution. Do not forget to
#        # recomment this line!
#
####################
# Define Variables #
####################

# Capture the shell script file name
THIS_SCRIPT=$(basename $0)

# Define the start time of the script
STARTTIME=`date +%T`

# Define current directory to return to at end of script
CURRENTDIR=$PWD

# Ask for the source of the backup
echo "What directory or filesystem do you want to backup?"
echo "Type the directory in this format /dir1/dir2 followed by [ENTER]:"
read SOURCE
echo "Using $SOURCE as the source directory you want to backup."

# Ask for tape device
TAPEDEV="/dev/st0"
echo "I assume your tape device is $TAPEDEV"
read -p "Am I correct? yes/no: "
if [ "$REPLY" = "no" ]; then
     echo "What is your tape device? "
     read TAPEDEV
     echo "Using $TAPEDEV as your tape device."
else
     echo "Using $TAPEDEV as your tape device."
fi


################
# Main Section #
################

# Verify there is a tape in the
# drive and rewind the tape
TAPECHK=$(mt -f $TAPEDEV rewind 2>&1 1>/dev/null)
# If there is no tape tell me and exit out
#  If the mt command return any data then there is an error

#  Check the results of the mt command
if [ "$TAPECHK" != "" ]; then
     echo $TAPECHK
     echo "Check to see if there is a tape in the drive or if the device you entered is valid."
     exit
fi

# If we made it here, there is a tape in the drive and it has rewinded.
# Change to the directory to be backed up
cd $SOURCE
echo "Changing to the $SOURCE directory."

# Back up data
tar cvf $TAPEDEV .

# Rewind the tape again
mt -f $TAPEDEV rewind

# Change back to the directory from where you came
cd $CURRENTDIR
echo "Changing back to the directory you started from: $CURRENTDIR"
# Define the end time of this script
ENDTIME=`date +%T`

# Display the start and end time of this script
echo "$THIS_SCRIPT began at $STARTTIME and finished at $ENDTIME"

exit
#################
# End of Script #
#################


I hope this helps others out there trying to do the same thing.

Monday, February 15, 2010

Using tar to Copy a Large File into a Tight Space

I do a lot of work with virtual machine images.  I ran in to a situation where I wanted to copy a file called System.img from one server to another and even though I knew I had enough room to do it, I would get messages stating that there was not enough space.  What in the world was going on?  The file system I wanted to copy the file to was 33G in size.  The file was just under 33G in size.  I knew it should fit.  I knew this because the file system this file is coming from is also 33GB in size (same identical size).  When it was all said and done I should of had about 150M of free space according to the source.

I tried FTP, SCP, and various other mechanisms to copy the file from the one server to the other.  No joy.
So I copied the file to the destination server but to a different and larger file system.  That obviously was successful, but I still wanted it on my 33G file system.  So I tried copying the file locally from the larger file system to the 33G file system.  No joy again... I got a message after a few minutes stating there was not enough space, and the process errored out.

I found a solution!  Now, to be honest, I do not know why it works, but it does.

Assumptions: You are in the directory the System.img file is located.  The file resides on the same server you are copying to.  You have done the math and according to the calculator the file will fit on the destination file system.

Run this command with the appropriate path of your destination.

Note:  The following command is all on one line.
# tar cvf - System.img | ( cd /destination file system/destination folder/;tar xvf - )

Works like a charm.  I have used this little gem a dozen times in the past few months.

This should be obvious but...  Remember, you can not copy/place a file that is larger than the space available on the destination.  Hope this helps others out there.