Backup files to Amazon S3


After a few years of silence today it came to my mind that I should once again start to write on my blog. During the last 12 odd months I have been involved in developing and revamping several of the websites own by the company I currently work.

In this article I’m going to discuss about the steps I have been using to automate the backing up of the websites that were hosted with Amazon EC2 to Amazon S3 bucket.

The strategy I adapted can be broken down into following 3 steps:

  1. Create a backup copy of each of the database and the website source code (all my websites were developed using PHP) on a daily basis and compress each (database & source code) using tar.gz compression appending the timestamp
  2. Pushing of backup files to Amazon S3 bucket
  3. Set a conjob task to execute to process

Step 1: Create a copy of each of the database and the website source code

To achieve this I created a folder called backups (/home/ubuntu/backups) in the home directory and added the necessary instructions into the shell script as follows.

#!/bin/sh

# (1) set up the required variables
DB_DUMP=<filename>_`date +"_%Y_%m_%d"`.sql
SOURCE_CODE=<filename>_`date +"_%Y_%m_%d"`.tar.gz
DBSERVER=<hostname>
DATABASE=<database name>
USER=<database user>
PASS=<database password>

# (2) use the following command  to create a dump of the database
cd /home/ubuntu/backups/
mysqldump --opt --user=${USER} --password=${PASS} -h ${DBSERVER} ${DATABASE} > ${DB_DUMP}

# (3) compress the mysql database dump using tar.gz compression
tar -zcf ${DB_DUMP}.tar.gz ${DB_DUMP}

# (4) create a copy of the website source, compress it and moved to /home/ubuntu/backups/
cd /var/www/
tar -zcf ${SOURCE_CODE}  <website source code folder>/
mv ${SOURCE_CODE} /home/ubuntu/backups/

# (5) delete the older copies of backups which are more than 3 days old inside /home/ubuntu/backups/
cd /home/ubuntu/backups/
find <filename>_* -mtime +3 -exec rm {} \;

Save the file as backup.sh inside /home/ubuntu/backups

Step 2: Pushing of backup files to Amazon S3 bucket

To achieve this I adapted two approaches and you’ll find that the latter approach is easier. Initially I adapted an approach of using the Amazon AWS’s SDK to move the backup files to Amazon S3 bucket. This approach had an limitation when individual file size (After the initial compression the backup was over 12 GB) exceeded more than 4GB while on a 64 bit architecture Linux box (I used Ubuntu 16.04) since I used PHP. To overcome this I sliced the final output of the compressed file in to multiples of  3.6 GB.

tar czf - / | split -b 3850 MB - ${SOURCE_CODE}.tar.gz.

Approach 1: Using Amazon AWS SDK

Download the appropriate Amazon AWS SDK from here.  In my case I used the PHP SDK using the instructions available here and downloaded the PHP library using the 3rd steps (Installing via Zip file).

<?php
require_once('/home/ubuntu/aws/aws-autoloader.php');
use Aws\S3\S3Client;
use Aws\S3\Exception\S3Exception;

$bucket = '<bucket name>';
$pathToFile = '/home/ubuntu/backups/';
$fileNameSourceCode = ['<filename>_'.date('Y_m_d').'.tar.gz']; // name of the website source code, it should be equal to name of SOURCE_CODE variable found on /home/ubuntu/backups/backup.sh
$fileNameDBDump = '<filename>_'.date('Y_m_d').'.sql.tar.gz';// name of the database dump file, it should be equal to the name of DB_DUMP variable found on /home/ubuntu/backups/backup.sh

$credentials = new Aws\Credentials\Credentials(”, ”);

// Instantiate the client.
$s3 = S3Client::factory([
‘region’ => ‘us-east-1’,  // Since I have create the buckets in US East region (N. Virginia)
‘version’ => ‘2006-03-01’, // Standard version number for the S3 bucket service
‘credentials’ => $credentials
]);

//Pushing the source code file to the Amazon S3 bucket

if(count($fileNameSourceCode) > 0) {
foreach($fileNameSourceCode as $file) {
if(file_exists($pathToFile.$file)) {
try {
// Upload data.
$result = $s3->putObject(array(
‘Bucket’ => $bucket,
‘Key’ => $file,
‘SourceFile’ => $pathToFile.$file,
‘ACL’ => ‘public-read’,
‘Expires’ => gmdate(“D, d M Y H:i:s T”, strtotime(“+15 days”)) //This parameter doesn’t get applied, this we have to set on the bucket from the Amazon S3 account
));

// Print the URL to the object.
echo $result[‘ObjectURL’] . “\n”;
} catch (S3Exception $e) {
echo $e->getMessage() . “\n”;
}
}
}
}

//Pushing the database dump file to the Amazon S3 bucket

if(file_exists($pathToFile.$fileNameDBDump)) {
try {
// Upload data.
$result = $s3->putObject(array(
‘Bucket’ => $bucket,
‘Key’ => $fileNameDBDump,
‘SourceFile’ => $pathToFile.$fileNameDBDump,
‘ACL’ => ‘public-read’,
‘Expires’ => gmdate(“D, d M Y H:i:s T”, strtotime(“+15 days”)) ////This parameter doesn’t get applied, this we have to set on the bucket from the Amazon S3 account
));

// Print the URL to the object.
echo $result[‘ObjectURL’] . “\n”;
} catch (S3Exception $e) {
echo $e->getMessage() . “\n”;
}
}
Save the file as upload_to_s3bucket.php inside /home/ubuntu/backups

Approach 2: Using Amazon S3Tools

The Amazon S3 Tools is a very easy to use command line utility which can be used to push very huge files to Amazon S3 bucket with minimum effort. For Linux & Mac we can use s3cmd while for Windows use S3Express. I found this article on TecAdmin which has comprehensively explained it usage. I followed the following steps to set it up on my server.

  • Setting up of S3tool on the server

Installation

$ sudo apt-get install s3cmd

Configuration

You need to provide the Access Key ID and Secrete Key available with your Amazon AWS account during the configuration by executing the following command. As a best practice it recommends to create an IAM user and provide that creadentials instead of using the root account details.

# s3cmd --configure

  • Setting up the shell script to push the files to S3 Bucket

To achieve this I created a folder called backups (/home/ubuntu/backups) in the home directory and added the necessary instructions into the shell script as follows.


#!/bin/bash

_DB_DUMP=<filename>_`date +"_%Y_%m_%d"`.sql  # name of the website source code, it should be equal to the name of DB_DUMP variable found on /home/ubuntu/backups/backup.sh
_SOURCE_CODE=<filename>_`date +"_%Y_%m_%d"`.tar.gz  # name of the website source code, it should be equal to name of SOURCE_CODE variable found on /home/ubuntu/backups/backup.sh

s3cmd put ${_DB_DUMP} s3://<bucket name>/
s3cmd put ${_SOURCE_CODE} s3://<bucket name>/

Save the file as upload_to_s3bucket.sh inside /home/ubuntu/backups

Step 3: Set a conjob task to execute to process

Now lets set the cronjob task to daily or any required time interval to execute the following two scripts.

Firstly lets make the two shell scripts executable using following command

$ chmod +x /home/ubuntu/backups/backup.sh
$ chmod +x /home/ubuntu/backups/upload_to_s3bucket.sh

Open up the terminal and execute the following command
sudo crontab -e

Enter the following two lines and save.

30 01 * * * /home/ubuntu/backups/backup.sh #set to run the backup 30 minutes passing 1 o'clock in the morning

#use this if used the Amazon AWS SDK approach
00 03 * * * php /home/ubuntu/backups/upload_to_s3bucket.php #set to run the backup daily 3 o'clock in the morning

#use this if used the Amazon S3tools approach

00 03 * * * /home/ubuntu/backups/upload_to_s3bucket.sh #set to run the backup daily 3 o'clock in the morning

Run nodejs server continuously using forever


By last Friday morning the open bugs count raised above 150 mark and we managed to take it down to under 25 by the end of the day, thanks to the dedicated effort by the team. Among them, one was to make the Nodejs server run continuously. In our application we are using the Nodejs to implement a near-realtime notification module, which involved pushing notifications and updates to the Web application as well to the Mobile application. Faye module was used to implement the identified channel patterns required to identify different user types and their respective activities each user type executes.

Nodejs is a powerful, Event-driven, I/O based or JavaScript based server that can be used to develop interactive applications. It was developed on top of the V8 JavaScript engine.

Installing Nodejs on Ubuntu

I came across an easier approach to install Nodejs via the Ubuntu Package Manager.

$ sudo apt-get install python-software-properties
$ sudo add-apt-repository ppa:chris-lea/node.js
$ sudo apt-get update
$ sudo apt-get install nodejs npm

Install Forever

In the last command it installed the Node Package Manager(npm). The npm provides the facility to install any module to Nodejs in a manner similar to installing application on Ubuntu via apt-get.

$ sudo npm install forever --global

Setting the –global parameter makes the module accessible globally

Updating Nodejs & the modules to latest stable release

Nodejs comes with a module called n aka node version manager provides the facility to update the module to its latest or stable release.

Install n

$ sudo npm install --global n

Updating Nodejs using n

$ sudo n --global latest/stable
    or
$ sudo n --global custom v0.x.x

Updating Nodejs modules using n

$ sudo n --global npm
    or
$ sudo n --global forever

Connect To Amazon EC2 via Putty


In this post I’m going to show how to establish a SSH connection to an Amazon EC2 instance using Putty using a Windows box. First of all we need to download following tool. It can be find under the section called For Windows on Intel x86.

Step 1: Generating the Private Key using Puttygen

In Amazon AWS environment each instance (EC2, RDS, ElasticCache, etc…) is attached to a permission/security group, access to each service is provided through Private Key/Public Key authentication. Each Amazon EC2 instance will holds the Public Key of the permission/security group it belongs. To connecting to a specific Amazon EC2 instance need to use the corresponding Private Key (xxx.pem). Puttygen can be used to generate the local Private key out from the Private Key obtained from Amazon AWS required for establishing the connecting to Amazon EC2 instance via Putty.

Open Puttygen as shown in Figure 1 and click Load to select the Private Key obtained form the Amazon AWS.

 Generate private key

Figure 1: Selecting the Amazon AWS Private Key

Next click Save private key, next click OK on the appearing dialog box as shown in Figure 2 and 3

Saving generate private key

Figure 2: Saving the generated Private Key

Provide a suitable name (eg: aws_putty_private_key.ppk) for the newly created Private key, make sure to store all these keys in a well secure place.

Saving generate private key

Figure 3: Saving the generated Private Key

Step 2: Pointing the generated Private key to Putty

Now we have finish creating the private key required by Putty. Open Putty and navigates to Connection -> SSH -> Auth from the left pane of the Putty window as shown in Figure 4. Select the newly created private key (aws_putty_private_key.ppk) from the Options controlling SSH authentication pane as shown in Figure 5

Options controlling SSH authentication

Figure 4: Options controlling SSH authentication

Selecting the generated private key

Figure 5: Selecting the generated Private key

Step 3: Providing Amazon AWS EC2 instance information

Navigates to Session from the left pane of the Putty window as shown in Figure 6 and Provide the Host Name or the IP of the Amazon EC2 instance Connection Type as SSH.

Providing Amazon AWS EC2 instance information

Figure 6: Providing Amazon AWS EC2 instance information

Step 4: Connecting to Amazon AWS EC2 instance

Provides the User name of the EC2 instance.

Connecting to Amazon AWS EC2 instance

Upon successful authentication connection is established. Happy hacking.

Connecting to Amazon AWS EC2 instance

Workarounds found for Huawei e220 on ubuntu 11.10


Once upgrading form Ubuntu 11.04 to 11.10 the Huawei E200 modem failed to work, unfortunately I switched back to Windows 7 for browsing the Internet. I didn’t realized that I would have search for this issue in the first place and today only I just did it, viola I managed to find some workarounds to get my good old Huawei E200 modem back to work 🙂

  • Adding the “blacklist usb_storage” to /etc/modprobe.d/blacklist.conf
  • sudo usb_modeswitch -v 0x12d1 -p 1003 -V 0x12d1 -P 1003 -R

As usual pluagged in the modem and ran the command illustrated under second point and now I’m happily surfing the Internet.

Following was the message got printed on the terminal.


hayesha@gnu-user:~$ sudo usb_modeswitch -v 0x12d1 -p 1003 -V 0x12d1 -P 1003 -R
[sudo] password for hayesha:

Looking for target devices ...
Found devices in target mode or class (1)
Looking for default devices ...
Found devices in default mode, class or configuration (1)
Accessing device 003 on bus 001 ...
Getting the current device configuration ...
OK, got current device configuration (1)
Using endpoints 0x02 (out) and 0x82 (in)
Not a storage device, skipping SCSI inquiry

USB description data (for identification)
-------------------------
Manufacturer: HUAWEI Technologies
Product: HUAWEI Mobile
Serial No.: not provided
-------------------------
Warning: no switching method given.
Resetting usb device .
OK, device was reset
-> Run lsusb to note any changes. Bye.

hayesha@gnu-user:~$ lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
Bus 002 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
Bus 001 Device 003: ID 12d1:1003 Huawei Technologies Co., Ltd. E220 HSDPA Modem / E230/E270 HSDPA/HSUPA Modem
Bus 001 Device 004: ID 03f0:231d Hewlett-Packard
Bus 002 Device 003: ID 064e:f209 Suyin Corp.

Reference:- https://bugs.launchpad.net/ubuntu/+source/modemmanager/+bug/868034

Useful GNU Linux commands


During the past week I spend most of my time and effort on installing and configuring Ubuntu Server 10.04.3 and required applications, so following are some of the useful commands I came across.

Manual Partitioning

As a practice, when installing GNU Linux I partition the harddisk as follows by selecting the option of manual partition.

  • / – Root partition
  • /boot – Boot partition, that keep the grub loader, especially important when multiple operating systems are installed.
  • /home – Home partitions where the user data will be stored
  • swap – Swap partitions, recommends to allocate double the size of RAM attached to the machine.

Once the partitioning is completed the OS installation will take few minutes to complete. Next step will be to configure the server to establish a connection to the Internet via the local network.

Network Configuration – Static IP approach

Let us see the steps involve in assigning a static IP and the DNS information.

  • Check whether the network card is properly working and the cable is properly connected as follows:
    $ sudo mii-tool eth0
    should give the details of the network card instead of no link message.
  • Backup the existing interfaces configuration file as follows:
    $ sudo cp /etc/network/interfaces /etc/network/interfaces.bk
    and open the interfaces file as following
    $ sudo vim /etc/network/interfaces
  • Replace the values for address, netmask, gateway, and broadcast with values specific to your desired IP address and network.
    # The loopback network interface
    auto lo
    iface lo inet loopback

    auto eth0
    iface eth0 inet static
    address 192.168.1.40
    netmask 255.255.255.0
    network 192.168.1.0
    broadcast 192.168.1.255
    gateway 192.168.1.1

  • Setting the IP’s for the DNS Server(s), backup the resolv.conf as follows:
    $ sudo cp /etc/resolv.conf /etc/resolv.conf.bk
    and update the nameservers as follows:
    $ sudo vim /etc/resolv.conf
    nameserver 192.168.78.174
    nameserver 192.168.78.174
  • Next, restart the network interfaces as follows:
    $ sudo /etc/init.d/networking restart
  • Finally, test the configuration first by pinging to the Gateway IP and then to another external IP or site like http://www.google.com
    $ ping 192.168.1.1
    PING 192.168.1.40 (192.168.1.40) 56(84) bytes of data.
    64 bytes from 192.168.1.40: icmp_seq=1 ttl=64 time=0.051 ms
    64 bytes from 192.168.1.40: icmp_seq=2 ttl=64 time=0.061 ms
    64 bytes from 192.168.1.40: icmp_seq=3 ttl=64 time=0.068 ms

Mount/unmount a filesystem – External hard disk with a ntfs filesystem

  • Find the device name (eg: /dev/sdb1) required to mount using the following command.
    $ sudo fdisk -l
  • Create a mount point and mount the external hard disk as follows:
    $ sudo mkdir /media/external_hard
    $ sudo mount -t ntfs-3g /dev/sdb1 /media/external_hard
  • Unmount the hard disk as follows:
    $ sudo umount /media/external_hard

Subversion

« Older entries