If you had a chance to read my getting started with Ec2 article I highlighted some of the challenges with deploying applications on the cloud. One of these challenges can now be easily overcome based on a new feature recently provided on Ec2
Elastic IP Addresses:
Elastic IP Addresses are static IP addresses designed for dynamic cloud computing, and now make it easy to host web sites, web services and other online applications in Amazon EC2. Elastic IP addresses are associated with your AWS account, not with your instances, and can be programmatically mapped to any of your instances. This allows you to easily recover from instance and other failures while presenting your users with a static IP address.
Availability Zones:
Availability Zones give you the ability to easily and inexpensively operate a highly available internet application. Each Amazon EC2 Availability Zone is a distinct location that is engineered to be insulated from failures in other Availability Zones. Previously, only very large companies had the scale to be able to distribute an application across multiple locations, but now it is as easy as changing a parameter in an API call. You can choose to run your application across multiple Availability Zones to be prepared for unexpected events such as power failures or network connectivity issues, or you can place instances in the same Availability Zone to take advantage of free data transfer and the lowest latency communication.
Every new addition makes Ec2 more attractive. In the coming months I will be experimenting more with deploying a large scale application to the cloud and will post some of my findings.
This article highlights the many different Flex development frameworks available as the Flex community has grown by leaps in bounds in recent years.
I was recently introduced to Amazon's new Ec2 services. The idea of cloud computing really intrigued me after I heard about it so I decided to take the dive. There is a bit of a learning curve with getting started but once you get started you realize the unlimited potential that cloud computing offers. Ec2 offers the ability to deploy pre-configured (linux based) images (called AMI's). The AMI's can be created from scratch or based on prebuit versions that Amazon or other users have exposed. You can quickly deploy to several different types of machines depending on your requirements. The base system has a 1.7Ghz Xeon CPU, 1.75GB of RAM, 160GB of local disk, and 250Mb/s of network bandwidth. Currently this will cost you $.10 per computing hour plus bandwidth costs. You are only charged for the time that the virtual machine is running and you can start and stop multiple instances at your will to scale as you need to. There are also beefier 64-bit machines available at a higher cost. On limitation (depending on how you look at it) is that persistent storage is not offered on the instances. After you start it up if at any time it crashes you lose everything on the instance. There are ways to overcome this as I will explain later but it makes things a bit more challenging. I found that the simplest way to get started is to find a public AMI that meets you needs, make the modifications to the instance then save it as your own instance into Amazon S3. S3 is another service that Amazon offers for storage, S3 and Ec2 work hand-in-hand with one another.
To get started you will need an account with Amazon Web Services at http://aws.amazon.com. You will need to sign up with both Ec2 and S3. It does not cost anything up front but you will need a credit card for them to draw funds from once you start using the service. One thing that took me a little while to get use to was the extensive use of certificates for authentication. Beyond signing in to your AWS account nearly everything else with the Ec2 service uses certificates or private keys. You use them to start your instances, as well as gain remote root access to an instance that you have started. It really makes things more secure. So lets get started....btw I recently switched from PC to Mac so all of the instructions will be for the Mac but they translate easily to the PC if you are familiar with java.
- Log into your AWS account, I am assuming you signed up with Ec2 and S3 already.
- After you are signed click on the "You Web Services Account" button and you will find the "AWS Access Identifiers" link.
- Select X.509 certificates link.
- When you click on the "create new" link you will be asked to confirm, click yes and the two files will be generated. You will find the two following files. These are the certificates I mentioned above that are used to authenticate you when any commands are issued to Ec2. There will be an additional cert that we create later to launch your instances.
- X.509 certificate named cert-xxxxxxx.pem
- RSA private key named pk-xxxxxxx.pem
- Next you will need to download the Amazon Ec2 command line tools.
- Now it is time to setup your machine to use the Ec2 tools.
- Open the terminal and go to your Mac home directory and create a new folder named ~/.ec2
- Copy the cert-xxxxxxx.pem and pk-xxxxxxx.pem into your ~/.ec2 directory from above.
- Unzip the tools into the ~./ec2 directory and move out the bin and lib directories to this directory as well. It should look like the following
- cert-xxxxxxx.pem file
- pk-xxxxxxx.pem file
- The bin directory
- The lib directory
- Next you will need to set a few environmental variables. To make things easier you can place these changes in your ~/.bash_profile file. If this file does not exist in your home directory you can create it then add the following:
# Amazon Ec2 tools
export EC2_HOME=~/.ec2
export PATH=$PATH:$EC2_HOME/bin
export EC2_PRIVATE_KEY=`ls $EC2_HOME/pk-*.pem`
export EC2_CERT=`ls $EC2_HOME/cert-*.pem`
export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Home/
- After making the changes you will need to reload your ~/.bash by running the command
source ~/.bash_profile
- Now you are ready to start issuing commands to Ec2, list instances and start them. The first step is finding the instance that is appropriate for your needs. You can test with the amazon images that are available and customize them to your needs. To list all of the Amazon instances type the following command.
$ ec2-describe-images -o amazon
IMAGE ami-20b65349 ec2-public-images/fedora-core4-base.manifest.xml amazon available public
IMAGE ami-22b6534b ec2-public-images/fedora-core4-mysql.manifest.xml amazon available public
IMAGE ami-23b6534a ec2-public-images/fedora-core4-apache.manifest.xml amazon available public
IMAGE ami-25b6534c ec2-public-images/fedora-core4-apache-mysql.manifest.xmlamazon available public
IMAGE ami-26b6534f ec2-public-images/developer-image.manifest.xml amazon available public
IMAGE ami-2bb65342 ec2-public-images/getting-started.manifest.xml amazon available public
IMAGE ami-36ff1a5f ec2-public-images/fedora-core6-base-x86_64.manifest.xmlamazon available public
IMAGE ami-bd9d78d4 ec2-public-images/demo-paid-AMI.manifest.xml amazon available public A79EC0DB
- Out of this bunch you should find at least one suitable to test with, we will use the Fedora Core 4 machine with Apache from above. Before doing this we need a keypair to start the instance. This keypair will be used to gain root access to the instance through SSH after it is up and running.
- To generate the keypair use the following command, this will create a RSA private key and output it to the screen. You will copy this entire key from ------BEGIN RSA PRIVATE KEY------ TO ------END PRIVATE RSA KEY------. Paste this into a new file named ec2-keypair in your ~/.ec2 directory.
$ ec2-add-keypair ec2-keypair
- This step is something that I missed at first and it frustrated me until I figured out what I was doing wrong. Before you can use this key to SSH to a running instance the Ec2 tools require that you set permissions on the file so that only your account has access to the file. You can do that with the command.
$ chmod 600 ec2-keypair
- Now we can boot up an ec2 instance. We have chosen the ami-23b6534a instance from above. You will use the following command to start the instance.
$ ec2-run-instances ami-23b6534a -k ec2-keypair
- It will take a little while for your instance to start but while you are waiting you can check on the status of the instance with the following command:
$ ec2-describe-instances
Once it is up and running you will see "running" as the status. Take note of the server addresses that this command provides since the provide the DNS addresses you will need to access your instance with a web browser or via SSH. They will be in the format of:
ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com - (Externally accessible DNS address)
domU-xx-xxx-xxx-xxx.compute-1.internal - (Internally accessible DNS address used from instance to instance)
- The server instances are locked down pretty tight and you will not have external network access to any of the instances by default. You have control over opening the ports though similar to controlling your own firewall. The network access is not configured uniquely to each instance but instead you control it by groups. You can launch several instances in the same group and provide network access to that group. When you start an instance like we did above it is started as part of the "default" group. We now need to open up network access for web traffic on port 80 and SSH on port 22 with the following commands:
ec2-authorize default -p 22
ec2-authorize default -p 80
- You can now access your instance by opening up your web browser and entering your address http://ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com
- Now you are ready to access the command line of the instance. This is where the private key that you created early comes in. You do not have a root password, instead you use the private key to authenticate yourself. You can access via SSH with the command:
ssh -i ec2-keypair root@ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com
Now you are up and running with your instance. You can change whatever you want and add software to the Linux image. Just remember that it does not persist if you shutdown. If you do a reboot it will persist. After you have made all of the changes you want you can repackage the instance as your own and store it into the Amazon S3 service (LINK TO THESE STEPS)
Challenges of working with Ec2
- You get a dynamic IP address each time you boot an image. There are solutions with DynDNS that are worth exploring.
- There is no persistent storage if an instance fails. There are ways to overcome this limitation. So far I have worked with PeristantFS which allows you to mount a bucket from S3 as a directory in your image.
- You are limited by space in the image to 10GB (I think I need to confirm) if are going to store large files I suggest putting them somewhere in the /mnt directory since that has a lot more space. Also if you save the image anything in the /mnt folder is not saved as part of the image. You can put log files and other content that you don't want saved in this location
- Databases are a challenge with limited options for persistence. Third parties are popping up offering db hosting on the cloud so you don't have to manage it yourself. I will explore these more in the future.
The future of scalable computing....
I really feel like cloud based solutions are the future for hosted solutions. Once you work out some of the limitations you can build a very scalable solution where you have automated scripts that launch new instances as you have a need to scale. In turn you can shut them down as the load decreases. There are overall architecture needs that have to be addressed to utilize an infrastructure like this but it is all doable with a bit of ingenuity. Add in the fact that a small business does not have to invest an significant amount into hardware and software to start running on this type of solution and it is a no brainer. The questions of SLA's come up and I expect that to be an issue for the short term but solvable in the future.
Getting started is easier with RightScale.com
I also used RightScale when I first got started with Ec2, they are a third party that puts a front end onto the managing of ec2 instance. It makes it a lot easier to get started and get your head around Ec2. All you need is an AWS account with Ec2 and S3 and you can get started with RightScale. You do not have to deal with all of the command line stuff above and the Ec2 tools.
In my last blog posting I discussed the advantages of using MaxMind GeoIP to obtain a site visitors geographical location based on their IP. In this posting I will show you how to integrate GeoIP into your site using ColdFusion. If you are a ColdFusion developer the most challenging part is getting the java source compiled and into a jar file so that you can leverage the GeoIP Java API in ColdFusion. I have made it easier by taking care of this process
The first thing you will need to do is download the example and jar file. Since this example was written using the Lite version of GeoIP you will need to download the Binary version of the Lite database. After you get it up and running you can download and just replace the .dat file with the full binary version. Go to http://www.maxmind.com/app/geolitecity and look under the section that says "Binary Format", click on the "Download the latest GeoLite City Binary Format" link. After that go ahead and extract the .dat file from the zip.
The zip file that you downloaded above will contain a jar file and two ColdFusion files. Follow along with the steps below and you will be up and running quickly.
Steps:
- Copy geoIP.jar to Jrun4/servers/lib or ColdFusion/runtime/servers/lib (I think that is right for standalone CF it has been a while since I used the standalone version)
- Restart ColdFusion or the JRun instance
- Create a directory for testing in your webroot and copy index.cfm and application.cfc from the zip file above
- Open Application.cfc and modify the entry for REQUEST.GeoIPCityDB to point to the location where you extracted the GeoLiteCity.dat file from above. Make sure you use forward slashes and not backslashes.
REQUEST.GeoIPCityDB="C:/geoIP/GeoLiteCity.dat";
- You should then be able to invoke index.cfm from the example and start resolving IP's to Geographic locations.
I have included an example usage on my site using both the free version and the paid version so you can get a sense of the differences between the two.
You should be able to take this example code and quickly integrate it into your own site. Overall it is pretty simple, if you are not using an Application.cfc file you will just need to add it to your Application.cfm file. Make sure you add logic so that it only is initialized once. The initial load is a bit expensive but it loads the entire database into memory. It is only about 25MB but it is worth the performance gain you get since you can support hundred-thousands of queries a second. The UDF in the index.cfm file is a little bit bloated but there is some reasoning behind it. When I started using this approach I had already been using IP2Location and had a predefined query structure that I had to adapt to. You can probably simplify the approach for the UDF if you choose to. If there are any other questions feel free to post them or email me.
Early this year I wrote a blog article about using IP2Location to identify a users location and ISP based upon their IP. You can find the previous articles here:
http://www.bpurcell.org/blog/index.cfm?mode=entry&entry=1078
http://www.bpurcell.org/viewcontent.cfm?contentID=147
The flaw with this approach is that the data was stored into a database of 4 million rows and it was very expensive to do the lookups even with the performance optimizations of splitting the data up across multiple tables. Recent research has turned up a more optimal approach using GeoIP with a binary based solution that will support several hundred requests per second. Yes that is right, several hundred requests per second. The initial setup and configuration takes a bit of time to get going but it is very simple to update and maintain.
GeoIP is a technology from MaxMind that provides developers with a non-invasive way to determine geographical and other information about their Internet visitors in real-time. When a person visits your website, GeoIP can determine which country, region, city, postal code, area code the visitor is coming from. Furthermore, GeoIP can provide information such as longitude/latitude, connection speed, ISP, company name, domain name, and whether the IP address is an anonymous proxy or satellite provider. In my findings and experimentation I have found that GeoIP is also more accurate than IP2Location.
GeoIP features multiple binary files with different features available based on the price that you pay. A listing of the different versions can be found here. For my practical purposes I used both the City and ISP versions. Another great thing that MaxMind offers is a GeoLite City version that is free that you can use to test integration with your system. All of the API's are exactly the same but it is less accurate than the version you purchase.
The final selling point for me with GeoIP is the different API's that they support. You can integrate it with almost any system using (Java, C, Perl, PHP, VB.net, MS Com....and many others). Since I was using ColdFusion I chose to go the Java route. If you are not experienced with Java you may stumble a bit here but I plan on posting the Jar that I built for integration along with sample code to make it very easy to first try the Lite version then use the full version if it works well for you. You can find the Java source here along with the GeoLite version.
If you are experienced with Java and ColdFusion it is pretty straightforward to get up and running. You will just need to compile the java source and build a jar, then place it in the classpath of ColdFusion. It is as easy as instantiating the Java object from CF then making calls to retrieve the information. I recommend encapsulating this into a function and loading the reference to the Java object in an application scope variable that is loaded when the application starts. I did have to work around a few issues in the Java source to get it working properly with CF but there were not too many changes needed.
I have posted a very easy to follow step by step set of instructions on this posting http://www.bpurcell.org/blog/index.cfm?mode=entry&entry=1100 so that you can use to get GeoIP up and running on your site.