The Space program blog has a post with some info about EC2 Best Practices. Some interesting things in here. One of the most important ones I though was as follows:

Keep your database as small as possible. Store as much data as possible on Amazon Simple Storage System (S3), NOT in your database and just store the S3 key to the data in your database. Consider putting any blob or large text fields in S3. This will make it much easier and faster to manage database backups, plus your database will perform better.

This is something I have wanted to do for a while. Take, as an example, my image hosting site (no link since its all mainly Back End stuff). At the moment, most stuff is hosted in the SQL server, mainly because of laziness. If I wanted it to work very well on Amazon EC2, I would store basic info in the DB, but make sure it can be taken back from S3. how? Serialization.

With the image site, the most important part to store is the images. Think of it as an object. It has a URL, a name, a content type, binary data and a unique ID (I use GUID’s). My thinking is this info should be placed in an object, and then serialized any which way you want (XML is probably the best).

When an new image is uploaded, the main image is placed in S3 storage, and the XML object file is placed there too. If your DB goes belly up, you should have a script or program to ask S3 for all XML objects, and place them into the DB. this allows for easy replication of the DB. Also, it means the DB servers don’t really have to know about one another.

[update] The Space Program Blog also has this post about getting MySQL running on EC2 in the /mnt storage partition. Mind you, they mention that if anything goes wrong it goes away, but still handy! :)

[Via AWS Blog]

Technorati Tags: , , , , , ,