Continuing the series about Database Sharding, I’m going to to talk about the software/hardware architecture. This post started from an excellent read, MySQL Database Scale-out and Replication for High Growth Businesses.

MySQL logo

The first order of business is MySQL replication. Replication is needed to offer redundancy and to distribute even further the load on the system. In a typicall shard environment, the database is split among multiple servers, with data being unique to each server. If one of the servers goes down, all that data will become unavailable, and even though the system will continue working, parts of some scenarios will fail. This is where replication comes to stage.
Read more

There’s a nice tool called iostat to check the HDD related info, like number of reads/writes, amount of data processed, etc. It’s a must have for any good sysadmin, as it allows you to identify some of the bottlenecks in the DB.

Together with the vmstat tool - allows a user to see statistics for the virtual memory usage - form a powerful duo to use, especially when your DB is running very slow, but the processors are not fully used.

The tools have enough explanations on the man pages.

So, the only thing remaining is to start them up:

Open 2 terminal windows. The first one would run something like iostat -dx 10 (will display the device extended report, refreshed every 10 seconds - you can increase/decrease this number to suite your needs - too small is not very good, as it’s better to have stats over a longer period). The second one should run vmstat 10.

Last but not least, to get them you need to install the sysstat package (vmstat is in the procps package, installed by default). For ubuntu, type: sudo apt-get install sysstat.

Before continuing, please read the first parts of the database sharding adventure:
Database sharding unraveled - part I
Database sharding unraveled - part II

Chapter 1. The small guys

Before really diving into high scalability principles, I want to take a moment to talk about why database sharding has an important role even in small startups or medium sized web-sites (5 - 30k unique visitors/day).

It is equally important and benefic for a smaller web business to prepare itself from the beginning to tackle large amounts of users cheap. If it’s not obvious enough, think about what happens to a web-page that gets some plain old Digg attention. The server quickly collapses and the user experience immediately turns from positive to mega negative.
As I’ve explained before, the whole purpose of sharding is to be able to use an unlimited number of cheap machines topped by an open-source database. As experience taught me, the web server will rarely die. Instead, the DB server will choke easily when having to deal with many simultaneous connections.
The database doesn’t even have to be very big.

Read more

I have a lot of article drafts sitting unused in my WP DB and I’ve decided to release them even though I don’t have very much time to get into details.
Here’s one of them.
I always comment the code I write like:

// Bogdan -> initializing dispatcher
$this->dispatcher->init();
//-

Very often I’m required to extract the parts of the code I wrote, even though they are not full functions or classes, but just simple variables, or…
Read more

When issuing symfony-propel-build-model and using sfGuardPlugin, some errors might appear.
Error: Attempt to set foreign key to nonexistent table, sf_guard_user!

Solution: The solution is very simple and involves changing the name of the database in the schema.xml to propel. Don’t worry, the database name will remain the one set in propel.ini, this is just to leverage the different xml files so that foreign keys can be processed.
The whole line should be:

<database name="propel" defaultIdMethod="native" noxsd="true"   package="lib.model">

If that doesn’t work, try changing the package from lib.model to plugins.sfGuardPlugin.lib.model (not recommended but if it does the work, why not…;)).
Read more

← Previous PageNext Page →

Advertisements