Before continuing, please read the first parts of the database sharding adventure:
Database sharding unraveled - part I
Database sharding unraveled - part II

Chapter 1. The small guys

Before really diving into high scalability principles, I want to take a moment to talk about why database sharding has an important role even in small startups or medium sized web-sites (5 - 30k unique visitors/day).

It is equally important and benefic for a smaller web business to prepare itself from the beginning to tackle large amounts of users cheap. If it’s not obvious enough, think about what happens to a web-page that gets some plain old Digg attention. The server quickly collapses and the user experience immediately turns from positive to mega negative.
As I’ve explained before, the whole purpose of sharding is to be able to use an unlimited number of cheap machines topped by an open-source database. As experience taught me, the web server will rarely die. Instead, the DB server will choke easily when having to deal with many simultaneous connections.
The database doesn’t even have to be very big.

Read more

After understanding how to pick the correct dividing logic we continue our journey into database sharding. Many say that sharding is partitioning and they are right, but keep in mind that it’s the most complex form of all. In order to better grasp the concept, think about a field of flowers. Unpartitioned dataIn a normal situation (database), the flowers are all together.

What if you want to pick only the red flowers? Partitioned dataIn this case you would have to check every flower and see which one has the desired color, than pick it up, but that would take to long.

Instead, why not plant all the flowers based on their color. So, if you’d like to get the red ones only, it would be easy as pie.
The only problem which could appear would be if you wanted only the flowers which had 5 petals. That is why you must carefully think things over before starting to split your data.

Alright then, we’ve setup the logic, what next? It’s time to implement it.Now, the implementation is the tricky part.

Read more

This is the first post from a hopefully long series to come, about Database Sharding. dark ice(f)
The best way I can think of to define the concept is to associate it with ice fragments (build them up and you can sculpt anything, but failing to provide the right temperature collapses it all).
The idea is to split tables in a database in what are called shards, or fragments, or pieces. As your application increases in size, you need a way to scale cheap, efficiently and limitless. Furthermore, minor changes to the already existing code are required (buying more hardware is usually cheaper than re-programming).

There are several ways of dealing with database sharding, each with its pros and cons:

  • application layer
  • proxy
  • database layer

Of course, many other methods exist, but they are only implementations at some extent of the above.

Read more

Advertisements