Twitter has been down for more than 2 hours,the last tweets on my time-line reflect panic as one would expect from the passengers of a ship as it goes down with the waves! Well, i haven't got a Google Wave account yet, the analogy is partly coincidental :p
Was i surprised? Not at all! This is not the first time that the micro-blogging site has gone down, since past few months,outages have become a regular affair and most likely will continue for a while. No! I don't mean to say that the guys at Twitter are not smart enough, indeed, it would be foolish to say that for guys who came up with such a brilliant concept such as Twitter. The only speculation that can be made is the probability that the issue lies deeper than it seems to us.
Outages are not specific to Twitter alone, most popular services including Google, Gmail and Facebook have had their blues too. It is an irony of fate that the very participatory nature of the Social Web is also the factor that takes it down often. For instance, the Internet traffic generated by Michael Jackson's death was much more than the servers were prepared for; it seems that most of the people in the world decided to use the Web to find more information about MJ, unprecedented numbers of requests coming in was interpreted as a DDoS attack by Google which then went into emergency mode. However, other Social Web sites such as Facebook, Friendfeed, MySpace and Twitter of course are far more open, consequently they could not balance the load and crashed.
So couldn't they build the service keeping load balancing in mind? This question is complex too, the Web is evolving at such a rapid pace that it is no longer possible to follow the traditional model where even the tiniest detail has gone through analysis and validation before the development has actually begun. Most of the popular service providers of the day are widely used for features some of which had not even been foreseen when the service was built for the first time. Adding newer features to existing applications is a continuous process that is conceived of and implemented in response to changing paradigms. Quick adaptation and innovation-on-demand is what drove their growth and has enabled them to change the way we use the Internet now.
Keeping the above scenario in mind, it is easier to understand why Web 2.0 websites often face scalability issues. Most of them had started up targeting a small segment of users, with plans to upscale operations on a future date. But as the service becomes popular and the user base increases exponentially,
too many requests lead to bottlenecks that make the system defunct for a period of time. There are technical reasons of course, I have come across quite a large number of papers blaming Ruby on Rails, the framework on which Twitter was built. This aspect cannot be refuted absolutely when one considers the fact that this framework has not been proven on large-scale application development. Facebook has addressed this issue by migrating the background code, it also has a limit on the number of friends a user can add. But there has to be something more than the weakness of the development framework.
One of the reasons that i love Twitter for is the fact that there is no restrictions apart from the 140 characters per tweet limit and the model itself. Unlike IMs or Social Networks, Twitter delivers messages from multiple users to multiple users. This makes every user a center of information propagation or many-to-many messaging system yet the server remains centralized, so as the number of users increase the processes increase exponentially since every user follows an unspecified number of users and is in turn followed by other users. i am not aware of the business logic or algorithm used by it but can understand that it must be efficient enough to handle all those transaction most of the time. However, the frequent outages suggest that something at the fundamental layer needs a re-look at.
As i said in the beginning i do not have a Google Wave account yet but from what i have read and seen in the Developer Conference preview video, the architecture is more inclined towards being a distributed one rather than being centralized. i wonder if the folks at Twitter have looked in that direction, if the server is unable to handle the load now, waves of users are just starting to migrate to Twitter. Read: Has Cattlegate Opened The Floodgate?
Also read : http://www.techcrunch.com/2008/05/22/twitter-at-scale-will-it-work/
P.S.: Inviting my friend netgenre to write a guest article on data marshaling in the above context
Update
Recently i came across (through Twitter, of course) a excellent article at High Scalability which offers to explain the different factors responsible for the scalability issue of most social media sites. i found the part on Pull-on-Demand vs Push-on-Demand approaches quite interesting. In the first approach, which is followed by Facebook ( according to the same article), the service queries all your friends,fetches their updates and changes and provides them to you at one place. So if you have 1000 friends, it makes 1000 queries to show your friends update. So with every new user or even connection the number of queries to execute rises exponentially.
In contrast, Push-on-Demand approach deals with updates in a different manner. Rather than waiting for queries,.it pushes the data to friends right when it is changed! The user no longer needs to pull data, since it is already there when s/he logs in. Of course this model is not without drawbacks but that is beyond the current scope.
But more importantly, which model does Twitter use ?
i am not sure, but most likely it uses a hybrid of both model...Just a guess right now ;)
Was i surprised? Not at all! This is not the first time that the micro-blogging site has gone down, since past few months,outages have become a regular affair and most likely will continue for a while. No! I don't mean to say that the guys at Twitter are not smart enough, indeed, it would be foolish to say that for guys who came up with such a brilliant concept such as Twitter. The only speculation that can be made is the probability that the issue lies deeper than it seems to us.
Outages are not specific to Twitter alone, most popular services including Google, Gmail and Facebook have had their blues too. It is an irony of fate that the very participatory nature of the Social Web is also the factor that takes it down often. For instance, the Internet traffic generated by Michael Jackson's death was much more than the servers were prepared for; it seems that most of the people in the world decided to use the Web to find more information about MJ, unprecedented numbers of requests coming in was interpreted as a DDoS attack by Google which then went into emergency mode. However, other Social Web sites such as Facebook, Friendfeed, MySpace and Twitter of course are far more open, consequently they could not balance the load and crashed.
So couldn't they build the service keeping load balancing in mind? This question is complex too, the Web is evolving at such a rapid pace that it is no longer possible to follow the traditional model where even the tiniest detail has gone through analysis and validation before the development has actually begun. Most of the popular service providers of the day are widely used for features some of which had not even been foreseen when the service was built for the first time. Adding newer features to existing applications is a continuous process that is conceived of and implemented in response to changing paradigms. Quick adaptation and innovation-on-demand is what drove their growth and has enabled them to change the way we use the Internet now.
Keeping the above scenario in mind, it is easier to understand why Web 2.0 websites often face scalability issues. Most of them had started up targeting a small segment of users, with plans to upscale operations on a future date. But as the service becomes popular and the user base increases exponentially,
too many requests lead to bottlenecks that make the system defunct for a period of time. There are technical reasons of course, I have come across quite a large number of papers blaming Ruby on Rails, the framework on which Twitter was built. This aspect cannot be refuted absolutely when one considers the fact that this framework has not been proven on large-scale application development. Facebook has addressed this issue by migrating the background code, it also has a limit on the number of friends a user can add. But there has to be something more than the weakness of the development framework.
One of the reasons that i love Twitter for is the fact that there is no restrictions apart from the 140 characters per tweet limit and the model itself. Unlike IMs or Social Networks, Twitter delivers messages from multiple users to multiple users. This makes every user a center of information propagation or many-to-many messaging system yet the server remains centralized, so as the number of users increase the processes increase exponentially since every user follows an unspecified number of users and is in turn followed by other users. i am not aware of the business logic or algorithm used by it but can understand that it must be efficient enough to handle all those transaction most of the time. However, the frequent outages suggest that something at the fundamental layer needs a re-look at.
As i said in the beginning i do not have a Google Wave account yet but from what i have read and seen in the Developer Conference preview video, the architecture is more inclined towards being a distributed one rather than being centralized. i wonder if the folks at Twitter have looked in that direction, if the server is unable to handle the load now, waves of users are just starting to migrate to Twitter. Read: Has Cattlegate Opened The Floodgate?
Also read : http://www.techcrunch.com/2008/05/22/twitter-at-scale-will-it-work/
P.S.: Inviting my friend netgenre to write a guest article on data marshaling in the above context
Update
Recently i came across (through Twitter, of course) a excellent article at High Scalability which offers to explain the different factors responsible for the scalability issue of most social media sites. i found the part on Pull-on-Demand vs Push-on-Demand approaches quite interesting. In the first approach, which is followed by Facebook ( according to the same article), the service queries all your friends,fetches their updates and changes and provides them to you at one place. So if you have 1000 friends, it makes 1000 queries to show your friends update. So with every new user or even connection the number of queries to execute rises exponentially.
In contrast, Push-on-Demand approach deals with updates in a different manner. Rather than waiting for queries,.it pushes the data to friends right when it is changed! The user no longer needs to pull data, since it is already there when s/he logs in. Of course this model is not without drawbacks but that is beyond the current scope.
But more importantly, which model does Twitter use ?
i am not sure, but most likely it uses a hybrid of both model...Just a guess right now ;)