Twitter has been down for more than 2 hours,the last tweets on my time-line reflect panic as one would expect from the passengers of a ship as it goes down with the waves! Well, i haven't got a Google Wave account yet, the analogy is partly coincidental :p
Was i surprised? Not at all! This is not the first time that the micro-blogging site has gone down, since past few months,outages have become a regular affair and most likely will continue for a while. No! I don't mean to say that the guys at Twitter are not smart enough, indeed, it would be foolish to say that for guys who came up with such a brilliant concept such as Twitter. The only speculation that can be made is the probability that the issue lies deeper than it seems to us.
Outages are not specific to Twitter alone, most popular services including Google, Gmail and Facebook have had their blues too. It is an irony of fate that the very participatory nature of the Social Web is also the factor that takes it down often. For instance, the Internet traffic generated by Michael Jackson's death was much more than the servers were prepared for; it seems that most of the people in the world decided to use the Web to find more information about MJ, unprecedented numbers of requests coming in was interpreted as a DDoS attack by Google which then went into emergency mode. However, other Social Web sites such as Facebook, Friendfeed, MySpace and Twitter of course are far more open, consequently they could not balance the load and crashed.
So couldn't they build the service keeping load balancing in mind? This question is complex too, the Web is evolving at such a rapid pace that it is no longer possible to follow the traditional model where even the tiniest detail has gone through analysis and validation before the development has actually begun. Most of the popular service providers of the day are widely used for features some of which had not even been foreseen when the service was built for the first time. Adding newer features to existing applications is a continuous process that is conceived of and implemented in response to changing paradigms. Quick adaptation and innovation-on-demand is what drove their growth and has enabled them to change the way we use the Internet now.
Keeping the above scenario in mind, it is easier to understand why Web 2.0 websites often face scalability issues. Most of them had started up targeting a small segment of users, with plans to upscale operations on a future date. But as the service becomes popular and the user base increases exponentially,
too many requests lead to bottlenecks that make the system defunct for a period of time. There are technical reasons of course, I have come across quite a large number of papers blaming Ruby on Rails, the framework on which Twitter was built. This aspect cannot be refuted absolutely when one considers the fact that this framework has not been proven on large-scale application development. Facebook has addressed this issue by migrating the background code, it also has a limit on the number of friends a user can add. But there has to be something more than the weakness of the development framework.
One of the reasons that i love Twitter for is the fact that there is no restrictions apart from the 140 characters per tweet limit and the model itself. Unlike IMs or Social Networks, Twitter delivers messages from multiple users to multiple users. This makes every user a center of information propagation or many-to-many messaging system yet the server remains centralized, so as the number of users increase the processes increase exponentially since every user follows an unspecified number of users and is in turn followed by other users. i am not aware of the business logic or algorithm used by it but can understand that it must be efficient enough to handle all those transaction most of the time. However, the frequent outages suggest that something at the fundamental layer needs a re-look at.
As i said in the beginning i do not have a Google Wave account yet but from what i have read and seen in the Developer Conference preview video, the architecture is more inclined towards being a distributed one rather than being centralized. i wonder if the folks at Twitter have looked in that direction, if the server is unable to handle the load now, waves of users are just starting to migrate to Twitter. Read: Has Cattlegate Opened The Floodgate?
Also read : http://www.techcrunch.com/2008/05/22/twitter-at-scale-will-it-work/
P.S.: Inviting my friend netgenre to write a guest article on data marshaling in the above context
Update
Recently i came across (through Twitter, of course) a excellent article at High Scalability which offers to explain the different factors responsible for the scalability issue of most social media sites. i found the part on Pull-on-Demand vs Push-on-Demand approaches quite interesting. In the first approach, which is followed by Facebook ( according to the same article), the service queries all your friends,fetches their updates and changes and provides them to you at one place. So if you have 1000 friends, it makes 1000 queries to show your friends update. So with every new user or even connection the number of queries to execute rises exponentially.
In contrast, Push-on-Demand approach deals with updates in a different manner. Rather than waiting for queries,.it pushes the data to friends right when it is changed! The user no longer needs to pull data, since it is already there when s/he logs in. Of course this model is not without drawbacks but that is beyond the current scope.
But more importantly, which model does Twitter use ?
i am not sure, but most likely it uses a hybrid of both model...Just a guess right now ;)
Was i surprised? Not at all! This is not the first time that the micro-blogging site has gone down, since past few months,outages have become a regular affair and most likely will continue for a while. No! I don't mean to say that the guys at Twitter are not smart enough, indeed, it would be foolish to say that for guys who came up with such a brilliant concept such as Twitter. The only speculation that can be made is the probability that the issue lies deeper than it seems to us.
Outages are not specific to Twitter alone, most popular services including Google, Gmail and Facebook have had their blues too. It is an irony of fate that the very participatory nature of the Social Web is also the factor that takes it down often. For instance, the Internet traffic generated by Michael Jackson's death was much more than the servers were prepared for; it seems that most of the people in the world decided to use the Web to find more information about MJ, unprecedented numbers of requests coming in was interpreted as a DDoS attack by Google which then went into emergency mode. However, other Social Web sites such as Facebook, Friendfeed, MySpace and Twitter of course are far more open, consequently they could not balance the load and crashed.
So couldn't they build the service keeping load balancing in mind? This question is complex too, the Web is evolving at such a rapid pace that it is no longer possible to follow the traditional model where even the tiniest detail has gone through analysis and validation before the development has actually begun. Most of the popular service providers of the day are widely used for features some of which had not even been foreseen when the service was built for the first time. Adding newer features to existing applications is a continuous process that is conceived of and implemented in response to changing paradigms. Quick adaptation and innovation-on-demand is what drove their growth and has enabled them to change the way we use the Internet now.
Keeping the above scenario in mind, it is easier to understand why Web 2.0 websites often face scalability issues. Most of them had started up targeting a small segment of users, with plans to upscale operations on a future date. But as the service becomes popular and the user base increases exponentially,
too many requests lead to bottlenecks that make the system defunct for a period of time. There are technical reasons of course, I have come across quite a large number of papers blaming Ruby on Rails, the framework on which Twitter was built. This aspect cannot be refuted absolutely when one considers the fact that this framework has not been proven on large-scale application development. Facebook has addressed this issue by migrating the background code, it also has a limit on the number of friends a user can add. But there has to be something more than the weakness of the development framework.
One of the reasons that i love Twitter for is the fact that there is no restrictions apart from the 140 characters per tweet limit and the model itself. Unlike IMs or Social Networks, Twitter delivers messages from multiple users to multiple users. This makes every user a center of information propagation or many-to-many messaging system yet the server remains centralized, so as the number of users increase the processes increase exponentially since every user follows an unspecified number of users and is in turn followed by other users. i am not aware of the business logic or algorithm used by it but can understand that it must be efficient enough to handle all those transaction most of the time. However, the frequent outages suggest that something at the fundamental layer needs a re-look at.
As i said in the beginning i do not have a Google Wave account yet but from what i have read and seen in the Developer Conference preview video, the architecture is more inclined towards being a distributed one rather than being centralized. i wonder if the folks at Twitter have looked in that direction, if the server is unable to handle the load now, waves of users are just starting to migrate to Twitter. Read: Has Cattlegate Opened The Floodgate?
Also read : http://www.techcrunch.com/2008/05/22/twitter-at-scale-will-it-work/
P.S.: Inviting my friend netgenre to write a guest article on data marshaling in the above context
Update
Recently i came across (through Twitter, of course) a excellent article at High Scalability which offers to explain the different factors responsible for the scalability issue of most social media sites. i found the part on Pull-on-Demand vs Push-on-Demand approaches quite interesting. In the first approach, which is followed by Facebook ( according to the same article), the service queries all your friends,fetches their updates and changes and provides them to you at one place. So if you have 1000 friends, it makes 1000 queries to show your friends update. So with every new user or even connection the number of queries to execute rises exponentially.
In contrast, Push-on-Demand approach deals with updates in a different manner. Rather than waiting for queries,.it pushes the data to friends right when it is changed! The user no longer needs to pull data, since it is already there when s/he logs in. Of course this model is not without drawbacks but that is beyond the current scope.
But more importantly, which model does Twitter use ?
i am not sure, but most likely it uses a hybrid of both model...Just a guess right now ;)
8 comments :
i totally agree that the root cause is still unidentified. And what I think could be the reason is that we can see many "Beta" versions in picture that we tend to maintain a mindset that okay this is gonna work like this. But that is something which is rolled out on a test basis itself. I think this can be a plus as well as a negative point.
Also I would like to add that Cloud Computing should be looked upon as an alternative for Distributed Computing. I think that should ease up the pressure for servers. Whats your take?
im reading ur post at just another epic moment wer i think twitter is down again...
not a techie junky to understand the tech aspect. but through n consumer point of view its really disappointing to face such probs.
for me its a sheer waste of time wen i log in to tweet for a while n find the system to be down.
resulting in me looking for updates on facebook n verify wat the prob really like..!
the company should really take strong steps to resolve these matters...so that we can enjoy unintrupeted tweeting..!!
hmmm very well mate,
Nice post
And thanks for sharing other links also
Very informative
Thanks for the comments :)
@aditi - im in favor of perpetual beta, the speed at which technology as well as our response to them are changing, following the traditional model of development might not work...
As for the Cloud, i think Twitter is ready for that, others aren't. Distributed computing as in server should be decentralized ? yep IMHO :)
@prianca - Hope they come out with a fix soon or im completely wrong :p
@nadhiya - Thanks for the compliment :)
Cloud Computing is not the best option for high volume services like Twitter, IMHO. This is because we spend considerable amount of processing power in virtualizing. Also, the back-end for Twitter (etc) is going to be a fixed one, in the sense that they wont keep on changing the OS/DB/other software, so setting it up on a real server, rather than a virtual one makes sense.
About Distributed Computing, yes, I agree. Twitter is now gaining volume in countries other than USA and Japan. It is now the right time for Twitter to consider setting up data centres at different geographical locations. But, given the current model of Twitter, where there /must/ be a centralized database, I wonder how they'll manage the concurrency across multiple locations. Maybe, tweets from farther locations may take a little longer. One more point that you can keep in mind is that GMail has just three locations for its server farms (nslookup gmail.com), and I don't need to tell you about the amount of traffic it has. Google.com has 4 locations.
One last thing. Even Twitter has limits on the number of people you can follow in a day, the number of tweets you send in an hour from an account, or an IP address. Most of the API's have a rate limit. You can check out apiwiki.twitter.com for more info :)
@ninad Thanks for the comment :)
But isnt Twitter already part of the Cloud? REST architecture, OAuth are characteristics that define Cloud standards.
Further, multiple locations are options that never crossed my mind, so cant say about the merits and demerits....but yeah most giants including Google,YouTube etc use a centralized database but then they crash too!:p btw, do you know what database Twitter uses ? MySQL, InnoDB ?
Finally, as i said in the post, im not sure what the problem is, just know that there is a problem and its getting more complex.... :)
Okay fair point. But I seem to have been experiencing outages ever since I signed up for Twitter more than 2 years ago. What's more, the other sites like Slandr, Tweete etc seem to provide far better service. That's probably because the bulk of the traffic still comes to Twitter.com and not to these others.
Even so I think the level of outages that Twitter faces take it far beyond the excuse of new, evolving technology.
The other point is while the brilliancy of the idea can't be denied, this doesn't necessarily mean that they are great at sustaining great service.
Good post btw.
I must say .. reading anything technical or knowledgeable wasnt this simple and clear for me ! I really liked reading you :)
Post a Comment