Last year we ran a little series called Ask the Experts where you all wrote in your virtualization related questions and we got them answered by experts at Intel, VMWare as well as our own expert on all things Enterprise & Cloud Computing - Johan de Gelas.

Below you find the first 2 answers which answer some of the questions posed in our "Ask the experts: Enterprise & Cloud computing" blog post.

Q (" Tarrant64"): With the growing use of the cloud computing, what are ISPs doing to ensure adequate bandwidth for not only the provider but the customers?

A: As a company outsources more and more services to the cloud, it gets increasingly important to make sure that the internet connection is not the single point of failure. Quite a few Cloud vendors understand this and offer dedicated links to their data centers and provide appropriate SLAs for uptime or response time. Of course these SLA are pretty expensive. These kind of "network costs" are often "forgotten" when vendors praise the cost efficiency of "going cloud".

Or the short answer: Don't rely on a simple internet connection to an ISP if you are planning to outsource vital IT services.

(" Tarrant64"): How do the cloud servers give results so fast? I imagine there is literally Petabytes of data in Google servers and it would be way too expensive to run on SSDs. Is their some sort of hierarchy that masks latency? Even then wouldn't it still take long just to communicate to the server itself?

Currently we have been in contact with some of the engineers of Facebook as we are evaluating the servers that are part of Facebook's Opencompute Project. Facebook built their own software, servers and of course datacenters and is sharing these technologies. As Facebook is a lot more open about their underlying architecture, I will focus on Facebook.

The foundations of facebook are not different from what most sites use: PHP and MySQL. But that is not a very scalable combination. The first trick that the Facebook engineers used was to compile the PHP code into C++ code with the help of their own "HipHop" software. C++ code can be compiled in much better performing and smaller binaries. Facebook reported that this alone reduced CPU usage by 50%. These binaries which are compute intensive are now run on Dual Xeons "Westmere"

A look inside an empty datacenter room at Facebook.

The main speedup comes from an improved version of memcached. In the early days of Facebook, it was Mark Zuckerberg himself who improved this opensource software. He describes the improvements Facebook did to Memcached here. By caching the web objects, only 5% of the requests have to make use of the database. Memcached runs on many dual Opteron "Magny-Cours" servers as these servers are the cheapest way to house 384 GB (now even 512 GB in version 2) of RAM in one server.

The most recent cache servers are AMD Magny-Cours based. Notice the 12 DIMM slots on each CPU node

 

The database (of the inbox for example) is not a plain MySQL either. It is based upon a distributed database management system, called Cassandra. Cassandra works decentralized and is able to scale horizontally, e.g. across cheap servers. This in great contrast with most relational databases which scale only vertically well unless you pay huge premiums for complex stuff like Oracle RAC.

Comments Locked

13 Comments

View All Comments

  • FunBunny2 - Thursday, July 28, 2011 - link

    I recall that Fusion-io is said to sell a considerable portion of its output to Facebook. It would be interesting to know how they're being used: cache or primary storage.
  • Atom2 - Thursday, July 28, 2011 - link

    For computing resources equal to hardware which can be hired on internet for about 1000 USD of rent per month, the cloud services charge roughly 10 000 USD +. With all this sharing and increased use of resources, one would expect the prices would be lower if 10 people share the same server rather than higher. In that sense the cloud services seem to ride on a hype, but it is hard to imagine a drop in price by 10x in order to become commercially viable in the future. Why are current cloud prices so far out of all proportions?
  • alent1234 - Friday, July 29, 2011 - link

    the economics only make sense for smaller companies

    this year we bought some new servers. dual 5650 xeons, 72GB RAM, 4 hour warranty, etc. $15000 per server. Amazon doesn't have these specs yet and if they did it would cost you a lot more money. say you buy 3TB of storage for a big database that doesn't need screaming I/O. it's $600 for a 500GB SAS drive and you buy 5 or 6 of them. so your total server cost is less than $20,000. you run one server and back it up. if it breaks you wait for HP to come and fix it.

    if this was amazon or another cloud provider they would have to have the resources for this on at least 2 servers because if one dies it has to fail over. that means they have to buy more hardware than you do. and if you buy your regular HP branded enterprise drives then the cloud guys are buying SAN storage which is a lot more expensive for faster failover.

    this is why they aren't cheaper. it only makes sense for smaller companies who don't have a lot of cash to do this since the smaller payments are easier to handle. if they grow then they will buy their own hardware.
  • Guspaz - Friday, July 29, 2011 - link

    That's not really what EC2 is for. You tend not to buy a small number of huge nodes; EC2 nodes are far less powerful than equivalent nodes from their competition (like Linode) anyhow. Nor are you supposed to be caring about hardware warranties, or even the disk configuration. With EC2, the application should be designed to scale horizontally on many relatievly small instances, and hardware failures should not be handled by trying to replace a machine in a certain amount of time, but by simply spinning up a new instance to replace it.
  • bobbozzo - Thursday, July 28, 2011 - link

    accross
    fondations
  • Ryan Smith - Thursday, July 28, 2011 - link

    Fixed. Thanks.
  • DriftEJ20 - Thursday, July 28, 2011 - link

    The spambots are starting to get pretty bad :-[
    Something tells me the majority of Anandtech readers aren't fiending for knockoff Nike's.

    I can't even imagine how many shell casings could stay visible in an FPS with 512GB of RAM. I enjoyed seeing what the Facebook dataroom looks like, thanks :]
  • awaken688 - Friday, July 29, 2011 - link

    I'm not even in the world of cloud computing and found this very interesting. I'm definitely looking forward to more of these.
  • seanp789 - Friday, July 29, 2011 - link

    I think there is a misunderstanding on how MySQL and Cassandra are related. Cassandra is not a management system for MySql.

    MySql - Relational Database
    Cassandra - Distributed Key-Value Store (NoSql)

    These are 2 completely independent types of data stores with very different programming models.
  • mikhailgarber - Friday, July 29, 2011 - link

    Another article conveniently ignoring the difference between "real" databases and noSQL. There is a reason why relational databases are difficult to scale horizontally. This is the price to pay to store your data in highly consistent, predictable manner, with referential integrity and transactions. Cassandra et al are designed to scale schema-less, eventually consistent, non-durable data. Would you like your financial data to be stored as reliably as your Facebook comments? You can't compare the two systems directly even worse is to pretend that NoSQL is somehow more advanced. It is different.

Log in

Don't have an account? Sign up now