Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Google: 'At scale, everything breaks' (zdnet.co.uk)
84 points by ca98am79 on June 23, 2011 | hide | past | favorite | 24 comments


It is always a pleasure to read from Urs Hölzle. I've been in touch with some minor spec of his work[1], but I'm always impressed by it.

[1] http://research.google.com/pubs/author79.html


All the best (and fun) scaling issues is best solved by hacking. Is there really any proper decent methods out there that can actually take a company from 5 users a day to 5 billion without any problems at all? I'm aware that you could perhaps "foresee" it, and get a big data-center, and lots of hardware solutions -- but is this really the solution that's called an 'out of the box' solution?

Sounds like when Google gets a problem, they create something to fix that, which is pretty damn cool in my opinion.


As a google employee I can say it really is "pretty damn cool".


"...most applications don't use [Google File System (GFS)] today. In fact, we're phasing out GFS in favour of the next-generation file system that is very similar, but it's not GFS anymore."

what could this be? home grown? an open source project?


Colossus: see a slide about halfway through this presentation: http://static.googleusercontent.com/external_content/untrust...


"We use tapes, still, in this age because they're actually a very cost-effective way as a last resort for Gmail. "

.. what kind of tape drives does Google use?


BFT... big freaking tapes. I wonder if it's really tape though. You can get 'tape' backup systems where the cartrige looks just like a tape, but it's actually a special hard drive. I wouldn't be surprised if that's it. Otherwise, I wonder how many miles (of tape) long my gmail inbox is?


Urs wouldn't have said tape if he didn't mean tape.

This brings to mind an amusing anecdote. I once saw a conversation about how many miles of tape are storing Viagra ads. Someone quipped, "The only thing that I know for certain is that the trend is up."


According to one source I found[1], 60 meters of DAT tape holds 1300mb of data. That's 21.67mb per meter. My gmail inbox is 679mb, thats about 31 meters of DAT tape.

[1] http://menehune.opt.wfu.edu/Kokua/SGI/007-2861-005/sgi_html/...


LTO5 hit's about 1.8GB per metre. Or about 3.6GB compressed.[1]

There's a reason that people say tape is still cost effective -- and it's not DAT.

[1] http://en.wikipedia.org/wiki/Linear_Tape-Open


Likely a lot less than 1 mile.

LTO5 stores approximately 3 TB with Compression. (1.5 Native). It is 846m long.

Let us assume you have 7597MB of emails in your gmail inbox. That's about .48% of the capacity of a Uncompressed tape. So that's a smidge over 4m. And if google compresses their data -- then you're potentially looking about 2m.


Google wouldn't buy a VTL when they already have GFS. When they say tape, they must mean tape.


I love this comment: "The reason why we put it in is not physical data loss, but once in a blue moon you will have a bug that destroys all copies of the online data and your only protection is to have something that is not connected to the same software system." I think that is often overlooked when designing HA storage systems.


Am a little bit surprised Google hasn't incorporated SSD more.


For lots of applications, the workloads are either memory only or large sequential IOs. SSDs matter less in both situations.



Having installed almost 200 Intel SSDs into servers with 0 failures, I doubt his anecdotal evidence.

Also compare with Intel's reported annual failure rates: http://www.anandtech.com/show/4244/intel-ssd-320-review and a french retailer: http://www.anandtech.com/show/4202/the-intel-ssd-510-review/...


SSDs are actually cheaper per IOPS.

Considering that everything we knew about hard disk reliability was shown to be wrong (by Google), I wouldn't be surprised if the same holds for SSDs.


It's hard to see SSD's being a clear win on the surface. It would be very interesting though to see if the cost begins to change when you consider power savings (much smaller than hoped, but they do exist), figure out how much they'd save on HVAC in the datacenters, etc.


Having worked on power-aware hybrid storage for the last two years, the power savings per GB is pretty much zero; there is significant W/IOPS savings but that still doesn't pay for the capital cost. Performance is really the reason to use SSDs.


per GB is an excellent point, I had not considered that.


think commodity... have ssd's reached the commodity level yet? from a commodity stand point, i think it would be far more likely that google would use an array of SD cards per node. SSDs are really just an array of SD cards anyway, with a pile of hype and marketing piled on. you would still get most benefits of an SSD. power usage (esp at idle) might even be less than an ssd. google could develop their own wear leveling algorithms, and the rest of the stuff that an SSD controller provides for the internal flash. replacement costs could be less as well over time.


I was at a linux user group meeting recently where a talk was given by a Google sysadmin where he talked about hard drive reliability. Someone asked him about SSDs to which he replied that he couldn't talk about it. My take is that they are definitely trying out SSDs, but either 1) found SSDs provide a huge competitive benefit and don't want to publicly share that knowledge or 2) they simply don't have enough data yet. I'm leaning towards #1.


Where did you get the impression that Google hasn't incorporated SSDs?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: