Author Topic: how does google not run out of space  (Read 11371 times)

0 Members and 1 Guest are viewing this topic.

Offline sony mavicaTopic starter

  • Frequent Contributor
  • **
  • Posts: 472
  • Country: nz
how does google not run out of space
« on: April 26, 2017, 01:42:06 am »
i have always wondered this as the amount of data sent to the google severs ever couple of seconds is like 1tb

plus there sites like YouTube have like 6 different encodes of the same video and up to 9 if original video is 8k and every video uploaded to youtube they will keep the original file

how can they keep up with so much data being sent to there servers
« Last Edit: April 26, 2017, 01:45:17 am by sony mavica »
MORE POWER!
 

Offline Ampera

  • Super Contributor
  • ***
  • Posts: 2578
  • Country: us
    • Ampera's Forums
Re: how does google not run out of space
« Reply #1 on: April 26, 2017, 01:45:57 am »
Google is unimaginably rich. They can afford that space. But that is interesting, what sort of drives do they use, if there are any special storage devices, etc.
I forget who I am sometimes, but then I remember that it's probably not worth remembering.
EEVBlog IRC Admin - Join us on irc.austnet.org #eevblog
 

Online RoGeorge

  • Super Contributor
  • ***
  • Posts: 6146
  • Country: ro
Re: how does google not run out of space
« Reply #2 on: April 26, 2017, 01:49:04 am »
One of the Google's long term missions is to be the repository for all human produced information.

Google is constantly increasing/upgrading their storage space, and there are many data centers around the world:


Offline tautech

  • Super Contributor
  • ***
  • Posts: 28141
  • Country: nz
  • Taupaki Technologies Ltd. Siglent Distributor NZ.
    • Taupaki Technologies Ltd.
Re: how does google not run out of space
« Reply #3 on: April 26, 2017, 01:49:22 am »
Which begs the question; what form of data storage has the highest density ?
Used to be tape.  :-//
Avid Rabid Hobbyist
Siglent Youtube channel: https://www.youtube.com/@SiglentVideo/videos
 

Offline Dubbie

  • Supporter
  • ****
  • Posts: 1114
  • Country: nz
Re: how does google not run out of space
« Reply #4 on: April 26, 2017, 01:59:21 am »
Would be flash memory probably.
Not the cheapest though!
 

Offline helius

  • Super Contributor
  • ***
  • Posts: 3632
  • Country: us
Re: how does google not run out of space
« Reply #5 on: April 26, 2017, 02:02:57 am »
I think atomic force microscopy still has the record for densest information storage.
Now reading back the data, that's a bit more complex :)
 

Offline sony mavicaTopic starter

  • Frequent Contributor
  • **
  • Posts: 472
  • Country: nz
Re: how does google not run out of space
« Reply #6 on: April 26, 2017, 03:18:13 am »
Which begs the question; what form of data storage has the highest density ?
Used to be tape.  :-//

Still is! https://phys.org/news/2015-04-tape-storage-milestone-areal-density.html

220 TB per tape.

but like mini dv does the tape degrade quickly after a couple of uses
MORE POWER!
 

Offline Red Squirrel

  • Super Contributor
  • ***
  • Posts: 2749
  • Country: ca
Re: how does google not run out of space
« Reply #7 on: April 26, 2017, 03:25:20 am »
Since they own their own data centres they can pretty much control every aspect of it.  My guess is they have dedicated staff that literally all they do is add more disk space.  Like order/assemble/rack/plug in systems.   They probably have custom setups where they snap everything together, put in like 60  of the highest TB drives available in a single 4U chassis, and slide it in, plug it in, then the cloud system takes over from there.  probably have like 10+ people who do this non stop all day.  When they run out of floor space they build a new data centre.  At some point they probably take down older storage servers to replace with newer higher density drives, and just repeat.

But it is mind boggling just how much data they do produce so I do wonder how they do in fact keep up, it seems even if people are putting together storage pods day after day it would still not be fast enough.
 

Offline digsys

  • Supporter
  • ****
  • Posts: 2209
  • Country: au
    • DIGSYS
Re: how does google not run out of space
« Reply #8 on: April 26, 2017, 03:48:28 am »
Quote from: sony mavica
... but like mini dv does the tape degrade quickly after a couple of uses
Actually, I worked with a mate who looked after a data storage centre. The tapes can last a very long time, 10-20 yrs, depending on quality BUT, to achieve that -
they are kept in a temp controlled room, flipped every few months or so to avoid print-through, spooled and re-spooled. Orientation is very important, due to effects of gravity.
Hello <tap> <tap> .. is this thing on?
 

Offline Red Squirrel

  • Super Contributor
  • ***
  • Posts: 2749
  • Country: ca
Re: how does google not run out of space
« Reply #9 on: April 26, 2017, 04:16:47 am »
I had looked into tapes for my home backups but felt they are not really practical due to those issues.  You really need to cycle them often to avoid bit rot, and each time you do cycle them, you reduce their life.  They are only good for a couple thousand spools.  I suppose if they are kept in a faraday cage at like -100C they might last longer. But print through might still be an issue.
 

Offline Brumby

  • Supporter
  • ****
  • Posts: 12288
  • Country: au
Re: how does google not run out of space
« Reply #10 on: April 26, 2017, 04:58:44 am »
While the hardware question is a good one - you have to admire the software design that manages it all.  This is one place where I imagine scalability is challenged on a daily basis.
 
The following users thanked this post: Kilrah

Offline digsys

  • Supporter
  • ****
  • Posts: 2209
  • Country: au
    • DIGSYS
Re: how does google not run out of space
« Reply #11 on: April 26, 2017, 05:00:30 am »
Quote from: Red Squirrel
... But print through might still be an issue.
This may actually be a good school / science experiment for the space station !! They're always looking for different ideas. Send up 20 tapes, with 20 control tapes on Earth, then bring 1 tape
down every year or so, and test against the reference ones. Any kids listening ? :-)
Hello <tap> <tap> .. is this thing on?
 

Offline CJay

  • Super Contributor
  • ***
  • Posts: 4136
  • Country: gb
Re: how does google not run out of space
« Reply #12 on: April 26, 2017, 06:19:11 am »
I have it in my head that Google use commodity drives for their storage, the kind of thing you'd find in a desktop PC but that's based on a memory of a hardware reliability study they published a few years ago.

They also used to run their own blend of server which amounted to little more than a commodity board on a folded metal tray, I think they'd worked out for the number of cores and terabytes needed that by having massive redundancy in their server and disk arrays and using consumer grade parts (albeit mid to high end ones) instead of dedicated server hardware they would save a *lot* of money at that the cost of a shorter lifecycle (however I do remember the reliability difference between enterprise and commodity drives being surprisingly small after early mortalities were weeded out and if they were all treated well)

It may of course have all changed now and they've decided that racks crammed full of flash chips are much better.

I think this is the study or a close relative:
http://matrix.lt/download/google-hdd-disk-failures-analysis.white_paper.pdf
 

Offline Red Squirrel

  • Super Contributor
  • ***
  • Posts: 2749
  • Country: ca
Re: how does google not run out of space
« Reply #13 on: April 26, 2017, 07:48:37 am »
Yeah the slight performance difference between "enterprise" and consumer drive is probably too negligent to justify the increased cost.  And individual hard drive is also seen as a very tiny sector as far as Google is concerned - heck, even me.  I don't store data on single hard drives, I store it on multiple at once.  AKA RAID.  I imagine Google has their own custom raid system that can span a crazy amount of drives with a crazy amount of performance and data safety tweaks.

Due to the high volume of writes flash is probably also not viable as flash has write limitations so I imagine they'll always stay on spinning disks, maybe also ram. 
 

Offline Kilrah

  • Supporter
  • ****
  • Posts: 1852
  • Country: ch
Re: how does google not run out of space
« Reply #14 on: April 26, 2017, 07:54:27 am »
how does google not run out of space
They add more, and more, and more, all the time.
 

Offline rrinker

  • Super Contributor
  • ***
  • Posts: 2046
  • Country: us
Re: how does google not run out of space
« Reply #15 on: April 26, 2017, 01:41:01 pm »
 A common commercial SAN uses 3U disk trays that hold 15 drives. There are now 12TB enterprise class drives. That's 180TB for every 3U, and you get 12 disk trays plus the 4U storage controller in a rack (with a couple of U left for the switch). 2 PB per rack.

And there are other brands and models of storage systems that have a much greater density.

 

Offline boffin

  • Supporter
  • ****
  • Posts: 1027
  • Country: ca
Re: how does google not run out of space
« Reply #16 on: April 26, 2017, 02:12:51 pm »
If you're looking for 'do it yourself' storage solutions, these are two pretty good places to start.  I've used numerous 6048Rs over the last few years, and they're pretty damn rugged and easy to deal with.  36 drive version is pretty annoying if the rack isn't well organized to get those back drives out, but the 24drive version is really nice and simple.

https://www.supermicro.com/products/system/4U/6048/SSG-6048R-E1CR36N.cfm


Backblaze sell silly cheap backup (and block storage), and they opensourced made their own, details are here.  Problem is it's impossible to replace drives 1 by 1 when they break. Lots of history about their pods on their website, and here's the most recent.

https://www.backblaze.com/blog/open-source-data-storage-server/
 

Offline mleyden

  • Contributor
  • Posts: 20
  • Country: ie
Re: how does google not run out of space
« Reply #17 on: April 26, 2017, 04:20:03 pm »
Having spent some time working on a Google for Work rollout last year...

Google will store your data in at least 3 and up to a max of 5 of their datacenters. You will access data from the datacenter your are nearest to. I assume it is similar for all their services... So take the 1TB / sec and it becomes 3TB / sec! Very few can compete with them - perhaps Amazon and Microsoft - but noone else could afford the infrastructure.
 

Offline suicidaleggroll

  • Super Contributor
  • ***
  • Posts: 1453
  • Country: us
Re: how does google not run out of space
« Reply #18 on: April 26, 2017, 04:28:34 pm »
Due to the high volume of writes flash is probably also not viable as flash has write limitations so I imagine they'll always stay on spinning disks
That would assume they actually delete things.  Flash write limitation only comes into play when you write, delete, write, delete, write, delete, over and over again, hundreds of TB per drive.  If all you're doing is archiving, that's one write cycle, and then it just sits there forever being occasionally read from.
 

Offline rdl

  • Super Contributor
  • ***
  • Posts: 3665
  • Country: us
Re: how does google not run out of space
« Reply #19 on: April 26, 2017, 04:34:31 pm »
I doubt Google ever deletes anything.

However, would it actually require that much writing? Deleting is not usually the same as erasing. Isn't the data is just marked as over-writable? I suppose that might depend on the system though.
 

Offline rob77

  • Super Contributor
  • ***
  • Posts: 2085
  • Country: sk
Re: how does google not run out of space
« Reply #20 on: April 26, 2017, 04:40:24 pm »
A common commercial SAN uses 3U disk trays that hold 15 drives. There are now 12TB enterprise class drives. That's 180TB for every 3U, and you get 12 disk trays plus the 4U storage controller in a rack (with a couple of U left for the switch). 2 PB per rack.

And there are other brands and models of storage systems that have a much greater density.

what you mean by common commercial SAN uses 3U .... ? ;)
SAN = storage area network it consists of SAN switches/directors , storage arrays, hosts and all is connected through fibre channel (scsi commands encapsulated into FC protocol transported over fiber)....
the SAN directors we use have hundreds of 8Gbit/s FC ports,  storage arrays we use are a full cabinet/rack (or two) with hundreds of spindles (disks) and the hosts are connected through FC host bus adapters with at least 2 ports. a common commercial SAN has 2  fabrics (each fabric is a separate FC network) for redundancy so you have at least 2 switches/directors and both storage arrays and hosts are connect to both of them (half of the available FC ports per fabric)...
so a common commercial SAN is definitely not a 3U disk array.
 

Offline sokoloff

  • Super Contributor
  • ***
  • Posts: 1799
  • Country: us
Re: how does google not run out of space
« Reply #21 on: April 26, 2017, 04:46:48 pm »
A common commercial SAN uses 3U disk trays that hold 15 drives. There are now 12TB enterprise class drives. That's 180TB for every 3U, and you get 12 disk trays plus the 4U storage controller in a rack (with a couple of U left for the switch). 2 PB per rack.

And there are other brands and models of storage systems that have a much greater density.
what you mean by common commercial SAN uses 3U .... ? ;)
SAN = storage area network it consists of SAN switches/directors , storage arrays, hosts and all is connected through fibre channel (scsi commands encapsulated into FC protocol transported over fiber)....
the SAN directors we use have hundreds of 8Gbit/s FC ports,  storage arrays we use are a full cabinet/rack (or two) with hundreds of spindles (disks) and the hosts are connected through FC host bus adapters with at least 2 ports. a common commercial SAN has 2  fabrics (each fabric is a separate FC network) for redundancy so you have at least 2 switches/directors and both storage arrays and hosts are connect to both of them (half of the available FC ports per fabric)...
so a common commercial SAN is definitely not a 3U disk array.
@rrinker specifically said uses 3U disk trays. IME, that's a very common (and perhaps the most common) form factor for the disk trays.

He was clearly trying to Fermi estimate the amount of storage per rack footprint.
 

Offline rob77

  • Super Contributor
  • ***
  • Posts: 2085
  • Country: sk
Re: how does google not run out of space
« Reply #22 on: April 26, 2017, 04:57:51 pm »
A common commercial SAN uses 3U disk trays that hold 15 drives. There are now 12TB enterprise class drives. That's 180TB for every 3U, and you get 12 disk trays plus the 4U storage controller in a rack (with a couple of U left for the switch). 2 PB per rack.

And there are other brands and models of storage systems that have a much greater density.
what you mean by common commercial SAN uses 3U .... ? ;)
SAN = storage area network it consists of SAN switches/directors , storage arrays, hosts and all is connected through fibre channel (scsi commands encapsulated into FC protocol transported over fiber)....
the SAN directors we use have hundreds of 8Gbit/s FC ports,  storage arrays we use are a full cabinet/rack (or two) with hundreds of spindles (disks) and the hosts are connected through FC host bus adapters with at least 2 ports. a common commercial SAN has 2  fabrics (each fabric is a separate FC network) for redundancy so you have at least 2 switches/directors and both storage arrays and hosts are connect to both of them (half of the available FC ports per fabric)...
so a common commercial SAN is definitely not a 3U disk array.
@rrinker specifically said uses 3U disk trays. IME, that's a very common (and perhaps the most common) form factor for the disk trays.

He was clearly trying to Fermi estimate the amount of storage per rack footprint.

my point is that a disk array is not a SAN , it's much more complex than that..... so a SAN is not using disk trays.   the storage array connecved to a SAN is using disk trays with disks... and some of them might be using 3U trays..
 

Offline Kilrah

  • Supporter
  • ****
  • Posts: 1852
  • Country: ch
Re: how does google not run out of space
« Reply #23 on: April 26, 2017, 05:56:34 pm »
That's exactly what he said in the rest of the sentence, that the SAN would be the rack with multiple trays, controller and switches...

Deleting is not usually the same as erasing. Isn't the data is just marked as over-writable?
Yes but the whole point of deleting is to make room to write something on there again... so it will eventually lead to a write cycle.
 

Offline bitseeker

  • Super Contributor
  • ***
  • Posts: 9057
  • Country: us
  • Lots of engineer-tweakable parts inside!
Re: how does google not run out of space
« Reply #24 on: April 26, 2017, 07:18:18 pm »
I have it in my head that Google use commodity drives for their storage, the kind of thing you'd find in a desktop PC but that's based on a memory of a hardware reliability study they published a few years ago.

...

I think this is the study or a close relative:
http://matrix.lt/download/google-hdd-disk-failures-analysis.white_paper.pdf

Yes, that's the one that came to my mind as well, as I read the OP.
TEA is the way. | TEA Time channel
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf