EEVblog #395 – World’s Most Expensive Hard Drive TeardownPosted on December 5th, 2012 34 comments
What’s inside a $250,000 1980′s vintage IBM server hard drive used in banks?
1989 vintage Model 3390 mod2 1.89GB or 3.78GB
Forum Topic HERE
That was an awesome teardown.
I would love to see (and ear) that sucker spin
Spin it up!
What a nice BLDC motor. Looks like there are even 3 sense wires. Non of that modern sensor-less control stuff.
I remember writing a driver for a similar drive in the late 70′s. I took it home to do the work and remember thinking that I had about $30K of hardware in the back seat of my new car that cost $4K .
I too had a Seagate STsomething that was 20MB in the 1980′s.
And the reason the electronics are light in that drive teardown – IDE didn’t come about until about 1990.
RE tiny – remember the Microdrive? A spinning hard drive in a CF type 2 case, with discs the size of a coin.
Great teardown! It makes a lot of sense for banking to have hard drives with golden plates
More seriously, this recalls me of this guy who managed to recover a drive from a Cray 1 supercomputer :
He used a magnetic head mounted to a servomotor and an FPGA to oversample the data from the head. A few months later, another guy wrote a program to recover track structure from the raw data and pulled out the actual data from the drive.
You could do the same thing if you can make the plates spin and have at least one working head…
/me thinks the grooves in the heads were aerodynamic “floating” systems – the incoming air would be compressed where the slot narrows, which would lift the heads up off the surface so there was zero friction or wear on the drives. However, this meant that the heads couldn’t be placed on the drive until the platters were up to speed. Hence the parking motor, which lifted the heads off the platters when the platters weren’t up to speed. I agree the spring is likely there to lift the heads up if the power fails.
It also looks like the platters and the platter spindle were part of an air pumping system, which continuously moved the air through those filters.
It was also critical that the air pressure inside the drive be constant, as that was a factor in how the heads floated. My uncle was a service tech before those days, and told an interesting story about a hard drive install at a high altitude site…
A few words on these drives.
To get throughput they have two independent LVDS drivers. top 16 heads and bottom 16 heads. This allows dual access to the stack.
Now , why 16 heads per block ? the outermost head is a pure READ had. no writing is ever done with that head , apart from during manufacturing. The innermost head is a read/write head. They store bytes in parallel across the 4 platters. so that top surface only holds bit0 of all data, the surface below only bit1 and so on. The outermost head ( closest to the hub ) reads the sectornumber and looks for the servowedge.
Here is the problem they were facing and how they solved them.
As you approach the hub of the drive the circumference of a track becomes smaller. given that a ‘bit’ needs to be a specific size this means you can store fewer bits per track as you approach the hub. so ideally you want the outer tracks. ( that’s why they have such a large ‘hub’ it’s useless anyway … not so in modern drives). This is problem 1 to solve ..
To find a track you apply a voltage on the lvds ( dave is driving this thing with a function generator in the video but in real life it is driven by a servo system that applies a sinewave beyond the mechanical inertia and picks it up using a second coil. the amplitude feedback of the second coil is a measure for where you are. the slider pushes a core in an out actually creating a variable transformer. so you can find tracks ‘coarsely’ very easy using only this servo. but you still need a fine-tuning. more on that later. This is problem 2 to solve
Problem three is finding the sector of the track. so you need a marker on the track that tells you where you are.
and finally problem 4 is the exact timing that needs to be in absolute sync with the rotation of the platters. write too long and you destroy data in a subsequent sector… so if the drive speed were to alter just a bit … catastrophy.
SO how do we tackle this. The inner tracks are pretty much useless apart from holding very little information. What if we only store the sector numbers there ? that is little data and can be spaced far apart, we actually have enough space to store both track ( actually cylinder and sector number ). the empty space between this data will be filled with a pilot tone ( sinewave )
i need to sidetrack here for a second. Today we talk about tracks and sectors. in the old day we talked about cylinders and sectors. a cylinder is the imaginary traverse of a cylinder across 8 surfaces ( remember we are writing a byte in parallel… )
Knowing this we can now solve all problems in 1 shot.
The ‘short’ inner cylinders are written with little blocks of data holding cylinder number and sector number interspersed with a pilot carrier.
since the heads on the arms are magnetically linked : if the slider holding the heads found cylinder 9 using the front head , then the rear head is on cylinder 9 of the user data … they are mechanically linked.
So now we can already find a cylinder and a sector. The sinewave pilot stored between this data is used in a PLL system to link up the motor. this guarantees constant speed as we have perfect feedback from the platters. if the phase of the pilot goes out of whack with the motro we know if we are early or late … so we can pinpoint the exact start of a sector and keep drive speed constant , so no risk of overrun.
To do the precision adjustment on the cylinder we use the amplitude of the returned pilot to seek maximum magnetic field strength. this puts us exaclty in the middle of the cylinder.
In short a write operation happens as follows :
the controller sends the LVDS to where he thinks the cylinder is. he uses the front head to read the cylinder number and sector and verify. correct if necessary. then the controller uses the front head to seek maximum signal amplitude of the pilot. this now positions him optimally on the cylinder center. The controller knows what the last sector number was that passed and knows how many sectors he needs to wait to ‘write’ a block. in the mean time he phase aligns the motor rotation using the pll so the pilot phase ligns up. when all falls into place he writes the data using the REAR head. the front head is still reading the pilot and the sector/cylinder data. the front head signals ‘end of sector’ so the REAR head stops writing ( or not , of writing is across multiple sectors ).
During manufacturing the front head is used to write the cylinder/sector and pilot. Once the drive is built the front head becomes a read head only.
If you take the platter stack apart you jumble the bits and its game over.. the drive is irrecoverable.
Now, this parallel writing stuff has another advantage. if you get a scratch on a surface only bit n of a whole stream of bytes will be corrupt. this allows you to do some optimisation in error correction mechanisms. at the end of a block of data some error correction information is written that allows you to recalculate the missing string.
Modern harddisks no longer uses this mechanism. it is wasteful and we can dynamically alter the number of sectors per track. data is interspersed with track/servo/sector information and the servo (the pilot) is only 10 cycles of a sinewave now.
For people old enough : remeber when LBA was introduced ? the old drives talked about x cylinders and tracks. LBA ( logical block addressing ) translates this old ‘numbering sceme’ to a sceme that has data no longer stored in cylinders but in sequential tracks.
the servo data today is actually stored in a spiral pattern so that the drive can read this as it flies by when the seeking operation is taking place.
Like dave said : harddisk are rocket science. they are actually harder than rocket science. the amount of trickery is phenomenal. The head construction , shape and aerodynamics is extremely complicated. we artifically control the heat in the hed to steer the flying height. we can flex the head to align it perpendicular to the track , we write bits vertically as opposed to one after another ( the magnepoles lie vertically in the platter as opposed to at the surface. theis requires us knowing what was the bit that was written x distance ago as we need to close the loop across an ‘old bit’. and now we can slant the heads to tilt the tracks sideways , doubling the number of tracks while keeping the magnecule size constant. and to combat the magnetic hardness and power required to flip a bit we heat the surface using a laser prior to writing it… as it cools down the magnetic energy is trapped. And this all needs to be done reliably , for maximumm 20$ in material and workmanship and sold with a 5 year warranty… If you scale it up to a car , the car should cost no more than 1000$ and work reliably for 5 years …
As for the head : that is a 747 at cruisespeed flying 1 inch above the tarmac … for its entire life…
Oh dear, something not quite right there.
These IBM 3990 Head Disk Assemblies (HDAs) each contained 2 devices (or volumes). The 2 sets of heads are for the separate 2 devices. It’s like 2 separate HDDs in 1 unit.
For each device there are 15 data read/write heads and 1 servo head – which accounts for the 16 heads. Depending on the model there are 1113 or 2226 data cylinders (head positions across the platter) on each device which with 15 data heads gives 15 tracks per cylinder for a total of 16,695 or 33,390 tracks per device. Each data track stores 56,664 bytes of data.
IBM mainframe systems that use this sort of HDD generally use variable sized blocks on each track so the suggestion that data is spread between heads is nonsense. While data bits in a byte could (in theory) be split across tracks of every track contained the same block structure that is not done with these devices. I am not aware of any HDD that ever adopted that approach. Most HDDs since the IBM 3330 which appeared in 1970 have used error-correcting codes which can correct data burst errors to deal with patches of the surface not reliably storing data.
Wonderful explination Vincent! Thanks so much. My hard drive experience started with seagate st4096′s. I still have them! A whopping 80 MEG (NOT gig), 5 platters (I think), Modified Frequency Modulation (MFM) interface, linear actuator (like Dave’s monster) and pricey! I bought on at the swapmeet for about $350 and that was a good price!
I even have the technical manual for them with schematics and detailed specs!
They are now wonderful bookends!
Could you please send me a mail scheme hard drive Seagate ST-4096.
On my board damaged transistor and I can not find the name.
oops , forgot one more thing. by having two heads per arm they also cut the seek time in half …. the arm only needs to move half distance…. that’s another trick…
@dave : take a good look at the rear heads. those should have 4 wires coming out of them . the front heads only 2 ( read only ) the rear one is read/write so two coils. unless this is a unicore head which does both with one coil.
modern heads only use inductors to write. to read we use a Magnetoresistive element. kinda like a hall sensor. send a current one way , pick it off the other way. coupling changes depending on outside applied magnetic field. these heads toggle fast … VERY fast . in the order of 4 to 5 GHz … The signal coming from a modern harddisk head is very high frequent
No, when these drives were used with the IBM large server mainframes such as the Banks did they did not split bytes across multiple platters.
The architecture of the Operating system data access methods relied heavily on the sophisticated ( for the 1960′s) ability of the controllers to accept I/O programms from the CPU to search for particular data blocks. Each block had a prefixing data count and a key field to allow the hardware to assist in searching for the correct database block of data. Hence these 3390 and earlier 3380 3350 and 3330 disks were known as CKD or count-key-data devices. This would release the CPU for other work and when the data had been transferred into memory an interupt would be generated. There was an altenative FBA fixed block architecture but it also did not split bytes. At least not on the IBM 360, 370, MVS, MVS/XA, ESA, OS390 and Z/OS mainframe servers.
Another means to improve access time was to permit the data to be allocated in either blocks, tracks or cylinders. Each Track could accept a number of arbitrary user selected blocksize data blocks up to 32K. However a track also had interblock gaps of about 600bytes so the optimum data capacity was for 2 blocks of about 27k on a 3390 disk. small blocks would waste space in interblock gaps but use less memory for smaller data buffers. When this architecture was developed in the early 1960s memory was expensive and there was a benefit to smaller blocks both in memory buffer savings and smaller data transfer times.
Also you could allocate datasets (files) in cylinders (still with a given blocksize) and this was done if you wanted to minimise head seeking by ensuring your data was clustered together and not fragmented as much across the surface being interspersed with other users datasets. A cylinder was a set of 15 tracks accessable without needing to move the heads.
Datasets could be preallocated to reserve empty space if you further wished to avoid fragmentation as the primary extent was filled and a secondary extent was allocated wherever it could fit, not necessarily adjacent to the primary or other secondaries. If you didn’t allocate enough space or you didn’t want or couldn’t get sufficient secondary space your job failed. Hence users frequently allocated considerable additional space just in case. Space not available to other users.
Userdata can indeed be stored as sequential bits on a single surface and then simply hop from one surface to another surface in a single cylinder. It is the cylinder number/ sector number that is split across the platters in a cylinder and read as parallel bytes. What you call interblock gaps are the run-out zones for writing. when writing a drive is essentially blind you know you are allowed to write x number of bytes and avery byte takes up x linear inches. this is normally equal to a sector length. but if the drive speed is even slightly off this lenght fluctuates. so they allow for a run-out zone.
The run out zone is longer than need to be becasue , at the inner tracks they use the same displacement ( shorter due to being closer to the hub ) to store the cylinder/sector data.
To visualise the layout of the platter :
Let’s look a the data track. assume 4 sectors.
the head at the tip of the arm sees this
So , when the user data head ( the one rimwards is reading or writing data the control head ( hubwards head ) is reading the pilot so the servo system can keep everything in sync. When the user data head sits in ‘the gap’ the control head is actually reading a very short burst of bytes containing sector and track number. these bytes are stacked vertically and written in parallel. so there is only a very short gap needed to store the info. if you were to write it sequentially the gap would be longer and you would not need 8 heads vertically. you could use a single head.
This is how the old discpacks worked. they had nine platters. one was the servo platter holding the synchro information. disassemble a discpack and put it back to gether and you can never retrieve your data …
That’s also how drives are initilized today ( mulitplatter drives ) they inject an extra head through the side of the drive chassi ( look for a little metallic oval sticker on the side ) and they write a ‘clock’ track on the outer rim of the top plate. once that track is verified they speed up the drive and sync the servocs to theis clock track ( it is now being read continuously) . the enterie servo system locks and the servowedges can now be written. this is a tedious procedure. Any harddisk that lands on your desk has sat in the ‘drive formatter’ for between 20 hours and 30 hours…. simply being written with the pilots and servowedges ….
I belive that adding 20-30 hours to drive manufacturing process before testing starts, would make it impossible to produce a 2TB drive with an end user price of about $100
From where comes this figure?
No,the rear heads are single coil, identical to the front heads.
Why would the front heads be read only? you’d be wasting half the disk area (ok, not half, but a lot)
Front heads are read only because they only read the sector data. The inner tracks would have too many “bits per angle” and thus would not be reliable. Basically the inner tracks are for low-density marker data and outer, longer tracks are for actual data.
Exactly. The size of a bit is fixed. As the track length changes you can store fewer bits hubwards than rimwards. The diameter of the track changes depending on where you are.
So the inner tracks are pretty much useless. There is even some waste at the outer tracks as the bits there are written longer than ned to be. The correct bit length is attained only at the center of the writable area. That would be where the rear head is if the arm is fully extended (front head touching the hub).
These old drives have a fixed amount of sectors per track and every sector holds the same amount of bits. Iso they use the ‘short tracks’ to store data at a different speed. By tilting the data vertically and splitting the bits across the platters they can make the packet very short on the tracks and leave lots of room for the pilot tone to sync up.
Modern harddisks dont do that anymore. Number of sectors per track changes. Each sector still holds the same number of bytes. Sectors are interspersed with the servowedge. The servowedge is a block of data only accessible by the drive and written during manufacturing. This holds the tract/ sector number and 3 to 10 periods of a pilottone to do the sync.
Your 1 terabyte harddisk actually is a 1.2 terabyte drive. There is enough room on the platters to store 1.2 tera but we simply need that room to store all the control information so we can find you data ! Note that this data is not accessible by the operating system. It is purely for the driveware. Also the driveware itself is stored as well as room to replace bad sectors.
When a drive boots it spins up under control of some firmware in rom. Once at speed the firmware performs a seek operation and loads the remainder of the drive firmware in ram These are the translation tables, the sector positions and the optimum data staggering algorithms. It also contains the location of bad sectors and the position of the runout zones as well as the optimized parameters for flying height , head tolerances and much more. All that stuff is deposited during manufacturing in 5 locations. The drive reads these blocks and when it finds 3 identical ones those are considered as valid. This is to ensure the drives operation if a block would get damaged. Without that info the drive cannot function.
Ill draw a diagram over the weekend of how data is laid out in this ibm as well as a modern drive.
What i draw about the ibm is partially speculation as i am. Ot familiar with this particular drive. It is a general drawing and explanation of one technique that was used at one point in time. The ibm drive seems old enough and young enough to use what i describe, but i could be a few generations off. Like is said earlier. Drive technology evolves fast and ever 6 months there is something new… Drives are developed in a staggered timeline. They start a new design every 4 to 6 months, work on it for two years and then release it in the wild.
So that brand new top of the line 4 terabyte drive is NOT what is capable today. It is what was possible two years ago. Today they are designing 8 and 10 terabyte drvies… With the 5 and 6 tera about 8 months away from introduction…
There has been so much evolution over the years and techniques evolve VERY quickly in the drive business. we see new techniques at a rate of every 5 to 6 months … Dual stage actuators for example to make the tracks narrower we can now tilt the head so it stays perpendicular to the track as opposed to skidding sideways.
I am old enough to have worked with a similar design drives, what a fun time it was! And a good workout, too to grab a pack of plates from the library (top shelf) and drag it over to the drive (in a different room) and install.
I don’t know if Dave actually reads these comments but: Dave, did you know your forum software does not send activation emails? I think it’s been a couple of times over the last few months that I tried to register and never got an activation email. Not in the junk/spam folder either – just never received it. Requested a re-send and that did’t arrive either. Used the same name/email as in this comment, can you check what’s going on?
And imagine being able to take all that technology and squeeze it in a tiny hard drive that costs well under $100.
Anyhow, modern hard drive heads have two parts – one is magnetic voice coil for writing data, the other is a magnetic resistor – the GMR head for Giant Magnetoresistive head. Basically the read head uses quantum effects to cause huge changes in resistance as the magnetic field changes, which can then be picked up by the head amplifier.
Plus, advanced ECC, servo and timing electronics allow for such bit-stuffing techniques like no-id sectors (the sector has no ID prefixes anymore – it’s basically timed to find the sector).
Facinating piece of mechanical wonder, really.
There’s some talk on fedoraforum.org that the 240000$US cost is really for the entire HDA set and cabinet they went inside. You just dismantled a single HDA, is that possible?
Dave, this is a modern HDD
The first one I have seen on my work was a fixed head (or head per track) drive.
They came with 128, 256 or 512 heads with capacities of 256kb to 1Mb.
they came in a cabinet with a compressor that pushed the hads toward the disc when it was up to speed.
The disc diameter was about 20″ and the thicness was around 1/3″
And with all the heads it was slower than any 3.5″ current disc drive.
When I saw the title of the video, I thouhgt it might be a REALLY old, ‘washing machine’ drive with removable disks.
I worked on these at Sperry Univac in the mid 70′s. 17″ platters with around 60MB capacity. I don’t remember much about the technical specs but have less than fond memories of late nights replacing and aligning heads.
Where is that hard drive now? Shame to throw it away. I had a 40 MB hard drive once, had stepper motors in it. I still have the stepper motors today. Seriously, old stuff are cool. New stuff are just fodder.
Leave a reply