They actually didn’t notice they weren’t running and didn’t check for errors. Someone just went into the cupboard in the office; took the tape out and put another one in once a day. What happened between two consecutive events wasn’t their problem.
They actually didn’t notice they weren’t running and didn’t check for errors. Someone just went into the cupboard in the office; took the tape out and put another one in once a day. What happened between two consecutive events wasn’t their problem.
I worked for a company once. Well I say worked for, but accidentally landed chief technical monkey position because I needed to eat. Turned out they had been blindly cycling tapes for about 2 years. When I reviewed the steaming turd I was landed with I noticed that the external VS160 DLT drive wasn't even connected to the server. The SCSI cable was down the back of the rack. I think someone had moved stuff around and left it like that. Still the tapes ejected and got inserted so they had the illusion of a backup
Sometimes drones is all you can get.
Sometimes drones is all you can get.Then you'd better make sure you train them properly, or give them the right checklists. If they all dance their appointed monkey dance, it's no problem, but someone needs to oversee the bigger picture.
That's what you get when you treat or pay people like drones.
QuoteThat's what you get when you treat or pay people like drones.
Nice soundbite, but I've had this when the 'drone' has been the owner of the business (and rich to boot). He wasn't dumb and wanted it to work, so I don't think this example (just one of many) fits your bon mot.
He wasn't treated or paid like a drone, so that seems to work out.
Hardly! Still have the religiously changed but still blank tapes, which I'd class as not working out
No, my point wasn't that this was an exception to the rule, but that perhaps the rule is arse about face. That is, the supposition is that treating someone like a drone gets you a bad service which, on the face of it, seems reasonable. But the ultimate drone has to be a computer, and that works out just fine (except when it doesn't). Typically, it fails to work because you've neglected to cover some situation in its programming, and maybe that's the problem with the drone thing. Make something explicitly part of the job and its covered.
So maybe the fix is to treat your drones as drones, being explicit in what they should do, and realising that if you haven't 'programmed' a situation it won't be covered. A failure is then a programming issue (i.e. you've failed the think things through enough to let your drone know how to react to a bad situation). If you tell 'em to change a tape every day and they do that but haven't checked to see if the backup has been done, that isn't their fault but yours for not making that check part of the job.
Which is not to say or imply that a 'drone' her is a brainless moron.
Then you'd better make sure you train them properly, or give them the right checklists. If they all dance their appointed monkey dance, it's no problem, but someone needs to oversee the bigger picture.
I'm sure I addressed the rest of your comment a few posts back.
I just had an external drive die suddenly. I am now copying it back from a backup drive.
How often do you test your backups?
I don' t test the individual PC backups as there is no need. Nothing of great importance could be lost due to other backups providing backups
The NAS unit I am perfectly happy with as far as ANY (relatively) large raid 5 system can go. The raid is scrubbed every week. Recovery of a single drive failure large raid 5 array is always finger biting no matter what the machine used. The main raid is backed up at a folder / file level - NOT as an image. The PC's data is backed up to the NAS at a folder / file level (which is why the user directories are on a separate drive). The only 'sector image backups' are the individual PC's to the GFS SD units - and again - I can safely recover that from other areas. The GFS system always means there are 3 independent image copies with little date difference. (And again, this only applies to the PC's / Laptops) .
All the data is therefore stored on a file by file basis with multiple backups. There is simply no need to test. It is utterly pointless. Image backups are a different animal which is why I limit that to PC's -(and VM's actually).
All external drives are checked occasionally for any error using manufacturer's applications (non destructive).
There is simply no need to 'test' anything. Relying on RAID - even with drive failure ' fault tolerance' is very very bad karma. Raid is not failsafe. It MUST be backed up. There is a very real chance of raid not recovering effectively and this is directly related to the amount of data our drives can store. I am 99.9999999% happy with the reliability and robustness of my systems. Delirious in fact. . If it was a commercial enterprise I would 'consider' use a real time off site link storage / backup, however, it isn't and I dont. Off site storage 'by hand' is usually quite sufficient even for a moderately sized commercial enterprise -- depending of course on the format of the data and number of copies stored - a GFS method should be used in that case (grandfather father son).
I have a Synology appliance set for Raid 1 that I backup to. I use Macrium Reflect for backups. I switched from Acronis (the business one, not the personal one) and dropped it because it was an absolute piece of junk that gave me headaches all the time. Macrium has been headache free.
I also have a SATA dock and I occasionally take an image on a drive and keep it as an offsite backup.
I use to have a backup plan that did a full backup every 2 weeks, differentials every 2 days, and incrementals every couple of hours. Now I just do a full backup every week or so. At some point, I'll kick off a differential every couple of days but haven't gotten around to it on my new system yet.
Having many copies of the original copy doesn't protect you from a lot of failure modes. You really do need to test whether you can recover relevant files from your backups on a regular basis. If you don't test, you don't know what you have. Your stacks of copies could be full of perfect files full of perfect garbage.
Not at all .
It is not the gun, it's the gunner...
Static files are subject to hashing when written, as they are static by nature they will be flagged by hash corruption. A total non issue.
Dynamic files are also subject to hashing but of no use as a direct integrity check. The parent application should maintain this integrity check. - self corruption is therefore a non issue providing multiple date/state copies exist. A total non issue.
Dynamic files and the issue of 'user' corruption / invalid or incorrect data entry or deletion or 'insert and other 'PBCAK' here' is managed by standard dynamic file methodology and also a far FAR longer TBO regime - (time before overwrite - if you are unfamiliar)
Dynamic files that are not self managed for data integrity are also subject to extended TBO. Depending on needs and pockets and regulations TBO can be up of 5 years or more (7 in a lot of cases in the UK)
Again, with my system and my regime I am 99.999999% happy and secure. It is a total NON issue. Simply unnecessary. .
What happens when you RAM goes corrupt and slowly corrupts data over time, which shows itself after a while when the errors built up to critical mass? All the garbage is copied perfectly down the line, again and again, and your last clean backups will be months or even years ago. Not testing anything is setting yourself up for failure. You're not the first and won't be the last. In the end, there has to be a monkey checking to see if the recovery process output lines up with the input.
Besides, anyone not nervous about his backups is complacent and will fall eventually
Ahh, the NHS, they pay *really* well when they lose critical patient data because their backups failed to restore sensible data.
I think I went to Cornwall for two weeks in 5 star on one of those or was that the private cosmetic surgery place, I forget...