This server houses our small business database, so I don’t want to do anything that might corrupt the data on it. The other IT staff (a programmer) backs up the files to external hard drives. I’m new in the IT field, so I could use some advice. I did not set up the server, and of course the server is a few months out of warranty at this point.
The LED display panel on the front of the server shows this error – “E1810 HDD ## Fault”. I have no way to manage this server, as it was set up as all virtual machines in VMWare ESXi 4. Right now the programmer doesn’t want to install Dell OpenManage, because the server would first need updated firmware, etc. When I look at the physical server itself, I can clearly see the hard drive with issues because there are amber lights blinking instead of green. This is the most detailed information I can get from VMWare console:
Alert! >>Storage - Drive 5 in enclosure 32 on controller 0 Fw:HS09 – UNCONFIGURED BAD Warning! RAID 5 Logical Volume 0 on controller 0, Drives (0e32,1e32,2e32,3e32,4e32,?) DEGRADED Warning!
Disk Drive Bay 1 Drive 5: Drive Fault – Assert Alert!
Disk Drive Bay 1 Drive 4: In Critical Array – Assert Alert!
Disk Drive Bay 1 Drive 3: In Critical Array – Assert Alert!
Disk Drive Bay 1 Drive 2: In Critical Array – Assert Alert!
Disk Drive Bay 1 Drive 1: In Critical Array – Assert Alert!
Disk Drive Bay 1 Drive 0: In Critical Array – Assert Alert!
Since I’m clueless as to what I’m dealing with here, can anyone answer the following questions for me?
I assume this is a RAID 5 configuration, correct?
I obtained this info from the system configuration page on dell.com:
1)Backplane, Dell2950, 3.5X6SAS
2)ControllerDRAC5, PE, Mid-Life Kicker
3)FG0271Card, Backplane, Key, TOE, 2 PORT Enterprise Systems Group
4)MC3601Assembly, Cable SASX4-PERC5X4-X6BKPLN
5)Controller PERC6IINT, Serial Attached ScsiShort Lead
1) What exactly is a backplane? I could look this up in Wikipedia, but I can probably get a better working explanation from a true network person! I assume the SAS stands for serial attached SCSI drives, since that’s what the hard drive description is (SCSI).
2) Is DRAC5 the brand of the controller? If not, what does DRAC5 mean? Mid-Life Kicker??
3) TOE?
4) Not sure what this is saying…
5) What is a PERC6IINT controller, and why does line 4 show PERC5?
I have been reading the Dell troubleshooting documentation and it has some terms I don’t know how to address. Based on the above information, this server has an SAS RAID controller, correct? How do I know if I have an SAS RAID controller daughter card (that term is in the troubleshooting steps)?
Finally, we purchased a Dell replacement hard drive. It’s hot swappable, and we were just going to exchange it with the failed drive. Will I need to configure anything on the server or drive, or is it as simple as just swapping out the bad drive with the new one? Will it rebuild without any intervention on my part? If there seem to be problems, it looks like I will at least be able to monitor something by going into the RAID BIOS, correct?
With my lack of server knowledge & experience, what would you suggest? Should we just go for it, and try hot swapping the drives after hours? Or do you think we should “bite the bullet” on this one, and opt for paying Dell a service call to do this? I may be overly cautious, but I don’t want to be responsible for taking down a critical server when I’m still a new at this. Any input would be most appreciated!