« March 2007 | Main | June 2007 »

April 27, 2007

Dead raid. Again.

I think this was totally my fault. Last time was too but last time it was an even stupider mistake...

Details in the extended entry.

I'm not really in the mood to purposely make this entertaining, so I'll just give the play-by-play. If you think it's funny, keep it to yourself. *smirk* If you think it's entertaining, THAT you can share.

So, my server has been crashing recently. I THINK it had said "disk errors on /dev/hda" but now I'm not even sure about that, it might have been /dev/sda.

/dev/hd* are the raid. /dev/sda is the 80gig disk which has / and /home partitions (and swap).

The server was running Mandrakelinux release 10.2 (Limited Edition 2005) for i586.

Booting the server has taken something insane like 10-15 minutes since I built the thing. I'm not sure why, but I'm starting to believe it has something to do with /dev/hda. :-) It pauses at LIL and then again at the end of the kernel load process...

So. Machine has problems, right? So I knew I had to upgrade the server because 10.2 isn't supported anymore. I did some "research" (looked around the public areas at work to figure out what the most current distro I'd heard of was easily available) and came up with Fedora Core 6. Burned a copy and brought it home.

Opened up the server and noticed "Hrm. There's a drive here that isn't being used. I remember now, that's the spare disk that I didn't have power or an IDE channel for when I built the damn thing. Why don't I just pop it in in place of /dev/hda which has problems, and let the raid rebuild." Makes sense...

Except that PCs find their operating system on the "Master Boot Record" of /dev/hda. OOPS.

So I say "No big deal", back up the root partition onto the home partition, boot up FC6 and have it reformat the root partition. YES, I was safe, the home partition was (and is) still fine.....

HOWEVER. Because it's never that easy. Rebuilding the RAID didn't seem to work quite right. Possibly (probably) because the way Fedora Core 6 treats the drives in a RAID 5 are not the same as how MandrakeLinux did... Where Mandrake had them in some random order, they are now in 'alphabetical' order. I don't know if it matters, but it isn't happy-ing.


At some point, the disks went in this order: /dev/hdg1 /dev/hde1 /dev/hdi1 /dev/hdk1 /dev/hda1 /dev/hdspare1

and now they're in /dev/hda1 /dev/hde1 /dev/hdg1 /dev/hdi1 /dev/hdk1


So I have two thoughts/options... One, which is the current plan, is to wait until the 1tb external drive I "just happened" to have ordered LAST FRIDAY and which is due Monday, shows up; use that to back up the /dev/md0 partition (anyone know how long it takes to back up a terabyte over USB on Linux?), and then run the ReiserFS.fsck --rebuilt-tree and keep my fingers crossed...

The other - which might be easier - would be to figure out which disk is which, and reshuffle them physically until the current disk 0 is the old disk 0, etc. etc. etc. down the line. The only major problem with that is that I'll have to reinstall lilo again. No huge big deal.

Here's a sad sad excerpt from dmesg.

RAID5 conf printout:
--- rd:5 wd:5 fd:0
disk 0, o:1, dev:hda1
disk 1, o:1, dev:hde1
disk 2, o:1, dev:hdg1
disk 3, o:1, dev:hdi1
disk 4, o:1, dev:hdk1
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda6, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
ReiserFS: md0: found reiserfs format "3.6" with standard journal
ReiserFS: md0: using ordered data mode
ReiserFS: md0: journal params: device md0, size 8192, journal first block 18, max trans len 1024, ma
x batch 900, max commit age 30, max trans age 30
ReiserFS: md0: checking transaction log (md0)
ReiserFS: warning: is_tree_node: node level 0 does not match to the expected one -1
ReiserFS: md0: warning: vs-5150: search_by_key: invalid format found in block 0. Fsck?
ReiserFS: md0: warning: vs-13070: reiserfs_read_locked_inode: i/o failure occurred trying to find st
at data of [1 2 0x0 SD]
ReiserFS: warning: is_tree_node: node level 0 does not match to the expected one -1
ReiserFS: md0: warning: vs-5150: search_by_key: invalid format found in block 0. Fsck?

And now for the other entertaining part. At some point I said "Maybe the problem is the failed disk at the same time as upgrading the operating system. Let me put back the old OS from the backup and see if that helps." And it worked. *phew* Linux rocks. I'm still not happy with it, entirely, and won't be until I get my damn RAID file system back. But to the extent that I care, it's pretty cool how much HAS worked.


Seth suggests convering to an OSX (mac mini) server. OK sure, fine. But Seth, please explain to me how I can put 5x250gig disks into a Mac Mini? :-P I have like 8-10 of these disks, between the server I have and "spares" I bought when Staples had a huge, huge discount on them (I think I paid $80-90 each) a year or three ago... So tossing them in the closet and buying something expensive isn't a good option. If I could find a chassis that I could toss 5+ IDE disks into and have it do RAID5, THAT I'd spend $200-300 on. But I doubt that exists at my price point.

OK I'm off to bed. I hope someone was entertained by this. Or has some useful suggestions for me.....

Posted by aland at 11:18 PM | Comments (2) | TrackBack

April 23, 2007

Say it isn't so!

According to http://forum.newsarama.com/showthread.php?t=109544, Marvel is working on Spider-Man the Musical.

Be afraid. Be very afraid.

Not that I won't go see it. And enjoy it, if it's even remotely possible....

Posted by aland at 11:56 AM | Comments (0) | TrackBack

April 2, 2007

Social Anxiety

Call Me Fishmeal.: TED2007, Days 1 and 2

His last paragraph struck me... I don't feel it the way he does but I've had a few conversations about social anxiety and the like over the past few weeks and I thought this was worth keeping around:

Remember this, if you ever meet me. I'm afraid of you. I'm afraid I won't live up to your expectations, I'm afraid you'll think I'm ugly, I'm afraid I'll look like a nerd or do something inappropriate and you'll disapprove of me. If I act aloof it's because I don't have the inner strength to risk being rejected that day, not because I don't care about you as a human being.

Interesting, huh?

Posted by aland at 10:21 AM | Comments (0) | TrackBack