I’ve worked with Martin for almost 8 years now (gee I’m not that old), and when we shared office space and were working on some project together occassionally he’d slip into frustration with a growl that went something like “Arggh!!” — Just like a pirate. So anyway for lack of other material to post, and in honor of his style of getting things done under frustration at times, I ran his narrative blog post this week through the pirate talk translator: http://www.syddware.com/cgi-bin/pirate.pl
Read on to see how it sounds on paper…I can’t help but laugh 😀
Clist Sail Tide
So thar I be… cold sweat runnin’ down me port cheek, tinglin’ sensation in me lips, clammy hands, an’ freezin’ fingertips. A dim screen in fore o’ me displayed details o’ th’ ser’er startup procedures. Thar be many green “ARRR” messages scrollin’ up almost too smartly too read as th’ programs started up. Suddenly a red “FAILED” slid passed, then another, an’ another, an’ then th’ words that would make any sysadmin forget about all th’ sand be his shoes… kernel panic.
Ser’er maintenance needs t’ be an integral part o’ any system administrator’s life. Takin’ down th’ services fer as brief a time as possible t’ apply fixes, updates, changes, an’ miscellaneous tasks. I do this much less often than I ortin’ ta, but soon … that’s goin’ t’ change.
I’m Martin Lehner an’ I do systems programmin’ an’ system administration fer th’ Center fer Teachin’ an’ Learnin’ at MCC. Ye might be havin’ seen me visage on obscure web pages or bulletin boards along side such fluff as th’ marshmallow ads or classified posts fer mid 80s computer hardware, but then again, maybe nay. I primarily write java code an’ try me hardest t’ keep th’ online services in th’ CTL up an’ runnin’ so nobody knows I exist. If swabbies know who I be, then somethin’ be probably horribly wrong.
Nay, really ;]
Fridee March 2nd
Paul Hickey (me mostly-windows sysadmin buddy) an’ I be be settin’ up t’ begin our maintenance on th’ CTL servers. I grabbed a pad o’ paper an’ a pen t’ write down fun scribbles an’ tookst note o’ th’ tasks ahead o’ me fer th’ next (hopefully pleasant) six hours or so…
- patches fer th’ production ser’er (apps.mc.maricopa.edu, AKA: ctl.mc, keeptoo.mc, an’ dltutorials.mc, Master War chief, Raider o’ th’ Se’en Dimensions, etc…).
- update secure shell(ssh) an’ secure socket layer(ssl) on ALL *nix (linux/unix/solaris) machines.
- rewire th’ power cables t’ balance th’ servers between two aftup batteries.
- rewire all th’ network cables fer clister organization (pictures later)
- rewire kvm cables fer clister organization.
- apply patches t’ all windows servers.
- test as much as possible after maintenance t’ make sure things work.
I spent th’ first hour or two installin’ patches on apps an’ upgradin’ ssh/ssl on th’ *nix machines. Apps be th’ toughest since I be havin’ t’ be extremely careful wi’ ‘t. Changin’ wee things can affect a lot. First I ran a program which told me which updates be available. Then, I go through an’ apply each one tryin’ t’ make sure they don’t conflict wi’ anythin’. After each set I be havin’ t’ check configuration files which may be havin’ changed t’ be sure nothin’ be lost durin’ updates. That tookst about an hour. Some dead time in thar fer lettin’ th’ packages compile. Ye be seein’… apps runs a distribution o’ linux called gentoo. Its extremely configurable an’ fast but that’s sometimes a wannion in that ‘t takes much more careful plannin’ t’ keep ‘t stable. When ‘t works tho, its a champion o’ speed an’ stability like ye wouldn’t b’lieve. Upgradin’ ssh an’ ssl be quick on all machines ‘ceptin’ prana (streamin’ media), which requires me t’ upgrade some older libraries first. Installin’ ssh/ssl means downloadin’ th’ packages, configurin’ them fer me OS an’ hardware, compilin’ them (like compilin’ a computer program on any computer), an’ installin’ them via install scripts. At this point everythin’ looked good, sailed’ on t’ th’ next thin’.
Lost track o’ time by that point. I reckon somethin’ about cables? Oh aye, shut down everythin’ (nay printin’, nay file shares, nay website, nay in-house tools, nay machine logins, swabbies be pretty much ou’ o’ luck on accessin’ anythin’ interestin’ that we host on our servers) which tookst twenty minutes or so by itself. Paul had applied th’ windows ser’er patches before we did this. Next we unplugged ALL th’ cables from ALL th’ servers an’ untangled them all (be seein’ th’ picture below o’ th’ mess).
After unpluggin’ everythin’ an’ untanglin’ we started by carefully redoin’ th’ power cable connections. Thar be some debate among us at first about whether t’ account fer rack slidin’ but we decided against ‘t. Rack mount servers be attached t’ th’ rack via rails which allow th’ ser’er t’ be pulled ou’ t’ be serviced without removin’ ‘t completely from th’ rack. This requires some extra plannin’ on cables tho on accoun’ o’ th’ rack needs slack on th’ cables attached. We decided that since all repairs be done in our office wi’ th’ rack removed that ‘t be more important t’ stick t’ clist cables. After power cables be kvm cables an’ then network cables. That tookst us t’ nearly 6pm, an’ apart from me knees hurtin’ I think ‘t looked much better. We decided also t’ completely replace th’ network cables next tides as they be WAAAAAY too long fer bein’ used on a rack mount like this. We ordered some new cable t’ make wee 2 or 3ft network cables. That ortin’ ta improve th’ setup on network cables in th’ room significantly.
I began turnin’ servers aft on. This be th’ moment o’ truth fer me. Me hands get clammy an’ I cross me fingers that we don’t be havin’ any hardware failures or bizarre OS failures when th’ servers start comin’ up. As Paul finishes th’ network cables I slowly turned th’ machines aft on. We did be havin’ a wee ‘ere th’ network cable be plugged into th’ wrong port (totally me fault), but smartly fixed them. Tally at this point:
- Me: 4 Horrible-failures: 0
Then, as I be watchin’ apps start up jboss (th’ application ser’er which runs our registration system, softsense, an’ some other java tools), somethin’ caught me eye… ‘t couldn’t resolve “apps.mc.maricopa.edu”. I listed in o’er th’ console on th’ rack an’ rested me head against its cold metal frame t’ let ou’ a long sigh. Here we go…
Luckily this one be a quick fix, th’ hosts file had some settings from aft when I set th’ ser’er up that hadn’t been completely removed. After fixin’ this I got ‘t workin’. Th’ hosts file identifies what name th’ machine has, so in this case, fer some reason th’ file had “sorcerer” in ‘t instead o’ “apps”.
Mondee, March 5th
Ser’er problems don’t tend t’ expel the’r cold viscera until ye least expect ‘t. This be th’ case today. A wee problems be noticed…
- registration nay working
- drake file shares nay accessible
- breeze failed t’ start
- ctl blogs failed t’ work
- keeptool failed t’ work
Jboss figured ‘t would be fun t’ shut down some time after I port on th’ 2nd. Don’t know why, nay errors in th’ log or anythin’. Fluke? Turned on now an’ ‘t works fine. Th’ drake file shares be totally me fault again, forgot t’ put th’ windows file sharin’ service in automatic startup. Don’t know what’s goin’ on wi’ breeze, I don’t directly administer that machine so I can’t comment on ‘t much. Ask Jeff Anderson if ye need t’ gripe about that one ;p. Blogs be a mystery t’ us so we jus’ reinstalled th’ files an’ changed th’ database connection t’ “localhost” instead o’ “apps”. Could be havin’ somethin’ t’ do wi’ th’ weird hostname startup issue that affected jboss earlier. Won’t ereknow fer sure I’m guessin’, at least nay until I restart ‘t again in six moons after forgettin’ all this. Keeptool be a maze o’ chaos in itself an’ will probably be down until th’ next version reaches me, which incidentally ortin’ ta be a tides or two.
So, thar ye be havin’ ‘t.. a letter opener. Seriously tho, ser’er maintenance this time be pretty tame. I’ve had MUCH worse days ‘ere I’ve had 10 swabbies starin’ at me askin’ when this or that be aft online an’ th’ only thin’ I’ve got fer them be a *shrug* an’ me assurance that I’m workin’ on ‘t.
<insert witty endin’ here>
I can hear th’ ocean in th’ distance,