Monthly Archives: July 2016

3aee85eb9782028cb6e4e3209a5a18e1

He’s Dead, Jim

Well, we’ve had our first redshirt moment:

Redshirt_characters_from_Star_Trek
It was the reallocated sectors that did him in, Captain.

Despite passing the initial week-long HDD stress test, one of out 3TB Toshiba NAS drives began to throw SMART errors, despite being left idle after deployment. So what did all this mean? Warranty claims, but of course — and also lit bit of mdadm hot-swap RAID fun.

First, we’ve got to tell our NAS box its disk has failed. With a one-liner, the disk is marked as failed and removed from the RAID array. With a flick of a handle, out slides the drive from the new, convenient hot-swap trays. Take that, old and inexplicably locked-up Apple Xserve RAID!

IMG_20160728_165727
Hotswap = life made easier when it comes to repairs

Luckily, NCIX is a little better with warranty claims than Canada Computers, which is one of the reasons they were chosen as our HDD supply source. Within 10 minutes, they had my RMA approved with Toshiba, and I was given a new disk and was on my way — no waiting around for 2 weeks for a mail-in replacement.

In a few minutes of being back at the data closet, the new drive is slid into its enclosure and into the NAS, and another two commands partition, then re-add the drive to the RAID array. Ah, life’s good.

What’s this say about Toshiba 3TB drives, though? Well, not much. One out of 6 disks failing might seem like a lot, but this is too small a sample size to draw any conclusions. Even in Blackblaze’s blog, they do a breakdown for the same DTO1ACA300 disks EngSoc uses, and note that even 58 disks is too small a sample size to draw conclusions.  One thing is certain though — as of June 2016, Toshiba drives undercut every other supplier on their cost per GB of storage!

 

 

The Servers are Coming!

The servers are coming, the servers are coming!

Yes, we’ve done it. We’ve gone and deployed a new, shiny, starburst-advertized server cluster – and a rack-mounted one at that, too. While there is still provisioning and service cut-over left to do (from the old systems to the new ones), the cluster is all up and running.

What we’ve got (from top of rack to bottom):

  • Lenovo SFF PC – acting as gateway router
  • Keyboard+Display & KVM switch
  • 2 x LXC host nodes
  • 12TB RAID server
  • DLink DGS1100-24 switch (on rear — not visible)
  • APC UPS1000
  • Old Xserve RAID, now defunct… but too heavy to move!

Some before and after shots:

before_closet before_serverafter_closet after_server

New RAID Storage for EngSoc

EngSoc is getting new servers — and what better way to start off than by replacing the crufty, decade-and-a-half old Apple XServe RAID server that had been the backing storage for all of the Club sites? Drawing 250W of power, and providing 1.2TB of space over 14 disks, the XServer RAID is definitely due for replacement.

xserve_raid_slot
Xserve was Apple’s one-time venture into commercial hardware. Unsurprisingly, it flopped.

Our replacement: a hand-spun server, providing 12TB of RAID storage over only 6 disks. The power footprint is also under 100W at the same time — 10 years makes a large difference, doesn’t it?  While the XServer RAID cost $5999 new in 2003, EngSoc’s solution cost only $1900 all told, all thanks to using off-the-shelf parts.

5742_10_rosewill_rsv_l4411_rackmount_server_case_review
The new RAID — still hotswap, and still rack-mountable, but with commodity hardware backing it 

We’re using good old Gigabit Ethernet for all our datanet backbones, and this server has 4 network cards. That’s a lot of bandwidth — 500MB/s in fact, which is more than enough to saturate the 400MB/s write speeds of this new server. Going with commodity Gigabit ethernet is great, because it’s both cheap, well-established, and inter-operable with any computer than can connect to an ethernet switch. This gives us much more flexibility in our network design — previously, the fibre-channel network for the Xserve RAID meant it could only operate as a ‘slave’ to another box with with a fibrechannel card.

All of this means that EngSoc has lots of storage now for proper daily backups, and a proper CIFS/SAMBA share server with some redundancy. This will go a long way in giving myself (the sysadmin), the webmaster, and the EngSoc Officers a staging area for files, temporary backups, and shares.