Wednesday, May 14, 2008

Hoo boy...

We're mid-way through our VMWare implementation project. It's been a very good learning experience for everyone. A couple of minor hiccups with an application here or there, but nothing unmanageable.

Part of this effort will involve migrating services currently running on NetWare 6.5, on to Novell's OES 1 SP 2 platform running on SLES...we've already replaced every other NetWare server in our environment with an OES server.

As you may recall from previous posts, we've been very disappointed with Novell's handling of patching for this "flagship", "enterprise class", "best-of-breed" operating system. In a word, it's a joke. Since we all value our jobs, we've decided it would be in our best interest NOT to adopt an automated or scheduled patching process for any of our OES servers...way too little quality control, way too much instability, and an utter disregard for the concepts of "stable" code bases.

Last night, that decision proved to be remarkably prescient.

We needed to see what happens when we migrate an NSS volume connected via iSCSI from a NetWare host to an OES/SLES host. So, off to the lab to build out a couple of simple servers. The NetWare box was built on a simple workstation PC in about an hour or so. The OES server took over 6 hours, and over 4 of that was waiting for the RUG update to finish (our standard process is to patch the server once, prior to productionalization, and never do it again once in the field).

We continued following our standard process for build-out after the first RUG update, which included kernels and all other manners of calamity. Once the second RUG update process completed, another reboot was in order.

Then it happened. TTY hadn't started. Absolutely nothing loaded that required a file system.

Upon further investigation, the SATA hard disks in our IBM x206 server - which, at install time, were designated as SDA1 and SDA2 by SLES, were now appearing as HDA1 and HDA2.

That's right. Running a patch process changed the paradigm for the hard disks in a running server, leaving the system unbootable.

Now, you might imagine how difficult it would be to address this issue - remotely - on 50 servers in the field. Even with the RSA cards we have and the boot-to-ISO capability, we'd be looking at untold hours wasted, recovering from an issue caused by Novell's complete and utter failure to act as responsible custodians for their product.

I'm now more convinced than ever that Novell lacks the leadership, and indeed the brainpower it once had, required to engineer and support an enterprise network operating system. The incredible legacy Drew Major created at Novell has been squandered, irrevocably. This is a company that has announced they won't have the first support pack to fix long acknowledge, significant issues in their latest version of the flagship operating system product until nearly 18 months after it's release.

What more reason do customers need to abandon Novell, with haste? All of the arguments of superiority Novell enjoyed at Microsoft's expense have been eradicated - not by Microsoft's improved product quality, but by the decline of Novell's product quality.

Certainly this will seem overly pessimistic, possibly bombastic, and over-reacting, to some of the more ardent red-blooded IT folks (and the indifferent). I am guilty of being passionate about the quality of service I deliver to my customers, and about holding others to the same expectations I have of myself - especially the vendors we've selected, especially when their performance goes into decline. This is the case far too often in our industry. It's inexcusable, and I cannot help but think a combination of Wall Street pressures, greed, and ignorance have all conspired to chip away at the foundation of our industry.

The cost and complexity of operating an enterprise IT infrastructure are growing exponentially, and in too many cases, completely without reason or justification. Nobody is concerned with making efficient, quality software any longer. Promise the world, ship tons of install DVD's, and if it doesn't work, shrug your shoulders and passively blame the customer for having either deficient requirements or deficient skill sets.

If our industry continues at this rate, the entire nation will have a very difficult time of competing abroad.

1 comment:

gcballard said...

Product quality. Yeah, right. I beta tested ZCM10. I was a fevered zenworks supporter until after I had implemented a bunch of zcm servers. and they broke. and I couldn't fix them. and I had to uninstall. and the uninstall deleted ALL my images. I now use Microsoft.