Monday, August 11, 2008

Novell Client for Vista, the ecosystem

I just reported a bug in the beta that surprised me. I can't talk details about it, but it strikes me as the kind of bug that should have been at least reported shortly after the client released. Perhaps it was just so overall buggy that it got lost in the forest, but still. The Vista client has been out for some time now.

Having said the following rant several times over the past few days, I figure it's time to post it ;).

The problem we're running in to is that the number of users of the Vista Client is a small, small sub-set of the overall users of the Novell Client, which are by now a minority of overall users of Novell NCP file-servers. Novell spent years hyping 'clientless' approaches to file-serving, through the CIFS stack on NetWare. A lot of places bought in to that. Because of this, the percentage of NCP-client Vista users among the overall Novell File-Server market is a rather small one.

And small means you don't get a lot of testing done by people-who-are-not-us, and seemingly obvious bugs showing up in the beta Sp1 builds. I don't have any Vista workstations, so I've done exactly zero testing of the Vista Client; this particular bug was reported and troubleshot by someone who is not me (I just filed it). Even though we have beta builds of the Vista client as part of this beta, I'm not testing it. All things considered, I probably should.

Since we're wedded hard to the Novell Client, it's probably time for us to start devoting resources to the ecosystem in order to keep it alive.

Labels: , ,


Friday, July 25, 2008

Handling eDirectory core-files on linux

If you've been getting core files generated by ndsd on your Linux servers, and want to call Novell Support about it, there are a few things you can do to maximize what Novell will get out of the files themselves. You may not get much, but these will help the people with the debug symbols figure out what's going on.

Packaging the Core


First and foremost, you already have the tool to package core files for delivery to Novell already on your system. TID3078409 describes the details of how to use 'novell-getcore.sh'. It is included on 8.7.3.x installations as well as 8.8.x installations.

Running it looks like this:
edirsrv1:~ # novell-getcore -b /var/opt/novell/eDirectory/data/dib/core.31448 /opt/novell/eDirectory/sbin/ndsd
Novell GetCore Utility 1.1.34 [Linux]
Copyright (C) 2007 Novell, Inc. All rights reserved.


[*] User specified binary that generated core: /opt/novell/eDirectory/sbin/ndsd
[*] Processing '/var/opt/novell/eDirectory/data/dib/core.31448' with GDB...
[*] PreProcessing GDB output...
[*] Parsing GDB output...
[*] Core file /var/opt/novell/eDirectory/data/dib/core.31448 is a valid Linux core
[*] Core generated by: /opt/novell/eDirectory/sbin/ndsd
[*] Obtaining names of shared libraries listed in core...
[*] Counting number of shared libraries listed in core...
[*] Total number of shared libraries listed in core: 72
[*] Corefile bundle: core_20080725_092227_linux_ndsd_edirsrv1
[*] Generating GDBINIT commands to open core remotely...
[*] Generating ./opencore.sh...
[*] Gathering package info...
[*] Creating core_20080725_092227_linux_ndsd_edirsrv1.tar...
[*] GZipping ./core_20080725_092227_linux_ndsd_edirsrv1.tar...
[*] Done. Corefile bundle is ./core_20080725_092227_linux_ndsd_edirsrv1.tar.gz


Once you have the packaged core, you can upload it to ftp.novell.com/incoming as part of your service-request.

Including More Data


If you're lucky enough to be able to cause the core file to drop on demand, or it just plain happens often enough that repetition isn't a problem, there is one more thing you can do to include better data in the core you ship to Novell. TID3113982 describes a setting you can add to the ndsd launch script (/etc/init.d/ndsd) that'll include more data. The TID describes what is being done pretty well. In essence, you're using an alternate malloc call that fails with better information than the normal one. You don't want to run with this set for very long, especially in busy environments, as it impacts performance. But if you have a repeatable core, the information it can provide is better than a 'naked' core. Setting MALLOC_CHECK_=2 is my recommendation.

Be sure to unset this once you're done troubleshooting. As I said, it can impact performance of your eDirectory server.

Labels: , , , , ,


Wednesday, July 16, 2008

Patching SLES

Last night I attempted to patch one of our OES2 servers. This particular server is an elderly beast, a P3 1GHz machine. So I wasn't expecting anything like fastness out of it. Especially with rug.

But still, it was painful!
normandy: ~#: rug lu
Waking up ZMD...
[8 minutes later]
[list of one update, libzypp]
normandy: ~#: rug update
Resolving Dependencies....
[8 minutes later]
Install this update? (y/N)
y
[12 minutes later]
Restarting ZMD...
[8 minutes later]
normandy: ~#: rug lu
[list of updates. No need to wait 8 minutes this time.]
normandy: ~#: rug update
Resolving Dependencies...
[8 minutes later]
Dependency resolution failed for bind-util and bind-libs. libdns-whatzihoozit required by bind-util is provided by bind-libs. Please fix you hoser.
[insert swearing here]
normandy: ~#: rug in bind-util bind-libs
Resolving Dependencies....
[8 minutes later]
Install these updates? (y/N)
y
[12 minutes later]
normandy: ~#: exit

As this had taken far longer than even I was expecting, I stopped. I'll finish up tonight. As this is an OES2 server, this means SLES10-SP1. I can attest that SLES10-SP2 on identical hardware is MUCH faster. I can't wait until OES2-SP1 comes out and this dinosaur can get faster patching.

Labels: , , ,


Monday, June 30, 2008

Novell Client for Linux, packaged for OpenSUSE

It has been mentioned many places, and I've done some of the mentioning, that since openSUSE is the foundation for SLED, it makes sense for Novell to distribute an NCL for openSUSE. It turns out they're working on just that. And here is the Novell beta page. I'm soooo going to try this out, since I'm running openSUSE 10.3 on my work desktop and won't be moving to openSUSE 11 until I can run the client on it (oh, wait, I can).

It should also be mentioned that Ubuntu is a very frequently requested target for another NCL, but I have reason to believe that'll never happen. First of all, any Novell Client involves closed source 3rd party licensed code, which makes it hard to port to Linux in the first place (a relic of being based on code from the days when open-source was just an ethical standpoint rather than a tangible market force). Second, Novell has proven to be rather light in developer resources in certain areas, and linux integration with non-SUSE linux distros is very minimal.

Labels: , , , , ,


Tuesday, June 24, 2008

Backing up NSS, note for the future

According to this documentation, the storing of NSS/NetWare metadata in xattrs is turned off by default. You turn it on for OES2 servers through the "nss /ListXattrNWMetadata" command. This allows linux level utilities (i.e. cp, tar) to be able to access and copy the NSS metadata. This also allows backup software that isn't SMS enabled for OES2 to be able to backup the NSS information.

This is handy, as HP DataProtector doesn't support NSS backup on Linux. I need to remember this.

Labels: , , , ,


Monday, June 16, 2008

A good article on trustees

Over on the Novell Cool Solutions site, Marcel Cox just posted an article about how Trustees are handled on the Novell Filesystems (TFS and NFS). If you wanted to know the fundamentals of how ACLs are done on NSS volumes and how it relates to eDirectory, this is a good start.

Labels: , , , , , ,


Thursday, May 29, 2008

OES2 and SLES10-SP2

Per Novell:

Updating OES2

OES2 systems should NOT be updated to SLES10 SP2 at this time!
Very true. And most especially true if you're running virtualized NetWare! The paravirtualization components in NW65SP7 are designed around the version of Xen that's in SLES10-SP1, and SP2 contains a much newer version of Xen (trying to play catch-up to VMWare means a fast dev cycle, after all). So, expect problems if you do it.

Also, the OES2 install does contain some kernel packages, such as those relating to NSS.

OES2 systems need to wait until either Novell gives the all clear for SP2 deployments on OES2-fcs, or OES2-SP1 ships. OES2-SP1 is built around SLES10-Sp2.

Labels: , , , , ,


Friday, May 23, 2008

Problem with SLES10-SP2

Just this morning Novell posted a new TID:

Updates catalogs missing after updating libzypp

I've heard on the grape-vine that this particular libzypp update was put into the SLES10-SP1 channel in order to prepare for SP2's release. Those fine folk out there that have turned on Auto Updating on their SLE[S|D] boxes have very probably already been bit by it. I hope Novell gets this one fixed, and posts recovery steps, soon.

Labels: , , , ,


Wednesday, May 21, 2008

SLES10 SP2 shipped

According to Novell, SLES10 SP2 has shipped.

This means that the ongoing OES2 SP1 beta I'm a part of will be done on released code for the SLES side of it. So any bugs we find there may end up as patches on the SP2 channel.

One nice thing in the new code?

"rug refresh --clean"

This will do what I posted about a few days ago. It'll nuke the zmd database and rebuild it fresh! Niiiice! Unfortunately, a truly better version of rug won't come until "Code 11".

Labels: , , ,


Wednesday, May 14, 2008

NetWare and Xen

Here is something I didn't really know about in virtualized NetWare:

Guidelines for using NSS in a virtual environment

Towards the bottom of this document, you get this:

Configuring Write Barrier Behavior for NetWare in a Guest Environment

Write barriers are needed for controlling I/O behavior when writing to SATA and ATA/IDE devices and disk images via the Xen I/O drivers from a guest NetWare server. This is not an issue when NetWare is handling the I/O directly on a physical server.

The XenBlk Barriers parameter for the SET command controls the behavior of XenBlk Disk I/O when NetWare is running in a virtual environment. The setting appears in the Disk category when you issue the SET command in the NetWare server console.

Valid settings for the XenBlk Barriers parameter are integer values from 0 (turn off write barriers) to 255, with a default value of 16. A non-zero value specifies the depth of the driver queue, and also controls how often a write barrier is inserted into the I/O stream. A value of 0 turns off XenBlk Barriers.

A value of 0 (no barriers) is the best setting to use when the virtual disks assigned to the guest server’s virtual machine are based on physical SCSI, Fibre Channel, or iSCSI disks (or partitions on those physical disk types) on the host server. In this configuration, disk I/O is handled so that data is not exposed to corruption in the event of power failure or host crash, so the XenBlk Barriers are not needed. If the write barriers are set to zero, disk I/O performance is noticeably improved.

Other disk types such as SATA and ATA/IDE can leave disk I/O exposed to corruption in the event of power failure or a host crash, and should use a non-zero setting for the XenBlk Barriers parameter. Non-zero settings should also be used for XenBlk Barriers when writing to Xen LVM-backed disk images and Xen file-backed disk images, regardless of the physical disk type used to store the disk images.

Nice stuff there! The "xenblk barriers" can also have an impact on the performance of your virtualized NetWare server. If your I/O stream runs the server out of cache, performance can really suffer if barriers are non-zero. If it fits in cache, the server can reorder the I/O stream to the disks to the point that you don't notice the performance hit.

So, keep in mind where your disk files are! If you're using one huge XFS partition and hosting all the disks for your VM-NW systems on that, then you'll need barriers. If you're presenting a SAN LUN directly to the VM, then you'll need to "SET XENBLK BARRIERS = 0", as they're set to 16 by default. This'll give you better performance.

Labels: , , , , , ,


Tuesday, May 06, 2008

Being annoyed by rug?

Rug/zmd in SLES10-SP1 is still a headache maker. Novell knows this, but I strongly suspect that we'll have to wait until SLES11 before we get anything improved. OpenSUSE now has zypper which works pretty good, and I think you can do it in SLES if you want, but I haven't tried.

One of the chief annoyances of rug is that the zmd.db file kept in /var/lib/zmd/zmd.db gets corrupted far too easily. And when that happens, rug can take HOURS to return anything. If it returns anything at all.

The fix for it is easy, stop zmd, delete the zmd.db file, restart zmd. Since I'm doing this fairly often, I've whipped up a bash script to do it for me.

nukezmd
#!/bin/sh
#
# For killing ZMD when it is clearly hung. An all too often occurance.
#

declare PIDZMD

# First get the PID of ZMD

printf "Getting PID... "
let PIDZMD=`rczmd showpid`
printf "$PIDZMD\n"
# Then unconditionally kill it

printf "Killing zmd hard... \n"
kill -9 $PIDZMD

# Remove the old, inconsistent database

printf "Nuking old database... \n"
rm /var/lib/zmd/zmd.db

# Restart ZMD, which will build a new, consistent database

printf "Restarting ZMD\n"
rczmd start
Simple, to the point. Works.

Labels: , , ,


Monday, May 05, 2008

Linux @ Home

My laptop at home dual-boots between openSUSE and WinXP. There are a few reasons why I don't boot the Linux side very often, some of them work related. And, what the heck, here are the two reasons.

1: Wireless driver problems
I have an intel 3945 WLAN card. It works just fine in linux, well supported. What throws it for a loop, however, are sleep and hibernate states. It can go one, two, four, maybe five cycles through sleep before it will require a reboot in order to find the home wireless again. If it doesn't lock the laptop up hard. Since my usage patterns are heavily dependent upon Sleep mode, this is a major, major disincentive to keep the Linux side booted.

I understand the 2.6.25 kernel is a lot better about this particular driver. Thus, I wait with eager anticipation the release of openSUSE 11.0. This driver is currently the ipw3945 driver, and will eventually turn into iwl3945 driver once it comes down the pipe. What little I've read about it suggests that the iwl driver is more stable through power states.

2: NetWare remote console
I use rconip for remote console to NetWare. Back when Novell first created the IP-based rconsole, they also released rconj along side ConsoleOne to provide it. As this was written in Java, it was mind bogglingly slow. This little .exe file was vastly faster, and I've come to use it extensively. Unless I get Wine working, this tool will have to stay on my Windows XP partition. It works great, and I haven't found a good linux-based replacement yet.

Time has moved on. Hardware has gotten faster, and the 'java penalty' has reduced markedly. RconJ is actually usable, but I still don't use it. Plus, it would require me to install ConsoleOne onto my laptop. It's 32-bit, so that's actually possible, but I really don't want to do that.

The Remote Console through the Novell Remote Monitor (that service out on :8009) has a nice remote-console utility, but it also requires Java. I'm still biased against java, and java-on-linux still seems fairly unstable to me. I don't trust it yet. It also doesn't scale well. When I'm service-packing, it is a LOT nicer looking to have 6 rconip windows up than 6 browser-based NRM java-consoles open. Plus, rconip will allow me access to the server console if DS is locked, something that NRM can't do and is invaluable in an emergency.

Once the wireless driver problems are fixed, I'll boot the linux side much more often. Remote-X over SSH actually makes some of my remote management a touch easier than it is in WinXP. And if I really really need to use Windows, my work XP VM is accessible over RDesktop. There are a few other non-work reasons why I don't boot Linux very often, but I'll not go into those here.

So, oddly, NetWare is partly responsible for keeping me in Windows at home. But only partly.

Labels: , , , ,


Thursday, April 17, 2008

NetWare and Novell, changing a company

A couple days ago Richard Bliss had a long blog entry about, "Novell's Cash Cow - How NetWare almost killed the company". It had some very interesting points. Some we knew:
We are all familiar with NetWare, the dominate Network Operating system of the 1980s and 1990s. We are all familiar with Microsoft's tactics of penetrating the NOS market with Windows NT by focusing on using Windows as an application platform.
Apparently Richard worked for Novell around 2001. I find that interesting since my first BrainShare was 2001, and that was when they announced the release of NetWare 6.0. While there he saw what seemed to be an outright denial that NetWare had been passed up by Windows and something new needed to be done.

In 2001 I knew that Windows had for all intents and purposes won. The only place you ever really saw NetWare servers were as file-servers, or running GroupWise or the small handful of apps that used NetWare as an application server. The stalwart loyalists among us saw this as annoying, but not a major problem.

It was also good for Novell's bottom line. NetWare still accounted for a large percentage of their revenues. Even though the writing was on the wall, they were still making real money on it so didn't see a need to change. This is why NetWare 6.0 introduced the AMP stack to NetWare, as a way to better make NetWare an application server and to slow the loss of customers. At BrainShare 2001 there was open speculation about "NetWare 7.0" and what it would look like.

And there still was until 2005 when Novell announced what the next version of NetWare would be. This being after the SUSE and Ximian purchases, it would be based on Linux. This move had been rumored, and alternately derided and lauded, for some time. There was a great wailing and gnashing of teeth on the part of the stalwart NetWare loyalists. It also started an exodus of customers, as Novell's financial reports at the time point out.

Fortunately for the company, they started actively promoting (for certain values of 'active' that are higher than they were previously, but still in the theme of Novell Stealth Marketing) and developing their other products, like GroupWise, Novell Identity Management, ZenWorks, and most especially their Linux business. It took them until last quarter to turn in a quarter in the black, and NetWare revenues are under 20% of total now. So, they've turned the corner and are no longer dependent on the NetWare cash cow. They have a couple of them in the field now, which is a MUCH healthier place to be.

It's a funny thing, but one of the reasons why NetWare is such a kick-butt file-server compared to everything else is why it's a challenging environment to develop in. Had Novell seen the light earlier and bought SUSE (or rolled their own Linux distro) in... 1999 instead, right after the NW5.1 release, they still would have run into the fundamental architectural problems in 32-bit linux that make it an inferior file-serving platform for large environments. By 2008 their server could have been a LOT more mature, and perfectly poised to take advantage of 64-bit Linux.

Novell in the 1990's is not an example of a 'nimble' company. It is trying to get there now through diversification. Not many companies (especially tech companies) have survived the loss of their prime money earner; Apple has done it through OSX, which required a fanatically loyal fan base to survive the dark years. This is the prime reason people kept predicting the imminent demise or buyout of Novell. Now that they're earning profits again, and have diversified away from just the OS sector, they're not going to be going out of business any time soon.

Now if only they had better SMB packages and programs. I hear repeatedly from peers who support SMBs that Novell's packages and programs in that space are lacking or exploitative. Significant revenue, and more importantly mindshare, are in the SMB market. Plus, today's SMB is tomorrow's large or global enterprise.

Labels: , ,


Tuesday, April 01, 2008

Slow blogging

I found out at BrainShare that WWU has been accepted as a Novell Authorized Beta site for OES2 SP1. And that's what I've been doing for the better part of the past week. Due to the NDA required, I can't talk about it. So, not much bloggable stuff to bring forward.

We requested entry into the program in part because of what I learned at BrainShare 2007. Specifically, Novell doesn't test for our scales of users. Therefore, it is in our best interest to make sure that organizations like us are in the beta. We have the hardware to make a go of it right now (all those new ESX boxes are liberating some still-useful 3-5 year old servers), and I have the time. Unfortunately, the only 64-bit testing we'll be doing will be in VMWare, so the newest of the new code will have to be really tested by other people.

That's why I've been quiet.

Labels: , ,


Tuesday, March 25, 2008

IPv6 vs IPX

In a session last week came the following comment from a presenter (paraphrased):
How may of you in the room have been at this long enough to do IPX? Ok, great. Now how many of you have done anything with IPv6? Doesn't that look JUST like IPX?
And he's right, to a point. IPX addresses are of the form network-number:node-number, such as:

00008021:0002a540d0e1

Where 'node number' is the MAC address of the network card in question. It's up to the routers to figure out where network-numbers live, and advertised services issue full-network broadcasts to advertise said service, which is the primary reason that IPX just doesn't scale if WAN links are in the mix. But that's by the by.

IPv6 addresses work similarly:

2001:0db8:85a3:08d3:1319:8a2e:0370:7334

The last 48 bits are the MAC address and the bits ahead of it constitute the network number. Except... the IPv6 designers knew about the failings of IPX and worked around them. The last 48 bits don't have to be the MAC address, though as I understand it that address has to exist for each physical interface. Unlike IPX, IPv6 has the ability to have 'secondary' addresses. The lack of this ability was the main reason that Novell Cluster Services only worked on IP networks, which caused its own wave of grief when clustering was introduced in the NetWare 5.1 era. Secondary IPv6 numbers don't have to follow the MAC format, which in my opinion is a good thing!

Yes, when I first read about IPv6 addressing I had that same, "wow, this is just like IPX," moment the BrainShare presenter had. Only, more scalable, and more flexible.

Labels: , , , ,


Thursday, March 20, 2008

BrainShare Thursday

Not a good day. My first course, "Advanced BASH," could more accurately be described as, "BASH scripting tips & tricks". I then proceeded to skip the other three sessions I had signed up for.
  • Novell Open Enterprise Server 2 Interoperability with Windows and AD. All about Domain Services for Windows and Samba. Neither of which we'll ever use. No idea why I wanted to be in this session.
  • Rapid Deployment of ZENworks Configuration Management. Other people around here have suggested that if we haven't moved yet, wait until at least SP3 before moving. If then. So, demotivated. Plus I was rather tired.
  • Configuring Samba on OES2. CIFS will do what we need, I don't need Samba. Don't need this one. Skipped.
DL236: Advanced BASH Course
BASH tips and tricks. I got a lot out of it, but the developers around me were quietly derisive.

ZEN Overview and Features
Not so much with the futures, but it did explain Novell's overall ZEN strategy. It isn't a coincidence that most of Novell's recent purchases have been for ZEN products.

TUT303: OES2 Clusters, from beginning to extremes
This was great. They had a full demo rig, and they showed quite a bit in it. Including using Novell Cluster Services to migrate Xen VM's around. They STRONGLY recommended using AutoYast to set up your cluster nodes to ensure they are simply identical except for the bits you explicitly want different (hostname, IP). And also something else I've heard before, you want one LUN for each NSS Pool. Really. Plus, the presenters were rather funny. A nice cap for the day.

And tonight, Meet the Experts!

Labels: , , , , , , ,


BrainShare Wednesday

The Wednesday keynote was, indeed, a bunch of demos. It was also mostly pointless as far as the technology I'm concerned with. Lots of GroupWise (don't care), lots and lots of PlateSpin (can't afford it), lots of Zen (not the bits I'd use).

That said, the new GroupWise WebAccess is gorgeous. I wish Exchange had their non-ActiveX pages look that good.

TUT175: RBAC: Avoiding the horror, getting past the hype
Mostly about IDM as it turned out. Only minimally interesting from an abstract viewpoint about roles in general.

TUT 277: Advanced eDirectory Configuration, new features, and tuning for performance
I learned a few things I didn't know, such as the fact that each object as an "AncestorList" attribute listing who their parent objects are. This apparently greatly speeds up searching. SP3, coming out this Summer, will have faster LDAP binds for a couple of reasons. Right now Novell is recommending 2 million objects as a reasonable maximum size for a partition for performance reasons.

And also they reiterated something I've heard before...
You know how back in the NetWare 4 days, we said to design your tree by geography at the first level, and then get to departments? Um, sorry about that. It was great back then, but for LDAP or IDM it really, really slows things down.
Yep. I took my first class for my CNA when 'Green River' was just coming out, or was just out. So I remember that.

TUT221: iPrint on Linux, what Novell Support wants you to know
A nice session from a mainline support guy about the ways people don't do iPrint on linux correctly. We're not going there until pcounter can run in linux, so this is still somewhat abstract. But, nice to know.
  • The reason that some print jobs render differently than direct-print jobs, is because of how Windows is designed. Direct-print jobs render with the 'local print provider', and iPrint jobs render with the 'network print provider'. This is a Microsoft thing, not an iPrint thing. You can duplicate it by setting up a microsoft IPP printer (assuming you're not mandating SSL like we are) and printing to the same printer with the same driver.
  • The Manager on Linux doesn't use a Broker, it uses a 'driver store'.
  • The Manager on NetWare doesn't always bind to the same broker. I didn't know that.
  • It is recommended to have only one Broker, or one driver store per tree.
  • Novell recommends using DNS rather than IP for your printer-agents, check your manager load scripts.

Labels: , , , , , , , ,


Tuesday, March 18, 2008

BrainShare Tuesday

Today started off with a bit of panic, as I hadn't set my alarm. Me being a west-coaster, 7:20 (when I woke up) is an entirely reasonable time to get up as far as my body is concerned. Only, I needed to get dressed and breakfasted before my first session at 8:30. Aie! I had to eat quick, but I got there. Didn't get a chance to check work email, though.

ATT326: Advanced Linux Troubleshooting
An ATT, therefore hard to summarize. But I learned about a few new commands I didn't know about before. Like strace. And vimdiff.

TUT130: Challenges in Storage I/O in Virtualization
Another nice one, but an emergency at work (printing down in a dorm, during finals week) distracted me heavily during the first half of it. Which resulted in the following note in my notes:
NPIV looks really nifty. Look into it.
NPIV being how you can use fibre-channel zoning to zone off VM's, rather than HBA's. Highly useful. I also learned about a neat new thing called Virtual Fabrics. Virtual Fabrics work kind of like VLANS for fabrics. You can segregate your fabrics into fabrics that share hardware but nothing else. Handy if your, say, Solaris admins don't want you mucking about with their zoning, while saving money through consolidated hardware.

TUT216: OES2 SP1 Architectural Overview
There is a LOT of new stuff in SP1.
  • It will include eDir 8.8.4 (8.8.3 will ship this summer sometime)
  • NCP and eDir will be fully 64-bit
  • OES2 SP1 will be based on SLES SP2, which will be releasing about the same time
  • AFP Support
    • AFP 3.1
    • Uses Diffie-Helman 1 for password exchange, meaning the 8-character password problem is solved.
    • Fully SMP-safe
    • Has cross-protocol locking with NCP. CIFS doesn't have cross-protocol locking yet, but IIRC, Samba does
    • Does not need LUM enabled users
  • CIFS Support
    • NTLMv1, but v2 is a possibility if enough people ask, so file those enhancement requests!!
    • CIFS is separate from Samba, therefore can not be used in conjunction with Domain Services for Windows
    • As with AFP, fully SMP safe
  • EDir 8.8.4
    • LDAP auditing enhanced
    • "newer auth protocols", but they didn't say what.
I should also mention that they're still deploying Novell Integrated Samba, which is what you'll have to use to get Domain Services For Windows. Samba still doesn't scale as far as I'd like ('only' 700-800 concurrent users), so that may be an issue for higher ed types who want high concurrency CIFS and also DSFW on the same box.

TUT211: Enhanced Protocol Support in OES2 SP1
This is the session where they went into detail about the AFP and CIFS support. They said that netatalk, the existing AFP stack on Linux, gets really slow once you go over the 20 concurrent users. Whoa! I can soooo understand why Novell felt the need to make a new one.
  • The 8 character password limit has been fixed! They now support DH1 for passing passwords.
  • The 'afptcp' daemon can use one password protocol at a time, so you can only use DH1, or one of the other three I can't remember.
  • Support for OSX 10.1 and 10.2 is scanty, and 10.5 is limited but users may not notice anyway.
  • Passwords will be case sensitive.
  • Kerberos will be in a future release
  • Performance is faster than NetWare, partly due to the ability to multi-thread
  • Can register services by way of SLP
  • Only supports NSS for the time being, the other Linux file-systems will be a future feature.
  • Can support 500 concurrent users, and 1000+ in the future. This fits our current AFP loads.
  • We can configure more about how it works than we could on NetWare, such as how many worker threads to spawn.
  • Has meaninful debug logs!
  • Has a new command, 'afpstat' that works like 'netstat' for giving a snapshot of afp connections.
And then some CIFS stuff. We can't use it for political reasons so I didn't pay attention. Sorry.

Tonight was the night formerly known as 'Sponsor Night,' but has a new name now that everyone who gets a booth is no longer a 'sponsor'. Some are sponsors, some are exhibiters. I can't keep track. Anyway, today was their party. "World of Novellcraft!" Homage to vid-gaming.

Lots of Wii, lots of Rock Band, some Halo, lots of women dressed in Renaissance Festival gear getting their pictures taken by the 90%+ male audience. I've blogged before about my ambivalence about Sponsor Night. I lasted until about 7, when I came back to the hotel.

Tomorrow I have an actual LUNCH BREAK in my schedule! Ooo! And Soul Asylum Soul Coughing Collective Soul plays the concert! I've been listening to two of their CD's for the past two months so I think I may even know a few songs by now.

Labels: , , , , , ,


Monday, March 17, 2008

Today at Brainshare

Monday. Opening day. I had trouble getting to sleep last night due to a poor choice of bed-time reading (don't read action, don't read action, don't read action). And had to get up at 6am body time in order to get breakfast before the morning keynote. There be zombies.

Breakfast was uninspired. As per usual, the hashbrowns had cooled to a gellid mass before I found everything and got a seat.

The Monday keynotes are always the CxO talks about strategy and where we're going. Today a mess of press releases from Novell give a good idea what the talks were about. Hovsepian was first, of course, and was actually funny. He gave some interesting tid-bits of knowledge.
  • Novell's group of partners is growing, adding a couple hundred new ones since last year. This shows the Novell 'ecosystem' is strong.
  • 8700 new customers last year
  • Novell press mentions are now only 5% negative.
Jeff Jaffe came on to give the big wow-wow speech about Novell's "Fossa" project, which I'm too lazy to link to right now. The big concern is agility. He also identified several "megatrends" in the industry:
  • High Capacity Computing
  • Policy Engines
  • Orchestration
  • Convergence
  • Mobility
I'm not sure what 'Convergence' is, but the others I can take a stab at. Note the lack of 'virtualization' in this list. That's soooo 2007. The big problem is now managing the virtualization, thus Orchestration. And Policy Engines.

Another thing he mentioned several times in association with Fossa and agility, is mergers and acquisitions. This is not something us Higher Ed types ever have to deal with, but it is an area in .COM land that requires a certain amount of IT agility to accommodate successfully. He mentioned this several times, which suggests that this strategy is aimed squarely at for-profit industry.

Also, SAP has apparently selected SLES as their primary platform for the SMB products.

Pat Hume from SAP also spoke. But as we're on Banner, and it'll take a sub-megaton nuclear strike to get us off of it, I didn't pay attention and used the time to send some emails.

Oh, and Honeywell? They're here because they have hardware that works with IDM. That way the same ID you use for your desktop login can be tied to the RFID card in your pocket that gets you into the datacenter. Spiffy.

ATT375 Advanced Tips & Tricks for Troubleshooting eDir 8.8
A nice session. Hard to summarize. That said, they needed more time as the Laptops with VMWare weren't fast enough for us to get through many of the exercises. They also showed us some nifty iMonitor tricks. And where the high-yield shoot-your-foot-off weapons are kept.

BUS202 Migrating a NetWare Cluster to OES2
Not a good session. The presenter had a short slide deck, and didn't really present anything new to me other than areas where other people have made major mistakes. And to PLAN on having one of the linux migrations go all lost-data on you. He recommended SAN snapshots. It shortly digressed into "Migrating a NetWare Cluster to Linux HA", which is a different session all together. So I left.

TUT215 Integrating Macintosh with Novell
A very good session. The CIO of Novell Canada was presenting it, and he is a skilled speaker. Apparently Novell has written a new AFP stack from scratch for OES2 Sp1, since NETATALK is comparatively dog slow. And, it seems, the AFP stack is currently out performing the NCP stack on OES2 SP1. Whoa! Also, the Banzai GroupWise client for Mac is apparently gorgeous. He also spent quite a long time (18 minutes) on the Kanaka client from Condrey Consulting. The guy who wrote that client was in the back of the room and answered some questions.

Labels: , , , , , ,


Thursday, March 13, 2008

Brainshare Sponsors

In order to keep costs to us walking sales leads down, Novell solicits sponsors for BrainShare to help subsidize the whole event. There is nothing wrong with that, it means a lot of potential freebies for the people who are good at saying No politely ;).

So I'm offering this list of companies who have booths at BrainShare, what Novell product they're primarily interested in, and how it relates to me. The PDF I'm sucking this off of is this one of the Sponsor Hall.

  • SAP. The 'Cornerstone Sponsor'. I think everyone who reads my blog knows what they do. At a guess, their primary interest is in Identity Manager. SCT Banner is the ERP for the .EDU space, so we don't use 'em.
  • IBM. From last year, it's clear this is their Hardware division. So their primary interest is in SLES. We're on a different hardware platform, but... it's hardware. I'll still drop by to look at the pretty.
  • GWAVA. They make message filtering software for GroupWise. If you need anti-spam/virus for your GW installation, you're probably running GWAVA. We don't use GroupWise, so they have nothing I need.
  • GroupLink HelpDesk. A Helpdesk product that appears to be cross-platform. Their product is probably Linux, but it wouldn't surprise me to learn that they still have a lot of NetWare hiding back there. We use Magic Helpdesk for that function.
  • Microsoft. You know who they are. Officially their product is SLES but... who knows what they'll bring. We use a LOT of them around here, what with being an Exchange deployment and owning 96% of the desktops.
  • Messaging Architects. They are a more general email security and archiving provider. Their product is GroupWise, but they also sell some appliances that I could theoretically use in front of our Exchange servers. We've settled on a product from a much bigger vendor for that function, but still.
  • Novacoast IT. A consulting firm specializing in Novell. Their products are a wide gamut of Novell stuff, SLES, ZEN, IDM, and GroupWise. We're a poor .EDU, and can't afford consultants.
  • Honeywell. Honeywell is kind of like GE and IBM, they do a little of everything. I don't know what their Novell tie-in is.
  • Syncsort. They were one of the first backup products to fully support OES1. They are arguably the backup software that supports Novell stuff the best. Their products are SLES, OES, and NetWare. We looked at them when we were looking for a new backup vendor, but they didn't quite measure up for various reasons. I just might drop by.
  • Omni. Another consulting firm that specializes in Novell products, but they also have some discrete products. Their web-site says they do SLES, OES, NetWare, GroupWise, and NetMail (now a Messaging Architects product). We're a poor .EDU, and can't afford consultants.
  • HP. They do hardware. Their booth isn't as big as it was last year, so there will be less pretty to look at. Their product is SLES/OES. They're our hardware vendor, so I'll be talking real good with these folks.
  • Condrey Corporation. Another consulting company specializing in Novell products. They do IDM, Novell Storage Manager, NetWare, and probably OES/SLES. Poor .edu, can't afford 'em. yadda yadda. Also, we built our own IDM stuff so don't need no steeenkin other stuff.
And a bunch more vendors in smaller booths. Some big names (Blackberry), some not so big (idEngines).

There are exceedingly few (two, really) vendors there that can expect to see any of WWU's money any time soon. Nor is that at all likely to change. Our user head-count (21,000+) and FTE count (13,000+) combine to mean that anything that charges per-user is going to be out of our price-range pretty quickly, or will be subjected to a bidding process. We build our own solutions to problems a lot of the time because of this.

Which means that I'm a very poor sales lead.

It also means I feel a bit guilty trading my contact info for Shiny! during Vendor Night since those vendors are sooo going to strike out when they call me in April.

Labels: ,


Tuesday, March 11, 2008

New novell.com web site

Novell just updated their web-site.

As in, updated in the last 12 hours or so, so expect some broken links for a while.

Another thing I noticed is a very slight rendering difference between Linux and Windows.

Top left of Novell.com, from Linux
The page as rendered in SeaMoneky from Linux

Top left of Novell.com, from Opera
The page as rendered in Opera from Linux

Top left of Novell.com, from WinXP
The page as rendered in SeaMonkey from WinXP


It's a very simple lay-out thing, but it does indent the page that much. I kinda like it.

What I don't like is that the front page is very flash-heavy. I've had issues with flash on x86-64 machines, so I'm a bit burned by it. That said, I do realize that flash is about as prevalent as the ability to render .PNG files so it's a valid web technology.

Labels:


Thursday, February 28, 2008

Novell posted 1st Q financials

Normally I don't cover this, but there is a significant thing in there.
Income from continuing operations in the first fiscal quarter 2008 was $15 million, or $0.04 per share. This compares to a loss from continuing operations of $12 million, or $0.04 loss per share, for the first fiscal quarter 2007.
What this means is that Novell just posted a quarter in the black. If I'm remembering right, this is the first one in a couple of years. My pure guess is that this represents a couple of factors:
  • The loss of NetWare customers has slowed down significantly. The large majority of who is going to jump ship already has, and the remaining ship-jumpers represent a small part of the overall Novell picture. They will still cause the NetWare unit to lose money, but the loss is balanced in other areas now.
  • The SLES business has increased significantly, making up for the loss in NetWare customers. Novell has made many press releases about how well SLES/SLED is doing in the market, and point to the Microsoft deal as a key part of that.
  • Identity Manager, the central software that is in the Gartner Magic Quadrant, continues to do rather well.
Which is good, since it means that Novell will be around for quite some time. It just won't be "that NetWare company" any more. I just hope the OES services continue to see development. We won't know how large the OES segment is until Novell files their SEC paperwork.

Labels:


Monday, February 25, 2008

First OES2

This weekend I upgraded the one replica server running OES1-Linux to OES2-Linux. It already was at eDir 8.8.2 so the only real changes were to the base OS. It went rather well. The upgrade documentation provided by Novell was just fine. Really, a simple upgrade.

It being done on a Pentium III 1.2GHz machine meant it took a while. But very little in the way of complication. The one hitch was that it changed the certificate the NLDAP server loads to the default, which I didn't catch until a certain service we wrote failed. But that was a very easy fix.

Labels: , , ,


Thursday, February 14, 2008

OES2-SP1 soon to be inclosed beta

Novell just announced that OES2 SP1 is going into closed beta.

"What is in this release of Open Enterprise Server

Novell Open Enterprise Server 2 Support Pack 1 refreshes the SUSE Linux Enterprise Server 10 distribution with SLES10 SP2, fixes defects found since the release of OES2 and also adds in the following functionality:

  • Novell engineered CIFS and AFP protocols
  • New version of iFolder (3.7)
  • Updated iPrint with an accounting API
  • 64-bit version of eDirectory
  • Enhanced migration tools and migration GUI
  • Improved performance of the XEN hypervisor
  • Domain Services for Windows
  • NetWare 6.5 Support Pack 8

Note that although Domain Services for Windows is part of OES2 SP1, a separate beta program will be run in order to collate DSfW feedback."

Novell engineered CIFS? I soooo want to know what that is. Is is a completely new CIFS stack, or is it Samba with Novell extensions whacked on? I want to know! The other important bit of information:

The beta test program is currently scheduled to begin mid March and run through October.
Which means there won't be product for my 2008 upgrade window. Fie. Well, at least we'll have ample time to prototype and test for the 2009 upgrade window.


Labels: , , ,


Tuesday, February 05, 2008

Exchange vs Groupwise

A post on CoolSolutions today quoted another blog about why GroupWise makes sense over Exchange. This is some of the same stuff I've seen over the years. A faaaaavorite theme is to point to mass mailer worms taking out Exchange, leaving everyone else fat and running.

On 1/7/07 I wrote about just this sort of thing. A quote:
The days of viruses and other crud scaring people off of Exchange are long gone. Now the fight has to be taken up on, unfortunately, features and mind-share. In the absence of a scare like Melissa provided, migrations from Exchange to something else will be driven by migration events. Microsoft may be providing just that threshold in the future, as they've said that they will be integrating Exchange in with SharePoint to create the End All Be All of groupware applications. Companies that aren't comfortable with that, or haven't deployed SharePoint for whatever reason may see that as an excuse to jump the Microsoft ship for something else. Unfortunately, it'll be executives looking for an excuse rather than executives seeing much better features in, say, GroupWise.
Which, 13 months later, is still mostly true. Mass mailer worms are no longer the scourge they used to be, and are well handled by commercial AV packages. Mass mailer worms even look different these days, preferring to infest and send mail independent of the mail client directly to the internet, thus neatly bypassing the poor meltable Exchange servers. The fear of mass mailers is FUD leftovers from years ago, not a current threat or reason to get off of the dominant platform.

The other thing I mentioned 13 months ago was 'migration events'. We're coming up on one, in the form of Exchange 2007. As the other blog mentioned, the hardware requirements for Exchange 2007 are a bit higher than for 2003. Speaking as an administrator with a sizable Exchange deployment, the requirement of 64-bit OS is something of a non issue since I'd be using one anyway. For a small office with only 200 users, though, forking out for Windows Server 2003 64 would be expensive.

Another point mentioned is that GroupWise can run on anything, and Exchange (especially Exch2007) won't. Again, as a mail admin for a largish Exchange system that doesn't matter to me since I'll be using newer servers to keep up with the load anyway. Again, for small offices who upgrade their servers whenever the old one completely bakes off, this is a bigger concern.

The other migration point is the Public Folders that Microsoft dropped in Exchange 2007. Or rather, made a lot harder to manage. Their users roasted their account managers hotly enough that Exchange 2007 SP1 reintroduces Public Folder management. We make some use of Public Folders, but I can see an office that makes extensive use of them looking at Exchange 2007 as not a simple plonk-in upgrade that Exchange 2003 was from Exch 2000. GroupWise doesn't have a similar concept to Public Folders (Resources might be, but only sort of), so this doesn't help GW much, but is the sort of event that makes an organization really think about what they're moving to.

As for productivity, we haven't had problems. Our Exchange has about 4300 accounts in it right now. This is supported by three administrators and a lot of automation. That said, during summer vacation season when I'm the only one of us three here I can go whole days without touching anything Exchange. It just works. This is a claim I frequently hear from GroupWise shops, so... Microsoft can do it too eh?

Another thing on CoolSolutions lately has been a few pieces on marketing GroupWise. In short, it makes more sense for Novell to pitch GroupWise as the #2 player than it is to pitch it as fundamentally better than Exchange. This has some good points. There are some markets that GroupWise is a better fit than Exchange, and the small, infrequently upgraded office is one of them. As are organizations looking really closely at Linux. GroupWise can very well be the #1 mail product in the Linux space, so long as Novell can convince people that paying for email services in Linux is a good idea.

I close out my previous post 13 months ago with a paragraph that still stands:
So, Exchange will be with us a long time. What'll start making the throne wobble is if non-Windows desktops start showing up in great numbers in the workplace. THEN we could see some non-MS groupware application threaten Exchange the way that Mac (and Linux) are threatening the desktop.

Labels: , ,


Monday, February 04, 2008

Today's 18 year olds...

Over the time I've been here there has occasionally been a list posted in the break-room. This list is the, "Incoming freshman today...." list of things they know, experience, or haven't experienced. It contains things like:
  • Were born in 1990
  • ...have never known life without cable or satellite TV.
  • ...probably have never seen a rotary dial phone.
  • ...have had internet access for most of their school life.
And other such things. Ostensibly this is to help foster an understanding of where incoming freshman are coming from, but generally they just cause faculty and staff to just feel a bit old. In tech circles this sparks conversations about the first computers we used.

Which got me thinking about a few things. One of the items that is frequently put forth about Kids These Days (tm) is that they don't KNOW anything, they just know how to FIND things. There is some debate about this, but it is a common sentiment. I believe that kids these days (KTD) have figured out keyword based searching, and the search engines have gotten good enough at mind-reading that arcane search incantations aren't needed nearly as often as they were in the past.

Before Google, there was AltaVista. This was an era of the internet where boolean search incantations were needed to really narrow down to what you wanted. I didn't switch to Google for a long time because Google didn't have the NEAR search term, which I used on AltaVista as a way to narrow results to be more relevant. I didn't know at the time that Google effectively threw that term in on every search.

Those of us who lived through that era of the internet built up searching skills. I remember some searches I did back then that were pretty complex. I can't remember the exact terms used, but they looked like this:

bootes AND (antaries OR proxima) AND (fulcrum NEAR pinnacle)

I had a logic class in college, so these sorts of parenthetical statements made sense to me. Still do, I just don't end up needing to uncork the boolean logic to find what I need anymore as the search engines have gotten good enough that I don't NEED to do it. I know google allows much of the above, but I haven't had to do it so I don't know the syntax for it.

So I posit that yes, KTD don't know anything, but neither are their search skills robust.

Which brings me to Novell. I got to thinking what a NetWare administrator in 1990 had to know to do their job, and how I could fit into such a hypothetical time.

Right now if I don't know the answer to a problem I have a few methods to figure it out.
  1. Hit the online Novell Knowledge Base over at novell.com/support
  2. Hit the peer-support forums over at forums.novell.com (or nntp://forums.novell.com/ if you prefer old-school)
  3. Pay for a support incident
  4. Ask around the office
In 1990 the options were similar, but a key player was missing:
  1. Hit the peer-support forums over on CompuServe, which required a modem and a CompuServe account.
  2. See if the problem is mentioned in the book-shelf of manuals, which was a big investment to own.
  3. Pay for a support incident.
  4. Ask around the office.
When I first started this Novell Administrator gig in 1997 most of the admins I knew had CompuServe accounts, even though the support forums had officially moved to NNTP. There was still plenty of traffic on the CS servers, though those died out fairly quickly. The office I started in had a subscription to a monthly publication from Novell of their support knowledge base, which I made extensive use of. Somewhere in there Novell made the archives web-searchable and I stopped using the CD's.

As I see it, a NetWare admin of 1990 was on average more knowledgeable about their product than the NetWare admin of 2008. Such administrators avoided the cost of paying for support incidents by having the manuals in hard-copy form, and plonking down real money for CompuServe accounts. If I have a weird problem I'll hit up the Novell KB to see if there is a TID on it, then check the support forums to see if it is mentioned there, before I'll expend an incident on the thing. In time I've managed to teach myself how NetWare works in some very basic ways, simply by troubleshooting oddball problems. This is why I typically end up talking to backline support when I call in, unless the problem is a known issue in the private KB. My skills are probably on par with what was normal 'back in the day'.

I think this holds true for a lot of the tech field. Back then there was a lot of stuff you just had to KNOW. Or failing that, have spent the money to get the backup resources in place (manuals, support contracts). These days a base understanding of how things work is the key to phrasing the right search queries in the online knowledge bases, and less rote memorization (training) can be effective in solving a greater list of problems.

Prosthetic memory! Prosthetic training! The tools of geeks everywhere.

Labels: , , ,


Wednesday, January 30, 2008

I don't think it works that way

We've been having some abend issues on the cluster lately, something with the network services rather than file serving. It seems to be triggered more by iPrint/NDPS than by MyFiles, but both are associated with it. The abend itself is in WS2_32.NLM, so it's in the network stack. I have a call open with Novell.

I finally, finally managed to get a meaningful packet-capture after it fails, and I found some traffic that... doesn't look right. Take a look:

-> NCP Connection Destroy
<- R OK
<- FIN, PSH, ACK
<- RST, ACK
-> ACK (to R OK)
<- RST

Note the three packets in the middle. The responding server is tearing down the connection twice for some reason. Compare this with a 'normal' tear-down:

-> NCP Connection Destroy
<- R OK
-> FIN, PSH, ACK
<- ACK
-> RST, ACK

The first example I gave is the last traffic on the wire before the server abends, so is of course highly suspicious. The pattern that leaps right out is that the responding server is issuing the FIN,PSH,ACK and RST,ACK pair, rather than the sending server, and doing so before the sending server can say "I got it" to the connection close packet.

Now I need to catch it in the act again to prove this theory.

Labels: ,


Friday, January 25, 2008

BrainShare social networking

I am going to BrainShare this year!

It has been interesting to watch the social networking thingy related to BrainShare over the years.

Two years ago, and for many years before that, the primary social group for BrainShare was novell.community.brainshare. This was an NNTP (you remember Usenet?) group hosted on the same servers that host the Novell Support Forums. BrainShare 2006 saw an increase in a certain kind of anti-Novell traffic that was already fairly common in the lead up to BrainShare 2005. The denizens of the group tend to be old time Novell hands, and as you can imagine they were pretty upset about Novell's plans for NetWare. A few very vocal people managed to raise enough of a stink that there wasn't a lot going on in the group for 2006. Unsurprisingly, novell.community.brainshare was removed from the NNTP servers around May 2006 (though the google-groups version of it is still around, see the link).

Last year Novell came up with BrainShare Connect as the social networking thingy. It had forums, blogs, and various other things to try and get attendees hooked up with each other and interacting. It got a reasonable amount of traffic, but many folks who had been regulars of the NNTP group were not there. I checked in every few days to see if anything new was up. For 2006 and 2005 I had checked the NNTP group daily, since there really was that much going on.

This year BrainShare Connect is back, but... they didn't do it right. The same outsourced firm is handling it, but even though it has Web 2.0 stamped all over it the interface is markedly worse than last year. There are no blogs. There are no polls. The interest finders are... weak and obfuscated. The forums are implemented on PhpBB, but done wrong. As an example of the wrong, take a look at this screen shot of me Replying to a thread:

Reply pop-over obscuring everything

What am I replying to? I can't tell. That window can't be moved or resized. I better hope my memory is good. I don't know if this is a new PhpBB feature, a new version came out a while ago, or some customized mod from WingateWeb. Whatever it is, it isn't a good thing. The ability to see what you're replying to greatly eases the flow of conversation.

And the logout screen is particularly interesting, too.

The logout window with weird buttons

What ever happened to "Cancel/OK"? Hasn't that been a de facto standard since, like, the Mac Classic came out 24 years ago? Proceed? I think that's the first time I've ever seen that particular word in that particular spot in an application developed by professionals.

The NNTP group had plenty going for it, but it was spoiled by a few vociferous critics. In the last few months Novell has released a brand new HTTP interface for the support forums that is worlds better than what was there before. Novell could bring this function back in-house if they really wanted to, and I'd support that decision. That said, I do understand why they need/want WingateWeb to handle that function. I just wish they did it better.

Labels: , ,


A needed patch.

Novell has released a patch for the "ConsoleOne sorting problem."

The sorting problem happens when you have eDir 8.8 installed. Suddenly C1 starts sorting things by creation date rather than as you've ever seen it before. This is... confusing. ConsoleOne 1.3h helped some of it for us, but not all. And now, we have a patch!

Let ConsoleOne Sort Correctly!

Labels: , , , ,


Wednesday, January 16, 2008

NetWare library patches

Novell recently split the libc and clib patches for NetWare. For a long time patches like "nwlib6a" included both. Now, they're split.

This just caused me a problem. It turns out that if you have libcsp6b (the LibC patch) applied and not nwlib6k (the CLib patch), there is an abend possibility. It happened yesterday. It turns out that in that case, a badly formed network broadcast can cause an abend. This caused three of my six cluster nodes to fall on their butts at the same time. That was fun. Strange (but good) thing is, I had already applied both patches to these three servers but hadn't gotten around to rebooting them yet. So, by killing themselves they actually fixed the problem.

The abend, key details:

EIP in SERVER.NLM at code start +0015FD27h

Heh heh heh. Oops.

And now a bit of history. Long time NetWare admins can ignore this part.

Q: Why are there two C libraries?

CLIB is the library NetWare started with. It began life in the dark and misty past, probably in the late 1980's. It is the deepest, darkest bowels of NetWare from the era when Novell was it when it came to office networking. Being so old, its APIs are very mature. Applications developed against CLIB generally speaking just plain work.

CLIB is also depreciated since it is highly proprietary, and doesn't play well with others. "Just plain works" in this instance means an assumption of 8.3 names, with kludging to support long file names if at all possible. CLIB applications have a tendency to have IPX dependencies for no good reason.

LIBC was created, IIRC, around the release of NetWare 5.0 when it became possible for NetWare to operate in a "pure IP" environment. LIBC was designed with the concept of POSIX semantics in mind, which CLIB was not. LIBC was created from scratch with long file name support. By now, as of NetWare 6.5 SP7, most of the NetWare kernel is written against LIBC rather than CLIB.

As an example of LIBC vs CLIB, take the 'MyWeb' service this blog is served by. When I did this the first time, it was on NetWare 6.0, using Apache 1.3. Apache 1.3 was linked against CLIB and was very stable. The service notes for the Apache Modules I needed to run to make it work made it clear that supporting long file-names on remote servers was something that only recently started working.

When the migration to NetWare 6.5 came around, it meant I had to migrate MyWeb to Apache 2.0. Apache 2.0 is linked against LIBC and used a different apache module to make things work. I had troubles. The LibC functions were not nearly as mature as their CLIB counterparts, and it showed. 3.5 years later things are now a lot more stable then back then.

Labels: , , ,


Monday, January 07, 2008

I/O starvation on NetWare, another update

I've spoken before about my latency problems on the MSA1500cs. Since my last update I've spoken with Novell at length. Their own back-line HP people were thinking firmware issues to, and recommended I open another case with HP support. And if HP again tries to lay the blame on NetWare, to point their techs at the NetWare backline tech. Who will then have a talk about why exactly it is that NetWare isn't the problem in this case.

This time when I opened the case I mentioned that we see performance problems on the backup-to-disk server, which is Windows. Which is true, when the problem occurs B2D speeds drop through the floor; last Friday a 525GB backup that normally completes in 6 hours took about 50 hours. Since I'm seeing problems on more than one operating system, clearly this is a problem with the storage device.

The first line tech agreed, and escalated. The 2nd line tech said (paraphrased):
I'm seeing a lot of parity RAID LUNs out there. This sort of RAID uses CPU on the MSA1000 controllers, so the results you're seeing are normal for this storage system.
Which, if true, puts the onus of putting up with a badly behaved I/O system onto NetWare again. The tech went on to recommend RAID1 for the LUNs that need high performance when doing array operations that disable the internal cache. Which, as far as I can figure, would work. We're not bottlenecking on I/O to the physical disks, the bottleneck is CPU on the MSA1000 controller that's active. Going RAID1 on the LUNs would keep speeds very fast even when doing array operations.

That may be where we have to go with this. Unfortunately, I don't think we have 16TB of disk-drives available to fully mirror the cluster. That'll be a significant expense. So, I think we have some rethinking to do regarding what we use this device for.

Labels: , , , ,