Thursday, April 17, 2008

And a gripe

2.5 hours is too freakin' long for "rug lu" to tell me which patches need application to this particular OES2 server. This needs fixing. I hope its fixed in SLES10 SP2.

Labels: , ,


Tuesday, April 01, 2008

Slow blogging

I found out at BrainShare that WWU has been accepted as a Novell Authorized Beta site for OES2 SP1. And that's what I've been doing for the better part of the past week. Due to the NDA required, I can't talk about it. So, not much bloggable stuff to bring forward.

We requested entry into the program in part because of what I learned at BrainShare 2007. Specifically, Novell doesn't test for our scales of users. Therefore, it is in our best interest to make sure that organizations like us are in the beta. We have the hardware to make a go of it right now (all those new ESX boxes are liberating some still-useful 3-5 year old servers), and I have the time. Unfortunately, the only 64-bit testing we'll be doing will be in VMWare, so the newest of the new code will have to be really tested by other people.

That's why I've been quiet.

Labels: , ,


Thursday, March 20, 2008

BrainShare Thursday

Not a good day. My first course, "Advanced BASH," could more accurately be described as, "BASH scripting tips & tricks". I then proceeded to skip the other three sessions I had signed up for.
  • Novell Open Enterprise Server 2 Interoperability with Windows and AD. All about Domain Services for Windows and Samba. Neither of which we'll ever use. No idea why I wanted to be in this session.
  • Rapid Deployment of ZENworks Configuration Management. Other people around here have suggested that if we haven't moved yet, wait until at least SP3 before moving. If then. So, demotivated. Plus I was rather tired.
  • Configuring Samba on OES2. CIFS will do what we need, I don't need Samba. Don't need this one. Skipped.
DL236: Advanced BASH Course
BASH tips and tricks. I got a lot out of it, but the developers around me were quietly derisive.

ZEN Overview and Features
Not so much with the futures, but it did explain Novell's overall ZEN strategy. It isn't a coincidence that most of Novell's recent purchases have been for ZEN products.

TUT303: OES2 Clusters, from beginning to extremes
This was great. They had a full demo rig, and they showed quite a bit in it. Including using Novell Cluster Services to migrate Xen VM's around. They STRONGLY recommended using AutoYast to set up your cluster nodes to ensure they are simply identical except for the bits you explicitly want different (hostname, IP). And also something else I've heard before, you want one LUN for each NSS Pool. Really. Plus, the presenters were rather funny. A nice cap for the day.

And tonight, Meet the Experts!

Labels: , , , , , , ,


BrainShare Wednesday

The Wednesday keynote was, indeed, a bunch of demos. It was also mostly pointless as far as the technology I'm concerned with. Lots of GroupWise (don't care), lots and lots of PlateSpin (can't afford it), lots of Zen (not the bits I'd use).

That said, the new GroupWise WebAccess is gorgeous. I wish Exchange had their non-ActiveX pages look that good.

TUT175: RBAC: Avoiding the horror, getting past the hype
Mostly about IDM as it turned out. Only minimally interesting from an abstract viewpoint about roles in general.

TUT 277: Advanced eDirectory Configuration, new features, and tuning for performance
I learned a few things I didn't know, such as the fact that each object as an "AncestorList" attribute listing who their parent objects are. This apparently greatly speeds up searching. SP3, coming out this Summer, will have faster LDAP binds for a couple of reasons. Right now Novell is recommending 2 million objects as a reasonable maximum size for a partition for performance reasons.

And also they reiterated something I've heard before...
You know how back in the NetWare 4 days, we said to design your tree by geography at the first level, and then get to departments? Um, sorry about that. It was great back then, but for LDAP or IDM it really, really slows things down.
Yep. I took my first class for my CNA when 'Green River' was just coming out, or was just out. So I remember that.

TUT221: iPrint on Linux, what Novell Support wants you to know
A nice session from a mainline support guy about the ways people don't do iPrint on linux correctly. We're not going there until pcounter can run in linux, so this is still somewhat abstract. But, nice to know.
  • The reason that some print jobs render differently than direct-print jobs, is because of how Windows is designed. Direct-print jobs render with the 'local print provider', and iPrint jobs render with the 'network print provider'. This is a Microsoft thing, not an iPrint thing. You can duplicate it by setting up a microsoft IPP printer (assuming you're not mandating SSL like we are) and printing to the same printer with the same driver.
  • The Manager on Linux doesn't use a Broker, it uses a 'driver store'.
  • The Manager on NetWare doesn't always bind to the same broker. I didn't know that.
  • It is recommended to have only one Broker, or one driver store per tree.
  • Novell recommends using DNS rather than IP for your printer-agents, check your manager load scripts.

Labels: , , , , , , , ,


Tuesday, March 18, 2008

BrainShare Tuesday

Today started off with a bit of panic, as I hadn't set my alarm. Me being a west-coaster, 7:20 (when I woke up) is an entirely reasonable time to get up as far as my body is concerned. Only, I needed to get dressed and breakfasted before my first session at 8:30. Aie! I had to eat quick, but I got there. Didn't get a chance to check work email, though.

ATT326: Advanced Linux Troubleshooting
An ATT, therefore hard to summarize. But I learned about a few new commands I didn't know about before. Like strace. And vimdiff.

TUT130: Challenges in Storage I/O in Virtualization
Another nice one, but an emergency at work (printing down in a dorm, during finals week) distracted me heavily during the first half of it. Which resulted in the following note in my notes:
NPIV looks really nifty. Look into it.
NPIV being how you can use fibre-channel zoning to zone off VM's, rather than HBA's. Highly useful. I also learned about a neat new thing called Virtual Fabrics. Virtual Fabrics work kind of like VLANS for fabrics. You can segregate your fabrics into fabrics that share hardware but nothing else. Handy if your, say, Solaris admins don't want you mucking about with their zoning, while saving money through consolidated hardware.

TUT216: OES2 SP1 Architectural Overview
There is a LOT of new stuff in SP1.
  • It will include eDir 8.8.4 (8.8.3 will ship this summer sometime)
  • NCP and eDir will be fully 64-bit
  • OES2 SP1 will be based on SLES SP2, which will be releasing about the same time
  • AFP Support
    • AFP 3.1
    • Uses Diffie-Helman 1 for password exchange, meaning the 8-character password problem is solved.
    • Fully SMP-safe
    • Has cross-protocol locking with NCP. CIFS doesn't have cross-protocol locking yet, but IIRC, Samba does
    • Does not need LUM enabled users
  • CIFS Support
    • NTLMv1, but v2 is a possibility if enough people ask, so file those enhancement requests!!
    • CIFS is separate from Samba, therefore can not be used in conjunction with Domain Services for Windows
    • As with AFP, fully SMP safe
  • EDir 8.8.4
    • LDAP auditing enhanced
    • "newer auth protocols", but they didn't say what.
I should also mention that they're still deploying Novell Integrated Samba, which is what you'll have to use to get Domain Services For Windows. Samba still doesn't scale as far as I'd like ('only' 700-800 concurrent users), so that may be an issue for higher ed types who want high concurrency CIFS and also DSFW on the same box.

TUT211: Enhanced Protocol Support in OES2 SP1
This is the session where they went into detail about the AFP and CIFS support. They said that netatalk, the existing AFP stack on Linux, gets really slow once you go over the 20 concurrent users. Whoa! I can soooo understand why Novell felt the need to make a new one.
  • The 8 character password limit has been fixed! They now support DH1 for passing passwords.
  • The 'afptcp' daemon can use one password protocol at a time, so you can only use DH1, or one of the other three I can't remember.
  • Support for OSX 10.1 and 10.2 is scanty, and 10.5 is limited but users may not notice anyway.
  • Passwords will be case sensitive.
  • Kerberos will be in a future release
  • Performance is faster than NetWare, partly due to the ability to multi-thread
  • Can register services by way of SLP
  • Only supports NSS for the time being, the other Linux file-systems will be a future feature.
  • Can support 500 concurrent users, and 1000+ in the future. This fits our current AFP loads.
  • We can configure more about how it works than we could on NetWare, such as how many worker threads to spawn.
  • Has meaninful debug logs!
  • Has a new command, 'afpstat' that works like 'netstat' for giving a snapshot of afp connections.
And then some CIFS stuff. We can't use it for political reasons so I didn't pay attention. Sorry.

Tonight was the night formerly known as 'Sponsor Night,' but has a new name now that everyone who gets a booth is no longer a 'sponsor'. Some are sponsors, some are exhibiters. I can't keep track. Anyway, today was their party. "World of Novellcraft!" Homage to vid-gaming.

Lots of Wii, lots of Rock Band, some Halo, lots of women dressed in Renaissance Festival gear getting their pictures taken by the 90%+ male audience. I've blogged before about my ambivalence about Sponsor Night. I lasted until about 7, when I came back to the hotel.

Tomorrow I have an actual LUNCH BREAK in my schedule! Ooo! And Soul Asylum Soul Coughing Collective Soul plays the concert! I've been listening to two of their CD's for the past two months so I think I may even know a few songs by now.

Labels: , , , , , ,


Monday, March 17, 2008

Today at Brainshare

Monday. Opening day. I had trouble getting to sleep last night due to a poor choice of bed-time reading (don't read action, don't read action, don't read action). And had to get up at 6am body time in order to get breakfast before the morning keynote. There be zombies.

Breakfast was uninspired. As per usual, the hashbrowns had cooled to a gellid mass before I found everything and got a seat.

The Monday keynotes are always the CxO talks about strategy and where we're going. Today a mess of press releases from Novell give a good idea what the talks were about. Hovsepian was first, of course, and was actually funny. He gave some interesting tid-bits of knowledge.
  • Novell's group of partners is growing, adding a couple hundred new ones since last year. This shows the Novell 'ecosystem' is strong.
  • 8700 new customers last year
  • Novell press mentions are now only 5% negative.
Jeff Jaffe came on to give the big wow-wow speech about Novell's "Fossa" project, which I'm too lazy to link to right now. The big concern is agility. He also identified several "megatrends" in the industry:
  • High Capacity Computing
  • Policy Engines
  • Orchestration
  • Convergence
  • Mobility
I'm not sure what 'Convergence' is, but the others I can take a stab at. Note the lack of 'virtualization' in this list. That's soooo 2007. The big problem is now managing the virtualization, thus Orchestration. And Policy Engines.

Another thing he mentioned several times in association with Fossa and agility, is mergers and acquisitions. This is not something us Higher Ed types ever have to deal with, but it is an area in .COM land that requires a certain amount of IT agility to accommodate successfully. He mentioned this several times, which suggests that this strategy is aimed squarely at for-profit industry.

Also, SAP has apparently selected SLES as their primary platform for the SMB products.

Pat Hume from SAP also spoke. But as we're on Banner, and it'll take a sub-megaton nuclear strike to get us off of it, I didn't pay attention and used the time to send some emails.

Oh, and Honeywell? They're here because they have hardware that works with IDM. That way the same ID you use for your desktop login can be tied to the RFID card in your pocket that gets you into the datacenter. Spiffy.

ATT375 Advanced Tips & Tricks for Troubleshooting eDir 8.8
A nice session. Hard to summarize. That said, they needed more time as the Laptops with VMWare weren't fast enough for us to get through many of the exercises. They also showed us some nifty iMonitor tricks. And where the high-yield shoot-your-foot-off weapons are kept.

BUS202 Migrating a NetWare Cluster to OES2
Not a good session. The presenter had a short slide deck, and didn't really present anything new to me other than areas where other people have made major mistakes. And to PLAN on having one of the linux migrations go all lost-data on you. He recommended SAN snapshots. It shortly digressed into "Migrating a NetWare Cluster to Linux HA", which is a different session all together. So I left.

TUT215 Integrating Macintosh with Novell
A very good session. The CIO of Novell Canada was presenting it, and he is a skilled speaker. Apparently Novell has written a new AFP stack from scratch for OES2 Sp1, since NETATALK is comparatively dog slow. And, it seems, the AFP stack is currently out performing the NCP stack on OES2 SP1. Whoa! Also, the Banzai GroupWise client for Mac is apparently gorgeous. He also spent quite a long time (18 minutes) on the Kanaka client from Condrey Consulting. The guy who wrote that client was in the back of the room and answered some questions.

Labels: , , , , , ,


Monday, February 25, 2008

First OES2

This weekend I upgraded the one replica server running OES1-Linux to OES2-Linux. It already was at eDir 8.8.2 so the only real changes were to the base OS. It went rather well. The upgrade documentation provided by Novell was just fine. Really, a simple upgrade.

It being done on a Pentium III 1.2GHz machine meant it took a while. But very little in the way of complication. The one hitch was that it changed the certificate the NLDAP server loads to the default, which I didn't catch until a certain service we wrote failed. But that was a very easy fix.

Labels: , , ,


Thursday, February 14, 2008

OES2-SP1 soon to be inclosed beta

Novell just announced that OES2 SP1 is going into closed beta.

"What is in this release of Open Enterprise Server

Novell Open Enterprise Server 2 Support Pack 1 refreshes the SUSE Linux Enterprise Server 10 distribution with SLES10 SP2, fixes defects found since the release of OES2 and also adds in the following functionality:

  • Novell engineered CIFS and AFP protocols
  • New version of iFolder (3.7)
  • Updated iPrint with an accounting API
  • 64-bit version of eDirectory
  • Enhanced migration tools and migration GUI
  • Improved performance of the XEN hypervisor
  • Domain Services for Windows
  • NetWare 6.5 Support Pack 8

Note that although Domain Services for Windows is part of OES2 SP1, a separate beta program will be run in order to collate DSfW feedback."

Novell engineered CIFS? I soooo want to know what that is. Is is a completely new CIFS stack, or is it Samba with Novell extensions whacked on? I want to know! The other important bit of information:

The beta test program is currently scheduled to begin mid March and run through October.
Which means there won't be product for my 2008 upgrade window. Fie. Well, at least we'll have ample time to prototype and test for the 2009 upgrade window.


Labels: , , ,


Friday, January 25, 2008

A needed patch.

Novell has released a patch for the "ConsoleOne sorting problem."

The sorting problem happens when you have eDir 8.8 installed. Suddenly C1 starts sorting things by creation date rather than as you've ever seen it before. This is... confusing. ConsoleOne 1.3h helped some of it for us, but not all. And now, we have a patch!

Let ConsoleOne Sort Correctly!

Labels: , , , ,


Wednesday, January 02, 2008

Where NetWare Fits

NetWare 6.5 still holds top honors in one server niche. Even though it is a 32-bit operating system. That niche is the "large file-server" segment. I define "large" as, "lots of data, way-lots of concurrent users". Yeah, that's highly scientific. But "way-lots" means "over 1000 concurrent" to my thinking.

We regularly run between 1200-6000 concurrent connections on our cluster nodes. This is a density that just doesn't happen all that often in the market. If you have 6000 users close enough together to all talk to the same file-server at LAN speeds using a protocol designed for file-serving (such as NCP, SMB/CIFS, or AFP), you're a big organization. 6000 is a large corporate campus, a large governmental entity of some kind, or a larger .EDU like us. Nationally, the number of 'large' file-servers like that is peanuts compared to the number of 'workgroup' (i.e. under 300 concurrent users) servers out there.

It is therefore no surprise to me that Novell is not devoting a lot of engineering to supporting the top end of this market. While it may pay well, there just isn't enough revenue coming from these customers to try and handle the hardest-to-test use-case: very high concurrency. I find it disappointing because I AM one of those customers (a larger .EDU), but I understand the business drivers supporting the decision.

For the moment, NetWare 6.5 (32bit) is the top-dog performance wise for our environment. That isn't going to stay true for much longer. It would not surprise me to find out that a Windows Enterprise Server (x86_64) with 16GB of RAM can out-perform a NetWare 6.5 (32bit) server with 4GB of RAM, simply due to the added room for a file-cache. What I don't know is how CPU-bound file-serving I/O is on a Windows Enterprise Server, that's the one area that could keep NetWare 6.5 (32bit) on top. I already know that OES2-Linux out-performs NetWare for NCP traffic, so long as you stay within CPU bounds.

For high-concurrency applications, as far as I know NetWare still wins.

Labels: , , , ,


Wednesday, December 19, 2007

eDir 8.8 is in

And as far as upgrades go, it was pretty much a non-event.

Whenever you do upgrades like this you always wonder if those balls you're juggling are tennis-balls or grenades. It took about a half hour per server and didn't have any significant hitches. The one problem that did surface is that the OES1-linux server's LDAP server had its certificate change from the one it was using to SSL CertificateDNS. This was not good, as that certificate doesn't have the subject-name we need and this caused some S/LDAP binds to fail due to SSL validation problems. That was an easy fix. The LDAP servers on the NetWare boxes didn't change.

This was a tennis-ball upgrade. So far.

We haven't turned on case-sensitive LDAP binds yet, but soon. Soon.

One unexpected side-effect of getting all three eDir servers upgraded to 8.8 like this, is that the Change Cache is now cleaned of those permanent residents we've had for ages. Woo!

Labels: , , ,


Monday, December 17, 2007

Not dead.

Wow, last post was the 30th? Jeez. I was on vacation all last week, which accounts for some of it. And it's looking like I'll be out sick for at least a pair of days with a crud I got while wandering about. Not sharing that with work, nosir.

On my list of things to do during the winter inter-session is to get eDir 8.8 deployed in the production tree. I just need to have ALL the servers in the tree (all, not just replica holders due to backlink updates) up and talking when I do the first one, and that could take some scheduling. This is the first step to OES2, which will be deployed on the eDir servers first.

As soon as I get some new hardware, since they're getting old.

Labels: , , , ,


Friday, November 30, 2007

OES2 SP1 timing

Novell just posted the third draft of their OES2 Best Practices guide. Which you can locate here. In that guide is this text:
Domain Services for Windows, which is scheduled to ship with OES 2 SP1 (currently scheduled for late 2008), will also offer some clear advantages.
"Late 2008" means they WILL NOT have SP1 out by August of 2008. This means that the upgrade of our 6 node cluster to OES will have to wait until 2009. Grrarrr!

Another 21 months of a 32-bit operating system on the single biggest storage consumer on campus. We'll have at least one hardware refresh before then for some of the nodes, and... boy I hope they have NetWare drivers for that. The very limited testing I did with NetWare-in-Xen was not encouraging from a performance stand-point. If it looks like I'll have to deploy that way for the next servers we get in the cluster, I'll have to do more real testing to characterize the performance hit (if any). The idea of a 64-bit memory space for file-caching makes me drool. Not getting it for 21 months is painful.

That said, if Novell releases the eDirectory enabled AFP server for OES2-Linux outside of the service-pack I could still make the 2008 window. That's our only dependency for SP1

Labels: , , , , ,


Wednesday, November 28, 2007

I/O starvation on NetWare, HP update

Last week I talked about a problem we're having with the HP MSA1500cs and our NetWare cluster. The problem is still there, of course. I've opened cases with both HP and Novell to handle this one. HP because I really thing that such command latencies are a defect, and Novell since they're having starvation issues with clusters.

This morning I got a voice-mail from HP, an update for our case. Greatly summarized:
The MSA team has determined that your device is working perfectly, and can find no defects. They've referred the case to the NetWare software team.
Or...
Working as designed. Fix your software. Talk to Novell.
Which I'm doing. Now to see if I can light a fire on the back-channels, or if we've just made HP admit that these sorts of command latencies are part of the design and need to be engineered around in software. Highly frustrating.

Especially since I don't think I've made back-line on the Novell case yet. They're involved, but I haven't been referred to a new support engineer yet.

Labels: , , , , , , ,


Wednesday, November 21, 2007

I/O starvation on NetWare

The MSA1500cs we've had for a while has shown a bad habit. It is visible when you connect a serial cable to the management port on the MSA1000 controller, and doing a "show perf" after starting performance tracking. The line in question is "Avg Command Latency:", which is a measure of how long it takes to execute an I/O operation. Under normal circumstances this metric stays between 5-30ms. When things go bad, I've seen it as far as 270ms.

This is a problem with our cluster nodes. Our cluster nodes can seen LUNs on both the MSA1500cs and the EVA3000. The EVA is where the cluster has been housed since it started, and the MSA has taken up two low-I/O-volume volumes to make space on the EVA.

IF the MSA is in the high Avg Command Latency state, and
IF a cluster node is doing a large Write to the MSA (such as a DVD ISO image, or B2D operation),
THEN "Concurrent Disk Requests" in Monitor go north of 1000

This is a dangerous state. If this particular cluster node is housing some higher trafficked volumes, such as FacShare:, the laggy I/O is competing with regular (fast) I/O to the EVA. If this sort of mostly-Read I/O is concurrent with the above heavy Write situation it can cause the cluster node to not write to the Cluster Partition on time and trigger a poison-pill from the Split Brain Detector. In short, the storage heart-beat to the EVA (where the Cluster Partition lives) gets starved out in the face of all the writes to the laggy MSA.

Users definitely noticed when the cluster node was in such a heavy usage state. Writes and Reads took a loooong time on the LUNs hosted on the fast EVA. Our help-desk recorded several "unable to map drive" calls when the nodes were in that state, simply because a drive-mapping involves I/O and the server was too busy to do it in the scant seconds it normally does.

This is sub-optimal. This also doesn't seem to happen on Windows, but I'm not sure of that.

This is something that a very new feature in the Linux kernel could help out, that that's to introduce the concept of 'priority I/O' to the storage stack. I/O with a high priority, such as cluster heart-beats, gets serviced faster than I/O of regular priority. That could prevent SBD abends. Unfortunately, as the NetWare kernel is no longer under development and just under Maintenance, this is not likely to be ported to NetWare.

I/O starvation. This shouldn't happen, but neither should 270ms I/O command latencies.

Labels: , , , , , , ,


Monday, October 15, 2007

Peer-to-peer sharing

One feature that has shown up in some applications and widgets lately has gained some traction internally. That is the concept of peer to peer sharing of disk space without going through all the pain of getting things approved and formally set up. The general idea is this one.

I want to share U:\SharedStuff\ApacheGroup\ to five other users. U: is my home directory, which is actually map-rooted so I don't see the top level directory. So I go to a web page and tell it I want to share this directory, to these people, for this long. Go.

It struck me that this sort of thing can be engineered with NetWare and OES. The key components are eDirectory, NSS, and NetStorage.

The web server takes the request and translates $Path into a real path by referencing the HomeDirectory attribute of the user who requested the share. Then, using LDAP it creates two objects:

A Group Object
  • Created and named dynamically
  • [AuxClass] Attribute with user-defined name
  • [AuxClass] Attribute with the creator
  • [AuxClass] Attribute with the expiry date
  • Since this is eDirectory, group memberships apply immediately rather than taking a logout/login cycle to refresh the access token like in MS networks.
A Storage Location Object
  • Created & named dynamically
  • Associated to the created group
  • Assigned to the specified users
  • This allows the share to show up in NetStorage
The web server sends a request to a file daemon that handles the actual trustee assignment.

There is a small constellation of maintenance tasks that also need to be created, such as a janitor process to deal with expirations, a helpdesk view to track who has what shares, a historic view to see what shares got deleted recently that suddenly need to be back RIGHT NOW, something to interface this with whatever disk or directory quota systems are in use.

The use of NetStorage allows WebDAV to be used as an access method, which allows the shares to be seen. The really brave may be able to leverage DFS to create actual directory structures reflecting the shares in the actual directories so drive mappings can be used; unfortunately I have no idea if a DFS database that large is a good idea.

Users would love this. No need to go through management to get a directory set up on the shared space. You just set up and go. Great for adhoc groups, or small private gatherings.

Unfortunately, this sort of share model is one that a lot of sys-admins are familiar with. If you've ever had a chance to examine the network of a small business with under 15 users, all of whom call themselves 'not that good with computers', you know what I'm talking about. This model of sharing is the one that Windows for Workgroups was designed for, and is still the default mode for plain old WinXP. Excessive use of peer to peer sharing like that can lead to one unholy mess, especially if a key person leaves (or in the case of the Windows example, one hard drive crashes hard).

If left unchecked, you can get whole business processes designed with the assumption that [username] will never retire. That already happens to an alarming extent, but this would make the dependency more invisible to those of us charged with making it all work again when it breaks. You can have shared spaces that are business critical to the company living 100% inside a user's self-managed space, and vulnerable to deletion on termination of that employee.

This is all part of the balance we as system administrators have to keep between end user functionality, and data protection. Desktop techs fight a constant battle to get users to save data on the server where it is backed up, and Novell puts out things like iFolder to help that whole thing become more invisible. We created shared directories to draw a big line between 'my stuff' and 'us stuff'.

That said, data-access habits are changing all the time. My own boss prefers to email a 150KB Excel spreadsheet to all of us, even though all of us have ready access a shared directory setup just for that. SharePoint integrates with Office to make the web-server look like a file-server. We still have to adapt with the times.

User-directed sharing is something I can see as highly desirable among the student population and faculty as well. Among staff, I'm less sure its a good idea outside of the 'trivial' personal use we're allowed.

Labels: , ,


Wednesday, September 26, 2007

OES2 release date

Just got out of the WebCast they had. First, the important stuff:

OES2 will be released on October 5th.
OES2-SP1 is targeted for mid-April, 2008.
AFP integration will be in SP1.

I sooooooooo hope they don't push SP1 past July. If that happens, my main migration of our cluster will have to be pushed to 2009. Ick. We're already running out of effective file-cache in 32-bit memory space. I need 64-bit to really give good performance. Hope hope hope.

A few other minor points:
  • Around the release of SP1, Prosoft and Condrey Consulting (Kanaka) will release an NCP client for Mac.
  • The clearing of throats next to a mic is a sign of someone who doesn't do a lot of work in front of mics.
  • OES2 is fully 64-bit optimized (on Linux)
  • They claim EVEN BETTER NSS performance on OES2. I hope to try that out, soon as I can figure out how to get SLES10/OES2-beta5 to talk to my SAN luns. It hates me.

Labels: , ,


Tuesday, September 25, 2007

OES2 Web-chat tomorrow

This isn't exactly widely spread, but here it is:

Open Enterprise Server 2 Live Webcast

Tomorrow, September 26th at 11AM PDT.

They'll be talking about all the spiffy thats in OES2, and some new info about code releases. I think this is the 'event' they mentioned a while back.

Labels: , ,


Tuesday, September 18, 2007

OES2: clustering

I made a cluster inside Xen! Two NetWare VM's inside a Xen container. I had to use a SAN LUN as the shared device since I couldn't make it work doing it just to a single file. Not sure what's up with that. But, it's a cluster, the volume moves between the two just fine.

Another thing about speeds, now that I have some data to play with. I copied a bunch of user directory data over to the shared LUN. It's a piddly 10GB LUN so it filled quick. That's OK, it should give me some ideas of transfer times. Doing a TSATEST backup from one cluster-node to the other (i.e. inside the Xen bridge) gave me speeds on the order of 1000MB/Min. Doing a TSATEST backup from a server in our production tree to the cluster node (i.e. over the LAN) gave me speeds of about 350MB/Min. Not so good.

For comparison, doing a TSATEST backup from the same host only drawing data from one of the USER volumes on the EVA (highly fragmented, but must faster, storage) gives a rate of 550 MB/Min.

I also discovered the VAST DIFFERENCES between our production eDirectory tree, which has been in existence since 1995 if the creation timestamp on the tree object is to be believed, and the brand new eDir 8.8 tree the OES2 cluster is living in. We have a heckova lot more attributes and classes in the prod tree than in this new one. Whoa. It made for some interesting challenges when importing users into it.

Labels: , , , ,


OES2-beta progress

As mentioned before, I have the OES2 beta. Right now I have two NetWare servers parked in Xen VM's on SLES10SP1. This is how it is supposed to work!

I haven't gotten very far in my testing, but a few things are showing. I managed to do a TSATEST-based throughput run of a backup of SYS. That's about a gig of data. Throughputs for just one stream to one of the servers was around 500 MB/min, which is passible and within the realm of real performance for slower hardware. The downside of that is that the CPU reported by "xm top" was around 45%, where the CPU reported in MONITOR was closer to 25%. That's way higher than I expected, but could be related to all the disk I/O ops. This I/O was to a file in the file-system, not a physical device like a LUN on the SAN (that comes later).

Now I'm trying to get Novell Cluster Services installed. I want to get a weensy 2-node cluster set up to prove that it can be done. I suspect it can, but actually seeing it will be very nice.

Labels: , , ,


Thursday, September 13, 2007

OES2: virtualization

I have the beta up and running. I have a pair of OES2-NW servers running in Xen on SLES10SP1. And it loads just spiffy. Haven't done any performance testing on it, kind of hard to really interpret results at this point anyway.

What I HAVE been spending time on is seeing if it is possible to get a cluster set up. Clusters, of course, rely on shared storage. And if it works the way I need it to work, I need multiple Xen machines talking to the same LUNs. It may be doable, but I'm having a hard time figuring it out. The documentation on Xen isn't what you'd call complete. Novell has some in the SLES10SP1 documentation, but the stuff in the OES2 documentation is... decidedly overview-oriented. This is the most annoying thing, as I can't just put my nose to a manual and find it.

So, looking for Xen manual. It has to be around somewhere. Google-foo failed me today.

Labels: , , ,


Monday, September 10, 2007

OES2 public beta is out

Jason Williams said so.

This looks to be Beta5. They released both the Linux and NetWare parts of it. The NW65SP7 overlay iso is 1.1GB in size. I sooooooooooooooooooooo gotta get DVD drives into my servers.

Rumor has it release is now mid-October. So who knows what's going on with the 'launch' on the 26th.

Labels: , ,


Friday, September 07, 2007

The mystery of the OES2 release date

Various sources have pointed at evidence that Novell will be launching OES2 on the 26th. As has been pointed out, "Launch" and "Release" are different things. And yet, and the same time rumor has it to "watch for events this Monday".

I don't know what to make of that.

It COULD be that the open beta will be out Monday. I have doubts about that, as that leaves very little time for reports to come back from the field for incorporation into OES2-release, presumably on or about the 26th.

It COULD be that it'll be released Monday, and the major PR push for launch will be two and a half weeks later. I have my doubts about that, Novell will be scooped by the likes of me as we put the new product to the test, but it could happen.

It COULD be that Monday is a red herring and Novell will announce a ship date on the 26th, and the opening of the beta. I put more stock into this possibility. The likes of me will swoop up the beta code, run it through its paces and send feedback about what we manage to break, for a presumed ship of OES in November or so.

Or it could be none of these. I guess we'll find out Monday or something.

Labels: , ,


Friday, August 31, 2007

Here's an interesting thing

Novell is putting together a Best Practices guide for migrating to OES2 from NetWare. Obviously this is OES2-Linux, as there is not much that needs migrating when going from OES-NW to OES2-NW. They're soliciting community input for the guides, and will be offering Cool Solutions reward points for contributions.

This is interesting. I know that the Novell Support Forum Sysops tend to build up their own micro guides based on problems people report in the forums, and this is a way to better formalize that. Some of the sysops have taken to using the Cool Solutions Wiki as a place to park boiler-plate answers and forward questioners to those pages. This is an interesting concept.

More interesting as OES2 isn't out yet, even in an open-beta form. Where are we going to get our experience from, eh? This implies that shortly we'll have at least an open beta to try out. I hope so.

I can't contribute much to this document because my main migration is contingent on AFP being eDir integrated, and they've said that'll not happen until probably SP1. If I do anything it'll be the eDir servers, and those are relatively easy migrations. DFS is the only sticking point for that.

Labels: , ,


Wednesday, August 29, 2007

Dynamic Storage Technology, more data

Two days ago Novell posted an AppNote on Dynamic Storage Technology, formerly known as 'shadow volumes'.

Setting up Dynamic Storage Technology with Open Enterprise Server 2

One thing I noticed right at the top of the article is a little blurb that reads:
This article was written for Novell Open Enterprise Server 2. Sign up here to be notified when the Novell Open Enterprise Server 2 open beta becomes available.
Which tells me that the public beta is probably pretty near, and that OES2 release will probably not be "end of Q3" like Jason Williams indicated a while back. I could be wrong, of course. As soon as I get the public beta code there is some serious testing I need to do.

Anyway, back to the article. This is a click-by-click guide for setting up DST. This includes screenshots, which are of the new iManager 2.7. Unsurprisingly, Novell re-themed the iManager interface. There is a gotcha on step 17, where you have to edit a local config file on the OES server to get it going, that would probably trip up most people trying to set up DST by going solely on looking at the UI.

This is a very good article describing it all. I recommend it!

Labels: , , ,


Friday, July 27, 2007

Novell news

Two Cool Blogs posts in the past few days have held some nice tidbits.

Jason Williams says that the Novell Client for Vista is due out mid August
, so long as a key defect registered with Microsoft gets fixed.

Jaimon Jose says that eDir 8.8 SP2 is also due out real soon. SP2 apparently involves some serious performance enhancements.

Both of these are technologies associated with the elusive OES2. We need the Client for Vista as soon as they can get it to us, so I'm not surprised they're considering releasing that independently of OES2. SP2 for eDir 8.8 is one thing I figure will be included in OES2 by default. As that's an independent product as well, having it release independently is nice. This means that two technologies that could be blockers for OES2 are finally being kicked into the real world.

In news unrelated to WWU at all, Bonsai, the next GroupWise version, seems to be getting closer to deployment. They're nearing 'code complete' and will soon start the Authorized Beta phase.

Labels: , ,


Wednesday, July 18, 2007

The OES2 push, what it means to me.

With the release of OES2 pushed to Christmas, or possibly BrainShare 2008, I'm in a hard spot. The magnitude of this migration means that I have one period a year I can pull that off, and that is the last week in August and the first two weeks of September. If I don't have code in that period, I can't migrate. Period.

As I learned at BrainShare this year, the Apple Filing Protocol stack on OES2-Linux is not eDirectory integrated. This is a project stopper for us, so we need that to be in place before we migrate. They quoted us, "Possibly SP1 timeframe, definitely not first-customer-ship, but don't hold us to it." They learned of the AFP problem at BrainShare and said they'd get right on it to try and get that in. That told me that summer 2008 would be the earliest I could expect to have the eDir integrated AFP stack.

Since I don't think Novell is planning on pushing OES2 ship to summer 2008, I suspect the AFP stack will be in with SP1. I consider it likely that OES2 SP1 will ship about the same time as SLE10 SP2. Which means I have real strong doubts that I'll be doing an OES2-Linux migration during next year's intersession. So we'll probably end up staying on NetWare for file-serving at least until 2009. In 2009 those NetWare servers may very well be in either an ESX or Xen virtual container, but it'll still be the 32-bit NetWare code doing the serving. That said, the web and print services (MyFiles, MyWeb, iprint) may move earlier, as they do not have the same AFP dependency.

Our storage needs on the WUF cluster are already pushing the boundaries of the 32-bit memory space. I'd be a lot happier of I could throw another 2 gigs of RAM at the file-servers in order to keep their cache-levels at a good spot. Can't do that on 32-bit NetWare, at least not while expecting improved performance. In 2009 we'll be managing anywhere from 12 to 18 terabytes of data on WUF, with a good chunk of it active. That is a situation that screams for 64-bit limits to memory space in order to provide zippy performance.

Thus, I am worried. Please, Novell. Ship at Christmas. It'll make my schedules look a LOT less grim.

Labels: , ,


Monday, July 09, 2007

More fun OES2 tricks

I had an idea while I was googling around a bit ago. This may not work the way I expect as I'm not 100% on the technologies involved. But it sounds feasible.

Lets say you want to create a cluster mirror of a 2-node cluster for disaster recovery purposes. This will need at least four servers to set up. You have shared storage for both cluster pairs. So far so good.

Create the four servers as OES2-Linux servers. Set up the shared storage as needed so everything can see what it should in each site. Use DRBD to create new block-devices that'll be mirrored between the cluster pairs. Then set up NetWare-in-VM on each server, using the DRBD block-devices as the Cluster disk devices. You could even do SYS: on the DRBD block-devices if you want a true cluster-clone. That way when disk I/O happens on the clustered resources it gets replicated asynchronously to the DR site; unlike software RAID1 the I/O is considered committed when it hits local storage, SW RAID1 only considers writes committed when all mirrored LUNs report the commit.

Then, if the primary site ever dies, you can bring up an exact replica of the primary cluster, only on the secondary cluster pair. Key details like how to get the same network in both locations I leave as an exercise for the Cisco engineers. But still, an interesting idea.

Labels: , , , ,


Friday, July 06, 2007

Getting creative with Blackboard

I had me an idea yesterday. One of those ideas that I'm not sure is a good one, but wow does it make a certain kind of sense.

We, like all too many schools run Blackboard as the groupware product supporting our classrooms. There is an opensource product out there that also can do this, but we're not running it. That's not what this post is about.

First a wee bit of architecture. Roughly speaking, Blackboard is separated into three bits. The web server, the content server, and the database. The web-server is the classic Application Server that is what students and teachers interface with. The web server then talks with both the content server and database server. The content server is the ultimate home of all things like passed in homework. The database server glues this all together.

Due to policies, we have to keep courses in Blackboard for a certain number of quarters just in case a student challenges a grade. They may not be available to everyone, but those courses are still in the system. And so is all of the homework and assorted files associated with that class. Because of this, it is not unusual for us to have 2 years (6-7 quarters) of classes living on the content server, of which all but one quarter is essentially dead storage.

One of the problems we've had is that when it comes time to actually delete a course, it doesn't always clean up the Content associated with that course. Quite annoying.

This is a case where Dynamic Storage Technology would be great. Right now our Blackboard Content servers are a pair of Windows servers in a Windows Cluster. It struck me yesterday that this function could be fulfilled by a pair of OES2 servers in a Novell Clustering Services setup (or Heartbeat, but I don't know how to set THAT up), using Samba and DST to manage the storage. That way stuff that is accessed in the past, oh, 3 months would be on the fast EVA storage, and stuff older than 3 months would be exiled to the slow MSA storage. As the file-serving is done by way of web-servers rather than direct access, the performance hit by using Samba won't be noticable as the concurrency is well below the limit where that becomes a problem. Additionally, since all the files are owned by the same user I could use a non-NSS filesystem for even faster performance.

Hmmmm......

The problem here is that OES2 isn't out yet. Such a fantastical idea may be doable in the 2008 intersession window, but we may have other upgrades to handle there. But still, it IS an interesting idea.

Labels: , , , ,


Dynamic Storage Technology

Novell Connection Magazine has an article up right now that describes DST, formerly known as Shadow Volumes. I've talked about them before, both last year around this time (6/15/07, and 6/26/07) and back at BrainShare (TUT205). So, I've been following this.

As said previously, this'll not work for NetWare, just OES-Linux. From what I understand you can host migration volumes on NetWare, but the server presenting the unified view of the storage has to be OES-linux.

Anyway, on with the article.

Labels: , , ,


Tuesday, July 03, 2007

OES2: pushed several months

A new post up on Cool Blogs shows where OES2 is sitting:

http://www.novell.com/coolblogs/?p=921

To quote from one of the comments by the author:
There will be a public beta. It might take couple of months more for a public beta.
This blows my schedule. From the sounds of it, they're looking at a Christmas or possibly BrainShare 2008 release. We'll have to put NetWare inside ESX server instead of a Xen paravirtualization. Due to this delay, and the presumed SP1 schedule, chances are now much worse for Novell to make the summer intersession 2008 migration window.

Crap.

Labels: , , ,


Thursday, June 28, 2007

Novell Client for Vista, in public beta

Announced in Cool Blogs.

On the Beta Page.

Downloads.

Documentation.

Still no word on when OES2 is coming out. This is somewhat disheartening, as I had heard at BrainShare that the OES2 release would be simultanious with the Novell Client for Vista release. At this point, it is looking like an August release for OES2, which soooo blows my schedule.

Labels: , , ,


Monday, June 18, 2007

New Novell releases

Looks like Novell pushed several products out the door late Friday:
No OES2. No Client for Vista. But SP1 gets me closer to where I need to be.

NCL 2.0 is interesting since the current version is v1.2. The full rev of the version suggests that they made marked improvements to it. I have noticed that they offer both 32bit and 64 bit versions of the client, which I don't think 1.2 had.

Labels: , , ,


Wednesday, June 13, 2007

Still waiting

Any day now OES2 will come out.

Any day now.

Any day now I'll get a paravirtualizable NetWare and will be able to run it through its paces.

Any day now I'll get to try and figure out how Xen virtualization of NetWare interacts with an HP MSA1500cs.

But not today.

Labels: , , , ,


Monday, April 09, 2007

OES2, not until 2008

The revelation about AFP in OES2 (how did I miss that?) is the last nail. OES2 will not be rolled out to the WUF cluster until August/September 2008 at the earliest. We'll be staying on NetWare until then. We have a couple of Mac labs and at least one class track that depends on AFP support. CIFS is not an option for many reasons.

So we will be waiting until Novell catches up. In the mean time our 'utility' servers could possibly move, but there aren't many of them. The other two NDS servers, and the server that ATUS hosts their Ghost images on. We're already running OES on one of the NDS servers. The other two are the SLPDAs for our environment, and also house the DFS databases.

Labels: , ,


Friday, April 06, 2007

OES2 and AFP

If you're an instituion of education like us, chances are real good you have PowerBooks and other Mac hardware desiring access to your NetWare/OES servers. It turns out I missed something while at BrainShare. OES2-Linux does NOT have an eDir integrated AFP stack like NetWare does. Whoa.

Details here: http://www.novell.com/coolblogs/?p=836

That's Jason Williams posting, and he is the Project Manager to OES. I spoke with him for a while during Meet the Experts regarding the concurrency concerns we have with OES in general. He has been on Novell Open Audio several times, so I know his voice. He was run downright ragged during BrainShare, which is very not surprising due to his level of oversight of a major product.

He's asking for people who need AFP to talk to them about it. The details of what he's looking for is in the posting I linked above. I've sent in my own impressions, and I've forwareded it to internal people who are Very Concerned about how Mac interacts with our NetWare servers.

Labels: , , ,


Monday, April 02, 2007

Concurrency, again

I performed another test on Friday for concurrency. I had 9 workstations performing an iozone througput test. Each machine ran 20 threads each processing against a 15MB file, for a total working set size of 2.7GB which fits into the server's RAM. The results from the workstations were pretty consistant. The workstations had all of 384MB of RAM in them, and the number of IOZone threads running caused significant page-faulting to occur. Which has the side effect of minimizing client-side caching. The workstations were connected to the core by way of 100MB ethernet, so maximum theoretical speeds are 12.5MB/s.

Some typical results, units are in KB/s

Initial write
11058.47
Rewrite
11457.83
Read
5896.23
Re-read
5844.52
Reverse Read
6395.33
Stride read
5988.33
Random read
6761.84
Mixed workload
8713.86
Random write
7279.35

Consistantly, write performance is better than read performance. On the tests that are greatly benefitted by caching, reverse read and stride read, performance was quite acceptable. All nine machines wrote at near flank speed for 100MB ethernet, which means that the 1GB link the server was plugged in to was doing quite a bit of work during the Initial Write stage.

What is perhaps the most encouraging is that CPU loading on the server itself stayed below the saturation level. Having spoken with some of the engineers who write this stuff, this is not surprising. They've spent a lot of effort in making sure that incoming requests can be fulfilled from cache and not go to disk. Going to disk is more expensive in Linux than in NetWare due to architectural reasons. Had the working set been 4GB or larger I strongly suspect that CPU loading would have been significantly higher. Unfortunately, as school is back in session I can't 'borrow' that lab right now as the tests themselves consume 100% of the resources on the workstations. Students would notice that.

The next step for me is to see if I can figure out how large the 'working set' of open files on FacShare is. If it's much bigger than, say, 3.2GB we're going to need new hardware to make OES work for us. This won't be easy. A majority of the size of the open files are outlook archives (.PST files) for Facilities Management. PST files are low performance critters, so I don't care if they're slow. I do care about things like access databases, though, so figuring out what my 'active set' actually is will take some figuring.

Long story short: With OES2 and 64 bit hardware, I bet I could actually use a machine with 18GB of RAM!

Labels: , , ,


Thursday, March 29, 2007

Why cache is good

One of my post-brainshare tasks is to rebenchmark some OES performance. I did a benchmark series back in September and the results there weren't terribly encouraging. I learned at BrainShare that a mid-December NCPSERV patch fixed a lot of performance issues, and I should rerun my tests. Okay, I can do that.

One test I did underlines the need to tune your cache correctly. Using the same iozone tool I've used in the past, I ran the throughput test with multiple threads. Three tests:

20 threads processing against a separate 100MB file (2GB working set)
40 threads processing against a separate 100MB file (4GB working set)
20 threads processing against a separate 200MB file (4GB working set)

The server I'm playing with is the same one I used in September. It is running OES SP2, patched as of a few days ago. 4GB of RAM, and 2x 2.8 P4 CPU's. The data volume is on the EVA 3000 on a Raid0 partition. I'm testing OES througput not the parity performance of my array. Due to PCI memory, effective memory is 3.2GB. Anyway, the very good table:
                        20x100M        40x100M        20x200M
Initial write 12727.29193 12282.03964 12348.50116
Rewrite 11469.85657 10892.61572 11036.0224
Read 17299.73822 11653.8652 12590.91534
Re-read 15487.54584 13218.80331 11825.04736
Reverse Read 17340.01892 2226.158993 1603.999649
Stride read 16405.58679 1200.556759 1507.770897
Random read 17039.8241 1671.739376 1749.024651
Mixed workload 10984.80847 6207.907829 6852.934509
Random write 7289.342926 6792.321884 6894.767334
The 2GB dataset fit inside of memory. You can see the performance boost that provides on each of the Read tests. It is especially significant on the tests designed to bust read-ahead optimization such as Reverse Read, Stride Read, and Random Read. The Mixed Workload test showed it as well.

One thing that has me scratching my head is why Stride Read is so horrible with the 4GB data-sets. By my measure about 2.8GB of RAM should be available for caching, so most of the dataset should fit into cache and therefore turn in the fast numbers. Clearly, something else is happening.

Anyway, that is why you want to have a high cache-hit percentage on your NSS cache. This is also why 64-bit memory will help you if you have very large working sets of data that your users are playing on, and we're getting to the level where 64-bit will help. And will help even though OES NCP doesn't scale quite as far as we'd like it to. That's the overall question I'm trying to answer here.

Labels: , , , ,