Wednesday, September 03, 2008

EVA4400 + FATA

Some edited excerpts of internal reports I've generated over the last (looks at watch) week:
Key points I've learned:
  • The I/O controllers in the 4400 are able to efficiently handle more data than a single host can throw at it.
  • The FATA drives introduce enough I/O bottlenecks that multiple disk-groups yield greater gains than a single big disk-group.
  • Restripe operations do not cause anywhere near the problems they did on the MSA1500.
  • The 4400 should not block-on-write the way the MSA did, so the NetWare cluster can have clustered volumes on it.
The "Same LUN" test showed that Write speeds are about half that of the single threaded test, which gives about equal total throughput to disk. The Read speeds are roughly comperable, giving a small net increase in total throughput from disk. Again, not sure why. The Random Read tests continue to perform very poorly, though total throughput in parallel is better than the single threaded test.

The "Different LUN, same disk-group," test showed similar results to the "Same LUN" test in that Write speeds were about half of single threaded yielding a total Write throughput that closely matches single-threaded. Read speeds saw a difference, with significant increases in Read throughput (about 25%). The Random Read test also saw significant increases in throughput, about 37%, but still is uncomfortably small at a net throughput of 11 MB/s.

The "Different LUN, different disk-group," test did show some I/O contention. For Write speeds, the two writers showed speeds that were 67% and 75% of the single-threaded speeds, yet showed a total throughput to disk of 174 MB/s. Compare that with the fasted single-threaded Write speed of 130 MB/s. Read performance was similar, with the two readers showing speeds that were 90% and 115% of the single-threaded performance. This gave an aggregate throughput of 133 MB/s, which is significantly faster than the 113 MB/s turned in by the fastest Reader test.

Adding disks to a disk-group appears to not significantly impact Write speeds, but significantly impact Read speeds. The Read speed dropped from 28 MB/s to 15 MB/s. Again, a backup-to-disk operation wouldn't notice this sort of activity. The Random Read test showed a similar reduction in performance. As Write speeds were not affected by restripe, the sort of cluster hard-locks we saw with the MSA1500 on the NetWare cluster will not occur with the EVA4400.

And finally, a word about controller CPU usage. In all of my testing I've yet to saturate a controller, even during restripe operations. It was the restripe ops that killed the MSA, and the EVA doesn't seem to block nearly as hard. Yes, read performance is dinged, but not nearly to the levels that the MSA does. This is because the EVA keeps its cache enabled during restripe-ops, unlike the MSA.
One thing I alluded to in the above is that Random Read performance is rather bad. And yes, it is. Unfortunately, I don't yet know if this is a feature of testing methodology or what, but it is worrysome enough that I'm figuring it into planning. The fastest random-read speed turned in for a 10GB file, 64KB nibbles, came to around 11 MB/s. This was on a 32-disk disk-group on a Raid5 vdisk. Random Read is the test that closest approximates file-server or database loads, so it is important.

HP has done an excellent job tuning the caches for the EVA4400, which makes Write performance exceed Read performance in most cases. Unfortunately, you can't do the same reordering optimization tricks for Read access that you can for Writes, so Random Read is something of a worst-case scenario for these sorts of disks. HP's own documentation says that FATA drives should not be used for 'online' access such as file-servers or transactional databases. And it turns out they really meant that!

That said, these drives sequential write performance is excellent, making them very good candidates for Backup-to-Disk loads so long as fragmentation is constrained. The EVA4400 is what we really wanted two years ago, instead of the MSA1500.

Still no word on whether we're upgrading the EVA3000 to a EVA6100 this weekend, or next weekend. We should know by end-of-business today.

Labels: , , , ,


A history of browsing

With Google Chrome now out and about, it got me thinking about my own browsing habits.

I first heard about this new 'www' thing back in college. I had been on the Internet for a couple years by that point, but http-free. Telnet and FTP were my friends. And so was Gopher. The very first browser I ever used was NCSA Mosaic, which was running on a DEC graphical station in one of the computer labs. It was a whonking big piece of software, so I didn't use it much. But use it I did.

Then someone installed a Netscape Navigator version to one of the CompSci machines and that somehow changed things. If I'm remembering right, it was version 0.97, and had the, "pulsing throbbing N" in the upper right corner (in a readme.txt: "And remember, it may be spelled "Netscape", but it's pronounced, "Mozilla."). After the 1.0 version it changed to the 'comet over the earth' logo that would stay with Netscape for the next many years.

The first browser I installed on a machine I owned was a package that I've forgotten the name of. It was a rather cunning use of telnet and the lynx text browser to emulate a graphical browser. This was before my university allowed SLIP or PPP dialups, of course. Once they did that, I could get my own version of Netscape. And did so.

And there I sat for a long number of years. I flirted with Opera a few times. Tried out IE. But, in the end, I stuck with Netscape. Opera just didn't feel right, or render things the way I expected. IE was the evil Microsoft, and I avoided it where possible.

And then... Netscape Communicator got stale. It hung in version 4.7something for aaaaaages. IE started eating Netscape's lunch. And things just got too annoying. At work I started using Opera, version 4.0 IIRC. It worked for the most part, but was still unsatisfactory.

I stuck with Opera for maybe 3 months before moving over to... sigh.... IE. IE5.5 was at the time significantly better than the alternatives. I moved to it exclusively at home, finally ditching Communicator. I believe the reason I moved had to do with IE's 'security zone' architecture, which made a lot of sense for me. Certain sites I wanted to grant 'trusted' status to, and it would just work with no popups. For the rest of the internet I could set different settings. It worked great.

And then I heard about an open-source version of Netscape called Mozilla. I kept an eye on it for a while, waiting for it to become more stable. I installed it on the home Linux machine since obviously IE wouldn't work there. In time, Mozilla matured to the point where it was stable enough for me, and I figured out how to make its security features do what I wanted them to do. I think I formally moved everything to Mozilla shortly after the 1.0 release.

And there I stayed, right up to the point where Mozilla killed Mozilla in favor of Firefox (formerly Firebird). I dutifully switched. And over the course of the next year I got steadily more annoyed with Firefox. Some of which I complained about here. I don't remember how I found out about it, but I learned of the SeaMonkey project that was recreating the Mozilla experience in a fully community-supported way. And that's where I am now.

Unfortunately, SeaMonkey is beginning to look as dated as Communicator was. Firefox 3 may be less annoying than earlier Firefox versions, so I probably have to try it out. Opera is pretty good, and I've spent time using 9.5 already, but the plugin community is weak and it doesn't do exactly what I want when it comes to privacy settings. IE8 is out of the picture since it's Windows-only. And so is Chrome, though I hear that'll be changing.

Labels:


Tuesday, September 02, 2008

EVA4400 testing

Right before I left Friday I started a test on the EVA4400 with a 100GB file. This is the same file-size I configured DataProtector to use for the backup-to-disk files, so it's a good test size.

Sequential Write speed: 79,065 KB/s
Sequential Read speed: 52,107 KB/s

That's a VERY good number. The Write speed above is about the same speed as I got on the MSA1500 when running against a Raid0 volume, and this is a Raid5 volume on the 4400. The 10GB file-size test I did before this one I also watched the EVA performance on the monitoring server, and controller CPU during that time was 15-20% max. Also, it really used both controllers (thanks to MPIO).

Random Write speed: 46,427 KB/s
Random Read speed: 3,721 KB/s

Now we see why HP strongly recommends against using FATA drives for random I/O. For a file server that's 80% read I/O, it would be a very poor choice. This particular random-read test is worst-case, since a 100GB file can't be cached in RAM so this represents pure array performance. File-level caching on the server itself would greatly improve performance. The same test with a 512MB file turns in a random read number of 1,633,538 KB/s which represents serving the whole test in cache-RAM on the testing station itself.

This does suggest a few other tests:
  • As above, but two 100MB files at the same time on the same LUN
  • As above, but two 100MB files at the same time on different LUNs in the same Disk Group
  • As above, but two 100MB files at the same time on different LUNs in different Disk Groups

Labels: , , ,


Friday, August 29, 2008

Storage update

Both the EVA4400 and the EVA6100 parts arrived late Wednesday. Wednesday I got the EVA4400 partially unboxed, and finished it up Thursday. Got CommandView upgraded so we could manage the EVA4400, and thus lost licensing for the EVA3000. The 10/26 expiry date for that license is no problem, as the EVA3000 will become an EVA6100 well before then. Next weekend if the stars align right.

And today we schlepped the whole EVA4400 to the Bond Hall datacenter.

And now I'm pounding the crap out of it to make sure it won't melt under the load we intend to put on it. These are FATA disks, which we've never used so we need to figure it out. We're not as concerned with the 6100 since that's FC disks, and they've been serving us just fine for years.

Also on the testing list, making sure MPIO works the way we expect it to.

Labels: ,


Wednesday, August 27, 2008

Woot!

The EVAs are scheduled to deliver today! This means that we are very probably going to be taking almost every IT system we have down starting late Friday 9/5 and going until we're done. We have a meeting in a few minutes to talk strategy.

There was some fear that the gear wouldn't get here in time for the 9/5 window. The 9/12 window has one of the key, key people needed to handle the migration in Las Vegas for VMWorld, and he won't be back until 9/21 which also screws with the 9/19 window. The 9/19 window is our last choice, since that weekend is move-in weekend and the outage will be vastly more noticeable with students around. Being able to make the 9/5 window is great! We need these so badly that if we didn't get the gear in time, we'd have probably done it 9/12 even without said key player.

The one hitch is if HP can't do 9/5-6 for some reason. Fret. Fret.

Labels: , , ,


Monday, August 25, 2008

Dynamic partitions in Server 2008 and Cluster

It would seem, and I've yet to trace down definitive proof of this, that Windows Server 2008 Clustering still has the Basic Partitioning dependency in it. This limits Windows LUNs to 2TB, among other annoyances. Such as the fact that resizing one of those puppies requires a full copy onto a larger LUN rather than extending the one you already have. How... 1999.

Labels: , , , ,


Email sizes

The question has been raised internally that perhaps we need to reassess what we've set for email message-size limits. When we set our current limit, we did it to the apparent defacto standard for mail size limits, which is about 10 meg.

This, perhaps, is not what it should be for an institution of higher-ed where research is performed. We have certain researchers on campus that routinely play with datasets larger than 10MB, sometimes significantly larger. And these researchers would like to electronically distribute these datasets to other researchers, and the easiest means of doing that by far is email. The primary reason we have mail-servers serving the (for example) chem.wwu.edu domain is to have these folk with much larger message size limits. Otherwise, these folk would have their primary email in Exchange.

The routine answer I've heard for handling really large file sizes is to use, "alternate means," to send the file. We don't have a FTP server for staff use, since we have a policy that forbids the use of unauthenticated protocols for transmitting passwords and things. We could do something like Novell does with ftp.novell.com/incoming and create a drop-box that anyone with a WWU account can read, but that's sort of a blunt-force solution and by definition half of a half-duplex method. Our researchers would like a full duplex method, and email represents that.

So what are you all using for email size limits? Do you have any 'out of band' methods (other than snail mail) for handling larger data sizes?

Labels: , ,


Tuesday, August 19, 2008

IPv6 uptake

Not too long ago I asked the question about what our plans were about IPv6. While the telecom guys didn't actually laugh at me, it was clear the question was considered a bit silly. After all, we are the proud owners of a full out class B (140.160.0.0/16) so IPv4 address exhaustion is not something we're likely to run into very soon. Certainly not by 2014 when we should be 'out' of IPv4 address space on the internet. Will IANA repossess our 'unused' spaces? Don't know, probably not.

That said us moving to IPv6 will require a few things, none of them internal processes:
  • A bill by the State Legislature mandating IPv6 uptake by all State agencies. We're not subject to the already existing Federal rule.
  • Enough of the general internet is routing IPv6 that the IPv4-over-IPv6 tunneling causes enough headaches we need to move due to user revolts.
  • Some new widget, be it server tech or some kind of net-attached device, only supports IPv6 and we need to get it running.
Of course, if the powers that be here decided that it must be done, and our telecom people fail to talk them out of it, it could still happen.

Labels: ,


Monday, August 18, 2008

My favorite compiz plugin

Screenshot

I love that plugin. Ad-hoc screen captures. I use it all the freaking time. It has managed to engrain itself into my expectations about how computers should work to the extent that I started swearing at XP the other day since I couldn't do that. Such a simple thing, but my how effective it is at capturing a quick snippet of what I want. No more do I have to...
  1. PrtScrn the application I want
  2. Open Paint
  3. Paste the screencap
  4. Copy the section I need
  5. Create new canvas
  6. Paste the copied section
  7. Save to file
That's a lot of work. This widget? Much faster and easier.

Labels: ,


Enabling autokey auth in NTP on SLES10

The NTP protocol permits the use of crypto to authenticate clients and servers to each other, as well as between time servers. By default, SLES10 is set up to allow the v3 method of using symmetric keys, but not the v4 method that uses public/private keys. If you want to use the v4 method, this is the tip for you.

Background

By default SLES runs NTP inside a chroot jail. This can be changed from the YaST NTP config screen if you wish. This is a more secure method of running NTP. The chroot jail's root is at /var/lib/ntp/.

Additionally, ntp runs with an AppArmor profile loaded against it for added security.

Getting NTPv4 auth to work

There are 4 steps to get this to work.

  1. Copy the .rnd file to the chroot jail
  2. Run ntp-keygen
  3. Modify the AppArmor profile for /usr/sbin/ntpd to allow read access to the new files
  4. Modify the /etc/ntp.conf file to enable v4 auth.

Copy the .rnd file to the chroot jail

By default, there should be a .rnt file at /root/.rnd. If so, copy this to /var/lib/ntp/etc/.rnd. If there is no file there, one can be generated through use of openssl.

timehost:~ # openssl rand -out /var/lib/ntp/etc/.rnd 1

Run ntp-keygen

Change-directory to /var/lib/ntp/etc, and execute the following command:

timehost:~ # ntp-keygen -T

This will drop a pair of files in the directory you run it, so running it while in /var/lib/ntp/etc saves you the step of copying them to this directory.

Modify the AppArmor profile

This is done through YaST

  1. Launch YaST
  2. Go to the "Novell AppArmor" section, and enter the "Edit Profile" tool.
  3. Select "/usr/sbin/ntpd" and click Next.
  4. Click the "Add Entry" button and select File.
  5. Browse to /var/lib/ntp/etc/.rnd and click the "Read" permissions check-box, and click OK
  6. Repeat the previous two steps to add the two files created by ntp-keygen, named "ntpkey_cert_[hostname]" and "ntpkey_host_[hostname]".
    1. Note: AppArmor behavior changes between SP1 and SP2. In SP1 you can use the link files, in SP2 you need to specify the link targets.
  7. Click Done on the main Profile Dialog
  8. Agree to reload the AppArmor profile

Modify /etc/ntp.conf

The YaST tool for NTP doesn't allow for v4 configurations, so this has to be done on the command line. Open the /etc/ntp.conf file with your editor of choice, and insert the following lines before your "server" lines:

keysdir /var/lib/ntp/etc/
crypto randfile /var/lib/ntp/etc/.rnd

Then append the word "autokey" to the server and peer lines of your choice. At this point, you should be able to restart ntpd, and it will use authentication. This is a very basic NTPv4 configuration setup, but this should set the ground up for more complex configs.

Labels: , , , ,


This page is powered by Blogger. Isn't yours?