Starwind iSCSI vs Microsoft NFS

Software-based VM-centric and flash-friendly VM storage + free version

Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)

User avatar
DavidMcKnight
Posts: 39
Joined: Mon Sep 06, 2010 2:59 pm

Tue Apr 12, 2011 9:25 pm

Well I spoke to Starwind support and I was told the problem is because of my Areca card. Startwind has a problem talking to Areca. The tech knew about the problem and said it was a Starwind issue. As soon as he saw my Areca card he had me down grade to Starwind 5.3.5. That picked up my speed, but I think it's causing some other issues. Since I'm in the process of assembling a new datastore I just, today, removed the Areca-1880 and replaced it with an LSI-9265. I'm seeing a dramic imporovemnt with Starwind 5.6 and LSI over anything I had before. I just wish someone at Starwind would have posted this info it would have saved me some time and money.

Before I put this new datastore into production I'll start up the NFS services again and see how it performs.
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Tue Apr 12, 2011 9:53 pm

Sorry, had missed "Areca" keyword in your very first post... So actually it's TWO concurrent things:

1) Lack of caching with StarWind by default (at the time NFS *is* cached by Windows Cache Manager like any ordinary network redirector at file level).

2) Areca slow writes and slow non-aligned access in general.

Please keep us updated with your new numbers. Thank you!
DavidMcKnight wrote:Well I spoke to Starwind support and I was told the problem is because of my Areca card. Startwind has a problem talking to Areca. The tech knew about the problem and said it was a Starwind issue. As soon as he saw my Areca card he had me down grade to Starwind 5.3.5. That picked up my speed, but I think it's causing some other issues. Since I'm in the process of assembling a new datastore I just, today, removed the Areca-1880 and replaced it with an LSI-9265. I'm seeing a dramic imporovemnt with Starwind 5.6 and LSI over anything I had before. I just wish someone at Starwind would have posted this info it would have saved me some time and money.

Before I put this new datastore into production I'll start up the NFS services again and see how it performs.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
DavidMcKnight
Posts: 39
Joined: Mon Sep 06, 2010 2:59 pm

Wed Apr 13, 2011 4:48 pm

So I've got Server 2008R2 SP1 running on the bare metal and it's only job is to run Starwind. I've got Intel's 16.1 drivers installed on my Server 2008R2 SP1 with no issues... I'm aware of. Again this physical server is not a VM. Here are the setting on my iSCSI server compared to what Aitor posted.
Aitor_Ibarra wrote: Advanced: Flow Control : Rx & TX Enabled
Advanced: Interrupt Moderation : Enabled
Advanced: Interrupt Moderation Rate : Adaptive
Advanced: IPv4 Checksum Offload: Rx & TX Enabled
Advanced: Jumbo Packet: 9014 Bytes
Advanced: Large Send Offload (IPv4) : Enabled
Advanced: Large Send Offload (IPv6) : Enabled
Advanced: Maximum number of RSS Processors : 16
How do you calculate the optimal value for this setting?
Advanced: Preferred NUMA node : System Default
Advanced: Priority & VLAN : Priority & VLAN enabled
Advanced: Receive Side Scaling : Enabled
Advanced: Receive Side Scaling Queues : 8 Queues
I have 2, I guess it was the default. How do you calculate the optimal value for this setting?
Advanced: Starting RSS CPU : 0
Advanced: TCP Checksum Offload(IPv4) : Rx & TX Enabled
Advanced: TCP Checksum Offload(IPv6) : Rx & TX Enabled
Advanced: Transmit Buffers: 512 (maybe I should increase)
How do you calculate the optimal value for this setting?
Advanced: UDP Checksum Offload(IPv4):Rx & TX Enabled
Advanced: UDP Checksum Offload(IPv6):Rx & TX Enabled
Advanced: Virtual Machine Queues: Disabled (I will be enabling them though)
This being an iSCSI server I don't think I need this enabled do I?
My iSCSI clients are vSphere 4.1 ESXi servers. Since these are proprietary to VMware I don't know how to check, let alone set, any of these Advanced setting for the Intel Nic. Yes I can, and have, set the Jumbo Packets. So if anyone has any experience with configuring Intel's advanced settings inside vSphere, I'd appreciate some directions.
User avatar
Aitor_Ibarra
Posts: 163
Joined: Wed Nov 05, 2008 1:22 pm
Location: London

Thu Apr 14, 2011 11:58 am

Hi David,

Thanks for the tip about 16.1 - I will have to try again.

I don't know what Intel's drivers are like for ESX. Perhaps they actually lock down the settings, or expose them only in config files or something. A lot of them are probably Windows specific anyway.

So you've gone to the new LSI 9265 - probably the world's fastest RAID card! Although I'm surprised at the bad results with the Areca 1880, given that the ROC is an LSI 2108 (same as what I have in my LSI 9280-4i4e's). The older Areca's I had an intel/arm ROC. Obviously, firmware nuances have a lot to do with performance, not just the ROC.

OK, I don't really know the definitive answers to your questions about the Intel settings...

Max number of RSS processors - I assume this is the max number of cpu cores in your system. Set to a lower number if you want some CPU cores to be reserved for other duties? If this is a dedicated server and Starwind itself isn't using all available CPU, then I'd let RSS run across every core.

Max RSS queues : I went with 8 because I have 4 cores and 2 threads per core (hyper-threading) - it was an educated guess. It did seem to improve performance from when I had it set to 4. Maybe it could go higher. The notes in the driver just warn of hugher CPU usage, but as this server is dedicated to iSCSI and I'm not seeing a cpu bottleneck, a higher number should be worth experimenting with.

VMQ: leave it off, there is no point in turning this on if you are running Starwind natively.

cheers,

Aitor
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Fri Apr 15, 2011 8:57 am

On-board I/O processor makes HUGE difference if production software is running on top of OS running on a bare metal. If you build a SAN with StarWind using your server as a backbone the only thing you need from your mainboard is proper DMA engine for SATA access. Your CPU is AGES ahead of any dedicated I/O RISC silicon.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
DavidMcKnight
Posts: 39
Joined: Mon Sep 06, 2010 2:59 pm

Sun May 01, 2011 5:34 am

Ok, I'm pretty much past this whole iSCSI vs. NFS. But I have some other question as part of what I discovered during this process...

There's a problem with Starwind and Areca, fine, but is the problem because I'm going Starwind -> Diskbridge -> Areca? So if I had done an ImageFile instead would I have had this problem? To get around this I have to recreate all my datastores, and I'm looking my options. Is it easier on Starwind to do an ImageFile target over a DiskBridge target? The way it looks in version 4.6, if I want to use any of the advanced features (HA or DDS) I have to use an ImageFile. To me, the people who want to use those features are pretty hardcore and speed would be very important. Isn't there a performance hit using an Imagefile? I just seems strange to me that HA still requires an ImageFile.
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Sun May 01, 2011 8:59 pm

Google a bit about Areca cards using "Areca problem" and "Areca performance" to find out who's having issues with this hardware as well.

There's no performance degradation when using image files (of any format, upcoming versions will support not only RAW but also VMDK and VHD as native format). Using formatted media it's much easier to align content properly. With RAWFS access (DiskBridge) it's not that easy so you can have border penalty (single I/O is using two stripe blocks).
DavidMcKnight wrote:Ok, I'm pretty much past this whole iSCSI vs. NFS. But I have some other question as part of what I discovered during this process...

There's a problem with Starwind and Areca, fine, but is the problem because I'm going Starwind -> Diskbridge -> Areca? So if I had done an ImageFile instead would I have had this problem? To get around this I have to recreate all my datastores, and I'm looking my options. Is it easier on Starwind to do an ImageFile target over a DiskBridge target? The way it looks in version 4.6, if I want to use any of the advanced features (HA or DDS) I have to use an ImageFile. To me, the people who want to use those features are pretty hardcore and speed would be very important. Isn't there a performance hit using an Imagefile? I just seems strange to me that HA still requires an ImageFile.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
DavidMcKnight
Posts: 39
Joined: Mon Sep 06, 2010 2:59 pm

Mon May 02, 2011 4:54 pm

I still have a hard time understanding how read\write requests that have to go through an ImageFile which, in my mind, looks like this: Starwind <-> Windows OS <-> RAID Driver <-> RAIDCard <-> HardDrive would be as fast as RAW which, in my mind, looks like Starwind <-> RAID Driver <-> RaidCard <-> HardDrive.

Anyway, so tell me this... I've been under the impression that VMware has some limitations with iSCSI. Something about 16 queues per iSCSI LUN and no more that 4 queues per VM on a LUN. So I've been trying to put no more that 4 VMs per iSCSI target. So this gives me many smallish iSCSI targets. What is the best practice to implement ImageFiles for my new Datastore design? If to use an ImageFile, it seems I have to create a RAID Volume, format it NTFS at the same block size as the RAID stripe size, then give it a drive letter in windows. Should I create a large RAID Volume and put many ImageFiles to the same drive letter? If, to get the best performance, I should still put only one ImageFile per drive letter, I'm afraid I'll run out of drive letters inside of my Starwind iSCSI server.
User avatar
Aitor_Ibarra
Posts: 163
Joined: Wed Nov 05, 2008 1:22 pm
Location: London

Tue May 03, 2011 2:43 pm

Hi David,

Re the drive letter issue: In Windows, you can mount a volume as a folder on another drive, so running out of drive letters isn't an issue. E.g. you could have three drives mounted as folders in D:\ ...

D:\drive1\image1.img
D:\drive2\image2.img
D:\drive3\image3.img

This does create a dependency on the parent volume, so in the above example, if the disk holding D goes down, you can't access the the other volumes. However you can mount them as folders on another volume, and then change that volume's drive letter to D...

It's also possible not to use drive letters, and use windows internal volume ids instead but not sure if Starwind can store paths to the images volume ids.

cheers,

Aitor
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Thu May 05, 2011 8:47 pm

No, it does not work as you think. User-land app issues I/O request thru it's subsystem (in our case it's StarWindService.exe -> kernel32.dll -> ntdll.dll). Then I/O request gets routed thru file system driver (NTFS in most of the cases and I/O gets cached by Windows Cache and Memory Manager). Then goes disk class driver (something providing RAW access to the disk volume). Then goes StorPort.sys/ScsiPort.sys + RAID miniport driver *or* monolithic port driver like atapi.sys doing backdoor service for ATA devices. Then goes RAID controller hardware, RAID card cache and then we reach disk drive silicon and disk drive cache. This is route for image file-based access. For direct volume access you can remove NTFS driver and Windows Cache and Memory Manager comletely. As we have OWN cache implemented we don't involve Windows Cache and Memory manager as all file access is in write-thru unbuffered mode. So NTFS impact is close to missing.

I don't think it makes any sense... I'd personally stick with large thin-provisioned volume per VM instead of having to split all the data into multiple chunks scattered among volumes.
DavidMcKnight wrote:I still have a hard time understanding how read\write requests that have to go through an ImageFile which, in my mind, looks like this: Starwind <-> Windows OS <-> RAID Driver <-> RAIDCard <-> HardDrive would be as fast as RAW which, in my mind, looks like Starwind <-> RAID Driver <-> RaidCard <-> HardDrive.

Anyway, so tell me this... I've been under the impression that VMware has some limitations with iSCSI. Something about 16 queues per iSCSI LUN and no more that 4 queues per VM on a LUN. So I've been trying to put no more that 4 VMs per iSCSI target. So this gives me many smallish iSCSI targets. What is the best practice to implement ImageFiles for my new Datastore design? If to use an ImageFile, it seems I have to create a RAID Volume, format it NTFS at the same block size as the RAID stripe size, then give it a drive letter in windows. Should I create a large RAID Volume and put many ImageFiles to the same drive letter? If, to get the best performance, I should still put only one ImageFile per drive letter, I'm afraid I'll run out of drive letters inside of my Starwind iSCSI server.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Thu May 05, 2011 8:48 pm

Absolutely true!

Yes, we can provide access to GUID-enumerated disk volumes. But they do look ugly and I've personally never heard about anybody running out of drive letters or UNIX-style mount points :)
Aitor_Ibarra wrote:Hi David,

Re the drive letter issue: In Windows, you can mount a volume as a folder on another drive, so running out of drive letters isn't an issue. E.g. you could have three drives mounted as folders in D:\ ...

D:\drive1\image1.img
D:\drive2\image2.img
D:\drive3\image3.img

This does create a dependency on the parent volume, so in the above example, if the disk holding D goes down, you can't access the the other volumes. However you can mount them as folders on another volume, and then change that volume's drive letter to D...

It's also possible not to use drive letters, and use windows internal volume ids instead but not sure if Starwind can store paths to the images volume ids.

cheers,

Aitor
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
User avatar
DavidMcKnight
Posts: 39
Joined: Mon Sep 06, 2010 2:59 pm

Fri May 06, 2011 1:23 pm

I've been brain storming this. I think what I've decided to do is create one massive Volume on my RAID. Since I'm putting all my VMs on RAID6, I'm just going to put many smaller ImageFiles on, what Windows will see as, one NTFS drive. I don't see any advantage in creating lots of volume sets on my raid card and then putting one ImageFile per RAID volume.
User avatar
anton (staff)
Site Admin
Posts: 4008
Joined: Fri Jun 18, 2004 12:03 am
Location: British Virgin Islands
Contact:

Fri May 06, 2011 9:08 pm

Your approach is exact way to go. Just make sure possible RAID set re-build time is reasonable. RAID6 is less vulnerable during re-build compared to RAID5 but it's still suffering from noticable performance degradation. Could affect StarWind performance thus effectively preventing hypervisor from say supporting VM migration feature being critical for production environment.
DavidMcKnight wrote:I've been brain storming this. I think what I've decided to do is create one massive Volume on my RAID. Since I'm putting all my VMs on RAID6, I'm just going to put many smaller ImageFiles on, what Windows will see as, one NTFS drive. I don't see any advantage in creating lots of volume sets on my raid card and then putting one ImageFile per RAID volume.
Regards,
Anton Kolomyeytsev

Chief Technology Officer & Chief Architect, StarWind Software

Image
Post Reply