I think first we should concentrate on getting a clean shutdown and power-up working first.So what *exactly* you'd like to see in terms of automatic recovery for absolutely broken shutdown
Cheers, Rob.
Moderators: anton (staff), art (staff), Max (staff), Anatoly (staff)
I think first we should concentrate on getting a clean shutdown and power-up working first.So what *exactly* you'd like to see in terms of automatic recovery for absolutely broken shutdown
robnicholson wrote:LSLF storage re-created and same test carried out. StarWind v8 does not recover from a planned and controlled power down and power up. You have to manually mark one of the nodes as synchronised and then it jumps back into action.
Cheers, Rob.
Hmm sort of but the devil is in the detail here. Depends what you mean by all cluster nodes were down? I'm not talking here about stopping & starting StarWindService alone here. I talking about taking the cluster nodes down by shutting down the Windows server via the normal mechanism.OK, so this is how it works now. After all cluster nodes were down and now are up StarWind is ...
In my lab, yes this is the case because there are *no* iSCSI initiators connected to the targets and the system has been left for many minutes before shutting down. Therefore nothing has been written or could be written for 5+ minutes. I therefore assume any residual sync & caches has been flushed?... trying to see do any of the nodes have recent data (log with transactions kept on every node) *AND* that data is in integral state (caches flushed properly so there are no partial transactions).
Sorry that does not happen. I will try and generate a video from the lab.If YES then StarWind does automatic sync and powers up virtual LUNs so they can serve customers.
Apologies for block quotes but I'd missed Simon's response and it's worthy of further consideration. My suspicion here is that the different between one node going down & up (in which case auto-sync appears to work) and both nodes going down and up (in the case of clean shut down/power-up by UPS when power fails) is that when the SANs come back up, the first one that starts first is unable to see the second node so assumes an unknown state.arinci wrote:I'd like to propose my idea, just my 2 cents, on how to manage this kind of situation. You wrote that each node sharing the storage is able to log the time of the last update. In this way each node knows, after having received relevant info from the others participants, which node keeps the last update of the shared storage...this is good...but what happen if all nodes are switched off, in different moment and then they are restarted? I'm thinking of two different scenario:anton (staff) wrote:I've discussed the situation with developers and it looks like there's a major confusion as my information was a bit outdated:
1 - all the nodes are switched ON at the same time: in this way they are able to talk each other and rebuild the shared storage properly
2 - nodes are switched ON with different timing: in this way when the Starwind service start, it's not aware if the local image keeps the last update or no...so the only good thing to do is wait the other node(s) before making available the HA storage via ISCSI interface. What to do if a server refuse to start? Some manual intervention is needed to mark as synchronized one of the available participants of the shared storage.
My suggestion is the following, in case that a full automatic restore is desired in every case:
1) When the Starwind service start it waits for all the participants to be ready, for a defined amount of time (user configurable, let's say 5 minutes).
2) If all the nodes became available in time, then the shared storage can be properly synced and started. In case that not all the nodes became available in time then the available nodes select the node that keeps the last update...
I'm aware that this kind of automatic logic may be dangerous for certain use, for this I suggest to have an option to enable/disable this feature.
What do you think? It's just a dream or do you think there is a chance that this dream became true?
Simon
robnicholson wrote:Hmm sort of but the devil is in the detail here. Depends what you mean by all cluster nodes were down? I'm not talking here about stopping & starting StarWindService alone here. I talking about taking the cluster nodes down by shutting down the Windows server via the normal mechanism.OK, so this is how it works now. After all cluster nodes were down and now are up StarWind is ...
In my lab, yes this is the case because there are *no* iSCSI initiators connected to the targets and the system has been left for many minutes before shutting down. Therefore nothing has been written or could be written for 5+ minutes. I therefore assume any residual sync & caches has been flushed?... trying to see do any of the nodes have recent data (log with transactions kept on every node) *AND* that data is in integral state (caches flushed properly so there are no partial transactions).
Sorry that does not happen. I will try and generate a video from the lab.If YES then StarWind does automatic sync and powers up virtual LUNs so they can serve customers.
Cheers, Rob.
robnicholson wrote:I've produced a video to demonstrate this issue as I'm not sure you believe me!
https://dl.dropboxusercontent.com/u/366 ... Wind01.wmv
Timeline:
Not riveting viewing but hopefully this shows the problem. Having to carry out the manual step at 08:20 is the crux of the problem. The above sequence could have easily been triggered by a power outage with controlled shutdown and power-up via the UPS system.
- 00:00 Two node StarWind cluster already exists with no storage
- 00:04 Create Storage1 as 5GB of thick HA storage on both nodes
- 00:36 Wait for 2nd node to synchronise - after about 3 minutes, sync finished (separate query over sync time). I'm watching it reading storage1.img
- 02:28 Synchronised according to the display - watch resource meter until read of storage1.img has finished
- 03:05 Showing that there are no targets connected except those used by StarWind itself for sync and heartbeat
- 03:20 Switch to 2nd node and check everything looks okay
- 03:23 Clean shutdown of UKMAC-SAN91 (2nd node)
- 03:48 Node #1 reports it's lost connection to node #2
- 04:00 Clean shutdown of UKMAC-SAN90 (1st node)
- 04:27 Restart both nodes (recording paused here to wait for power-up)
- 04:55 Logon to node #1 - errors about StarWind connections but expected as service is on delayed start-up
- 06:40 Wait for the services on both nodes to start
- 07:40 Connect to both nodes
- 07:50 Both nodes offline, not accepting connections (not that anyone has ever connected in this demo)
- 08:20 One has to manually mark node #1 as synchronised (the manual step!)
- 09:00 Waiting for full synchronisation to occur - WHY full? You saw - nothing was written ever to this storage and it was in sync at 02:28. It took another 3 minutes roughly to sync (I paused recording) and that was for just 5GB. Consider how long a 5TB device would have taken... I'm guessing because I had to to a manual mark as synchronised.
Also, it demonstrates pretty slow initial replication and re-replication after manual marking as synchronised. My lab isn't very fast but the disks & network can transfer faster than 3 minutes for 5GB.
So I hope this shows that the cluster was in a clean state with empty caches before show down (nothing was every written to the storage to be accurate) and that automatic resync DID NOT OCCUR.
Cheers, Rob.
robnicholson wrote:PS. Apologies for teaching granny to suck eggs but I just want this to work Also, appreciate split-brain is a risk but I think the above circumvents this.
Hello everyone, I'm opening again this discussion to understand if there is something new on this argument: I'd like to know a bit more from Starwind regarding the next steps and, possibly, a release date for the new feature we talked about.anton (staff) wrote:
That's fine. Let us do some homework here and I hope we'll keep everybody happy.
robnicholson wrote:PS. Apologies for teaching granny to suck eggs but I just want this to work Also, appreciate split-brain is a risk but I think the above circumvents this.