Consider the following sequence of events. Server 1, 2, and 3 are up; server 3crashes; server 1 and 2 form a new group; server 2 crashes. Now as we want to tolerate networkpartitions correctly, we forced server 1 to fail. However, this is too strict. If server 1 stays aliveand server 3 is restarted, server 1 and 3 can form a new group, because server 1 must have per-formed all the updates that server 2 could have performed. The rule in general is that twoservers can recover, if the server that did not fail has a higher sequence number, as in this case itis certain that the new member has not formed a group with the (now) unavailable member in themeantime. We will incorporate this improvement in our directory service in the near future.
Нетривиальный корнейкейс: система заставляет Server 1 упасть, хотя он мог бы подождать Server 3 и нормально восстановиться... Решение очень элегантное: смотреть sequence numbers. Если у неупавшего сервера sequence number выше, он гарантированно видел все апдейты и может безопасно пересобрать группу