Snapshotting the full DB and transferring from primary DC to secondary DC every time the DR agent or replica fails seems like an overkill. Any idea if enhancement on way to avoid the full db transfer incase of a failure of replica cluster ?
The current WAN replication implementation can't resume if the WAN agent or node the WAN agent has connected to fails. That we plan on fixing outright so that any node in any cluster can fail without breaking replication as long as both clusters stay up.
For an ACID database with distributed transactions it comes down to log shipping. If the the slave goes down we can buffer logs and then ship/replay indefinitely, but then the question is at what point is it faster to just ship a snapshot and a new log.
It probably makes sense to buffer on disk the same amount of data as a snapshot or some low multiple of it. Beyond that shipping a snapshot wins. Right now the value isn't tunable although I personally would like it to be.
I think many of the WAN replication uses cases will have largish amounts of cold data that doesn't need to be shipped. so shipping partial data sets is another possible enhancement.
I don't know if that has been explored in the context of ACID relational databases. Cassandra uses a tree of hashes to diff large data sets and then sync the portions that don't match, but ACID (and distributed ACID) makes that a little more involved. Cassandra is up front about associating a timestamp with everything all the way down to the storage layer. VoltDB isn't doing that with transaction ids although it is something I think will be useful in the long run.