Error Restoring a Subscriber DB in Merge Replication
According to Microsoft’s BOL restoring a subscriber database involved in merge replication is no problem as long as the database you are restoring is within the retention period of the publication. The documentation is located at http://msdn.microsoft.com/en-us/library/ms152497.aspx
All sounds great and I took that into account when designing a Disaster Recovery plan for a client who heavily relies on Merge Replication to keep remote sites and their central corporate database in sync.
When I went to perform a test case against this strategy, I was met with errors when I restored a 1 day old backup of the subscriber database. The 2 errors were
1. The merge process failed because it detected a mismatch between the replication metadata of the two replicas, such that some changes could be lost leading to non-convergence. This could be due to the subscriber not having synchronized within the retention period, or because of one of the replicas being restored to a backup older than retention period, or because of the publisher performing more aggressive cleanup on articles of type download-only and articles with partition_options = 3. (Source: MSSQL_REPL, Error number: MSSQL_REPL-2147199363)
2. The common generation watermark is invalid at this replica since it does not exist or metadata for changes not yet propagated may have been cleaned up. (Source: MSSQLServer, Error number: 21800)
The architecture of their publication is a mixture of download only and bidirectional articles as well as dynamic filters (partition_options=3) using override of HOST_NAME(). This is a very reasonable publication architecture as the download only articles contains data which can only be changed at the corporate application, such as pricing for their services and tax rates. The partition_options=3 is a replication optimization when using filters that can only go to a single subscriber, such as a SiteID in a table that can belong only to one site. Seems very reasonable to me
We opened a case with Microsoft and I was met with the following response
“If you have download only articles or you are using partition_options=3 you cannot restore to a previous backup if any data has changed since that backup was taken”
That is not documented anywhere in the above BOL link for restoring subscriber database involved in Merge Replication. Their only work around was a re-initialization which is what we are trying to avoid at all costs by coming up with a DR plan in the first place.
Not sure what the end result will be as I have a call with an escalation engineer today as this is HUGE for recoverability. Stay posted for the final decision but I wanted to make everyone aware of this behavior.
Lesson to all readers, take the time to test out your architecture and design decisions regardless of what the official documentation says is possible!!
Great Post Chad! Looking forward to finding out what the resolution is!