Get rid of those Lingering Objects
What are Lingering Objects?
Objects that exist on 1 or more DCs but not on others (how bizarre, the DCs are supposed to replicate all objects aren’t they?)
How does it happen?
Well, when you delete an object from AD, it is removed from general visibility and is marked with a Tombstone Flag. This flag is replicated to all DCs and sure enough, the object is removed from visibility on all DCs after full propagation.
After a period of time (Tombstone Lifetime [TSL] – 60 days default for forests that started with W2K. 180 days default if the first DC in a forest is W2K3 SP1), the garbage collection process hard deletes these objects.
Ok, scenario – what happens if a DC was unavailable for the period of the tombstone life and after this period, comes back online. Remember, it won’t receive the tombstone. Dadaaahhhhhh, it has an object that no other DC has. The object seems to be lingering about when it was supposed to be deleted.
Other ways this can happen is a System State Restore older than the TSL, promoting a DC using install from media (IFM) and significant time changes.
How do I prevent it?
Getting a new job isn’t a bad idea. If this isn’t possible, then enable Strict Replication.
Suppose there is a Lingering object on a DC that’s been offline for longer than the TSL. Bring it online. We know it’s going to have a few lingering objects. These objects are only replicated when there is a change. Suppose you change any attribute on one of these objects. This will force replication and replicate to all DCs! This object is then “reanimated” on the other DCs.
By enabling Strict Replication, a DC won’t accept an attribute change to an object that doesn’t exist in it’s naming context.
NOTE: Before enabling strict replication, ensure that all lingering objects have been cleaned up from the forest or you may have some significant replication issues.
The setting for replication consistency is stored in the registry in the Strict Replication Consistency entry in
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\Parameters. It should be a RegDWord with a value of 1 to enable it.
Another consideration – are you using virtualization snapshot software. This is NOT A BACKUP of Active Directory as it doesn’t take into consideration the role of InvocationID and USN Rollback can happen.
What is USN Rollback?
This is somewhat related to plastic surgery which allows someone with too much money to roll back time. Except, that someone is a DC in the hands of an incompetent IT Admin.
- InvocationID – ID of a DC
- USN – Number representing the last change that occurred originating from this DC
So, DC1 makes a change to User 1 and registers USN (change number) 5001. DC2 replicates this change. DC2 will now only replicate changes above DC1:5001 because it is already up-to-date to 5001.
Suppose that you “restore” DC1 from a Snapshot. Suppose the snapshot was made when the USN was at 4500. Question. Why does replication not take place when you create a new object on DC1.
You’ve got it! That’s USN Rollback.
Now, you have a new object on the restored DC not replicating to the other DCs. It’s a Lingering object.
So, when your VMWare admin says he’s found a new way to quickly backup AD, you’ll need to educate him. Snapshots are NOT backups.
Ok, how do I find Lingering Objects?
Use repadmin for this with the /removelingeringobjects /Advisory_Mode fo find these objects.
You need a good Source DC to start this. Basically, we are going to check to see if on a different DC there are additional objects in comparison to our source DC. An event 1946 will be created for each Lingering object identified.
And to remove them?
Basically, the same command as above, except don’t use the /Advisory_Mode switch. Event 1945 will be created for each object removed.
Using Repadmin /removelingeringobjects can be a nightmare in large organizations with many DCs. You need to run it from a source DC to each other writable DC individually, and then start again from a new source until each DC has been used as a source against all other DCs. i.e. N(2(N-1)). So, if you have 500 DCs, this is, erm, one moment, can it be 499’000 commands?
Isn’t there an easier way?
Yes, have a look at repldiag.exe. This can automate the process of removing Lingering Objects but requires connectivity to all DCs. It also supports a Test First, Run Later methodology. However, you only need to run it once and it’s done. You can find repldiag on codeplex here.
Simply [disclaimer goes here] run it in test mode, then run it for real as follows:
repldiag /removelingeringobjects /advisorymode
I hope this demonstrates the importance of ensuring that you keep an eye on Replication and also to potentially review your Backup strategy. If you can’t manage replication consistently to remote branch offices, rather don’t include a DC there.