Global Catalog Occupancy

I had fun in some lab action the other day. We were testing a procedure with a number of Domain Controller that required them to be separated from the rest of the network for a while.

We created two new Domain Controllers, had them replicate for about 24 hours, sitting there settling, and started then with shutting them down, moving them into the separated test and booting them up. To our surprise, we were not able to log on to the DCs with our Domain Admins credentials (“Logon server unavailable..”).

Looking at the Domain Controller we had with them, the DCs registered a number of SRV records alright, but it became apparent that the “GC” records weren’t there. Looks like there was an issue with the Global Catalog. As in our environment, all Domain Controllers were Global Catalogs by default, building the GC should have happened over night, during our 24 hour wait time.

Recent events logged in Event Viewer verified our theses (we had to reboot and plug the DCs back into the lab environment with other DCs):

Event Type: Information
Event Source: NTDS Replication
Event Category: Global Catalog
Event ID: 1578
Date: Date
Time: Time
User: NT AUTHORITY\ANONYMOUS LOGON
Computer: Server Name
Description: Promotion of the local domain controller to a global catalog has been delayed because the directory partition occupancy requirements have not been met. The occupancy requirement level and current domain controller level are as follows.

Occupancy requirement level: 6
Domain controller level: 2

Aha – remembering that the Global Catalog is built off other Domain Controllers, we needed to find out what held our DC back from replicating all Directory Partitions necessary to become a GC. Apparently we were on Level 2, whereas the required level for a DC to advertise itself as a GC is 6.

The article http://technet.microsoft.com/en-us/library/how-global-catalog-servers-work(v=WS.10).aspx has a very good overview of the levels. Here’s a version with shortened descriptions:

Level Description
0 No requirement – just advertise without listing or replicating.
1 At least one read-only directory partition in the site has been added by the KCC (for replication).
2 At least one read-only directory partition in the site has been fully synchronized (synchronized = replicated).
3 All read-only directory partitions in the site have been added by the KCC (at least one has been fully synchronized).
4 All read-only directory partitions in the site have been fully synchronized.
5 All read-only directory partitions in the forest have been added by the KCC (at least one has been fully synchronized).
6 All read-only directory partitions in the forest have been fully synchronized. (Default level for a DC to advertise itself as a GC)

According to the event message, we’re stuck in 2 – that is: we have one or more NCs in our replication list and we have one or more NCs replicated successfully (in our own site). BUT – there are one or more NC missing in our rep list and one or more missing to replicate in. Hummm… now what? We’ve waited 24 hours for replication to settle and .. our lab isn’t too complex after all.

We could have messed with repadmin to see what’s going on and verify what NCs the forest has – but we took the easy way out: talk to the lab manager. According to them, one of the child domains in our lab forest was non-functional and all DCs were tombstoned by now. D’oh! Now that’s a good reason why we’re still waiting on an NC to be added to our list and replicated in. And we could have waited a lot longer…

The ultimate solution to this problem is: Fire up NTDSUtil and remove the no longer existing child domain from Active Directory. KCC would pick up on it and suddenly understand that there’s nothing else to find here. Eventually, it would proceed in levels and would start to advertise as a GC. But since it was not our lab and we were just guests, we figured we tell the GC to advertise itself anyway.

As mentioned above, the default value for advertisements is 6. Now there’s this registry key outlined in the article with which one can change the default level: “Global Catalog Partition Occupancy” in HKEY_Local_Machine\System\CurrentControlSet\Services\NTDS\Parameters. It shouldn’t exist, so creating it as a DWORD helps. As a value, we specified the level that we wanted it to start advertising in. Lazy as we are, we picked the level we are in now: 2. We changed this on both DCs that we wanted to separate.

A reboot of the Domain Controllers for good measure later, the DCs came up again – and they advertised GC SRV records in DNS. Once moved back into the separation, even logon was possible!

What our story’s morale?

  • Never trust a lab you didn’t set up yourself
  • You can influence as to when a DC considers itself “ready” as a GC
  • Not all levels are helpful in all situations. Level 0 might be interesting if you wanted to stage a GC beforehand, the levels before 5 are interesting if you have GCs that do not/can not talk to DCs outside their own Site.
  • Those GCs are still keen on replicating the missing NCs in – and given the chance, they will. This isn’t disabled by the Occupancy registry key.
  • Don’t choose the easy way out in production – the right thing to do is remove the orphaned/tombstoned domains off the forest.

No Comment