Greetings all, it’s Eric here again, wait nope…this is the first time posting to CB5! I’m a guest blogger on the Crandall brothers’ site today and I wanted to talk about an interesting case that I had regarding the ADWS service not starting on most of the DC’s at one of my customers’ hub site. I want to cover what kind of impact that can have, so I’ll talk about what ADWS does, requirements of ADWS, how to troubleshoot the issue, and some workarounds to fix the issue.
First of all, I’ll give you a high level overview of ADWS. ADWS is the Active Directory Web Service and it’s a new component that shipped with Windows 2008 R2 and is installed by default when adding the ADDS or ADLDS roles. It’s a web service interface for managing your Active Directory Domains, database Instances, or database mounting tools . One great thing is that all management is done over a single (non configurable) port, TCP 9389. Also, just as a side note, this service runs independently of IIS.
Due to the fact that the Active Directory Module for PowerShell uses/requires ADWS to function, a lot of things can be rendered useless if those services don’t start. Some examples of that are things like the AD-Recycle Bin (though you can still un-delete objects via LDP), the Active Directory Administrative Center, PowerGUI, certain client applications, or anything else that uses PowerShell on the back end that manages the directory.
So, as far as requirements for ADWS, they are as follows:
< 2008 R2
- 2003 or 2008 DC’s with the Active Directory Management Gateway Service
- .NetFramework 3.5 SP1 (need that to be able to install the ADMGS)
- Depending on the OS that ADMGS is being installed, hotfixes from the following knowledge base articles:
- KB969166
- KB969429
- KB967574
>= 2008 R2
- 2008 R2 OS
- Event Log Service. It’s a hidden dependency, but ADWS will not start without the “Windows Event Log” service being started
Both
- If certificates are involved, then ADWS needs to use a Server Authentication certificate for encryption purposes to be able to start. They key here is encryption, as you’ll see in the update at the bottom of the page.
In my customer’s case, they had all of the basic requirements met, to include a “valid” Server Authentication certificate, but their ADWS service still didn’t start. They got the following errors:
- When attempting to start the service they got “Error 1067: The process terminated unexpectedly.”
- When rebooting, they saw ADWS Event ID 1002 in the ADWS Logs that said: “Active Directory Web Services could not initialize its endpoints. A networking error could have occurred.”
OK, now what?? Now we turn on diagnostics logging for ADWS and see what’s going on.
To do this we need to modify the Microsoft.ActiveDirectory.WebServices.exe.config file found in the %Windir%ADWS directory. You’ll need to add the following lines into the <AppSettings> section. Be sure that its between the <appSettings> and </appSettings> section boundaries…
<add key=”DebugLevel” value=”Info” />
<add key=”DebugLogFile” value=”c:windowsdebugadws.log” />
Valid values for the DebugLevel value are:
0 – No logging
1 – Error (this logs critical errors only)
2 – Warn (this logs warning events as well as error events) – Recommended value to use unless you need full tracing
3 – Info (verbose)
Use strings rather than numbers, so just to be clear, type “Info” between the quotes instead of “3” for example. Once this is done, you’ll see some new events trigger in the ADWS Event log, and then you’ll see the ADWS.log start to populate with diagnostics info.
Note: So you don’t gather too much data, just turn logging on, reproduce your issue by trying to start the service, wait for it to fail, and then turn logging back off.
When I did that I found the problem as seen in the snip below:
Well, again, the DC had a server authentication certificate, so it should have worked, so I compared the certificate store of the DC’s where ADWS was failing to the one that was working. The difference was the OCSP Signing Cert… The three that were failing had the OCSP cert, and the one that was working did not.. The domain controllers that were unable to start the ADWS service also had the OCSP Responder role installed on them as well (this isn’t recommended BTW). So I did a quick test by deleting the OCSP Signing cert and then trying to start the service…voila! It worked…until you rebooted the server or manually re-enrolled to pick up a new OCSP Signing cert and then tried to restart the service.. As a temporary work around what I did was delete the OCSP Signing cert, started the service and then re-enrolled, telling the customer that if they rebooted to follow my work around so everything was functional again.
Later on, I sent the customer two other more permanent work around that were better, but still not great. They are as follows:
- Move OCSP Responders off of the DCs, so the OCSP Signing Certs go away. This is the best case scenario as it is not recommended to collocate the directory service and OCSP role.
- Modify the “OCSP Response Signing” certificate template to add encryption to the certificates purpose. This can be done by going to the template and modifying the Request Handling.
In the end, the service is supposed to use the first “valid” server authentication certificate that it finds on startup, but I’ve found that in at least my one customers case, this isn’t true. This appears to be an unintended behavior in the certificate selection process for the ADWS service and there’s no way that I know of to manually select what it uses.
Note: In my customers case, they copied and modified the default template. Using the default template, I’m not able to reproduce this issue. Once I find the discrepancy in the modified template, I intend to update this blog entry.
UPDATE (4/1/2011):
I went back to visit one of my customers (the same one that had the issue that I wrote this blog about) and just in catching up, they mentioned that a while back, the Active Directory Web Service wasn’t starting on any of their DCs. This occurred after a major PKI environmental change. Essentially they cutover from managing their own PKI infrastructure, to being moved under a larger entity. In doing this, they had to get new certificates for all of their DC’s. I never got a call about this though because this customer has a secret weapon, “SuperG”. SuperG figured out the root cause out after finding this blog, and it pointing him down the right path.
When turning on Debug logging, he saw the exact same error:
StartService: couldn’t start WCF (invalid argument), System.ArgumentException: The certificate ‘CN=DomainController1.Contoso.com’ must have a private key that is capable of key exchange. The process must have access rights for the private key.
However this time, there was no OCSP Signing Cert, only a DC Cert, so that pretty much pointed straight to the problem, he just needed to figure out what was wrong with the cert and how to correct it for future certificate requests. After looking into it, he found that it was a “type 2” key, so it had a “signature” key type and not a “type 1” key, an “exchange” key type . He then requested a new cert, with the correctly selected option (seen below), and then everything was good after that.
If you aren’t sure what kind of key type your DC Cert has, you can check using the certutil command. The first thing that you’ll want to run is:
certutil –verifystore my >c:CertutilVerifyStoreMy.txt
From there you’ll need to review the output and find your current DC certificate. In my customer’s case, there was only one certificate on the DC, so it should have been “Certificate 0”, however, they had 20 other archived certificates that showed up in the dump, and the current/only certificate that was in the personal store ended up being Certificate 4. Now that I knew which certificate it was, I could go gather some more verbose data about that certificate to try to figure out what was wrong with it. To do that you’d run the following command, changing the 4 to whatever the number is for the certificate that you’re interested in:
certutil –v –store my 4 >c:CertutilVerboseStoreMy4.txt
That will give you a detailed output for that certificate. What I was interested in was the key type that I mentioned earlier. That information is located in the output file, however it’s not called key type, you’ll need to search for KeySpec. When I did a search for it, I found the following, which as noted above is not what we want:
KeySpec = 2 — AT_SIGNATURE
It should be:
KeySpec = 1 — AT_KEYEXCHANGE