Eric here again. Recently I had an interesting issue with one of my customers that I caught on accident while looking for something else related to a different problem. When combing through the event log I found the following error in the event logs:
After the other issue was fixed, I started to look into my new finding. In this scenario Domain.com is a relatively newly built domain that trusts the domain that MemberServer resides in, however the domain that MemberServer lives in doesn’t trust domain.com – so it’s a one way incoming trust.
After seeing this, the first thing that I wanted to do was to see how many other boxes had this error, so I downloaded EventComb (it’s part of the Account Lockout Tools). From there, I added all newly built servers within the last few months that used the customer’s new server template, checked System, checked Error, and then input the Source and the actual Event ID that I was interested in as seen below:
From there I clicked search and found 9 servers that were getting this error ever since the trust was established with domain.com.
Now that I’ve identified any additional machines, I wanted to run PSgetSID (it’s part of the PSTools) against them.
Command:
Result:
PSGetSID Memberserver
S-1-5-21-578956882-3542531243-2375499820
PSGetSID Memberserver1
S-1-5-21-578956882-3542531243-2375499820
PSGetSID Memberserver2
S-1-5-21-578956882-3542531243-2375499820
PSGetSID Memberserver3
S-1-5-21-578956882-3542531243-2375499820
PSGetSID Memberserver4
S-1-5-21-578956882-3542531243-2375499820
PSGetSID Memberserver5
S-1-5-21-578956882-3542531243-2375499820
PSGetSID Memberserver6
S-1-5-21-578956882-3542531243-2375499820
NOTE: In this case I chose to use PSGetSID Memberserver to get the Local SID vs. PSGetSID DomainMemberserver$ which returns the Domain SID.
After running PSGetSID against all of the boxes, I saw that they all had the same LOCAL SID (for local user account security IDs). Now if you were to run the other PSGetSID command mentioned in the note above, or if you were to open up LDP on the domain and find those same servers, the SID would be different than what we see in the chart above; that’s because what we see with the other command and what we see in LDP is the DOMAIN SID. Every machine on the domain has a local SID and a domain SID, even DC’s, but in the DC’s case, its by design that the local SID for all DC’s are the same , where for member servers they should be different.
Anyhow, so the next thing that I did was get onto a DC in the Domain.com domain and look at the ObjectSID of that DC (which I probably didn’t need to even do). The local SID of the DC was S-1-5-21-578956882-3542531243-2375499820 and the ObjectSID was S-1-5-21-578956882-3542531243-2375499820-1000.
So right there we see the correlation and why the error occurred. The local SID of the 9 or 10 machines on the trusted domain has the same SID as the Domain SID on the trusting domain and the error was generated because of the potential security implications. There are actually numerous scenarios that you could contrive where this could cause an issue, one of which Dean Wells describes in the following TechEd Sessions from the 17:45 minute mark to the 30:00 minute mark: http://www.msteched.com/2010/Europe/SIA320. His scenario is very similar to mine, but instead of it being a child domain, in my case it was a separate forest where they put a one way forest trust in place. Also in my case there wasn’t actually any impact to the customer, though obviously there could be in any number of scenarios.
I originally saw this presentation at TechReady10, but this is the first time I’ve had an actual case that relates somewhat to the scenario depicted in Deans session. So the next question is: How do we fix it? Well there’s a few different ways I suppose, but option 1 (depending on the functions/roles of the server) or option 2 is likely the best bet, though they very well might be ok with option 5.
- Remove the machines with the duplicate SIDs from the domain, SYSPREP them to generate a new SID and re-join the domain (if you can).
- Unfortunately these may all be production boxes that would have a big impact if the SYSPREP even worked. (it depends on what functions/roles are installed on the servers). Here’s a list of supported/unsupported sysprep scenarios when using the /generalize switch: http://technet.microsoft.com/en-us/library/cc722158(WS.10).aspx
- Build new boxes that have been SYSPREPed and migrate services.
- Rebuild the trusting domain (domain.com)
- Usually this would not be an option, but if the environment is small or newly built, it’s still an option. Also this option is flawed because it still leaves those 9 or 10 servers on the trusted domain still having duplicate local SIDs, still leaving security holes and access among those machines when using local user accounts.
- Break the trust.
- This option brings no value at all, and it’s somewhat pointless, but it would get rid of the errors in the member servers’ event logs. 😉
- Leave it and live with any possible security risks or issues that might occur.
In the end, it’s a very interesting problem and fortunately it didn’t cause them any sort of issues, but the fact of the matter is that it *could* cause issues, and I personally would try to mitigate that risk.