AD Issues & Solutions
AD Issues & Solutions
Active Directory
Troubleshooting:
Problems, Methods
and Solutions
Gary L. Olsen
WTEC
Global Services Engineering
Hewlett -Packard
[email protected]
2004 Hewlett-Packard Development Company, L.P.
The information contained herein is subject to change without notice
Books
http://www.phptr.com/title/0131467581
http://WindowsOnProLiant.com
Authors: Gary Olsen, Bruce Howard
Publisher: Prentice Hall
ISBN: 0131467581
Publishing Date: October, 2004
3
!
"
Define the Problem
Work the Problem
Collect Data
Action Plan
4
Define the Problem
Is there one?
Define the Problem
Events are NOT the problem!
What exactly is failing?
Define the Scope
One or multiple Machines
One or more users
Single or multiple sites?
Single or multiple DCs? (check logon server env. Variable)
Members of same or multiple groups?
Group Policy applied (event 1704 in app log)
Time of day
5
Work the Problem
Is there one?
Impact to the business
Urgency
Resource allocation
After Defining the Problem, are there events in the event log
related to the failure and time of failure?
When did you notice it? What conditions?
Tie the times to the events, other log entries
Can the problem be replicated?
Start narrowing the variables
Identify a savvy user with the problem who can help
6
Collect Data
MPSReports
Free download (see slide notes)
All Event Logs in .txt, evt format
Netdiag, DCdiag, Net Accounts, Net Share,
Repadmin
DCpromo Logs
GPOtool, GPresult
Run it on all affected machines
Other
Verbose Logging
Get status report from Replication Monitor
7
The Action
Define the Problem
Plan
Talk to all admins involved
Who is affected? (computers, users)
When? Is it reproducible?
Area
Replication
Security
Name Resolution
Group Policy
FRS/DFS
What data needs to be collected?
Analyze the Data
Errors, warnings, etc
Solution
Google
www.eventid.net
Microsoft KB
Test Solutions
8
Action Plan Example
Overview: Determine cause of hang of ATL-DC1
Summary:
analyzing perfmon logs
implemented contingency plan
identified support path
11
DNS Resolver Configuration
Workstations, Servers, DCs point to NS for
their domain
No reason to point to other name servers like ISP,
other internal NS, as additional DNS servers
Std primary zone name server points to self
for DNS
ADI Zone
Only one NS points to self for DNS
Other NS point to single primary
12
DNS Server Configuration
Server Properties
Forwarding
Zone Transfers
Restricted Servers
Enable Scavenging
Delegation
Correct server, IP address?
Resolver (Tcp/IP Properties)
13
ADI Server Configuration
Best Practice: Select single ADI DNS Server as
the Primary.
Fixed in Windows Server 2003
Dont put Std Secondary zones on DCs!
14
Qtest DNS Configuration
Delegations
CPQCorp.net
Qtest.CPQCorp.net
15
Qtest DNS Configuration
Forwarders
CPQCorp.net
Qtest.CPQCorp.net
17
Quick Checks
Use Monitor tab in DNS snap-in
Test Recursive, simple queries
Ping
Domain name
Server Name, address
NSLookup
nslookup gc._msdcs.qtest.cpqcorp.net
Delete bad records, restart Netlogon svc
18
Common Problem: Missing sub zones for
SRV records
First DC creates subzones for SRV records
_msdcs, _Sites,_TCP,_UDP
If they arent there
Check Tcp/ip properties for DNS server
Dynamic Updates on
Physical connectivity to DNS server
Bonus Question: What if you delete these zones?
19
Problem: Promotion of 2nd DC fails:
Unable to contact domain
Just promoted a new DC to create a new forest,
company.com.
Promoting 2nd DC in that domain yields an error
saying it cant contact the domain.
How do you troubleshoot this?
20
Problem: Replication Broken in child domain,
DNS errors
w2k.net
corp.com
= Delegation
= Forwarder
NA.w2k.net EU.w2k.net
21
Troubleshooting
AD Issues
Tools
MPSReports
Tips
Account Lockout
Problem Solving
23
Windows XP as a Tool!
Adminpak for Win2K
Adminpak for Windows 2003
GPresult.exe
RSOP, ACL Filters, Policy Priority List,
Group Policy Management Console
Save GPO settings, User application
Repadmin (new features)
Remote Desktop
24
NTDSUtil (Windows 2003)
Authoritative restore
Roll AD back to previous date
Entire AD, tree or object
Improved in Windows Server 2003 (with LVR)
DSRM Mode
Domain management
Create Application Partitions
Pre-create domains
Metadata cleanup
Remove Server, domain, site objects
Roles
FSMO Management: See, change all roles
Semantic database analysis
Can repair checksum, inconsistency errors
DSRM mode
Set DSRM Password or account password
25
ADSIedit.exe Demo
GUI much like Users & Computers
snap-in/Advanced features.
Graphical view of AD.
Like LDP.exe but:
Easier to browse.
Can modify attribute values
Shows ALL attributes
Dont confuse with Users & Computers!
26
LDP.exe Demo
Takes time to set up:
Connect
Bind
View Tree
Enter DN to start (blank for default)
Exposes attributes quickly, easy to see.
Only lists DEFINED attributes
Faster than ADSIedit no GUI to traverse.
LDAP searches.
Can delete and modify, but not as easy as ADSIedit.
Can execute remotely.
27
MPS Reports
Demo/Exercise Using MPS Reports for AD
Troubleshooting, Health Check
28
Tip: Tracking Down a GUID
Problem: GUID referenced in event log. What is it?
Solution: (Q216359)
LDP search for the GUID
Search: <guid=5d718c23-253b-4310-94f0-9d6c62bea3ad>
Search.vbs in Support tools
Orphaned Object (will kill replication)
Turn up NTDS diagnostic logging
Internal processing
Replication
Find object (GUID) in event logs
Delete it via LDP
29
Account Lockout
Problem: User changes password, account gets
locked out
Causes:
User logged in to multiple machines with old credentials
(VPN from home, etc.
User has mapped drives with old creds
Watch those lab machines!
30
Tool: Acctinfo.dll
Produces new User property tab
32
Downloads
Account Lockout Best Practices Whitepaper at
http://www.microsoft.com/downloads/details.aspx?FamilyI
D=8c8e0d90-a13b-4977-a4fc-
3e2b67e3748e&DisplayLang=en.
The AcctInfo.dll and LockoutStatus.exe tools can be
downloaded from
http://www.microsoft.com/downloads/details.aspx?FamilyI
D=7af2e69c-91f3-4e63-8629-
b999adde0b9e&DisplayLang=en.
33
Active Directory Problem Diagnosis
(Class exercise)
34
Troubleshooting
AD Replication
Golden Rule
Top 10 Things that Break Replication
Quick Checks
Tools
Common Problems
Problem Solving Exercises
36
Top 10 Things
That Break Replication
10. Failure by System Architect to Design the topology
properly
This isnt rocket science!
9. Failure by Administrator to understand Replication
8. Failure by Administrator to monitor AD
7. DNS problems
Duplicate connection objects
Bad Cname record
SRV Records not registered
6. KCC doesnt clean up (by design in Windows 2000)
37
Top 10 Things
That Break Replication
38
Quick Checks
Who isnt replicating with who?
MPS Reports (DS)
Repadmin
/Showreps (Win2K)
/replsum bydest bysrc /sort:delta
Event Logs
Map out topology ( HP OpenView )
39
Tools: HP OpenView for Windows
Demo
40
Quick Checks
Force Replication (snap-in)
Returns different error
Create user, site on broken DC
See if Inbound/outbound replication working
ReplMon Status Report
Not included in MPS Reports
41
Check Cname DNS Records
In root _msdcs zone (only), alias record mapping
DCs FQDN to its server GUID.
Only one record per server.
Delete duplicates.
Match GUID in alias record to GUID reported by
Repadmin /showreps.
If in doubt, delete DCs Alias record(s) and re-start
netlogon on broken DC to re-register .
Ping <guid>._msdcs.domain.com
OR Use DNSLINT
42
Replication Monitor
Status report (replication health report)
List of all GCs, BHS, Trusts
List of all replication errors on all DCs in domain
Changes not replicated
Replication partners
Force push/pull replication
Group Policy Object status
FSMO validation
43
Common Replication Problems
Event 1311
Physical
GC or DC cant be contacted (see event 1722)
Network Failure
Improper routing
Changes in routing, addressing, etc.
Logical
Sites w/o site links
Site Link Bridges covering dial up networks
Site Links not Interconnected
Site link A-B and C-D (no common site)
Preferred BHS offline
44
Common Replication Problems
Logical (contd)
BHS swamped
Undersized
Too many satellite sites to single BHS (fixed in W2k3)
Site Link Schedule
DNS Lookup Failure
KCC didnt clean up properly (Windows 2000)
1311 Repair
Look at the topology (HP OpenView)
Review the design and implementation
Poor design = lots of 1311s!
Are 1311s forest wide, domain wide, or site specific?
Repadmin /istg
46
Common Replication Problems
1311 Repair
Look at the topology (HP OpenView)
Review the design and implementation
Poor design = lots of 1311s!
SLB only in fully routed networks
Preferred BHS: Just say NO! (or upgrade to w2k3)
47
Common Replication Problems
1772 RPC Server is unavailable.
Physical connectivity.
DNS.
DefaultIPSiteLink
Failure to treat this as a normal site link after topology is implemented
All Sites in here + in other Links (forgot)
Treat it as any other link
Rename, dont delete just in case
Causes Replication to break, poor performance
Test later
Time Skew (must be within 5 minutes
W32tm sync (Windows 2000)
/config /syncfromflags:DOMHIER (Windows 2003)
48
Lingering Object Problem
The problem
Replication broken or DC/GC offline
>tombstonelifetime (TSL)
Loose behavior (Windows 2000 pre-sp3)
Allows old object to be propagated back to the AD
Security Problem (possibly)
Kills replication chokes on orphaned objects
GC: propagates read-only objects (cant delete)
49
Lingering Object Fix
The Fix:
Tight behavior (default in Windows Server 2003 clean
install)
Stops replication until the object is deleted.
Q317097
Cleanup:
Repadmin /removelingeringobjects
See Repadmin /experthelp
50
If all else fails, try demoting
Normal or Manual Demotion of a DC then
repromote to clean up problems
Microsoft loves this!
Only if problem is isolated to one DC.
If replication isnt working, demotion wont work.
Can manually demote a DC in Win2K SP3 and
Windows 2003.
DCPromo /forceremoval Then clean up the AD
KB 332199
51
Replication Problem Diagnosis
(Class exercise)
Boston
100 200
100
Miami
Denver
100 200
Atlanta
52
Replication Problem Diagnosis
(Class exercise)
53
Sites Linking Level 1 and Level 2 hubs Desired Replication Path
LAX NYC
LAXLINK
NYCLINK
LAX NYC
SLC
OMH
SEA
ABL DEN
AUB
FTC ORM ATL
EUG PIT COL
TOP
MON
TUS WDC
ABLLink SEALink PRO SPT CLE
NOR
SBD RCH
BOI
OPL
LIT CIN
LAS RAL
1
2nd Tier
Hubs
55
Replication Problem Diagnosis
(Class exercise)
Where do we start?
56
DCPromo
Troubleshooting
Basics
Quick Checks
Tools
Common Problems
Problem Solving Exercises
58
DCPromo Basics
Able to contact a functional existing DC.
DNS must be working
Dcpromo /replicationsourceDC=
NLTest /test:DCPromo (tests DNS)
Creates/moves Machine acct (DC1$)
UserAccountControl Attribute set
4096 (1000 hex) = Workstation/Server
532480 (82000 hex) = DC
59
Quick Checks
DNS Set up properly?
TCP/IP properties set to correct DNS
_ zones exist
Proper Credentials?
Is the DC a DC?
Inbound/Outbound Replication
SYSVOL and NetLogon shares
If no, then no Outbound Replication
UserAccountControl = 532480 (82000 hex)
60
DCPromo tools
%windir%\debug
DCpromo.log (appended)
DCpromoui.log (renamed)
Set verbosity on dcpromoui.log
Netdiag /v
DCDiag /v
Directory Service Event Log
61
Common Problems
Missing Sysvol and NetLogon shares
KB 257338 good but
Create Manual connection object
Force Replication
Works well for any connection failure
Force KCC to Check Replication Topology
Repadmin /add and /sync
Adds a low level link and syncs across it
Works very reliably
See my article on the CD
62
Common Problems
Errors accessing the machine account (DC1$)
Q250804
If server is in a workgroup, join the domain, then DCpromo
(cuts the troubleshooting in half)
Account is moved.
Error: DC1$ not found, access denied, etc.
Credentials of account running Dcpromo
Source must have security policy applied to itself.
Q250874
Dcdiag /test:MachineAccount
/test:FixMachineAccount
/test:RecreateMachineAccount
63
Poor WAN Performance
Install From Media (W2k3)
Source Replica AD from Media in DCPromo
GCs or DCs (Replica only).
No initial replication from a DC.
After initial load, replicates changes.
Unattended Answer File Support:
ReplicateFromMedia
ReplicationSourcePath
64
DCPromo Problem Diagnosis
(Class Exercise)
65
File Replication
Service (FRS)
67
Top FRS Issues
Morphed Directories
Upgrade to SP3 + FRS hotfix
Dont use Authoritative Restore
Excessive replication
Get FRS friendly AV software
Upgrade to SP3 +FRS hotfix
68
Top FRS Issues
Improper design & implementation
Topology
Staging area space
USN Journal size
Insufficient hardware
Missing FRS objects in Active Directory
Usually easiest to demote/promote
69
#1 FRS Issue
Accidental Bulk Delete of Sysvol or DFS trees
Oops!
Solutions:
Authoritative Restore/ Disaster Recovery
Watch who you give Admin rights to!
70
FRS Troubleshooting Tools
Sonar
Ultrasound
FRSDiag
Ultrasound Help File
71
File Replication Service
Journal Wrap errors + Journal Sizing
Q292438 Troubleshooting Journal_Wrap Errors on SYSVOL and DFS Replica
Sets
Q315070 Event 13568 Is Logged in the File Replication Service Event Log
Staging Directory Sizing & Placement
Q329491 Configuring Correct Staging Area Space for Replica Sets
Q265085 Moving FRSStagingPath Requires Non-Authoritative Restoration
(RTM,SP1,SP2)
Q291823 How to Reset the File Replication Service Staging Folder to a Different
(SP3)
Staging Directory accumulation
Q307777 Possible Causes of a Full File Replication Service Staging Area
Repairing AD Objects used by FRS
Q296183 Overview of Active Directory Objects That Are Used by FRS
Q312862 Recovering Missing FRS Objects and FRS Attributes in Active
Directory
File and Folder filters in FRS
Q229928 Design Decisions, Defaults and Behavior for FRS File and Folder
Filters
72
File Replication
FRS entries in the registry
Service
Q221111 Description of FRS Entries in the Registry
FRS Debug Log Configuration
Q221112 NT File Replication Service Log File Size and Verbosity
Morphed or Conflicted Directories
Q328492 Folder Name Is Changed to "FolderName_NTFRS_<xxxxxxxx>"
QuadZero Service Assertion
Q328800 superceded by Q811217
Excessive Replication of FRS content
Q279156 The Effects of Setting the File System Policy on a Disk Drive or Folder
Q284947 Antivirus Programs May Modify Security Descriptors and Cause
Excessive
Q282791Disk Defragmentation Causes Excessive FRS Replication Traffic
Q315045 FRS Event 13567 Is Recorded in the File Replication Service Event
Log
Sharing violations
Kixstart, real-time antivirus programs, interactive logon scripts, dynamic data
73
FRS Problem Diagnosis
(Class Exercise)
74
Group Policy
Troubleshooting
Basics
Quick Checks
Tools
Common Problems
Problem Solving Exercises
Resources
76
GPO Basics
Removing the GPO will modify the dynamic
registry settings, return to pre-GPO state
Watch for
Inheritance Blocking
No Override
ACL filtering
Loopback
77
GPO Basics
GPO only applies to domain/OU where user or
computer account reside
Wont apply to OU with just groups
Multiple GPOs cost in logon performance
Test them before implementing
78
Quick Checks
Policy isnt getting applied
Computer, user in domain or OU policy is defined for?
ICMP disabled or blocked in the network
Filtered, Overridden, Blocked, disabled?
Not refreshed yet?
GPUpdate (replaces secedit /refreshpolicy)
FRS or Replication Problem
Look for event 1704
79
New! Gpresult.exe
Use the XP/2003 version
Run on XP client in the domain
Built-in
Gpresult /V (verbose)
Returns:
Filtered GPOs (and reason)
Security Details
Account policies
User Rights
Remember
Policy is cached reboot / login to clear
Note who authenticating server is
Environmental Variable logon server
80
New! GPMC
Group Policy Management Console
Free Download
Manage all Policies in domain
See all options: No Override, blocking, etc
Applied GPOs / Denied GPOs (and why)
Save GPO settings, User Applied Settings
Modeling (what if scenario)
81
New! GPResult
GPresult
Built in to Windows 2003 Server, XP
RSOP
Security settings displayed
Account settings
User Right Settings
Audit Settings
ACL Filters applied
Filtered GPOs (and reason)
Admin Template (registry) settings
Works in a Windows 2000 domain but only from an XP
or Windows Server 2003 client.
82
Group Policy Management Console
Manage all policies for all Domains, all OUs in forest
with one GUI based tool
Run GPResult to see policy apply for any user on
any machine without bothering the user or the
machine.
Save GPO and GPResults in HTML format
Off-line analysis
Send to Support Engineers for analysis
Troubleshooting
GPOs applied, not applied and why
Easily see all settings applied without wading thru gpedit
Backup/Restore policies 83
84
Userenv.log
Located: %systemroot%\debug\usermode
User environment info:
Group policy (registry)
Client side extensions
Increase verbose logging (Q221833)
(enhanced in Windows 2003)
Take time read and study and you may be
surprised at what you can find!
Sample Userenv.log
85
Additional User Mode Logs
Client-side extensions
Registry (demo) Q216357
HKLM\software\Microsoft\WindowsNT\currentversion\winlogon\ GPExtension
Errors created in %windir%\debug\user mode
Named after the CSE .dll
Q245422
Produced automatically on error (except winlogon.log)
Invaluable in debugging. Use them!
Sample gptext.dll
86
Common Problem
Need to restore Default Domain, Default Domain
controllers policies
Best Practice Dont mess with these 2 policies
If you do
DCGPOFix
Replaces Default Domain Policy
Replaces Default Domain Controllers Policy
One or both
Wipes out old settings like EFS
87
Group Policy Problem Diagnosis
(Class exercise)
88
Security Problem Diagnosis
(Class exercise)
89
Group Policy Resources
All Group Policy Resources: http://www.microsoft.com/gp
Server 2003 Group Policy Infrastructure
http://www.microsoft.com/downloads/details.aspx?FamilyId
=D26E88BC-D445-4E8F-AA4E-
B9C27061F7CA&displaylang=en
90