Skip to main content
MSSQL DBA Interview Questions
MSSQL DBA Interview Questions
SECTION 1 — SQL SERVER ARCHITECTURE & INTERNALS (30 Questions)
- Explain SQL Server’s architecture: SQLOS, storage engine, relational engine, query processor.
- What happens internally when SQL Server receives a query?
- Describe how worker threads, schedulers, and tasks work together.
- What is the role of SOS_SCHEDULER and non-preemptive scheduling in SQL Server?
- Explain buffer pool internals and lazy writer behavior.
- What are memory clerks, and how do they help in memory troubleshooting?
- Explain the difference between latch and lock.
- What are spinlocks? When do they become bottlenecks?
- What is the role of Resource Monitor in SQL Server?
- Explain NUMA architecture optimization in SQL Server.
- What is a checkpoint and what triggers it?
- Explain how the transaction log ensures durability.
- What is a log block and log sequence number (LSN)?
- Where is the plan cache stored and how is it cleaned up?
- Explain Query Optimizer phases.
- What is the “probe residual” in execution plans?
- How do memory grants work internally?
- What is Columnstore batch mode execution?
- How does SQL Server handle hash spills?
- What does the cardinality estimator do?
- What changed in the new CE vs legacy CE?
- What is parameter sniffing and why does it occur?
- Explain auto soft-NUMA and when it helps.
- What is DBCC MEMORYSTATUS used for?
- Explain the SQL Server page structure (8 KB pages, IAM, GAM, PFS).
- What is the purpose of the differential bitmap?
- What are ghost records and how are they cleaned?
- What is the Virtual Address Space (VAS) reservation?
- Explain timetable for lock escalation.
- What is the difference between synchronous & asynchronous I/O in SQL Server?
🟦 SECTION 2 — STORAGE ENGINE, FILES, LOG, RECOVERY MODELS (25 Questions)
- What causes VLF fragmentation? How do you fix it?
- How does SQL Server manage log reuse (log_reuse_wait_desc)?
- Why does the log grow even if the database is in SIMPLE recovery?
- What is the difference between physical and logical log?
- How is proportional fill algorithm used across data files?
- What are the best practices for designing filegroups?
- How does instant file initialization work?
- What is Accelerated Database Recovery (ADR)?
- What is the crash recovery cycle?
- How do you diagnose a database stuck in "In Recovery"?
- What conditions cause a file to go “offline”?
- What is differential backup base mismatch?
- How do torn page protections work?
- What is the difference between row-overflow and LOB storage?
- How do FILESTREAM and FileTable differ?
- Explain TempDB’s role in version store.
- How much log is generated for different operations (minimal logging)?
- When can minimal logging occur under BULK_LOGGED mode?
- What is a page restore?
- What happens during transaction rollback in SQL Server?
- What causes the log to be in a “recovery pending” state?
- How do you detect and fix corrupted page header?
- What is log truncation and how does it differ from shrinking?
- How are extents allocated?
- What is the IAM chain?
🟦 SECTION 3 — INDEXING, STATISTICS & COLUMNSTORE (35 Questions)
- What is the difference between clustered and nonclustered indexes physically?
- When should a heap be used instead of a clustered index?
- What are forwarded records and when do they occur?
- Explain the impact of page split and how to prevent it.
- What is fill factor and how do you choose the value?
- What is index fragmentation? What fragmentation really matters?
- Why may SQL Server ignore an index?
- How do statistics impact execution plans?
- What causes stale statistics?
- How does auto-update statistics work for large tables?
- When should AUTO_CREATE_STATS / AUTO_UPDATE_STATS be OFF?
- What is synchronous statistics update?
- How do filtered indexes work?
- How do filtered statistics differ from filtered indexes?
- What is the importance of key column order?
- What is the bookmark lookup?
- How does covering index reduce lookups?
- What causes index spool operations?
- What are missing index recommendations? Why are they risky?
- Explain hash indexes in memory-optimized tables.
- What is the difference between columnstore rowgroup & delta rowgroup?
- What is the significance of rowgroup trim threshold?
- Why do columnstore indexes switch to row mode execution?
- How does columnstore compression work?
- What causes dictionary corruption in columnstore?
- How do segment elimination and batch mode work?
- Explain online index rebuild internals.
- When is REORGANIZE better than REBUILD?
- Why can rebuilding indexes hurt performance sometimes?
- How do you detect unused indexes?
- How do you detect duplicate indexes?
- When should you use computed column indexes?
🟦 SECTION 4 — QUERY OPTIMIZATION & PERFORMANCE TUNING (35 Questions)
- What are the most important DMVs for performance troubleshooting?
- Explain SOS_SCHEDULER_YIELD waits.
- What causes THREADPOOL exhaustion?
- What causes ASYNC_NETWORK_IO waits?
- What is the difference between CPU-bound vs IO-bound workload?
- How to detect memory spills using execution plans?
- What is the significance of query_hash and query_plan_hash?
- How do you detect plans causing parameter sniffing?
- What is DBCC FREEPROCCACHE? When is it dangerous?
- How do you find top resource-consuming queries?
- How does Query Store capture regressions?
- What are key Query Store views for performance analysis?
- Compare forced parameterization vs optimize for ad hoc workloads.
- What is Cardinality Estimator feedback?
- What is a plan regression?
- How do multi-statement TVFs affect performance?
- How do scalar UDFs affect performance pre-2019 and post-2019?
- What is adaptive joins?
- What is memory grant feedback?
- How does row mode vs batch mode differ?
- How do you identify missing indexes with actual execution plans?
- Why does SQL Server choose nested loop join even when table is huge?
- What causes HASH MATCH (spills)?
- How does table variable behavior differ from temp tables?
- What is the impact of implicit conversions?
- Why do data type mismatches break index seeks?
- How do OPTION(RECOMPILE) and OPTION(OPTIMIZE FOR …) differ?
- What are XEvent sessions useful for?
- How to detect memory pressure using Ring Buffers?
- Explain physical join types and when each is used.
- What is feedback-driven optimization in SQL 2022?
- How do window functions affect performance?
- How do TOP and OFFSET-FETCH influence plans?
🟦 SECTION 5 — LOCKING, BLOCKING, DEADLOCKS & ISOLATION (25 Questions)
- Compare READ COMMITTED vs RCSI vs SNAPSHOT.
- What is TempDB version store?
- How do you identify the longest blocking chain?
- What is lock escalation and how to prevent it?
- What is a schema modification lock?
- What is a latch wait? How to diagnose?
- How do you fix PAGELATCH_UP contention?
- What is the difference between XACT_STATE and @@TRANCOUNT?
- How do you capture deadlocks via Extended Events?
- Why do deadlocks increase after adding indexes?
- How do you fix deadlocks caused by range locks?
- What causes SERIALIZABLE isolation deadlocks?
- What is an update conflict in snapshot isolation?
- What is the difference between LCK_M_S, LCK_M_X, LCK_M_IX?
- What is trace flag 1222 used for?
- What causes key-range locks?
- What is lock compatibility matrix?
- What is the impact of NOLOCK?
- Why do NOLOCK reads still block on schema locks?
- What is a retention lock?
- What causes HADR_SYNC_COMMIT waits?
- How does long open transactions impact log truncation?
- Why tempdb metadata can lock under heavy load?
- Detecting latch vs lock contention differences.
🟦 SECTION 6 — BACKUP / RESTORE / PITR / HIGH AVAILABILITY (30 Questions)
- What breaks the backup chain?
- When do differential backups stop being useful?
- How do striped backups improve performance?
- Explain tail-log backup and when it's needed.
- Restore sequence for full + diff + log backups.
- What is a piecemeal restore?
- What is TDE and why is certificate backup mandatory?
- How to restore a Page Restore scenario?
- How do you recover when log backups are missing?
- How to restore msdb without losing SQL Agent jobs?
- Explain Always On AG architecture.
- When does automatic seeding fail?
- What is synchronous commit vs synchronous-commit with automatic failover?
- What is primary replica vs secondary vs read-intent replica?
- Why does AG cause longer transactions under sync commit?
- What causes redo queue to grow?
- What are distributed AGs and when needed?
- How to diagnose AG failover issues?
- Difference between AG vs Log Shipping.
- Difference between AG vs Database Mirroring.
- What is quorum and node weighting?
- What causes AG “Not Synchronizing” errors?
- Explain automatic failover vs manual failover scenarios.
- How do you test AG failover without downtime?
- Log shipping: what causes secondary to lag?
- How to perform DR drill for 10TB database?
🟦 SECTION 7 — CORRUPTION, DBCC CHECKDB, REPAIR (25 Questions)
- Types of corruption SQL Server can detect.
- DBCC CHECKDB phases.
- How does DBCC CHECKDB detect physical corruption?
- What causes page checksum errors?
- When does torn page occur?
- What do you do when CHECKDB shows “repair_allow_data_loss”?
- What is DBCC CHECKFILEGROUP?
- How to recover a corrupt nonclustered index?
- How to interpret error 824 vs 825?
- What causes 824 errors?
- How do you isolate corruption to specific table/index?
- How do you restore only one table after corruption?
- How do you detect corruption early via alerts?
- How much overhead does CHECKDB introduce?
- Why does corruption often appear after power failure?
- What are “suspect pages”?
- How to view suspect pages from msdb?
- What is torn page detection?
- Why CHECKDB runs slower on large CLR or LOB objects?
- How do you detect corruption caused by storage subsystem?
🟦 SECTION 8 — TEMPDB, MEMORY, STORAGE, I/O (20 Questions)
- How to design optimal tempdb layout?
- What causes tempdb PAGELATCH contention?
- What is allocation bottleneck?
- How do you diagnose disk bottlenecks using DMVs?
- How to detect excessive checkpoint I/O storms?
- How to diagnose WRITELOG waits?
- What is indirect checkpoint?
- What is Buffer Pool Extension?
- How do you configure SQL Server for SSD/NVMe?
- How do you find which queries spill to tempdb?
🟦 SECTION 9 — SQL AGENT, AUTOMATION, MONITORING (15 Questions)
- What causes SQL Agent jobs to hang?
- How do you secure SQL Agent proxies?
- How to detect failed SQL Agent subsystems?
- How does SQL Agent behave in AG failover scenarios?
- Why do SQL jobs disappear after reinstalling SSMS?
- How to capture job execution time metrics?
- How do you audit job history automatically?
- How to implement retry logic for failed jobs?
- How do you handle SSIS package failures at scale?
- How do you monitor SQL Server using Extended Events?
🟦 SECTION 10 — SQL SERVER ON LINUX, DOCKER, KUBERNETES (15 Questions)
- Differences between SQL on Linux vs Windows.
- How does Active Directory authentication work on Linux?
- How do you configure persistent storage for SQL containers?
- What is mssql-conf?
- What are the limitations of SQL on Linux?
- How do you configure HADR on Linux-only clusters?
- How do you monitor SQL on containers?
- Why do permissions often fail on SQL Linux file paths?
- How do you diagnose I/O latency on Linux SQL Server?
- How do you secure SQL Server in Kubernetes?
🟦 SECTION 11 — CLOUD SQL SERVER (AZURE SQL / AWS RDS / MI) — 20 Questions
- Differences between SQL Server on VM vs Azure SQL DB.
- Why does Azure SQL throttle CPU even if low load?
- What is DTU vs vCore?
- How do you troubleshoot slow RDS failover?
- How do you secure SQL in Azure with Key Vault?
- How do long running transactions behave in cloud SQL?
- What causes log IO throttling in RDS SQL?
- What is the difference between Geo-replication vs Failover Groups?
- What limitations Azure SQL DB has vs SQL Server on VM?
- How do you migrate on-prem SQL to Azure SQL MI with near zero downtime?
🟦 SECTION 12 — REAL-WORLD SCENARIOS ROUND (MUST-KNOW) (70 Questions)
- SQL Server CPU 95% during peak — step-by-step diagnosis?
- Query slow only during business hours — where do you start?
- A log file is growing continuously — how do you stop it safely?
- Always On secondary lagging by hours — diagnosis steps?
- AG failover occurred but clients cannot reconnect — why?
- SQL Server taking 20 minutes to restart — what's happening internally?
- Index rebuild job increased blocking — what went wrong?
- Query uses index scan instead of seek — find the cause?
- Large ETL causing tempdb to explode — how to fix?
- SQL Server reports deadlocks every minute — root cause analysis?
- A table has millions of forwarded records — approach to fix?
- Statistics update caused plan regression — what's next?
- A query’s estimated rows = 1 but actual = millions — cause?
- Heavy waits on WRITELOG — how to tune?
- Always On automatic seeding keeps failing — diagnosis?
- AG replica stuck in “Not Synchronizing” — how to fix?
- Blocking chain has 120 sessions — approach?
- Tempdb 100% full — what immediate actions do you take?
- SQL Agent jobs stopped executing — where to check?
- Columnstore index performance degraded — why?
- High memory utilization but low workload — cause?
- SQL Server using only 50% CPU even under load — investigate?
- Query performs differently on primary vs secondary — why?
- Slow cross-database queries — approach?
- Database in “Recovery Pending” — how do you fix?
- Sudden plan regression after Cumulative Update — what to check?
- LOG SHIPPING COPY job failing — next steps?
- DBCC CHECKDB takes 6 hours — optimize?
- Log chain broken due to accidental full backup — handle?
- AG not failing over during outage — why?
- Frequent threadpool waits — diagnosis?
- Blocking caused by ghost cleanup — why?
- Backup performance degraded — how to fix?
- Deadlocks increased after schema changes — why?
- High PFS latch waits — what to do?
- Slow deletes due to page splits — fix?
- Slow merge replication — diagnosis?
- SQL Server memory dump generated — next steps?
- Columnstore delta stores huge — cause?
- Slow metadata queries on sys tables — why?
- SQL Server freezing during autoshrink — impact?
- Stale statistics on partitioned tables — fix?
- Foreign key cascading causing slow writes — how to redesign?
- Server fails to register Kerberos SPN — fix?
- SSIS package memory leak — troubleshooting?
- SQL Server login storms from app pool recycle — detect?
- Server shows “non-yielding scheduler” — analysis?
- Long waits on HADR_SYNC_COMMIT — root cause?
- Very slow linked server queries — fix?
- TempDB frequent autogrowth events — reason?
- Rebuild index blocking RCSI readers — why?
- Stack dump due to spinlock — diagnosis?
- Bulk insert extremely slow — analyze bottleneck?
- Weekly patching causes slow restart — root cause analysis?
- High latch waits on allocation maps — fix?
- Rewriting query fixes plan regression — why?
- SQL Server high memory but low workload — cause?
- High latency on data file reads — detect and fix?
- Wrap-around of identity column — safe repair?
- Credential corruption — resolve?
- SQL Agent crashes after Windows patch — root cause?
- Always On listener not registering DNS — troubleshoot?
- Temp table contention under parallel workloads — solve?
- Application using NOLOCK causing phantom reads — explain?
- Auto stats update kills plan stability — fix?
- Non-yielding resource monitor — diagnosis?
- Slow execution after cardinality estimator change — fix?
- Corrupted nonclustered index — recover?
- Identity jumped by 1000 — reason?
- Stored procedure slow first time, fast second time — why?
Comments
Post a Comment