Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Clustered Index Scan

Clustered Index Scan scans the entire clustered index (meaning the entire table, since clustered index stores all data). This is the most expensive operator in the execution plan (taking 97% of the cost).

What is Clustered Index Scan?

When SQL Server performs a Clustered Index Scan, it reads through every row in the clustered index. Since the clustered index contains the entire table data (the data rows are stored in the leaf nodes of the index), a scan means reading the entire table.

This operator is typically the most expensive because it reads all rows, not just a subset.

Common Use Cases

  • No suitable index for the WHERE clause
  • No WHERE clause at all (SELECT * FROM table)
  • Non-sargable WHERE clause (e.g., WHERE YEAR(date) = 2024)
  • Fetching too many rows (SQL Server estimates scan is faster than seek + bookmark lookup)

Parameters Explained

1. Storage

Storage: RowStore
  • Data is stored in row format (horizontal) - traditional storage
  • RowStore vs ColumnStore: RowStore is good for OLTP (many inserts/updates), ColumnStore is good for OLAP (aggregates on many rows)

2. Number of Rows - VERY IMPORTANT

Number of Rows Read: 19820
Actual Number of Rows: 19820
Estimated Number of Rows: 19820
Estimated Number of Rows to be Read: 19820

✅ GOOD POINT:

  • Actual = Estimated = 19,820 rows
  • Statistics are accurate, no cardinality estimation issues

Meaning:

  • SQL Server correctly estimates the number of rows to process
  • Query optimizer selects a plan appropriate for actual data

3. Cost - THE MAIN ISSUE

Estimated Operator Cost: 0.113973 (97%)
Estimated I/O Cost: 0.0920139
Estimated CPU Cost: 0.021959

⚠️ KEY POINTS:

MetricValuePercentageAssessment
Operator Cost0.11397397%❌ Very high
I/O Cost0.092013981% of operator❌ Reading from disk
CPU Cost0.02195919% of operator⚠️ Significant

Analysis:

  • This operator takes 97% of the total query cost
  • 81% of cost is I/O → reading data from disk, not from cache
  • THIS IS THE MAIN BOTTLENECK of the query

4. Execution Parameters

Number of Executions: 1
Actual Rebinds: 0
Actual Rewinds: 0
Ordered: False
ParameterValueMeaning
Executions1Runs only once (not in a loop) ✅
Rebinds0Not re-initializing parameters ❌
Rewinds0No loop join rewind ✅
OrderedFalseResult is not sorted by clustered key

5. Row Size

Estimated Row Size: 47 B
  • Each row ~47 bytes
  • Total data: 47 × 19,820 ≈ 931 KB (less than 1 MB)

⚠️ Paradox:

  • Data is only ~1 MB, but still scanning everything?
  • The table may have more than 19,820 rows but the filter only selects 19,820 rows without a suitable index

Comparison with Compute Scalar

OperatorCostI/O CostCPU CostRows
Clustered Index Scan0.113973 (97%)0.0920140.02195919,820
Compute Scalar0.001982 (2%)00.00198219,820

Observation:

  • Clustered Index Scan is 57 times more expensive than Compute Scalar
  • If the query is slow, the issue is here, not in Compute Scalar

How to Optimize

Method 1: Check if there’s a WHERE clause

-- Current query (assuming)
SELECT * FROM Orders
WHERE CustomerId = 12345  -- If there's no index on CustomerId

Solution: Create a non-clustered index

CREATE INDEX IX_Orders_CustomerId ON Orders(CustomerId)

After that, the plan will become:

Index Seek (NonClustered) → Key Lookup (Clustered) → Compute Scalar

Method 2: If query has no WHERE clause (SELECT all)

SELECT * FROM Orders  -- Fetching all 19,820 rows

Assessment:

  • Scanning 20,000 rows is acceptable (I/O cost 0.09 is small)
  • No optimization needed if data is small

Method 3: Check WHERE clause sargability

-- BAD (non-sargable)
WHERE YEAR(OrderDate) = 2024

-- GOOD (sargable)
WHERE OrderDate >= '2024-01-01' AND OrderDate < '2025-01-01'

Method 4: Use covering index to avoid Key Lookup

If the query only needs a few columns:

-- Instead of SELECT *
CREATE INDEX IX_Orders_CustomerId_Include 
ON Orders(CustomerId) 
INCLUDE (OrderDate, TotalAmount)

The plan will become:

Index Seek (NonClustered) → Compute Scalar
(No Key Lookup needed)

Overall Assessment

✅ Good Points:

MetricValueAssessment
Actual vs Estimated Rows19,820 = 19,820Statistics accurate
Executions1No loop
Row Size47 BSmall data

❌ Points to Improve:

MetricValueIssue
Operator Cost97%Takes almost entire cost
I/O Cost0.092Reading disk, not cache
StorageRowStoreAppropriate but scanning all

Questions to Determine If Optimization is Needed

  1. Does the query have a WHERE clause?

    • If yes: Need to create an index for the column in WHERE
    • If no: Scanning 20k rows is acceptable
  2. What is the total number of rows in the table?

    • If ≈ 20,000: Scan is OK
    • If >> 20,000 (e.g., 1 million): Scan is a serious issue
  3. How long does the query take?

    • If < 100ms: No optimization needed
    • If > 1s: Need to create an index

Summary

This Clustered Index Scan is the main bottleneck (97% cost).

Next steps:

  1. View query text (F4 or hover over operator) to see WHERE clause
  2. Check total rows in the table
  3. If there’s a valid WHERE clause → create non-clustered index
  4. If query actually runs slow (> 500ms) → optimize now
  5. If query is fast (< 100ms) and data is small → can leave as is

Principle: Clustered Index Scan on 20,000 rows is not always bad. What’s important is execution frequency and actual time.