Our database has a master/detail pair of tables:
VisitTarget has one row for each sales transaction, and
VisitSale has one row for each product in each transaction.
The tables look like this: (I'm simplifying to only the relevant columns for this question)
-- "master" table CREATE TABLE VisitTarget ( VisitTargetId int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED, SaleDate date NOT NULL, StoreId int NOT NULL -- other columns omitted for clarity ); -- covering index for date-scoped queries CREATE NONCLUSTERED INDEX IX_VisitTarget_SaleDate ON VisitTarget (SaleDate) INCLUDE (StoreId /*, ...more columns */); -- "detail" table CREATE TABLE VisitSale ( VisitSaleId int IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED, VisitTargetId int NOT NULL, SaleDate date NOT NULL, -- denormalized; copied from VisitTarget StoreId int NOT NULL, -- denormalized; copied from VisitTarget ItemId int NOT NULL, SaleQty int NOT NULL, SalePrice decimal(9,2) NOT NULL -- other columns omitted for clarity ); -- covering index for date-scoped queries CREATE NONCLUSTERED INDEX IX_VisitSale_SaleDate ON VisitSale (SaleDate) INCLUDE (VisitTargetId, StoreId, ItemId, SaleQty, TotalSalePrice decimal(9,2) /*, ...more columns */ ); ALTER TABLE VisitSale WITH CHECK ADD CONSTRAINT FK_VisitSale_VisitTargetId FOREIGN KEY (VisitTargetId) REFERENCES VisitTarget (VisitTargetId); ALTER TABLE VisitSale CHECK CONSTRAINT FK_VisitSale_VisitTargetId;
For performance reasons, we've partially denormalized by copying the most common filtering columns (e.g.
SaleDate) from the master table into the each detail table rows, and then we added covering indexes on both tables to better support date-filtered queries. This works great to reduce I/O on when running date-filtered queries, but I think this approach is causing cardinality estimation problems when joining the master and detail tables together.
When we join these two tables, queries look like this:
SELECT vt.StoreId, vt.SomeOtherColumn, Sales = sum(vs.SalePrice*vs.SaleQty) FROM VisitTarget vt JOIN VisitSale vs on vt.VisitTargetId = vs.VisitTargetId WHERE vs.SaleDate BETWEEN '20170101' and '20171231' and vt.SaleDate BETWEEN '20170101' and '20171231' -- more filtering goes here, e.g. by store, by product, etc.
The plan for these kinds of queries looks like this:
As you can see, the cardinality estimation for the join is over 4x too low, which causes performance issues (e.g. spilling to tempdb), especially when this query is a subquery that's used in a more complex query.
But when I checked the row-count estimates for each branch of the join, they were close to (and a little bit higher than) the actual row counts. Hash bucket distribution also looks good. These observations suggest to me that statistics are OK for each table.
At first I thought that the problem was SQL Server expecting that SaleDate columns in each table are independent, whereas really they are identical. So I tried adding an equality comparison for the Sale dates to the join condition or the WHERE clause, e.g.
ON vt.VisitTargetId = vs.VisitTargetId and vt.SaleDate = vs.SaleDate
WHERE vt.SaleDate = vs.SaleDate
This didn't work. It even made cardinality estimates worse! So either SQL Server isn't using that equality hint or something else is the root cause of the problem.
Got any ideas for how to troubleshoot and hopefully fix this cardinality estimation issue?
If it matters, we're running SQL Server 2014 Enterprise SP2 CU8 build 12.0.5557.0 on Windows Server. There are no trace flags enabled. We see the same behavior on multiple different SQL Servers, so it seems unlikely to be a server-specific problem.