Question :
I have a query as follows:
SELECT @tenor_from =CONVERT(DATETIME,MIN(spc.maturity_date),103)
FROM source_price_curve spc
INNER JOIN #source_price_curve_list spcl
ON spc.source_curve_def_id = spcl.price_curve_id
WHERE spc.as_of_date >= @as_of_date_from
It takes almost 12 seconds to run.Removing the temp table #source_price_curve_list from join gives the result in less than 1 second.
The source_price_curve
table has 130 million records. The temp table #source_price_curve_list
has one record in output(as given below).It may contain more data in future
select * from #source_price_curve_list
rowID price_curve_id
1 1
Why does the inner join to a one record temp table make the query take so much longer time? I need to reduce the query run time to less than 1 sec.
The execution plan of the query with the join is provided in the below link:
https://www.brentozar.com/pastetheplan/?id=BksaaWjSb
The query without the join to temp table runs in less than 1 sec.The execution plan link is as follows:
https://www.brentozar.com/pastetheplan/?id=BJ2_3boSZ
The table source_price_curve is created as follows:
CREATE TABLE [dbo].[source_price_curve](
[source_curve_def_id] [int] NOT NULL,
[as_of_date] [datetime] NOT NULL,
[Assessment_curve_type_value_id] [int] NOT NULL,
[curve_source_value_id] [int] NOT NULL,
[maturity_date] [datetime] NOT NULL,
[curve_value] [float] NOT NULL,
[create_user] [varchar](50) NULL,
[create_ts] [datetime] NULL,
[update_user] [varchar](50) NULL,
[update_ts] [datetime] NULL,
[bid_value] [float] NULL,
[ask_value] [float] NULL,
[is_dst] [int] NOT NULL,
CONSTRAINT [IX_unique_source_curve_def_id_index] UNIQUE NONCLUSTERED
(
[as_of_date] ASC,
[source_curve_def_id] ASC,
[maturity_date] ASC,
[is_dst] ASC,
[curve_source_value_id] ASC,
[Assessment_curve_type_value_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[source_price_curve] WITH NOCHECK ADD CONSTRAINT [FK_source_curve_def_id] FOREIGN KEY([source_curve_def_id])
REFERENCES [dbo].[source_price_curve_def] ([source_curve_def_id])
ON DELETE CASCADE
GO
ALTER TABLE [dbo].[source_price_curve] CHECK CONSTRAINT [FK_source_curve_def_id]
GO
The following indexes available in the table source_price_curve
(many were created long back and may not be used)
CREATE CLUSTERED INDEX [source_curve_def_id_index] ON [dbo].[source_price_curve]
(
[as_of_date] ASC,
[source_curve_def_id] ASC,
[maturity_date] ASC,
[is_dst] ASC,
[curve_source_value_id] ASC
)
CREATE NONCLUSTERED INDEX [source_price_curve_123] ON [dbo].[source_price_curve]
(
[source_curve_def_id] ASC,
[as_of_date] ASC
)
INCLUDE ( [maturity_date])
CREATE NONCLUSTERED INDEX [IX_PT_source_price_curve_as_of_date_curve_source_value_id]
ON [dbo].[source_price_curve]
(
[as_of_date] ASC,
[curve_source_value_id] ASC
)
INCLUDE ( [curve_value],
[maturity_date],
[source_curve_def_id])
CREATE NONCLUSTERED INDEX [IX_PT_source_price_curve_as_of_date_curve_source_value_id_Assessment_curve_type_value_id]
ON [dbo].[source_price_curve]
(
[as_of_date] ASC,
[curve_source_value_id] ASC,
[Assessment_curve_type_value_id] ASC
)
INCLUDE ( [curve_value],
[maturity_date],
[source_curve_def_id])
CREATE NONCLUSTERED INDEX [IX_PT_source_price_curve_maturity_date] ON [dbo].[source_price_curve]
(
[maturity_date] ASC
)
CREATE NONCLUSTERED INDEX [IX_PT_source_price_curve_source_curve_def_id_curve_source_value_id]
ON [dbo].[source_price_curve]
(
[source_curve_def_id] ASC,
[curve_source_value_id] ASC
)
INCLUDE ( [as_of_date],
[Assessment_curve_type_value_id],
[curve_value],
[is_dst],
[maturity_date])
CREATE NONCLUSTERED INDEX [IX_PT_source_price_curve_source_curve_def_id111]
ON [dbo].[source_price_curve]
(
[source_curve_def_id] ASC,
[Assessment_curve_type_value_id] ASC,
[curve_source_value_id] ASC,
[maturity_date] ASC,
[as_of_date] ASC
)
I hope all these explanations will help to analyze! Thanks in advance.
Answer :
Why does the inner join to a one record temp table make the query take so much longer time?
Without the join, the optimizer is smart enough to work out that it can find the minimum value by reading one row from the end of the index.
Unfortunately, it is not currently equipped to apply the same sort of logic when the query is more complicated (with a join or grouping clause, for example). To work around this limitation, you can rewrite the query to compute local minimums per row in the temporary table, then find the global minimum.
Perhaps the easiest way to express this in T-SQL is to use the APPLY
operator:
SELECT
-- Global minimum
@tenor_from = MIN(MinMaturityPerCurveID.maturity_date)
FROM #source_price_curve_list AS SPCL
CROSS APPLY
(
-- Minimum maturity_date per price_curve_id
SELECT TOP (1)
SPC.maturity_date
FROM dbo.source_price_curve AS SPC
WHERE
SPC.source_curve_def_id = SPCL.price_curve_id
and as_of_date >= @as_of_date_from
ORDER BY
SPC.maturity_date ASC
) AS MinMaturityPerCurveID;
Good performance relies on there being many rows per price_curve_id
. You may need an index of the form:
CREATE NONCLUSTERED INDEX
[IX dbo.source_price_curve source_curve_def_id, maturity_date, as_of_date]
ON dbo.source_price_curve
(
source_curve_def_id,
maturity_date,
as_of_date
);
Your second query runs within a second because it does not have to go through each record in (130 million record table) and compare whether it matches the record from a temp table.
And there is not much you can do when you are using a temp table with a single record within. One solution would be to save it within a variable and use it in where condition without joining it, but you said temp table will contain more records.
Note that ,it does not necessarily mean that more records will increase your execution time. With more rows in temp table, query optimizer will use Hash Join which could possibly give you even better results.
However you could optimize your query like this:
SELECT CONVERT(DATETIME,MIN(spc.maturity_date),103) as [MinDate]
FROM source_price_curve spc
WHERE EXISTS(select 1 from #source_price_curve_list as spcl where
spcl.price_curve_id = spc.source_curve_def_id AND
spc.as_of_date >= @as_of_date_from)