Currently, I have table
base_voter with data around 100M of dummy data. I have stored procedure as follows:
CREATE Procedure [dbo].[spTestingBaseVoter] @SortColumn NVARCHAR(128) = N'name_voter', @SortDirection VARCHAR(4) = 'asc', @offset INT, @limit INT As Begin SET NOCOUNT ON; -- reject any invalid sort directions: IF LOWER(@SortDirection) NOT IN ('asc','desc') BEGIN RAISERROR('Invalid parameter for @SortDirection: %s', 11, 1, @SortDirection); RETURN -1; END -- reject any unexpected column names: IF LOWER(@SortColumn) NOT IN (N'name_voter', N'home_street_address_1', N'home_address_city') BEGIN RAISERROR('Invalid parameter for @SortColumn: %s', 11, 1, @SortColumn); RETURN -1; END --SET @SortColumn = QUOTENAME(@SortColumn); DECLARE @sql NVARCHAR(MAX); SET @sql = N'SELECT id, name_voter, home_street_address_1, home_address_city FROM dbo.base_voter WITH(NOLOCK) WHERE deleted_at IS NULL' SET @sql = @sql + N' ORDER BY ' + @SortColumn + ' ' + @SortDirection + ' OFFSET @OF ROWS FETCH NEXT @LIM ROWS ONLY '; EXEC sp_executesql @sql, N'@OF int,@LIM int', @OF=@offset, @LIM=@limit End
To make query faster, I have also created index as :
CREATE NONCLUSTERED INDEX FIBaseVoterWithDeletedAt ON dbo.base_voter (name_voter asc,home_street_address_1, home_address_city) WHERE deleted_at IS NULL ;
By creating this non-clustered index I have reduced the query time drastically. However, it does not applied same for the query with higher offset.
For example with:
Execute spTestingBaseVoter name_voter,asc,9999950,50
Is there something I am doing wrong, which is causing this performance issue? Or, it is more compulsory to create another index in descending order.
Let, me know if there is any better way to tackle this situation, which might decrease query time drastically.
Estimated Execution Plan
Answer based on comments originally left by Dan Guzman:
OFFSET isn’t magic; time will be progressively slower with higher offsets. Also, you should have a separate index for each column to be ordered but SQL Server can read each either forward or backward so you don’t need additional permutations for descending order.
Instead of row-number pagination, you could use key pagination instead, passing the last retrieved values of the ordered column and primary key. The query could then specify `SELECT TOP(n) WHERE ‘. That would allow users to scroll forward (and backwards with similar logic). If you must do row number pagination, do that in a caching layer.
See Optimising Server-Side Paging – Part I and T-SQL Querying: TOP and OFFSET-FETCH (sample chapter) by Itzik Ben-Gan.
An anchor filter is required to avoid a progressively larger scan, which is costly to return a page far into the result with a large paginated result like yours. I’ve used that method with multi-billion row tables with sub-second response time, albeit that was back in the SQL 2000 days and I don’t think users ever scrolled to the end.