I am not a pro in the database sector so pardon me if i am asking a basic question.
I have a table in which all the recorded are stored chronologically.
Number of rows of that table is 1,445,248 (1.4 Million).
I want to optimize my query which looks like following: –
Select * from SalesInvoice where S_Date Between 20190401 And 20190501
- Yes, i need all the columns in the table and instead of typing out all the columns, i just used the all symbol (astrix).
- S_Date is stored as numeric instead of DateTime (because it was developed a long time ago in this format)
- The total time taken to execute this query is only 4 seconds but because of some calculations performed it is looping through ~80-90 time.
I want to know, is there any way through which i can start the searching of rows from Nth position.
Edit – Currently i am using SQL Server 2008 R2 but i will switch to MySQL that is why the query is written in SQL Server 2008 R2 format.
Step 1: Get first index of data by following query: –
Select Top 1 Id from SalesInvoice where S_Date Between 20190401 And
20190501 order by S_Date ASC
Step 2: Start Searching from the returned Id lets say 500,000
My Assumption if this technique exists
According to me, this technique might save table scanning time by starting the search from that ID which removes scanning of 500,000 rows.
I don’t know how to start the lookup from a particular ID.
If anyone can provide me with a code snippet that will be very helpful
As for “all the recorded are stored chronologically”, the only control you have over this is the
PRIMARY KEY in InnoDB.
A simple rearrangement will provide the optimal access:
PRIMARY KEY(S_Date, id), -- to order the data in the order desired INDEX(id) -- to keep AUTO_INCREMENT happy
TOP 1 is not a MySQL construct; see
LIMIT. But it is not needed with the above change.)
(The technique above is even better than INDEX(S_date).)
In MySQL, the data rows are ordered by the
PRIMARY KEY. Said another way: The
PRIMARY KEY is “clustered” with the data. This implies that you you fetch the rows in PK order, you will be fetching the data in the most optimal order.
With the change I proposed, the data will be ordered by
S_Date (with dups ordered by
id). This is exactly the order you want.
To change the current table definition, do
ALTER TABLE SalesInvoice DROP PRIMARY KEY, ADD PRIMARY KEY(S_date, id), ADD index(id);
The auto_increment column must occur first in some index. This allows the ‘next’ value can be efficiently found after a restart.
start the searching of rows from Nth position.
This performs such, but not efficiently:
ORDER BY ... LIMIT 1000 OFFSET 20000
will skip (that is, read and discard) 20K rows, then deliver 1000.
It is more efficient to “remember where you left off”. With
PRIMARY KEY(id), it is straight-forward and efficient:
WHERE id > $leftoff ORDER BY id
With a composite index it gets messier, but still possible. More discussion: http://mysql.rjweb.org/doc.php/pagination
I have tried this in my local and it is working fine, hope it will work for you.
SELECT Id FROM SalesInvoice WHERE S_Date BETWEEN '2019-04-01 00:00:00' AND '2019-05-01 00:00:00' ORDER BY S_Date ASC, Id ASC LIMIT 1;
Result: 500000 (Assumed)
EXPLAIN -- Remove this when you actually want result. SELECT * FROM ( SELECT ROW_NUMBER() OVER(ORDER BY Id) AS RoNum , Id -- Add any fields needed here (or replace ID by *) FROM SalesInvoice WHERE Id > 500000 ) AS tbl WHERE S_Date BETWEEN '2019-04-01 00:00:00' AND '2019-05-01 00:00:00' ORDER BY tbl.S_Date ASC, tbl.Id ASC;