Best solution for performance – to extract 400 over fields with varchar (300) for one single BI Data set

Posted on

Question :

I have this requirement to extract about 450 fields with most of it’s fields length with varchar(300) from multiple tables for one single BI report.

For better performance, I tried to create one single table to store all the required fields. However, it exceed the SQL server max row size and table was created with errors such as cannot insert or update the table since it exceeded max row size limit. As such, It seems like we have to split the tables.

Then I am afraid for the extract performance. During extract time, we also need to apply data level security. Do you have any idea for better design to get better performance?

Answer :

450 fields for a single BI report – that seems a lot, really cumbersome report

Extracting so much text from SQL Server will always be “a bit” slow when it comes to transferring those rows over the network (you will see big ASYNC_NETWORK_IO waits)

Best solution for performance would be to check with end users of the BI report – do they really need so many columns in a report ? Having a lot of data to chose from sometimes can be good, but I doubt end users will use all of them

Check with them what they really need, and it might happen that significant amount of fields really can be removed from the report, which will improve the performance


update: if you go by 1 single table route, to resolve the max row size limit issue, you either have to:

a) reduce the varchar columns from 300 to 10-15 symbols. Of course you can do it only if your actual data allows it (if the text values in each column are small). This way you actually can have 450 columns in a row

or

b) if your text data in varchar fields is actually bigger, then, it is actually pushed from row into off-row data and row can be bigger than 8060 bytes. BUT – here is also limitation. You can’t have more than ~335 such non-null varchar columns in a row (actual number will be less than 335). Read here for more details why you can’t have 450 bigger varchar columns in a row, even if data pushed off-row, here: Cannot create a row of size XXXX which is greater than the allowable maximum row size of 8060

Then, if you manage to reduce all varchar columns to 10-15 symbols, OR reduce number of columns to less than 335, you can create clustered columnstore index (CCI) on that single table.

CCI can improve reading speed from table significantly. But beware that depends on table usage patterns. There are a lot of gotchas with CCIs, and your situation is a tip of iceberg where a lot of things is not known to provide the precise solution

Leave a Reply

Your email address will not be published. Required fields are marked *