Question :
It is widely recommended NOT to use SELECT *
. I understand this, as it is not a good idea to bring the entire row into memory, when only a few cells are needed.
-
Is it still bad to use
SELECT *
when we want to read the entire row, or even 90% of it? -
Isn’t it benefitial when we read the whole row?
In normal condition, the server needs to look for each cells inSELECT col1, col2 FROM
in order of theSELECT
. But inSELECT *
, it simply reads in order of columns? The later should be easier and faster!
Answer :
In general SELECT * is bad, but like most rules that are exceptions.
-
Not really, there’ll be a minor performance hit but the simplified logic and compatibility more than make up for this.
-
I believe the SELECT * gets replaced with a list of all fields in the table which the DB engine has to look up before it can run the SQL, this is where the performance hit comes from. However, I’ve never seen an noticeable performance decrease and in my opinion the simplified maintenance and compatibility make up for this.
There are a couple of things to consider though:
-
Always use a WHERE clause (unless you REALLY want the entire table), this is a different issue to the SELECT * issue.
-
If you don’t control the schema, listing columns will keep network traffic as low and predictable as possible if someone adds a varbinary(max) field to the table for storing images/PDF file or whatever.
This seems to be one of those “rules” that have come about to stop people reading an entire 400+ column tables into memory.
In my experience, as long as the table is fairly small (in terms of columns) and you really do need all the data, let the highly-optimized DB take the hit while you concentrate on maintainability and compatibility and other benefits to yourself and your users.
The problem with this is maintainability. If you explicitly name your columns, and then alter the table to add a NULLABLE column to the end of the table, then none of your existing applications will be broken, i.e. you will be able to maintain backwards compatibility.
Another issue, as you have noted, is that if you change the order of the columns, again, you will not be able to maintain backwards maintainability.
Performance isn’t really an issue here, and you can quite easily use tools to generate the select statement for you so that you don’t have to spend time manually adding the column names.