In my previous Blog I have updated about poor coding practices which drag the performance of system, in the same series I am adding the article specifically on SQL mistakes which drag the system performance

8) Improper filter: It is very common mistake where developer feels entire database is at their mercy. In the traditional transactional database developer puts a check that same transaction should not process twice when other systems are retrying. For this here are queries written by developer:

Select * from transaction_table where transaction_id=?123453?;

Now let?s analyze above query critically here are some basic flaws:

  1. User should not give the Select * which is very costly
  2. While checking the transaction developer should use the date filter within close range because if it searches entire table for the transaction it will use high I/O and query will be very slow and impact the transactional system performance.
  3. If it is partitioned table use the query to select specific partition(s) to fetch the data for faster query response.

Solution: Select only the required columns which are necessary to fetch.

Select column1,column2 from transaction_table where transaction_id=?123453?; -- Only required column should be selected

9) Self Join: This is very frequent problem that developer make self-join on transaction table to show different columns in same row.

Select a.coulmn1, a.column2, b.column1 and b.column2 from table1 a, table1 b and; and a.type='ABC'and b.type='BCD'

Now let?s analyze above query critically here are some basic flaws:

  1. Once you go for the two tables join it is all the columns are self-joined.
  2. Cartesian product is created for the same table and more over it requires high IOPS.
  3. The cost of query is very high and it penalize the data fetch and performance of system.

Generally this is done by developer to use when there are multiple rows of same transaction and have different value and they want to show this in same row. Instead of doing this developer should write separate query and then manipulate these data in programming language.

10) ?Having Clause?: This is very frequent problem that developer use having clause in the query:

Select t1.column1, t1.column2, sum (t1.column3) from table1 t1 where t1.column1>?XYZ? group by t1.column1, t1.column2 having sum (t1.column3)>1000;

Let?s see what happens when we write such queries.

  1. Once you group large table it sort the table based on the columns. So index should be created on these columns.
  2. For applying the having clause entire result set is loaded in memory.
  3. Filter from this result set based on having clause.

Solution: It is better if we take the entire data using cursor and filter in and transactional code in iterative of the program. Like in PLSQL. use the cursor and define the variables to filter out the data.

11) ? Should not use where clause on ?Null? column?: Developer/DBA should not write a query on those columns where null value is possible. When comparison starts on column and find the value as null it slows down the performance on query. Also from the rule of normalization it is not recommended to query search on null column. These columns are only used to store and display information not for the search.

The expression column=null will evaluate to UNKNOWN and according to the three valued logic NOT UNKNOWN also evaluate UNKNOWN and query can return empty set.

Solution: In such cases either make this column not nullable and define the default value of the column. Then this will have better plan and will always return the proper result set.

12) Using ?IN? clause as sub query: Often developers write query like this.

Select * from table1 as t1 where t1.empcode in ( select t2.empcode from employee where salary>100);

This query can behave in any manner let?s analyze the same:

  • If the no of employees who have salary >100 are few means 10-15 this query may work well that means in test environment it may work.
  • But in actual it will have high number of records so sub query will return large number of employeecodes therefore the IN clause will be very big and takes time to finish the query.
  • Once any value is used to compare with IN clause then it compare the column value with each of the in clause and this may kill entire system performance.

Solution: write a program with cursor and use the iterative cursor is one option but better option is use the exits clause in such cases wherever is possible.

13) Function on Index columns: Often developers use the functions on indexes columns. Here are few examples:

Select customer_name from customer where Left(customer_name,1)=?A?;

Select * from employee where trunc(joining_date,?Month?)=?JANUARY?;

In the above queries index will not be used because function has been used therefore this will create plan with full table scan.


SQL Mistakes which Drag System Performance Java, JDBC, Hibernate Part1

2 Thoughts on “SQL Mistakes which Drag System Performance Java, JDBC, Hibernate Part2

  1. Maninder on November 28, 2014 at 1:16 pm said:

    Check for new feature of DBIM in Oracle R12c

  2. Pingback: Designing the Scalable System

Post Navigation