Home » Database Basics » 10 - Databases - A brief look at Security, Optimization, Monitoring, and Troubleshooting
10

Database Optimization

Make the database function as efficiently and effectively as possible

Optimizing a database means making it function as fast as possible while still ensuring accurate results. Basically, the database sits on a computer with a CPU that can only take a given load. When too many commands are executed on the database at once, if some commands are not designed well, or if commands take too long to execute or search through a table, the CPU becomes overloaded and the system becomes slow. This causes timeouts and error pages on the screens of user systems and should therefore be avoided at all cost. Optimization has to do with all aspects of a DBMS including query design, backups, table design, and even field size. As the number of requests per minute gets higher, in-depth optimization becomes inevitable. There are several good practices that result in optimal performance. Some of the key practices are outlined below:

  • All SQL statements that are sent to the engine should be examined and columns that are most often used to identify information (i.e., that appear after WHERE statements - normally IDs, names and so on) should be indexed.
  • Although normalization is a good thing, large tables that are frequently combined using queries should be de-normalized - JOINS are very expensive commands that take up time and CPU.
  • Using Triggers and Procedures stored in the DBMS itself in server side scripts instead of SQL statements will cause less of a load on the DBMS; sequences of SQL statements take more time to execute than triggers and only the name of the Procedure and its parameters are passed between the web and Database server rather than entire commands - the latter causes an increase in network traffic. Also, procedures may provide an extra layer of security by hiding underlying data objects and names.
  • Applications should allow asynchronous execution of SELECT queries and unrelated INSERTS and UPDATES. This means that two such queries should be allowed to run at the same time. This can improve performance because one query will not wait for the next to finish.
  • Client side data caching means putting a 'cache' of information that the user will request over and over again (such as a particular section in a catalog) over the course of a session on the client computer. Applications should be designed to do client side caching if most users have computers with a large CPU. Thus, the load on the data server will decrease - the client system will be used for data access.
  • WHERE clauses should be used where ever possible so that only relevant rows are returned to the client. Also, only relevant columns, rather than all columns using the '*' keyword, should be returned to the client. Some DBMS allow a TOP clause in SQL - that is, only the top ten or twenty records (this number is set in the TOP clause) are returned to the client. This is useful in making a query fast in the case of search engines, shopping catalogues and so on.
  • Queries should not be allowed to run indefinitely - a time out should be set for every query using the DBMS. If this time is exceeded, the query should stop running and an error message should be returned to the user.
  • Backing up a database is usually resource effective, still, backups of large databases should be performed on local drives first and sent to tape drives later. Doing the former is faster and will make the backup easier on the DBMS. Backups should be performed often to minimize the impact of data loss after restoring and at a time when little or no data access takes place.
  • Data typing and constraints, rather than rules or triggers, should validate and restrict information entered. Constraints are more reliable as they are built in rather than programmed.
  • The smallest data type possible should be used for each column while designing a table. For instance, some DBMS offer a 'bit' data type that is either 0 or 1. This, rather than integer, should be used for a column that contains gender or current/archived. A field that will only contain alphanumeric characters should be set to varchar rather than the more complex nvarchar type that allows unicode.
Screenshot 10b: Optimization helper in MS Access: User Input
Screenshot 10b: Optimization helper in MS Access: User Input

Screenshot 10c: Optimization helper in MS 

Access: MS Access Output
Screenshot 10c: Optimization helper in MS Access: MS Access Output