Suddenly the system started getting crash

Sometime back I blog about a project that I was engaged at the office. This system was initially started as a simple e-commerce site to manage the daily activities of an bathroom equipment retailer. So later based on the owners requirements the functionality got expanded and grew it into a CRM.

Currently the system consists of quote and order processing with a comprehensive refund process, stock , user task , supplier, manufacturer, category and product management, report generation and a well organized dashboard with summarized results of overall system activities.

Everything went smoothly until last week where the system came to a total stand still in few minutes. With some testing I found that once the user login and carry outs the regular activities, the MySQL server’s usage goes to 99.9% (checked it with command top). At that time I was totally lost because the system was on live run. ;-( . To understand the root cause I started by going through each and every functionality stating from the dashboard section and further dig into all the SQL statements, which mounted up to more than 300. It was a very painful task, where most the queries were SELECT statements that returned SELECT * resultset. So I had to select the columns that is only needed to display leaving out the rest. Also had to be very careful that this modification will not further degrade the existing functionality and some of the queries were used more than once within the system.

Next I make use of a profiler to measure the time for each page to load and it helped to identify the pages that took the longest time to give the results. The profiler used for this purpose was called the PQP. It had a simple interface with a comprehensive set of functionality and it was very easy to integrate into any web page. With that for every page I measured the

  • Page Execution Time
  • Memory Usage
  • Runaway Includes
  • Individual SQL query execution time

With that I was able to list down the pages and the SQL queries that contributed for most of the traffic. This series of testing concluded that four functionalities on the dashboard, couple more on the order details page and the main functionality on the reporting section was responsible for the overall performance issue.

Next I focused much of my effort and attention on the identified sections. Since I experienced a frequent crashing of the Firefox browser I decided to switched on to Safari browser. In Safari I found a tool that really suite my requirement at this point of time. It was called Web Inspect (Developer > Web Inspect ). It was a tools somewhat similar to Firebug but it also had profiler and resource analyzer inbuilt. So with that I managed to further investigate into each of the identified pages and there respective functionality. Along with few modification to existing PHP logic (by removing redundant database requests), introducing INDEXING to reduce execution time in database tables and finally decided to introduce archival mechanism to store historical data as a long term performance improvement. I must thank Ashoka Ekanayaka for his effort and guidance.