TPC-H queries’ characteristics.
The growth of the Organization’s data collections, due to the development of new materials and advanced computing devices, puts the database (Db) technology at the forefront. Consequentially to its popularity, different options of database management systems (DBMS) can be found on the market to store one of the most important Organization’s assets, the data. Among the factors that may influence its choice (advanced features, interoperability, etc.), could be highlighted the cost-benefit provided by fierce competition between different software philosophies – proprietary (Oracle, MS SQL Server, IBM DB2, etc.) and public domain (PostgreSQL, MySQL, Firebird, etc.) – and, also the system performance in critical computing environments.
Performance measurements in the computer systems area (processors, operating systems, compilers, database systems, etc.) are conducted through benchmarks (a standardized problem or test used as basis for evaluation or comparison) widely recognized by the industry. There are a lot of the benchmarks consortia with specific evaluation criteria, metrics, pricing and results communication, highlighting: Open Source Database Benchmark (OSDB) (http://osdb. sourceforge.net), System Performance Evaluation Cooperative (SPEC) (http://www.spec.org) and Transaction Processing Performance Council (TPC) (http://www.tpc.org).
In the academic scope, the TPC benchmarks are widely recognized [1; 2; 3; 4; 5; 6; 7] due its exactness for definition of tests implementations, price measurements and results reporting. The TPC began from two
In this chapter we intend to present a study on the Db performance by means of statistical techniques for planning and analysis of experiments (DoE), applied in the computing scope. The objective of this study is use two DoE techniques (2
The paper is structured as follows: Section 2 shows an introduction in the benchmark technology, with emphasis on the TPC-H standard. The DoE methodology is introduced in Section 3, where can be found a overview over full factorial and fractional factorial designs. The Section 4 is devoted to the case study to investigate the influence of different Db parameters (from PostgreSQL DBMS) in its performance through DoE designs (2
2. Benchmark overview
Performance tests in the computing scope are a valuable tool to assist the decision makers in the hardware and/or software settings. Such tests, usually called as benchmark, can ensure that software does not present problems or unavailability due to insufficient resources (i.e.: memory, processor, disk, etc.).
According to the Merriam-Webster dictionary (http://www.merriam-webster.com), benchmark is “
Some generic benchmarks have become widely recognized (i.e.: TPC), such that vendors advertise new products performance based in its results. Benchmarks are important tools in evaluating computer system performance and price/performance. However, to be really useful  prescribes some key criteria that should be considered during the benchmarks’ choice and/or use:
Relevance (must measure the peak performance and price/performance of systems when performing typical operations within that problem domain);
Portability (should be easy to implement on many different systems and architectures);
Scalability (should apply to small and large computer systems); and
Simplicity (must be understandable).
Benchmark softwares, usually, simulates scenarios from the real world (i.e.: TPC-C, TPC-DS, TPC-E, TPC-H and TPC-VMS) where systematic procedures are performed to test, collect and analysis the system performance. It’s use is strongly recommended, because the cost of implementing and measuring specific applications on different systems/platforms is usually prohibitive .
2.1. TPC-H Standard
TPC-H benchmark represent a generic decision support benchmark, with the main features: capacity to manage very large amounts of data, the power to analyze it with a high degree of complexity and the flexibility to answers critical business questions.
Figure 1, illustrates the logical schema of the TPC-H specification and shows the business and application environment. According to the TPC official documentation , “
In this schema (Figure 1), the Db consists of eight tables simulating a realistic application involving customers, parts, lineitems, suppliers and orders. The prefix of the table columns is expressed into parentheses, the relationships between tables are represented by arrows and the number/formula below each table name are the cardinality (number of rows) of the table, factored by scale factor (SF). That is, the SF determines the size of the raw data outside the Db (i.e.: SF = 100 means that the sum of all base tables equals 100GB).
To be compliant with TPC-H benchmark,  recommends that the Db must be implemented using a commercially available DBMS, with support to queries and refresh functions against all tables on a 7 days of week, 24 hours by day (7
The performance metric reported by TPC-H is called TPC-H Composite Query-per-Hour Performance Metric (QphH), and reflects multiple aspects of the system’s capability to process queries. TPC-H benchmark is composed by 22
3. Design of experiments overview
In science, the researchers' interest is focused on systems (processes) investigations where, usually, there are several variables for analysis. Often, the investigations are centered on individual changes produced by the variables, as well in their interactions (Figure 2).
Traditionally, in an investigation, experiments are planned to study the effects of a single variable (factor) in a process. However, the combined study of multiple factors represents a way to determine the main effects, as well as the interaction effects among the factors underlying the process. The DoE is a framework of statistical techniques, such as the results can produce valid and objective conclusions .
DoE methodology have been used very successfully in the verification, improvement and reducing the processes variability with impacts on costs and development time [9; 10]. A relevant class of DoE techniques is called factorial design, whose goal is to study and analyze the results (effects) produced by multiple variables of a process.
The beginning of a factorial design is a careful selection of a fixed number of levels for each set of factors. The experiments should be performed with all combinations factor/level. For example, if there are
The default schema for designs with two levels uses the notation "–" (negative) and "+" (positive) to denote the low and high levels of each factor, respectively [9; 10; 11]. For example, a 2
3.1. Full factorial design
When all combinations of factors are running at the same number of times for each level, the experiment is classified as 2
3.2. Fractional factorial design
Fractional factorial designs represents one way where only a fraction of appropriate combinations required for 2
The design does not give full resolution, that is, there are certain interaction effects that are confounded with (identical to) other effects. However, fractional designs requires a smaller number of plays as compared to the full factorial design, but they assumption implicitly that higher-order interactions do not matter. Therefore interactions greater than two-way particularly could escape of detection.
4. Case study
To illustrate the effectiveness of DoE methodology we will apply it in the Db scope, through a case study to know the influence of parameters in the Db performance. This case study will be divided by phases: the first, comprehends a study with a
Thus, this case study uses the PostgreSQL DBMS (version 8.4.11), through the implementation of a database of 1GB, populated with
Q10 – identifies customers who might be having problems with the parts that are shipped to them;
Q13 – seeks relationships between customers and the size of their orders;
Q18 – ranks customers based on their having placed a large quantity order; and
Q22 – identifies geographies where there are customers who may be likely to make a purchase.
In this study, the parameters selected (Table 5), intuitively, looks be significant to the Db performance according to the queries characteristics.
The PostgreSQL parameters (experiments factors, Table 5) were set according to the suggestions from the PostgreSQL official documentation (http://www.postgres.org), and the values from low level, are standards of installation, while the high level values, were customized according to the computing environment characteristics (a virtual machine implemented over Intel CoreTM i5 360 3.20GHz CPU, GNU/Linux i386 openSUSE 11.3, 2GB RAM and hard disc 50 GB). That is,
4.1. Phase I – 2
kfull factorial design
The experiments performed in this phase were structured with five factors at two levels (Table 5), resulting in a 25
In Table 7 can be found the main effects of factors (, where
We use the analysis of variance (ANOVA) to know the influence of factors in the Db performance. According to the ANOVA (Table 8) it appears that the effect of factor A is statistically significant (
|Total SS||0.768||95||Total SS||4617.089||95|
|Total SS||0.143||95||Total SS||278.100||95|
After estimate the effects of each factor and analyze them through the ANOVA, we used a analysis of sensitivity of factors (Table 9), suggested by , whose goal is create a rank of the factors. This methodology consists of a “
For example, the sensitivity effect and of factors A and B for query Q10 were estimated as and . All others sensitivity effects were estimated in the same way (Table 8).
Once the sensitivity effect of factors was estimated, they can be rated with respect to its range of influence (Figure 3) based on number of factors studied (i.e.: 5, Table 5). According to this range, each factor has 0.2 units of influence and, therefore such factors with the same normalized effect can be assigned at the same rank. For example, factors with sensitivity effect 0.2 (factor E for query Q10) and 0.3 (factor A for query A13) will be at the same ranking.
The ranking of sensitivity effect of factors is presented in Table 10. These results corroborates with the ANOVA analysis (Table 8) and reveals that, factors statistically significant (i.e.: factor A for queries Q10, Q18 and Q22) are the most sensitivity (close to 1.0), while those effects marginally significant (factors C and E for query Q10) have low influence (close to 0.0). Here also these factors do not looks have influence for query Q18.
Such observations can be highlighted by graphs of the changes of effects versus factors levels (Figure 4). They confirms the hypothesis that the factor A is the most significant for queries Q10, Q18 and Q22. Through a visual inspection, it should be noted that the factors classified with low influence are very close to the average (i.e.: factors B, C, D and E for query Q18). The graphs also confirms that factors C and E are significant for queries Q13 and Q22.
4.2. Phase II – 2
k-pfractional factorial design
The second phase of this study case uses a
As mentioned before, the experiments are composed of two replicates performed at the same computing environment used during the Phase I (Section 4.1). A sample of the experimental matrix with execution time (in seconds) – average time of queries answers – is presented in Table 11.
In Table 12 are presented the effects of factors for each query (Q10, Q13, Q18 and Q22). These effects were calculated with the same methodology used in the Phase I (Section 4.1), but here the sum of multiplying levels (– and +) with execution time (
The influence of factors were studied with ANOVA (Table 13). It’s noteworthy that factor A is statistically significant for queries Q10, Q18 and Q22, but it does not seems influential for query Q13. By the other hand, the factors C and E are statistically significant for query Q13. We also note that, with the fractional factorial experiments, there are no factors marginally significant for query Q10, so this query, as well as query Q18 have only one factor significant (factor A), while for query Q22 the factors A and C are statistically significant.
|Total SS||0.354||47||Total SS||2464.116||47|
|Total SS||0.066||47||Total SS||148.413||47|
After known the ANOVA results, we also employee the analysis of sensitivity of factors to rank them (Table 14) according to the range of influence (Figure 3). The results reveals that such factors classified as statistically significant by ANOVA (Table 13) (i.e.: factor A for queries Q10, Q18 and Q22) are the most sensitivity. It stands out that A is the only factor that seems influential for queries Q10 and Q18.
Through the graphs of the changes of effects versus factors levels (Figure 5) it can be noted that the factor A is the most significant for queries Q10, Q18 and Q22. The a visual inspection stands out the factors classified with low influence are very close to the average (i.e.: factors B, C, D and E for queries Q10 and Q18). The graphs also highlights that factors C and E are significant for queries Q13 and Q22.
4.3. Analysis of results
Through the present study were made several experiments using two DoE techniques (2
The Phase I (Section 4.1) comprehends a study with a 2
In the Phase II (Section 4.2) a same research was conducted, but using a 2
To perform comparison of results between both techniques, it’s noteworthy that even with similar results, full design is more accurate. For example, it stands out such factors classified as marginally significant by ANOVA (Table 8) from full design (i.e.: C and E for query Q10), do not appear in fractional design. However, despite the accuracy, the full design is more laborious and, therefore should require more resources (depending on the number of factors). Anyway, we could state that, in this study, both techniques proved to be effective for identification and classification of influential factors (parameters) to the Db performance.
In this study, we assume that all queries have same importance. So intuitively it seems that factor A (
5. Related work
A quick search on the contemporary literature reveals some works addressing to the use of DoE in the several scopes. At the computer science area, the use of DoE is explored by  through a comprehensive study about techniques for software engineering experimentation. Other works [13; 14] are devoted to the algorithmic optimizations by means of DoE.
There are also a lot of works approaching the Db performance subject. For example,  approaches the optimizations of Db systems through a statement of a new problem, that is the Web-based interactive applications.  report a performance study with different Db architectures and provide useful information for the enterprise architects and database administrator in determining the appropriate Db architecture. Techniques to automate the setting of tuning parameters in specifics software applications could be found in , as well as in . The importance of best practices and the database administrator knowledge for autonomic Db tuning is pointed by . In  is introduced a algorithm to select a small subset of Db statistics, such that it can improve the benefits over maintaining base-table statistics. To  the challenge in making Db systems truly self-tuning is a tall task, due the different abstractions for workloads and different constraints on the desired solution (i.e.: the complexity of internal components of the Db architecture). In  is discussed a way to avoid the trial and error process of SQL tuning, through by choosing a small set of experiments in order to reach a satisfactory plan to execute queries.
The o use of DoE techniques is formally explored in the Db scope. In the [5; 6], the database performance evaluation was studied by a statistical approached. The authors define a statistical methodology, which may be useful to minimize the effort related with database tuning activities. Following in this line, the study presented by  describes a software application, whose purpose is to automate the database parameters configuration by means of DoE.
In summary, in the Db performance area there are much of the effort to take the tests and results comparison, however only a little portion of the studies uses the DoE methodology to planning and analysis of the experiments.
6. Final considerations
This chapter presented a study to investigate the influence of Db parameters in its performance. Two DoE techniques (full factorial design and fractional factorial design) were applied in a case study with PostgreSQL DBMS and TPC-H benchmark, to assist in the investigations of how each parameter may influence in the Db performance.
To the case study, were selected five parameters mostly related with the Db tuning and conducted different experiments with the DoE techniques chosen. At the Phase I, we studied the influence of parameters through a 2
It should be noted that Db technology was used in this study as a vehicle to demonstrate how the DoE methodology can help in the design of experiments and its analysis and used as tool in several scopes, like in the computing science field. We also emphasizes that this study did not aim to close the subject about the use of DoE in the computing scope, instead it we sought disclose the effectiveness of this methodology applied in this context. Thus, we can conclude that DoE methodology is a promising to assist in quantitative analysis, for example in the investigation of influential parameters in the Db performance.