Area optimization by MSA for 2D packing
The optimization techniques for integrated circuit (IC) layout design are important.Generally speaking, the basic process of modern hardware engineering includes designing, manufacturing and testing. IC layoutis an inevitable stage of designing before manufacturing. There are many applications which are directly related with layout optimization in practice, such asfloor planfor very-large-scale integration (VLSI) design, placementfor printed circuit board (PCB) design,packing for logistics management, and so on. In this research, we mainly focus on the optimization for three layout problems, which are 2D packing, 3D packing and 2D placement. The 2D/3D packing is to position different modules into a fixed shape, normally rectangular one, with area or volume minimization. The placement can be regarded as the packing problem with interconnect optimization. Since a general placement problem is NP-hard, there are no practical exact algorithms so far to be sure to find optimal solutions. As an alternative to get the optima, heuristics [1-6] are typically used to find near optimal solutionswithin a given runtime.
As product size keeps shrinking, product lifecycle keeps shortening and product complexity goes up, more electronic components will be integrated into a smaller IC chip or PCB with higher density and shorter time to market. At the same time, multi-objective optimization is common for IC/PCB layout in real product design, so another difficulty is the trade-off between conflicting objectives, such as low power and high performance. Pareto improvement for multiple objectives is one of the biggest challenges we have to face nowadays. The layout problem becomes much harder to find near-optimal or even acceptable solutions with high requirements. In orderto improve the best cases and mitigate the worst cases of IC/PCBlayout, it becomes increasingly criticaland urgent to improve the quality of solution and reduce runtime.
Simulated annealing based algorithm with a good representationfor2D/3D packing is one of the most popular ways to improve the quality of solution.On the one hand, many researches explored different representations [7-12], such as bounded-slice-line grid, sequence pair, FAST sequence pair, Q-sequence, selectedsequence pair, etc. In order to code and decode 3D-packing problem, sequence pair for 2D packing is extended to sequence triple and sequence quintuple, and it has been proved that sequence triple could represent the topology of the tractable 3D packing and there are at least one sequence quintuple which can be decoded to a topology as an optimal packing for volume minimization. But the effectiveness to improve solution quality and reduce runtime is quite limiteddue to huge solution space and complex solution distribution, even if a very good representation is used. The experimental results within a short runtime are still far from near-optimal solutions in real implementation to solve the packing problem.So it is the right time to explore new algorithms in order to solve 2D/3D problem more effectively.
There are many significant shortcomings of traditional heuristics for IC layout optimization. Let us take simulated annealing (SA) and genetic algorithm (GA) as an example. For SA, firstly, some slight modifications of solution are repeated to get a good convergence. Therefore, the global search is inefficient in general. It is disadvantageous to solve the problem with huge solution space, such as VLSI design. Secondly, SA does not use the past experience, including past good solutions and past moves, and it is a big informational waste. To speed up SA, some researchers  proposed two-stage SA for VLSI design. But the search speed is still quite slow, and it is not seriously considered to avoid or reduce informational waste. For GA, it evaluates too many candidates in order to get next generations. The evaluation takes too much runtime. Besides GA selects the next generation according to a ranking function, which is not always necessary but takes much time. So it is possible to improve the solution quality or reduce runtime if we can overcome the mentioned shortcomings.
In this research, a simulated annealing based approach [13-14], named mixed simulated annealing (MSA), is proposed to improve solution quality and reduce runtimeby overcomingthe shortcomings of inefficient global search and informational waste. Inmixed simulated annealing, a special crossover operator is designed to use a part of information from past good solutions and get higher improving efficiency, and the solutions gotten by the crossover are much better than random solutions.To evaluate the effectiveness and the reliability of the proposedmixed simulated annealing, we apply it to three mentioned optimization problems, i.e. 2D packing, 2D placement and 3D packing, and get considerable improvement for all three problems. The experimental results show the runtime, the packing ratio of area for 2D packing, the packing ratio of volume for 3D packing and two more objectives (low power and short maximal delay) for 2D placement are improved considerably by using MCNC, ami49_X and ami98_3D benchmarks. For example, the runtime of mixed simulated annealing with sequence quintuple representation is up to 4 times faster than that of 2-stage SA with the same representation, and the packing ratio of volume is improved by up to 12% within 100s runtime.
2. Simulated annealing for integrated circuit layout
Based onthe theory of statistical mechanics and the analogy between solid annealing and optimization problem, S. Kirkpatrick, C. D. Gelatt and M. P. Vecchi  proposed simulated annealing algorithm in 1983. The annealing is to heat up a solid with a very high temperature and then to cool it down slowly until it reaches or approaches its minimum energy state. Each state of solid represents a feasible solution of problem. The energy of the state is the value of cost function to evaluate the solution. The state with the lowest energy corresponds to the optimal solution with the best value of cost function. SA is a stochastic algorithm with iterative improvement. Each iterative step consists of changing current solution to a new solution, named a move to neighbourhood. The acceptance probability of new solutions depends on the current temperature, which is scheduled from the highest temperature to the lowest temperature. An important point we have to mention here is that, if the physical process is to cool the solid down very quickly, it is known as quenching, instead of annealing. The difference between normal simulated annealing and simulated quenching is the parameter setting of temperature scheduling.
In detail, let
In real implementation with a given finite runtime, we are using a fast geometric simulated quenching scheduling (
As shown in Figure 1, it is a typical flow chart of SA, which is used for layout optimization in this research. The initial solution is randomly produced or simply follows past layout designs. The temperature scheduling is used to change the current temperature (
For the solution representation, 2D/3D topology for IC layout is defined by the orthogonal coordinate system, as shown in Figure 2, and is representedby sequence pair for 2D general cases, sequence triple for 3D simple cases or sequence quintuple for 3D general cases. Each layout in 3D general cases is regarded as a set of the relations of relative location between boxes, i.e. “Top-Bottom”, “North-South” and “West-East” (TB-, NS- and WE-) relations. The coding and the decoding are based on TB-, NS- and WE- relation corresponding to the order of modules. In Figure 2, box
For the moving methods, let us take 2D placement as example. Three basic moving methods with small changes are designed to change the current solution by using the sequence-pair representation. The “rotation” changes the orientation of a module. The “exchange” exchanges the order of two modules in all sequences. The “move” changes the order of a module in one of sequences.The detail of each moving method will be discussed in section 6 and section 7.
For the cost function in the case of 2D placement, the total value (
For the temperature scheduling, the starting temperature
3. How to improve traditional simulated annealing
There are at least two shortcomings which impact the search speed of traditional SA. (1) Inefficient global search: In order to assure a final convergence effectively, the moving methods with relatively small changes should be used, so the global search within a short runtime is quite limited. Even using higher temperature, it is still slow to explore the huge search space of layout problem. (2) Informational waste: It does not use the information of past experience, including past solutions and past moves. It is quite possible to get a very good configuration of past solutions at the beginning but to lose it at last.
First of all, to overcome the shortcoming of inefficient global search, a two-stage algorithm is considered as follows. The first stage is named rough search, and the second stage is named focusing search. The rough search tends to big changes, such as crossover from genetic algorithm, to improve global search ability, while the focusing search tends to small changes, such as exchange, move and rotation, to get final convergence and reach better near-optimal solution.
Secondly, to overcome the shortcoming ofinformational waste, a special crossover operator from genetic algorithm, which reuses the information of past solutions, is considered. Comparing with random operator, the crossover operator has a search direction, which is based on the configuration from past good solutions, by use a part of configuration of the current best to reduce the informational waste. Besides, a guide with the probabilities to select running method adaptively according to the short-term improving speed is also considered.
In real implementation of 2-stage SA, the temperature scheduling of the second stage is same withthat of the first stageusing a geometric scheduling (
4. Mixed simulated annealing
By overcoming the mentioned shortcoming, we proposed a mixed simulated annealing (MSA) to speed up traditional simulated annealing (SA) and 2-stage SA. The basic idea is to improve the global search ability and to speed up the search process by a special crossover operator, which uses the information of past solutions. Just like SA and 2-stage SA, MSA is an iterative improvement method and a stochastic algorithm. The main difference between 2-stage SA and MSA is the special crossover operator to use a part of configuration of the current best and to reduce the informational waste. Although we can get a rough solution by producing solutions randomly,it is with low improving efficiency. The proposed crossover operator has a search direction, which is based on the configuration from past good solutions, to get high improving efficiency.
The intuitive comparison shows several intuitive advantage of MSA comparing with traditional SA and 2-stage SA. For global search ability, 2-stage SA is better than traditional SA due to big changes in the first stage, and MSA is even better than 2-stage SA due to the crossover operator and even bigger changes in the first stage. Traditional SA and 2-stage SA do not use past experience, while MSA is using past good solutions by using the crossover operator.
5. Application to integrated circuit layout optimization
The detailed IC design process includes system specification, architectural design, functional design, logic design, circuit design, layout design and verification. The layout is near the last stage of IC design, and it is a critical stage of electronic product development. As one of the key steps of IC layout, the placement has big impact on the overall quality of IC chip.
5.1. Problem definition
Let us start with the formulation of 3D placement. Let
In short, The input is a set of modules
The 3D-packing problem is a special case of 3D placement with no consideration of power and delay. The 2D placement problem is regarded as a special case of 3D placement with z=0, as shown in Figure 3. The 2D-packing problem is a special case of 3D packing with z=0. All of the mentioned three problems are formulated so far.
5.2. Problem representation
The original packing or placement is with infinite solution space. The coding and decoding method is generally needed to connect the problem and its representation. The solution space of a good representation should be finite. And the solutions after a good representation should be feasible and be better to include at least one optimal solution. Sequence quintuple can be used to represent a general 3D-packing topology, but the solution space of sequence quintuple is quite large. Furthermore, sequence quintuple representation is simplified to sequence triple representation, which can be decoded to a relatively simple 3D topology. In the case of 2D-packing topology, sequence pair, which can be simplified fromsequence triple, is used to represent a general 2D packing in this research.As an example, the coding from Figure 3 to Figure 6 can be gotten by using the coding-decodingtransition methodin Figure 4 and Figure 5, which are based onNorth-South and West-East relation corresponding to the order of modules in sequence pair as follows.
In order to get a positive sequence Γ+, each West-South corner of module connects to the West-South corner of the whole layout, and each East-North corner of module connects to the East-North corner of the whole layout without any intersection as shown in Figure 4. We can get a sequence (
Generally, a 3D-packing topology is defined by the orthogonal coordinate system (
We define the notation ofsequence pair, sequence triple and sequence quintuple as follows. Let (
In case ofsequence pair, the two sequences generate a finite solution space which includes at least one optimal solution of 2D packing for area optimization by decoding. Sequence pair defines (
It defines (
For a given packing with
In case ofsequence triple, it consists of three independent sequences
It defines (
It defines (
For a given packing with
However, sequence triple does not cover all kinds of topology of 3D packing. As shown in Figure 7, (
In case of sequence quintuple, it consists of five sequences
It defines (
It defines (
F5(mi) < F5(mj)
6. Moving methods for SA, 2-stage SA and MSA
According to section 5.2, we design the following moving methods.First of all, three basic moving methods,which are named by rotation, exchange and move, are used in the focusing search. Based on the basic methods, the group rotation and the group exchange are also used as two of moving methods in the rough search. They are repeatedly used the rotation and exchange operator with a given number. The groups are randomly selected modules for rotation or pairs of modules for exchange.
In detail, the rotation changes the orientation of a module. When a rotation is applied to module
The exchange moving method exchanges the order of two modules in
The move changes the order of a module in
7. Crossover operator for MSA
Besides, a special crossover operator is designed to generate a new solution from the current solution and the best solution so far in the rough search based on the representation in section 5.2. The margin and centre of the new solution (child) inherit the margin of the current solution (father) and the reversed centre of the best solution (mother), respectively. The reason to reverse the best solution is to get a different solution even two given solutions are the same solution.
For the detail of crossover, two sequences
To make it clearer,let us take an example to explain the crossover operator. As shown in Figure 11, the left layout is represented by
8. Objectives and cost function
To solve multi-objective problem, we are using the total cost function, which includes area of bounding rectangle for 2D case, volume of bounding box for 3D case, interconnect power and maximal delay. Especially, the interconnect power and themaximal delay are two typical conflicting objectives, which need to experiment carefully to satisfy the requirements in real product design.
For the multi-objective optimization of 2D placement in this research, three different objectives are defined by one formula as follow.
For power estimation, the dynamic power of a net
For performance estimation, the maximal delay among all nets is used. The delay is defined by the wire length of nets. To get the wire length estimation for each net, the half perimeter wire length (
For the objective of 2D packing, the area estimation is the minimum bounding rectangle including all modules, which is the total height
For the objective of 3D packing, instead of 2D case, the volume estimation is given by the minimum bounding rectangular parallelepiped including all modules, which is the total height
9. Experiment and comparison
To evaluate the effectiveness and reliability of the proposed MSA in practice, a set of experiments was implemented, comparing with traditional SA and 2-stage SA. In the case of 2D packing and placement, we are using ami49_X and MCNC benchmarks. The ami49_X is produced by duplicating ami49 circuit X times. In the case of 3D packing, ami98_3D benchmark is produced by inheriting the height and width of 2D ami49_2 benchmark and randomly getting the length between the given minimum and maximum dimensions. The implementation for 3D packing is to compare MSA with the mentioned 2-stage SA. MSA is implemented in Python environment on 2.16GHz PC with 3.00GB memory. For a fair comparison, SA and 2-stage SA is also implemented at the same machine. The maximum runtime is within 14,400s (4 hours) each time.
For area optimization of 2D packing, let γ be 1 and α+β be 0 in the cost function. As shown in Table 1, the best, average and worst cases among 50 trials are gotten. The comparison of solution quality and runtime between SA and MSA is gotten. MSA reduced near 21% runtime with better solution quality. As shown in Table 2, a near log-linear trend of average improvement rates from SA to MSA is gotten. That means MSA should be more suitable for the placement with a larger number of modules.
For interconnect optimization of 2D placement, let γ be 0 and α+β be 1 in the cost function. The experiment is using ami49_X benchmarks. To get the figures, α is randomly produced from 0.1 to 0.9. 240 solutions are tested for comparison. For all tested ami49_X with X from 1 to 12, block number from 49 to 588, and net number from 408 to 4896, the improved results are gotten. Figure 13 shows that MSA obtains at least 13% Pareto improvement with the constraint of less than 108.2% maximal delay. To get the worst cases, we tested more 120 solutions with α given by 0.1, 0.3, 0.5, 0.7, and 0.9. As shown in Figure 14, MSA got near 6% worst-case mitigation on average for the interconnect power with no degradation of maximal delay.
For volume optimization of 3D packing, we compare the computational performance of volume ratio using 2-stage SA and MSA. The results show considerable improvement from 2-stage SA to MSA. The improvement of packing ratio is between 2% and 7% for sequence triple representation. The improvement for runtime is up to 3 times, as shown in Figure 15. With regard to sequence quintuple representation, the experiment also shows the improvement from 2-stage SA to MSA. The improvement of packing ratio is between 3% and 12%. The improvement for runtime is up to 4 times with the sequence quintuple representation, as shown in Figure 16. The packing ratio of volume is improved by near 7% with less than 100s runtime, if we select MSA with sequence triple representation, instead of 2-stage SA with the same representation. The packing ratio of volume is improved by near 12% with less than 100s runtime, if we select MSA with sequence quintuple representation, instead of 2-stage SA with the same representation. In short, the overall solution quality and the runtime of MSA algorithm are better than these of 2-stage SA algorithm.
10. Conclusion and future work
In summary, the optimization techniques for integrated circuit (IC) layout design with large solution space are facing big challenges to get better solution quality with less runtime. In this research, a new simulated annealing based approach, named mixed simulated annealing (MSA), is proposed to solve three typical layout design problems, which are 2D packing, 2D placement and 3D packing, by using sequence pair, sequence triple and sequence quintuple representations. A new crossover operator is designed to reuse the information of past solutions and get high improving efficiency. Based on experiment, MSA improved both the best and the worst cases of 2D placement for interconnect power and maximal delay. For area minimization, MSA reduced computational runtime with the better solution quality, and a near log-linear trend of average improvement rates from SA to MSA is gotten for both solution quality and runtime. The overall quality of packing by MSA is normally better than the published results. For the volume minimization of 3D packing, MSA improved the solution quality (up to 12% better) and the computational time (up to 4 times faster). For the future work, the proposed MSA has potential to be extended to more general problems, such as 2D/3D packing or placement with rectilinear boxes.
The authors would like to thank Prof. Kunihiro Fujiyoshi at Tokyo University of Agriculture and Technology, to thankProf.Shuichi Ueno, Dr. Tayu, Mr. Shinoda, Mr. Inoue and Mr. Zhao at Tokyo Institute of Technology, and to thank Mr. Yamada, Mr. Ando and Mr. Ukon at Osaka University for their discussion, comment and support.