Faster SoC Design Cycles with Celaro, |
|
LSF and Sun |
The problem
For realizing rapid prototyping in the semi-conductor development, the industry focuses more and more on hardware emulation of system on chip (SoC) design during the development process.
Celaro from Mentor Graphics is one of the most well-known emulation systems. If a design is to be reproduced on the emulator, it is first of all necessary to compile all relevant information (ASIC net list, the included memory, configuration files) in an emulator-readable format.
This is done by Sun Blade workstations. The compile process can be distributed to several CPUs in a multi-processor system or to several hosts in a network. It is also possible to combine the two possibilities.
For a current project a very large net list of a telecommunication IC with approx. 8.5 mill. gates had to be verified. For compiling this net list a Sun Blade 1000 uni-processor workstation with a 750 MHz CPU and 2 GB RAM needed 18 hours. This long run-time was not acceptable for the developers. After modifications in the net list a new compile process would have to be run for more than one and a half working days, before debug works on the design could have been continued.
The solution
Together with the Tübinger IT specialist science + computing an optimal IT-infrastructure for the SoC-design compilation was developed to fasten job processing. For many years now science + computing distributes, supports and customizes the software LSF of Platform Computing (headquarter: Toronto, Canada) in the automobile and EDA-industry. LSF for example manages an optimal utilization of clusters by distributing individual computing jobs automatically to those systems within the network which are best suited for the task. In addition, it takes care that each workstation operates at full capacity.
Dirk Hansen, application engineer at Mentor Graphics and responsible for Celaro, visited science + computing in Tübingen. Within two-days he got informed about the optimal configuration of LSF and together with the Tübinger specialists he installed and configured LSF for his requirements. With the profound experience of science + computing only one more day of consulting at Mentor in Munich was required.
The results
Result 1: An LSF administered cluster with six Sun Blade 1000 workstations (six 750 MHz processors, a total of 5 GB RAM distributed to six workstations) compiled the data base now in 10 hours. The jobs that could be distributed with LSF were considerably faster due to distributed processing.
Result 2: The compile process on a Sun Blade 1000 uni-processor workstation with 4 GB RAM still needed 11 hours. Obviously, there were serial jobs in the process which could not be parallelized. The only solution for accelerating these jobs was an extremely large main memory. Normally, it is not justifiable for financial reasons to install on the desktop of each developer main memory in the dimension of four GB and more, above all because this investment would for the most part not be used. Therefore it makes sense to install this resource in the backend and to make it available for the developers centrally in the intranet via middleware such as LSF together with other resources. For a final test Dirk Hansen networked the six Sun Blades of his LSF cluster with the workstation which had four GB main memory.
Result 3: The run-time of the job was drastically reduced to 5.5 hours. Now it is possible to compile once again on the same day after correcting a mistake in the design. Debug work can be continued, i.e. it is ensured, that after starting a compiler-run in the evening it is possible to continue work the following day with a new net list.
Besides, Dirk Hansen also realized that a Sun Blade with 4 GB main memory is nearly almost as performant as a workstation cluster consisting of six Sun Blades and a total of 5 GB RAM (11 hours versus 10 hours of the cluster). This is worth a consideration not only for financial reasons. It shows impressively that the concentration of computing power and main memory in the backend as central resource for all developers is a valuable addition to the LSF controlled optimal utilization of all systems in the network.
Dirk Hansen concludes: "The final results convincingly demonstrate that LSF optimally utilizes existing resources. The time needed for administering LSF until a first productive use is more than compensated by a reduction of computing and waiting time, not to mention the money that would have been spent for hardware expansions."


