2006. Nr. 4(68) #### ELEKTRONIKA IR ELEKTROTECHNIKA SYSTEM ENGINEERING, COMPUTER TECHNOLOGY T 120 SISTEMŲ INŽINERIJA, KOMPIUTERINĖS TECHNOLOGIJOS # **Energy Consumption Optimization in Hard Real-Time System CMOS Processors** #### A. Baums Intitute of Electronics and Computer Science, 14 Dzerbenes st. LV-1006 Riga, Latvia; tel. +371 7558134, fax: +371 755337; e-mail: baum@edi.lv #### Introduction Portable system energy consumption reduction is of primary importance in mobile embedded systems and SoC design [1]. Dynamic voltage scaling (DVS) for CMOS chips at this time is the most effective energy reduction technology [2–11]. Usually there are no or few problems for DVS using it for soft real-time system design or task scheduling, where deadlines can be missed if the Quality of Service is kept [9, 10]. In hard real-time systems it is necessary to provide that the task never miss the deadline and the energy saving is maximized. The problems arise in on-line systems when at the ordinary task starting moment the task execution time is impossible to know and the best energy reduction regime can not be selected. In majority investigations and publications periodic task executions are analysed there with off-line determined execution times and determined priorities. Then the popular hard real-time task scheduling methods are earliest-deadlines first (EDF) or rate-monotonic scheduling (RMS) for DVS based energy consumption reduction is used [5, 6, 7, 8]. Few investigations are made in the field of event drive single stochastic hard real-time task energy consumption reduction. In [12] there is a method of dividing the task into multiple smaller sub-tasks with checkpoints for dynamic power management or voltage scaling. By using this method the energy consumption is reduced, but there are not guarantee that for the last sub-tasks execution step the deadline will not be reached. Method for the hard task energy consumption on-line minimisation which is named "mini-max" is proposed from author in article [11]. By using it the deadline is guaranteed for the task execution at the minimal power till the critical time $t_{cr}$ , after which the execution is finished at maximal clock frequency. #### Dynamic voltage scaling DVS Power consumption of CMOS has two main components: $P_{sw}$ – dynamic power consumption by charging and discharging the capacitive load and $P_{st}$ – static power consumed, which is determined by leakage current $I_{leak}$ and for CMOS technology below 0.1 $\mu$ m can not be taken into account. Dynamic or switching power can be estimated by: $P_{sw} \sim C_{ef} V^2 f_{sw} = C_{ef} V^2 / \tau_c$ , where: V – supply voltage, $C_{ef}$ – capacitive load, $f_{sw}$ – switching frequency and $\tau_c$ – cycle time. The time and energy consumption optimisation by using dynamic voltage-time scaling DVS is based on power $\Delta P_i$ consumption minimization for task step i, duration the *slack time* $s_i$ . In cases when the real task execution time $t_i$ is smaller then the worst case execution time WCET<sub>i</sub>: $$s_i = WCET_i - t_i. (1)$$ For hard real-time system: $WCET_i \le D$ (D deadline) **Fig. 1.** Time connections for the task step i For voltage or (and) frequency scaling different program controlled CMOS chips are designed and can be used: *Intel XScale Technology* (XA250, 80200) [14], *Transmeta Co. Crusoe* (TM5400÷TM5800) [15], XILINX Co. *Virtex-Pro* [16], Altera Co. *NoisII* [17]. # Mini-max method for hard real-time task energy consumption on-line minimization The hard real-time single event driven stochastic task energy consumption on-line minimization method is developed by consideration that CMOS processor has a wide supply voltage and clock frequency borders for voltage and frequency scaling. For the stochastic task it is necessary to assess the clock cycle $\tau_c$ number $N_{max}$ at the WCET and at the best case execution time (BCET) $\tau_c$ number $N_{min}$ . It can be made by simulation or executing the task at some nominal frequency $f_0$ , clock cycle $\tau_{co}$ and nominal voltage $V_0.$ By using the power cycle time characteristic $P_c$ $(\tau_c)$ the cycle time can be adjusted in region $\tau_{cmin} \div \tau_{cmax}$ and the cycle power consumption in region $P_{cmax} \div P_{cmin}$ (Fig. 2.), where $P_{co}$ – cycle time power consumption at nominal frequency $f_0$ and nominal voltage $V_0.$ Fig. 2. Cycle time power consumption Execution time characteristics at nominal voltage V and frequency f: $$t_{\text{max}} = \tau_{\text{co}} N_{\text{max}} = \text{WCET}, \qquad (2)$$ $$t_{\min} = \tau_{co} N_{\min} = BCET, \tag{3}$$ $$s_{\text{max}} = D - \tau_{\text{co}} N_{\text{min}}.$$ (4) By using mini-max method in the worst case, when $t_i = WCET_i = D$ and $s_i = 0$ , the task execution is started at minimal power $P_{cmin}$ and continued till instant $t_{cr}$ , after which in time interval $T_D = D - t_{cr}$ the execution is finished at maximal frequency and min clock time $\tau_{cmin}$ (Fig. 3). Fig. 3. The presentation of task execution critical time $t_{cr}$ In Fig. 3 by $N^I$ the number of instructions executed at maximal long clock cycles is denoted. This number and time intervals $t_{cr}$ and $T_D$ can be calculated from equation (5): $$\tau_{\text{cmax}}. N^{\text{I}} + \tau_{\text{cmin}} (N_{\text{max}} - N^{\text{I}}) = D,$$ (5) $$N^{I} = (D - \tau_{cmin} N_{max})/(\tau_{cmax} - \tau_{cmin}), \qquad (6)$$ $$t_{cr} = \tau_{cmax} N^{I}, \qquad (7)$$ $$T_D = D - \tau_{cmax}. N^I.$$ (8) From here the energy consumption at WCET: $$E_{\text{max}} = \tau_{\text{cmax}} P_{\text{cmin}} N^{\text{I}} + P_{\text{cmax}} T_{\text{D}}.$$ (9) The task energy consumption $\boldsymbol{E}_{ci}$ at the dynamic (online) scheduling when $\tau_{cmax}\,N_i$ < D - $T_D$ : $$E_{ci} = \tau_{cmax} P_{cmin} N_{\dot{i}}, \qquad (10)$$ $$E_{ci} = \tau_{cmax} P_{cmin} N^{I} + \tau_{cmin} P_{cmax} (N_{i} - N^{I}), \qquad (11)$$ or $$\begin{split} E_{ci} &= \tau_{cmin} \, P_{cmax} \, N^I + (\tau_{cmax} \, P_{cmin} \, - \\ &- \tau_{cmin} P_{cmax}) \, N_{max} \, (\tau_{cmax} - \tau_{cmin}) / (\tau_{co} - \tau_{cmin}). \end{split} \tag{12}$$ To estimate the benefits of the mini-max method it is necessary to compare this energy consumption with the energy consumption at task static (off-line) scheduling, i.e. that at every instant of task start it execution time $t_i$ is known (for example calculate at task compilation). For different hard deadline situations the execution at different $t_i$ can be estimated by using simple expressions represented in Fig. 4. Fig. 4. Energy consumption at the task static scheduling Uniform random distributed task execution time flow is proposed forthe mini-max method investigation. For simplicity uniform distributed by constant step $\Delta_N$ clock cycle discrete number $N_i$ and corresponding constant step $\Delta_S$ slot time $s_{i,co}$ values are used. The single task energy consumption $E_{\text{m-m}}$ includes two components $E_1$ and $E_2$ for the task execution times $t_i$ .= $t_{cr} \le \tau_{c \max} \ N^I$ : $$\mathbf{E}_1 = \sum_{i}^{k} \mathbf{P}_{c \min} \, \tau_{c \max} \,, \tag{13}$$ else $$\mathbf{E_2} = (\text{n-k})P_{\text{c}\min}\tau_{\text{c}\max} + P_{\text{c}\max}\tau_{\text{c}\min}\Sigma_{k+1}^{\text{n}}(N_{\text{max}} - N^{\text{I}} - (\text{n-k-1})\Delta_N)$$ (14) Mini-max method was proposed in [13] and quantitative estimation of an <code>Example</code> was made. Experimental obtained $P(\tau_c)$ characteristic was used and work point $\tau_{co}$ = 1 $\mu s$ ( $f_{co}$ = 1 MHz ), $P_{co}$ = 6 nJ; $\tau_{cmax}$ = 3 $\mu s$ , $P_{cmin}$ = 0.5 nJ; $\tau_{cmin}$ = 0.4 $\mu s$ , $P_{cmax}$ = 14 nJ, $N_{max}$ = 1000, $N_{min}$ = 100 and deadline D =1000 $\mu s$ are selected. When WCET = D and BCET $t_{imin}$ = 100 $\mu s$ , then by using (6), (7) and (8): $N^I$ = 230, $t_{cr}$ = 690 $\mu s$ and $T_D$ = 310 $\mu s$ . The energy consumption is calculated for $\Delta_N$ = 100 and for all corresponding slot times $s_{ico}$ = {0,100, 200, ..., 800, 900 $\mu$ s}and the results of estimation as $E_{i \text{ din}}$ mini-max are represented in Fig. 5. For comparison there are represented curved lines calculated using the same workload: for task execution at nominal clock frequency $f_{co}$ ( $E_{i \text{ co}}$ at $\tau_{co}$ ) and for task execution at off-line voltage scaling ( $E_{i \text{ st}}$ off-line DVS). **Fig. 5.** The graphical presentation of the task energy consumption at different scheduling The joint energy consumptions for task cycle including 10 tasks with different slot times $s_{ico} = \{0, 100, 200, ...800, 900 \, \mu s\}$ was calculated using (13) and (14): $E_{m-m} = E_1 + E_2 = 22000 \,$ nJ. For identical tasks and workloads the energy consumptions for the task execution at nominal clock frequency $E_{co} = 26000 \,$ nJ and task execution at off-line voltage scaling, $E_{st} = 33000 \,$ nJ. The gain from the mini-max method using for energy consumptions in % can be estimated by using two characteristics $$\delta_{st} = (E_{st} - E_{m-m})100/E_{st}$$ and $\delta_{co} = (E_{co} - E_{m-m})100 / E_{co}$ . For the above example: $\delta_{st} = 15\%$ and $\delta_{co} = 33\%$ . #### The mini-max method benefit boundaries estimation At *first* the influence of $\tau_{cmax}$ on $E_{m-m}$ was investigated. The value of $\tau_{cmax}$ was adjusted from 1.5 $\mu s$ to 5 $\mu s$ and the energy economy results in % were compared with energy consumptions $E_{st}$ and $E_{co.}$ The results of estimation are presented in Table 1 and Fig. 6. Table 1. Results of estimation | No. | $\tau_{cmax}$ , | $\delta_{st}$ , | δ, | |-----|-----------------|-----------------|------| | | μs | % | % | | 1 | 1.5 | 3.5 | 13 | | 2 | 2.0 | 8.8 | 22.5 | | 3 | 2.5 | 12.5 | 29 | | 4 | 2.8 | 16.5 | 34 | | 5 | 3.0 | 15.5 | 33 | | 6 | 3.5 | 12.5 | 31 | | 7 | 4.0 | 6.0 | 25 | | 8 | 4.5 | -1.5 | 20 | | 9 | 5.0 | -19.0 | 6.0 | The value of $\tau_{cmax}=3~\mu s$ which was used in *Example* is close to the optimal value $\tau_{cmax}=2.8~\mu s$ At *second* the influence of $\tau_{cmin}$ on $E_{m-m}$ was investigated. The value of $\tau_{cmin}$ was adjusted from 0.2 to 0.65 $\mu s$ and the energy economy results in % compared with energy consumptions $E_{st}$ and $E_{co.}$ The results of estimation are presented in Table 2. and Fig. 7. Table 2. Results of estimation | No. | τ <sub>cmin</sub> , | $\delta_{st}$ , % | δ <sub>co.</sub> , | |-----|---------------------|-------------------|--------------------| | | μs | % | % | | 1 | 0.2 | 31 | 75 | | 2 | 0.3 | 24.5 | 40 | | 3 | 0.4 | 15.5 | 33 | | 4 | 0.45 | 13 | 31 | | 5 | 0.5 | 10 | 28 | | 6 | 0.6 | -0.1 | 21 | | 7 | 0.7 | -4.0 | 20 | Fig. 7. Results of estimation: $\delta_{st}(\tau_{cmin})$ % $\square$ $\square$ $\delta_{st}(\tau_{cmin})$ % $\square$ $\square$ $\square$ It follows from Fig. 7, that by using mini-max method the energy economy can be a raised at task execution at small $\tau_{cmin}$ or by increasing the maximal clock frequency $f_{cmax}$ . #### Conclusion To solve the *on-line* energy saving problem in the hard deadline task real-time systems *mini-max* method was proposed. By using this method at uniform random distributed task execution time workload, the energy saving exceed more then 15% compared to off-line DVS scheduling. The gain from *mini-max* using it for energy saving is estimated by using functions of $\tau_{cmax}$ and $\tau_{cmin}$ . The maximum energy saving can be obtained by selecting the optimal value of minimal frequency. #### References - 1. **Mudge T.** Power: a first-class architectural design constraint // Computer. 2001. 34(4). P. 52–58. - Yun H. S., Kim J. On energy-optimal voltage scheduling for fixed-priority hard real-time systems - ACM Transactions on Embedded Computing Systems. – 2003. – V. 2. – P. 393–429. - Ishihara T., Yasuura H Voltage scheduling problem for dynamically variable voltage processors // Proc. International symposium on Low power electronics and design. – Monterey, 1998. – P. 197–202. - Chandrasena L. H., Chandrasena P., Liebelt M. J. An energy efficient rate selection algorithm for voltage quantized dynamic voltage scaling // Proc. International<sup>14th</sup> symposium on Systems synthesis. – Montréal, 2001. – P. 124–129. - Jejurikal R., Gupta R. Energy Aware Task Schduling with Task Synchronization for Embedded Real Time Systems // Proceedings of the intrnational conference on Computers, architecture, and syntesis for embedded system. – Grenoble, 2002. – P.164–169. - Kuo J. L., Chen T. F. Dynamic voltige sheduling for realtime on low-power variable speed processors // Proc. - International conference on Computers, architecture, and synthesis for embedded systems.— Grenoble, 2002. P. 147–154. - Choi K., Soma R., Pedram M. Dynamic voltage and frequency scaling based on workload Decomposition // Proc. International symposium on Low power electronics and design. – California, 2004. – P. 174–179. - Gruian F. Hard real-time scheduling for low-energy using stochastic data and DVS processors // Proceedings of the 2001 international symposium on Low power electronics and design, Huntington, 2001. – P. 46–51. - Hua. S., Qu G. Energy-efficient dual valtage soft real-time system with (m,k) – firm deadline guarantee // Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems. – Washington, 2004. – P. 116–123. - 10. **Mosse D., Aydin H., Childers B., Melhem R.** Compiler-assisted dynamic power-aware scheduling for real-time application // Workshop on Compilers and Operating Systems for Low Power. 2000.– P. 11–22 - 11. Rotenberg E. Using Variable MHz Microprocessors to Efficiently Handle Uncertainity in Real-Time Systems // Proceedings of the annual ACM/IEEE international symposium on Microarchitecture. – Austin, 2000. – P. 28–39. - Baums A. The Real-Time System Energy Consumption Estimation for the Task Dynamic and Statistic Realization Scheduling // Scientific proceedings of Riga Technical University, Computer Science, 5 series, 2005. – Vol. 20. – P. 36–41. Presented for publication 2006 02 10 ### A. Baums. Energy Consumption Optimisation in Hard Real-Time System CMOS Processors // Electronics and Electrical Engineering. – Kaunas: Tehnologija, 2006. – No. 4(68). – P. 19–22. Energy consumption reduction is of primary importance in mobile embedded systems and SoC design. Dynamic voltage scaling (DVS) for CMOS chips is the most effective energy reduction technology. For hard real–time system optimisation DVS based task online scheduling method "mini-max" was developed. By it using the deadline is guaranteed at the task execution at minimal power till critical time, after which task is finished at maximal clock frequency and speed. In paper the benefits of the mini-max method using for energy saving is estimated and the optimal frequency selection for hard deadline task execution is analysed. Ill. 7, bibl. 16 (in English; summaries in English, Russian and Lithuanian). # А. Баумс. Оптимизация энергопотребления CMOS процессоров в системах реального времени с жесткими временными ограничениями // Электроника и электротехника. – Каунас: Технология, 2006. – № 4(68). – С. 19–22. Минимизация энергопотребления является первоочередной задачей при создании мобильных и однокристальных СМОS систем. Наиболее эффективной технологией для этого является DVS (динамическое масштабирование напряжением). На ее основе создан метод оптимизации энергопотребления для СМОS процессоров "mini-max". Согласно с этим методом при выполнении задач можно оперативно гарантировать жесткие временные ограничения и оптимальное потребление энергии. Это достигается реализацией задачи с минимальной частотой и энергопотреблением до определенного критического момента, после которого выполнение завершается на максимальной частоте. В работе представляются результаты анализа возможностей метода mini-max и принципы определения ее оптимального частотного режима. Ил. 7, библ. 16 (на английском языке; рефераты на английском, русском и литовском яз.). ### A. Baums. Energijos suvartojimo optimizavimas realaus laiko sistemų KMOP procesoriuose su griežtais laiko apribojimais // Elektronika ir elektrotechnika. – Kaunas: Technologija, 2006. – Nr. 4(68). – P. 19–22. Energijos sąnaudų minimizavimas yra ypač svarbus veiksnys projektuojant mobilios paskirties įterptines sistemas ir vienkristales KMOP sistemas. Pati efektyviausia šiam tikslui technologija yra DVS (dinaminis įtampos keitimas). Jos pagrindu sukurtas CMOS procesorių energijos sąnaudų optimizavimo minimumų ir maksimumų metodas. Atliekant šiuo metodu užduotis galima operatyviai garantuoti griežtus laiko apribojimus ir optimalų energijos vartojimą. Tai pasiekiama atliekant užduotį esant minimaliam dažniui ir energijos suvartojimui iki nustatyto kritinio laiko, po kurio vykdymas užbaigiamas maksimaliu dažniu. Darbe pateikti metodo "minimas" galimybių analizės rezultatai ir jo optimalaus dažnio režimo nustatymo principai. II. 7, bibl. 16 (anglų kalba; santraukos anglų, rusų ir lietuvių k.).