Energy-efficient Hybrid Optical-Electronic Network-on-Chip for Future Many-core Processors

The increasing number of cores in future chip multiprocessors places a heightened emphasis on the energy-efficiency of the on-chip communications infrastructure. Traditional packet-switched electronic network-on-chip (NoC) has low energy-efficiency in transmitting long-distance packets. By providing high-bandwidth, low-latency chip-wide communication, optical interconnects can be a promising candidate for future NoC architectures. However, laser source and optical-electronic modems in optical network-on-chip (ONoC) introduce considerably high static energy, which is mostly wasted if the traffic load is low. In this paper, we propose a novel energy-efficient Hybrid Optical-Electronic NoC named HEON, which consists of an electronic buffer-bypassed mesh subnetwork and some bus-based optical channels. The optical channels, which dispose long-distance traffic to improve performance, are managed through a novel priority token scheme, in which the optical components can be dynamically turned on and off to save energy according to the traffic load. Experiment results show that HEON can improve energy-efficiency by 47.4 % and 38.1 %, compared with the electronic NoC and a state-of-art bus-based ONoC respectively. DOI: http://dx.doi.org/10.5755/j01.eee.20.3.6682


I. INTRODUCTION
The new trend of integrating an increasing number of processing cores into a single die raises the importance of designing an efficient on-chip communication infrastructure.Packet-switched Network-on-chip (NoC) is becoming an alternative to existing dedicated interconnection and shared bus.The traditional electronic NoC is not energy-efficient, especially under long-distance traffic patterns, since each packet must be buffered and switched at each hop along its route, which consume a great amount of energy.
Thanks to recent progresses in optical device integration, nanophotonic interconnect, which is featured with high Manuscript received June 14, 2013; accepted October 22, 2013.This paper is supported by the National Natural Science Foundation of China under Grant No. 61379035, the National Natural Science Foundation of Zhejiang Province No. LQ12F02017, Open Fund of Mobile Netwok Application Technology Key Laboratory of Zhejiang Province, Innovation Group of New Generation of Mobile Internet Software and Services of Zhejiang Province.
bandwidth, low transmission latency and power, has emerged as a promising candidate for future on-chip communication fabric.Recently proposed optical network-on-chip (ONoC) have demonstrated exceptional throughput with low runtime power dissipation [1]- [8].However, these bus-based ONoCs suffer from great static power consumption, with the main contributors being the laser source and optical-electronic modems [4]- [8].Laser source should continuously provide power to illuminate a bundle of long waveguides and overcome the attenuation.Optical-electronic modems, which are made by nanophotonic micro-rings, should be trimmed (i.e.heated) to work at accurate wavelengths [4], and still, consumes energy over time.When the traffic load in an ONoC is low, most of the static energy is wasted, leading to significant degradation in energy-efficiency.
Considering the pros and cons of both electronic and optical NoCs, in this paper we propose a novel energy-efficient Hybrid Optical-Electronic NoC named HEON, which consists of an electronic buffer-bypassed mesh subnet-work and some bus-based optical channels.The role of optical channels lies in disposing long-distance traffic to lighten off the load of the electronic sub-network and reduce the buffering energy induced by conflicts.To further save energy of our optical sub-network, we introduce some mechanisms to power gate some of the underused optical components.In order to maximize energy-efficiency, channels and micro-rings can be adaptively turned on and off according to the transient traffic load.Several novel mechanisms are used, which constitute the fundamental contribution of our work: Regional optical gateways: The whole network is partitioned by several regions and there's a single regional optical gateway located in the centre.Providing with an optical-electronic interface, the gateway routers deal with long-distance optical transmissions.
Priority token-based optical arbitration: We use a light-weight central optical channel manager to allocate traffic resources to each gateway.A novel priority token-based optical arbitration scheme is proposed to manage the optical power gating and access admission.

A. Motivation
Traditional 2D mesh electronic NoC is the most popular interconnect fabric for many-core process, thanks to its high scalability and simple footprint.Buffer-bypassing [9], [10] is one of the most energy-efficient enhancement to NoC, in which packets may be switched without buffering if no conflicts happen, reducing considerable latency and power consumption.However, if the traffic load is high, the possibility of conflicts significantly increases, which results in low buffer bypass rate and very similar energy-efficiency compared with the baseline NoC.The case for ONoC is quite different.ONoC show substantial performance improvement under high traffic load due to its ultra-high bandwidth and throughput.Under low load however, the advantage of light speed transmission is not significant.Due to the great difference in the sensitivity to traffic load between electronic and optical NoC, we consider deploying optical channels to lighten off the load in electronic network and increase buffer bypass rate so that the energy-efficiency can be improved.Except for traffic load, traffic pattern is another important determinant to the energy-efficiency of both NoCs.For optical NoC, since both latency and power are not sensitive to transmission distance, the bus-based optical channels are more suitable to transmit long distance packets than packet-switched electronic NoC.
When running real-world applications, different cores and memory banks show great deviation in the number of sending or receiving packets.The nodes which have very few packets to transmit should suffer from low channel utilization and high static energy consumption.Besides, traffic load distribution may fluctuate due to the staged execution fashion.In consideration of the above space and time features of on-chip traffic, sharing optical channels and dynamically tuning the number of shared channels may guarantee performance at much lower power consumption.
Furthermore, the number of nodes mounted on each optical channel (gateways) has a strong impact on the optical power, since optical gateways are usually complicated to implement and quite power hungry.In each optical gateway, insertion loss of micro-ring resonators puts great limitations on the laser source power budget, and the optical-electronic interfaces consume much energy in buffering and switching data.So in this paper we propose to use regional optical gateways to reduce the number.These gateways are shared by the nodes in the region via the electronic NoC, and merging traffic for fast and high-throughput optical transmission.

B. HEON Architecture
Motivated by the above observations, in this paper we propose a Hybrid Optical-Electronic NoC named HEON.The architecture is shown in Fig. 1, which is based on an 8×8 2D mesh electronic NoC.Buffer bypassing techniques are applied to each on-chip router in which the buffer-write (BW) pipeline stage can be bypassed according to look-ahead signals for switch allocation, provided that no conflict exists.
The mesh network is partitioned into eight regions, which is shown as a specific colour in Fig. 1.There is a single router containing an optical-electronic interface inside each region, which is called optical gateway router (OGR).OGRs can communicate with each other directly via optical buses, while other routers which are called non-gateway routers (NGR), must first transport packets to the corresponding OGR in its region to use the optical network.We select the central node in each region as the OGR to effectively collect and distribute traffic flows which use optical channels.The eight OGRs are interconnected by several ring-shaped nanophotonic waveguides, forming the bus-based optical sub-network.The waveguides are shared by all the OGRs in the method which is similar to [8], and the numbers can vary according to different configurations, which can be independent with the network scale.Dense wavelength division multiplexing (DWDM) is used in each waveguide to increase the available bandwidth, which allows multiple flits to be transmitted within a single cycle.The laser source is located next to the edge of the chip, which may emit laser power inside the waveguides.During the round trip of laser inside the waveguides, each discrete wavelength within its spectrum can be modulated by one of the OGR and detected by another, realizing end-to-end light speed transmission.
As shown in the figure, a module called central optical channel manager (COCM) is placed near the entrance of waveguides.The COCM takes charge of optical channel arbitration and power management, which is detail in the next sections.

C. Optical Power Management Mechanisms
As mentioned before, the optical channels in HEON can be turned on to improve network performance if the traffic load is high, or turned off when some waveguides are underused.In this section we first discuss the low-power mechanisms found in optical channels and micro-rings.
Low power mechanism for laser source and waveguides: Laser is produced by off-chip source and emitted into the on-chip optical power waveguides.Optical splitters are used to couple optical power to each waveguide, which should be intense enough to overcome waveguide attenuation and insertion loss of all the micro-rings and be above the sensitivity of the optical receivers.In HEON, the splitters are controlled by our COCM module to illuminate or extinguish each single waveguide.The off-chip laser source can automatically adjust the total optical power according to how many waveguides are illuminated.Thus we can make best use of laser power through tuning the optical splitters.
Low power mechanism for micro-ring modems: Micro-ring resonators are used to modulate data and detect optical signals, which should be trimmed to work at the accurate wavelength.Trimming requires continuous heating, which consumes considerable static energy.In our bus-based optical waveguides, only one pair of modulator and detector is working in each cycle, and the other micro-rings are idle.Besides, for some nodes that have few packets to send or receive via optical channels, the optical modems are seldom used.So in our design, each set of micro-rings in the optical gateways can be power gated.For optical channels that are not illuminated, and modems that are seldom used, the trimming of ring resonators can be suspended to save energy.

D. Priority Token-based Optical Arbitration and Routing
We proposed a novel priority token-based optical arbitration mechanism to manage optical channels efficiently.By releasing priority token in an additional arbitration waveguide, the COCM can determine which OGR is able to modulate on a particular optical channel preferentially and avoid bus access conflicts.Here a token is a small optical signal which travels around the arbitration waveguide and can be destructively absorbed or modified by the OGRs passing by.The OGR which absorbed a token is allowed to modulate 1 flit of data on the optical channel.Besides, our COCM module can recycle the tokens when they turn back, in order to observe the traffic load to make better arbitration decision.The mechanism is detailed next.
Priority of optical gateways: Each OGR has a priority for the optical sub-network, which indicates how urgently the OGR requests the optical channels.In this paper, the priority is decided according to the number of flits buffered in an OGR that request optical channels, since an ORG with heavy traffic load are prone to congestion and necessitates fast clearance of pending packets in its buffer.We use four levels of priority and four threshold of traffic load to decide an OGR's priority.Upon updating the priority, each OGR turns on its corresponding token detectors which have the same and lower priority number.As shown in Fig. 2, the priority of OGR-1 is 2, so it should turn on its token detectors labelled with priority number 1 and 2.
Release and absorption of tokens: Our COCM releases m tokens, where m is the number of optical channels which are illuminated.The tokens should first be coded with the highest priority and released to the network.Then, only OGRs with the same or higher priority are able to absorb the token to modulate data on the channel.For example, in Fig. 2, a token with priority 3 cannot be grabbed by OGR-1 and OGR-2, but can be absorbed by OGR-3, since the priority for OGR-3 is 3.So OGR-3 is prioritized to use the optical channel.Since the granted OGRs will automatically degrade its priority due to fast clearance of traffic load, the mechanism will not starve other OGRs.
Load feedback and channel management: The COCM may observe the global traffic load through recycling coded tokens.When an OGR with high priority number finds a low-priority token passing by occupied, it should write on the token to indicate the starvation.In this work, the COCM recycles the tokens and counts the starvation signals.If the number exceeds a threshold, which implies that the supply of optical channels fall short of demand, the COCM should illuminate a new unused channel.On the contrary, if a token travels around the network without being absorbed by any OGRs, which means that the channel is underused, COCM should extinguish the channel to save energy.
Routing: In HEON, each packet has to choose whether to use the optical channels to be routed to its destination.If it decides to take an optical route, it must go through local and destination OGRs.However, not all packets can benefit from using optical channels.For example, a packet from router-3 in Fig. 1wants to be transmitted to router-19 and only 2 hops are need in the electronic sub-network.But if the optical route is taken, it must go through gateway router-10 and router 20, which involves 4 hops.Thus, optical channels should only be used if it results in fewer hops to take to the destination for a packet, than routing along the electronic network.Besides, routing of packets in NGRs should be relative to the status of the OGR in its region.An OGR with many optical channels and modulators on and low traffic load can attract the NGRs to use the optical gateway.We use some low-cost wires to implement such information exchanging between an OGR and the NGRs in the region, as shown in Fig. 2.

III. EVALUATION
In this section we evaluate the performance and power of HEON, and compare it with buffer-bypassed electronic mesh NoC (EMesh) and the state-of-art multi-write-single-read optical NoC Corona [4].Application traces from running PARSEC [11] benchmark are conducted using GEM5 simulator [12] and simulated on a cycle-accurate C++ simulator.The electronic sub-network in HEON shares the same configurations with EMesh and its optical sub-network consists of eight waveguides with the same bandwidth with electronic links (128 bits).
We present results of average packet latency in each application in Fig. 3.We can see that HEON shows much better performance for all the benchmarks than the EMesh, and acquires 32.1 % latency reduction.Compared with Corona, HEON achieves more than 90 % performance on average, and for some applications such as dedup and facesim, HEON has even better performance, which indicates that Corona suffers from performance penalty for some particular traffic, but HEON can be more adaptive to various traffic patterns and loads.Figure 4 illustrates average power breakdowns for the above three NoCs running all the applications in PARSEC, in which HEON+PM refers to a NoC which has used our proposed low-power mechanisms.We find that HEON+PM saves total energy by 22.6 % and 43.1 % compared with EMesh and Corona respectively.Since our HEON uses additional optical buses to transmit long distance packets and hence get a great reduction in the traffic load of electronic sub-network, HEON has higher buffer bypass rate than EMesh and saves buffer refreshing and read/write energy.The ONoC Corona however, has very low optical components utilization in running most of the benchmarks, which results in significantly high static optical power dissipation.Besides, we have evaluated our HEON NoC without the power gating mechanisms and the result shows that our power management scheme can save energy by 26.8 %.It is a strong proof that the optical components are always underused during the execution of applications and may consume much less energy if adaptively turned off, without impacting the performance.Finally, we depict the energy delay product (EDP) results as a comprehensive evaluation index for energy-efficiency, which is the product of average latency and total power dissipation.The results are normalized to EMesh and are shown in Fig. 5.
We can see that the EDP for HEON has much less EDP than EMesh and Corona, by 47.4 % and 38.1 % respectively.We can conclude that HEON achieves exceptional performance at much less energy consumption under various traffic load and distributions.

IV. CONCLUSIONS
We proposed a hybrid optical-electronic network-on-chip, which is comprised of an electronic mesh sub-network and several bus-based optical channels.We use some regional optical gateways to transmit long-distance packets on the optical channels to lighten off the load on the electronic network.The optical channels are managed through a novel priority token scheme, in which the components can be adaptively power gated to save energy when the traffic load is not heavy.For future work, we would like to study the fairness and QoS issues of the proposed NoC architecture.