NOC Based Router Architecture Design Through Decoupled Resource Sharing Using CABHR Algorithm

Received Apr 4, 2017 Revised Apr 21, 2017 Accepted May 7, 2017 A Network-on-Chips (NoCs) is rapid promising for an on-chip alternative designed in support of many-core System-on-Chips (SoCs). In spite of this, developing an increased overall performance low latency Network on chip using low area overhead has always been a new challenge. Network on Chips (NoCs) by using mesh and torus interconnection topologies have become widely used because of the easy construction. A torus structure is nearly the same as the mesh structure, however, has very slighter diameter. In this regard, we propose effective router design for Decoupled Resource sharing in a torus topology based on clustering algorithms Based Hierarchical Routing (CABHR) to get better the efficiency of NoC. We show that our approach is provides improved latency and energy consumption, overall performance developments compared to the most distinguished existing routing technique.


INTRODUCTION
The SoCs is established to provide good performance resolution for fulfill the expanding communication requirements of challenging Very large scale integration circuits. System on chip provides high efficiency from reusing predefined Intellectual Property (IPs). System on chip employed to be linked IPs through using busses; however; common channel buses can problems the throughput. Because the complexity for SoC raises, limits on power dissipation, chip scalability and operating frequency are receiving most important problems. Major SoC will cause considerable rise in interconnection needs leading to more energy consumption as well as delay. The Network on Chip (NoC) is an alternative technology of SoC that is proposed like resolution for the interconnection issue of large scale SoCs. Network on Chip consists of processing elements (PEs) interconnected through network Interface (NI), communication channels and switches. NoCs accomplish higher scalability and also better performance of the on-chip interconnection wires. In Network on Chip, a normal interconnections similar to point-to-point wires as well as busses starting source to destination IPs. The NoC became an efficient approach to the conventional bus based design for inter core communication.
In [1], Network on Chip (NoC) is realized through the use of Torus structure. They recommended a routing algorithm, router design as well as given solution to the challenge offered from the long wire connection within torus structure through pipe-lining both the long and short wire connection by increase the input buffers connected to the long wires. Because of the fact, gate delays will be scale down along with technology. Large-scale wire delays usually rise tremendously, linearly by including repeaters [3]. The delay may possibly meet or exceed restriction of a clock cycle or repeatedly, a number of clock cycles, in spite of repeater insertion. For ultra deep submicron methods, 75% or a lot of delay with crucial paths may be due to  [4]. Nowadays, Networks on Chip are considered to be a scalable alterative used for on chip communication. Here, the past few decades Network on Chip has appeared like a developing and essential research field. Network topology, router architecture as well as routing algorithm tend to be an important solution to the performance of Network on Chips. Prior to determining the routing algorithm, we have defined the network topology. The location is an intermediate stage from the topology selection and the routing algorithm definition. We proposed router architecture approach is a torus topology based on Clustering Algorithm Hierarchical Routing (CAHR). We show that this kind of proposed method improves the performance of the routing algorithm and also provide better latency, power consumption and additionally improved throughput.

NOC ROUTER ARCHITECHTURE
A Torus network can be a better model of essential mesh network. For torus structure is mostly a mesh structure wherein, the heads in the columns will be linked to the tail from the columns as well as the left sides used for the rows are linked to the right sides of the rows. The path selection for torus network is preferable to the mesh network, and in addition it provides minimum number of routers. From a mesh network all the hop count and also latency improves around linearly using on chip distance. At some level this is correct with torus networks, although a result of the cyclic design of the route structures, devices based adjacently on chip is often several hops separate along the network. It is unwanted in fact, however is traded off from minimized hop counts for medium distance communication.
An N x N torus Network on Chip containsN2 nodes set up in a 2D structure. Each and every node can be addressed as shows the positioning from the node coupled horizontal or vertical dimension. Any node may have a nearby node during the maximizing and also minimizing paths throughout all dimensions. The 1st node and also the final node in every dimension are connected by using a wraparound link in the torus Network on Chip, while this kind of wraparound links will not occur from the mesh Network on Chip. Figure  1 shows a 4 x 4 torus NoC structure. Ever node within the network consists of two parts such as Router and IP (intellectual property) [6,7]. The router architecture consists of five input ports along with output ports. A node employs 4 input along with 4 output ports for connecting to be capable of their neighboring nodes, a couple for each dimension, then one with every direction. The rest of the ports are when chosen from the IP so as to add or send out information from the network or to the network, respectively. Information produced through the IPs is inserting on the system from the inoculation port. Information which gets to a desired destination node will be transferred to a nearby IP by ejection port. The data transfer rate of any port can be shared between a numbers of virtual channels (VCs). The components setup from the router consist of more than a few different units like Address Extractor that can verify and control the data packet headers there are a number of buffer per arriving virtual channel. The idea has to be pointed out in which the several virtual channels is, the slightly more difficult the node arrangement. There is a Multiplexer and De-Multiplexer unit that overcome the virtual channel process, Selector switch unit in which can be applied the virtual channel selection procedure, and Crossbar that can at the same time link several input ports to several output ports provided there is no contention on the output ports. Reservator unit wherein control the crossbar and also other connected sub modules. If a selected structure like torus or mesh needs chosen patterned through this kind of elements, a top-level wrapper module can be applied in which links a number of nodes for this type to one another depending on structure from the selected topology. The router design is share methods with 2 nearby inputs within the type of decoupled resource sharing. Indeed, all Virtual channel, multiplexer as well as de-multiplexers in 2 neighboring inputs are shared in accumulation to reserve sharing control modules.
In this paper, we developed router structure is shown in Figure-2. The router architectures 2 categories of inputs are south & west and north7 east considered with respect to reserve sharing. The reserve sharing control units having the same network are responsible for construct links between the inputs as well as the corresponding input port. Indeed, the source of router design can be using the purpose of reserve sharing units. However, if no-fault is present from the input ports for every group, as the reserve sharing units perform in order that input control works by using default input port for the south input channel makes use of the south input port and so on. Also various links are at ease with allowing the faults. Every transmission port connected by some reserve sharing unit consist of data lines (DATA), control signal (Conl), clock control signal (Clock-Conl), credit signals and VC-id.
• Conl -control signal which signifies the existence of information input from the router.
• Clock-conl -signal can be used for synchronizing data transmission.
• VC-id -same as virtual channel (VC) identifier, as well as the credit signal will allow transmitting data, when it has an empty space from the input buffers for the receiver. All the reserve sharing unit acts through a fault control module which makes use of error state data from the input route and also the module within the input port which can be noted in the register. According to this data and also error conditions from the nearby port as well as channel, the error control module selects among 2 input channels of reserve sharing unit although the input ports. An additional, following the determination for the network topology is usually selecting the proper routing algorithm which can be the responsible to determine the route of the packet through the starting place to the destination. A option of the routing algorithm depends upon a number of metrics like reducing power consumption for routing; minimize logic and also routing maps to develop a reduced area, improving overall performance by decreasing delay as well as increasing traffic usage of the network. There are many likely routing algorithm used from a network on chip. The reason for routing algorithms is usually to make sure that all the data packets may properly achieve their destination in spite of that algorithm is chosen. It may be categorized into many types like static or dynamic routing, minimal or non-minimal and source routing.

ROUTING ALGORITHMS a. Deterministic Algorithm
Deterministic routing algorithms route is established like a purpose of the destination address; thus the result is obviously a similar route for the same set of two nodes. In order to become more accurate, we're able to indicate in the direction of routing is distinct as of oblivious routing. within oblivious, routing tables that contain a number of output channels is made up and the packet selects involving these predefined paths. The accessible route is selected according to several alternative algorithms. When it comes to a deterministic routing algorithm, a particular established path can be achieving for the packet each and every step. Therefore, we arrived at concluding which deterministic algorithms are insensible, as the alternative is not all the times true, and also the deterministic routing algorithms usually follow the shortest path. Probably the most generally used kind of deterministic routing is XY routing, an alternative of dimensional routing.

b. Adaptive Routing Algorithm
Another routing type is Adaptive routing algorithm. The main difference between deterministic routing algorithm and adaptive routing algorithm, is within Adaptive routing algorithm a message provide multiple path to travel in the direction of the destination node, which is possibly lead to a less short path. The traffic facts each and every possible node can be considered in adaptive routing algorithm if the choice is perfect for transferring the message to the next link at each intermediate node. Adaptive routing algorithm outperforms deterministic routing algorithm in irregular traffic [10].

c. Clustering Algorithm Based Hierarchical Routing (CABHR)
The deadlock free is the most important strategy in network and various routing algorithms have planned to achieve the same. The hierarchical routing algorithm method is suggested by R.Holsmark et al. The hierarchical routing work, each subnet works achieve the routing function by internal routing algorithm and each subnet is interconnected with global routing algorithm. In this paper, clustering algorithm based hierarchical routing logic is introduced. The whole network is divided into a number of clusters logically and its size can be varied is determined by network size as shown in Figure 3. The clusters from the network are separate network and also the routers do not bother about other clusters [9]. In this work, the network sizes 4 × 4 network architecture, it is divided into four clusters and each having four routers. Clustering algorithm, the cluster head finds the nearest active node in the neighbor cluster and then it forwards its data to it. From all the cluster heads the data reaches the sink not directly, but by using a self organized efficient routing algorithm. The purpose of a clustering algorithm can be to generate and maintain linked cluster connection is defined as the possibility which a node is actually reachable through any other node. The clustering algorithm includes two stages: the set up and also the maintenance. Algorithms differ in the requirements for selecting cluster-heads within cluster set up stage. Selecting cluster-heads optimally is an NP hard problem. Each and every node can be a cluster-head when it has the required functionality, like control and transmitting power, cluster-heads operate in dual power method, increased power mode for inter cluster transmission.

Routing Function
The Clustering algorithm based hierarchical routing function initially requires data of header flit from the packet which provides the destination router address as well as cluster id when the destination with the same cluster-id, boundary router-id, and also destination router id when destination router within different cluster. a.
Consider the case 1: In the header flit data, when the two cluster-ids and router address can be the same then the related port is set to the local IP or PEs. If not, internal routing is called with destination router address. b. Consider the case 2: when the cluster ids are different, the external routing work is invoked having cluster-id and boundary router-id. The boundary router is applying logical concept to consider both internal and external routing algorithms. The router to support Clustering algorithm based hierarchical routing is designed with extra concepts are,  The comparator to compare the destination address as well as current address with cluster-ids,  The multiplexer to select routing function to be done whether it is internal or global. Data packet format (a) when the destination within same cluster (a) when the destination within different cluster is shown in Figure 4. a b Figure 4. Data packet format (a) when the destination within same cluster (b) when the destination within different cluster

SIMULATION RESULT
The results of expected router architecture about the performance are investigating in the performance for 4×4 NoCs. In this routers are simulated using VHDL based NoC. In the router structure all input ports have 4 Virtual Channel with the size of four flits as well as the packet length is fixed to 16 flits [11,12]. To recognize the efficient of algorithms on NoCs, throughput along with power consumption is assumed as evaluation metrics. To conclude the performances of these algorithms are compared for torus NoC architectures with fixed packet size. The power consumption for data transmission from source to destination using deterministic, adaptive, CABHR as well as the modified router architecture using CABHR algorithms are described in the values are listed in Table 1 under packet size of 512.

CONCLUSION
The proposed NoC router architecture in this paper contains a resource sharing methods in the inputs along with the crossbar using CABHR algorithm to achieving higher network performance. Therefore, it is suitable for the network under large power consumption and performance. Increasing how many of IP or PEs increase the power consumption & deadlock and that router connection in NoC are likely to be helpful to provide a low cost fault tolerant. This method also provides a better solution to prevent deadlock conditions. A new logical approach for Clustering algorithm based Hierarchical Routing (CABHR) could be analyzed for numerous power and performance can be evaluated for the whole network. Thus, the outcomes specific our proposed router design shows better performance when it comes to power consumption.