SlideShare una empresa de Scribd logo
1 de 6
Descargar para leer sin conexión
Traffic Characterization for Multicasting in NoC
                  V.Laxmi1 , Roopesh Chuggani2 , M.S.Gaur3 , Pankaj Khandelwal4 , Prateek Bansal5
                               Department of Computer Engineering
                                 National Institute of Technology
                                               Jaipur
{vlaxmi |gaurms }@mnit.ac.in,{roopesh.chuggani2 |pankaj1394 |prateekbansal.895 }@gmail.com
       1       3




   Abstract—NoC (Network on Chip) is an emerging paradigm           one core to another. Traffic modelling has been proposed as an
for design of VLSI/ULSI circuits to overcome communication          open area of research in recent papers [7]. Most evaluations
bottleneck of traditional bus based systems. NoC communica-         and analysis of NoC design parameters are still based on basic
tion framework consists of regularly placed routers, which are
connected to processing cores. NoC performance is determined        synthetic traffic patterns such as CBR (Constant Bit Rate),
by latency and throughput for communication requirements.           bursty, bit-complement, transpose, etc. These traffic patterns
NoC communication traffic modelling plays an important role          do not capture real-world scenario as each of these patterns
in design of NoC simulators and/or prototypes. This paper           comprise of only point-to-point communications, i.e. for each
presents a framework for modelling source traffic for multipoint     source there is only one destination. Traffic modelling of
communication from one source to different destinations as is
required for multicasting. Such a traffic model captures real-       multicast communication for NoC is still in infancy.
world scenarios such as multicasting, execution of concurrent          In multimedia applications such as NoC design for modules
multiple tasks on a single core (each task requiring commu-         of MPEG encoder/decoder, point-to-multipoint communica-
nication with different destinations). The model proposes how       tion patterns are also needed as experienced by authors while
concurrent traffic streams from a single core to different desti-    extending capability of an NoC simulator. This requires gen-
nations can be mathematically characterized as a single stream
at source end. The model is derived from statistical behaviour      eration of multiple traffic streams originating from the same
of probabilistically demultiplexing of a single traffic stream. In   source but destined for different cores. A similar traffic pattern
its nascent stage, the method is proposed for a scenario of one     is observed when a core is running concurrent tasks; each task
source concurrently communicating with two destinations as shall    requiring communicating with different destination.
be required for mapping two concurrent tasks to same core or           In this paper, we propose how multicast communication,
simultaneous broadcast to two destinations.
   Index Terms—Network on Chip, Multicasting, Bursty Traffic,
                                                                    i.e. multiple traffic streams originating at the source, can be
Probabilistic Demultiplexing, Exponential Distribution              viewed as a single traffic stream without any adverse impact
                                                                    on statistical characteristics of destination traffic streams. The
                      I. I NTRODUCTION                              model is derived from observations of statistical behaviour of
                                                                    received streams at destinations in a single source multiple
   VLSI designs are increasingly becoming more complex with         destinations scenario. Till now, to the best of our knowledge,
increase in scale of integration resulting in more components       no traffic model has been proposed to accurately characterize
being fabricated on the same chip. With resultant increase in       this scenario. In this initial work, we present model for
the number of processing cores (CPU, DSP, memory, etc.),            two destinations. This can be used as basis for n(n > 2)
increased inter-core communication requirement cannot be            destinations.
satisfied by the traditional bus based communication archi-             The model is based on the observation that probabilistic
tecture [1], [2]. Network on Chip (NoC) has been proposed           division of a bursty traffic stream into two separate streams
as an alternative [3]. NoC provides a communication layer           results in both streams being bursty. Burst parameter of each
of regularly placed, interconnected routers. Inter-core com-        stream is related to the that of the original stream. The
munication takes place through these routers. Decoupling of         proposed traffic model has been implemented and tested on
communication and computation simplifies IC design process.          an open source NoC simulator NIRGAM [8].
Regularity in NoC structure results in better scalability and          This paper is organized as follows: In Section II, we present
fault tolerance [2], [4]. Because of its modular structure, many    the background survey in this field. In Section III, we present
components can be reused from previous designs resulting in         objectives of the presented work and motivation for proposed
reduced time to market for new NoC designs.                         traffic model. In Section IV, we derive how statistical charac-
   NoC design parameters include topology selection, router         teristics of traffic streams received at destinations are related to
design and choice of routing function. A NoC simulator can          those of the source traffic. These relationships are derived from
assist the designer in evaluation of different NoC designs.         observations of experiments conducted. Section V describes
One important aspect of simulator design is characterization of     NoC simulator NIRGAM, on which the proposed model is
inter-core traffic. Traffic modelling of the cores is an important    implemented, in brief. In Section VI, implementation of the
step in NoC design [5], [6]. Traffic models are mathematical         proposed model on NIRGAM is described. Experimental result
characterization of statistical properties of data flowing from      are presented in Section VII followed by conclusions and


978-1-4244-8971-8/10$26.00 c 2010 IEEE
pointers for further extension in Section VIII.                                     0
                                                                                                1           2             3

                     II. R ELATED W ORK
                                                                                                                      7
   Applications needs to be mapped to the underlying NoC                                4       5           6

architecture by dividing their functionality of the application
into smaller tasks. Each task is mapped onto one NoC core.
                                                                                        8       9           10         11
Many algorithms for mapping these tasks on to IP core have
been proposed [9]–[11]. In each of previous work, a single
task is mapped onto one IP core. Most of the past work has                          12          13          14         15
been done to map a single application onto the underlying
network. In [9], the tasks of a process control platform are                            0
mapped on to NoC cores in one to one manner. In [11], Hu et                       IP Core            Task          Data Flow
al propose an energy constrained mapping of communication
task graph to a NoC. This work considers single task per core.           Fig. 1: NoC Architecture with Multiple Task per core
   NoC evaluation is based on the assumption of mapping sin-
gle task per core and point-to-point traditional traffic patterns
like bit complement, transpose [3]. This type of communi-            statistical characteristics of traffic received at the destinations.
cation is limited to only few applications, because rarely a         Following are the assumptions for our model.
node communicates with just a single node or with all the               1) There is one source and two destinations. This can
other nodes in the network. For modelling a multicast (point                 happen when at most two traffic streams are emanating
to multipoint) scenario, uniform random traffic is used by                    on a single core.
selecting a random destination for each packet; probability of          2) Each stream (task) is generating Bursty traffic; average
each destination being selected is same. In [12], a new traffic               OFF time of this traffic is modelled using exponential
pattern is proposed to create the scenario where tasks with                  distribution.
higher intertask communicating tasks are mapped to cores in             3) Traffic model is independent of burst size (Number
adjacent regions. In this traffic pattern, communication is point             of packets in a particular burst). Experimental results
to point but, traffic is distributed to multiple destinations.                suggest that traffic statistics appears to be independent
   These traffic patterns cannot model the point to multipoint                of burst size. Details are discussed in Section VII.
traffic generated by multiple tasks executing on a single
core. This is because when we map multiple tasks on single           We define following parameters for our traffic model :
core, traffic of the core is composed of the individual traffic           1) mc : Average (Mean) OFF time of the traffic generated
generated by each tasks. Each individual traffic stream can                   by the core node.
have different statistical properties and destination pattern. But      2) p1 : Probability that packet is destined for first destina-
traditional traffic generators do not provide functionality for               tion.
such a communication.                                                   3) p2 : Probability that packet is destined for second
                                                                             destination
                       III. M OTIVATION                                 4) mt1 : Average (Mean) OFF time of the traffic received
   In this paper, we try to model point-to-multipoint source                 by first destination.
traffic pattern given the statistical behaviour of traffic received       5) mt2 : Average (Mean) OFF time of the traffic received
at the destinations. This will result in multiple traffic streams             by second destination.
emerging from same core. Each traffic stream may have a                  Our model is based on the observation that when a bursty
different destination and is likely to have different statistical    traffic generated using exponential distribution with average
properties.                                                          OFF time as mc is demultiplexed probabilistically into two
   Figure 1 shows one such scenario in an NoC of size 4 × 4          traffic streams, demultiplexed traffic streams still follow expo-
wherein cores are numbered 0 to 15. Core 0 is multicasting           nential distribution. Average OFF time of each stream/task is
to cores 9 and 10 respectively. Core 7 is multicasting to cores      mt1 and mt2 respectively. Probabilistic demultiplexing means
10 and 12 respectively. There is one unicast communication           that each packet is assigned to one of the streams/tasks as per
from core 15 to core 13.                                             probabilities (p1 , p2 ). A random number is generated and if
                                                                     it is less than p1 this burst of packets belongs to first stream,
                    IV. P ROPOSED MODEL                              otherwise to second one.
   The main objective of the work presented here is to deter-           We investigate dependence of mt1 and mt2 on mc , p1 , p2 .
mine how a point-to-multipoint traffic pattern can be modelled
at source end. We need to derive statistical characteristic of the   A. Bursty Traffic Model
traffic at source given traffic characteristics at the destination.       Bursty traffic is modelled using exponential distribution [8].
For such a derivation, we first consider the inverse of the           Both inter packet interval and packet size follow exponential
objective. Given source traffic characteristics, what are the         distribution. We are concerned only with inter packet intervals.
Exponential distribution is parametrized by average value of                               To verify this observation, we generated and demultiplexed
the distribution denoted by m. The probability density function                         traffic for multiple values of mc . One such instance is shown in
(PDF) of an exponential distribution is                                                 Figure 2. Here, Figure 2(a) shows the probability distribution
                                                     x
                                                  1 −m                                  of original trace with m = 30 while Figure 2(b) shows PDF
                                                  me   ,      x≥0                       of one of the demultiplexed trace with probability 0.6. As can
                            f (x; m) =                                            (1)
                                                    0,        x<0                       be seen, both approximate to exponential distribution.
m is also known as expected value of the distribution. Fol-                             C. Deriving the relation
lowing variables are required in the traffic model
                                                                                           To seek relationship between mc , mt1 and mc , mt2 , we
B. Observation of Demultiplexed Trace                                                   generated and demultiplexed traces for various values of mc
   We generated a traffic trace with a random average OFF                                and calculated the values of mt1 and mt2 . It was found
time mc . This traffic trace was divided into two different                              that average OFF time of traffic generated by each stream
traces using probabilities (p1 , p2 ). The PDF of the original                          is directly proportional to average core OFF time.
trace was exponential as expected. PDFs of each demultiplexed                                                                                           mt1 ∝ mc                                   (2a)
trace was observed to follow similar exponential distribution.
This observation was significant because it meant that we can                                                                                            mt2 ∝ mc                                   (2b)
generate two different exponential distributions from a single
distribution by probabilistically demultiplexing.
                                                                                                                           100

                                                                                                                                         Offtime of task 1 (mt1) with probability 0.4
                120                                                                                                         90           Offtime of task 2 (mt2) with probability 0.6
                                                                                              Average Off time for tasks
                                                                                                                            80
                100
                                                                                                                            70


                   80                                                                                                       60
    Frequency




                                                                                                                            50
                   60
                                                                                                                            40


                   40                                                                                                       30


                                                                                                                            20
                   20

                                                                                                                            10
                                                                                                                                 0      5         10         15          20         25   30   35

                    0
                                                                                                                                              Average Off time at Core
                        0               50                  100            150
                                 Value of inter packet time
                                                                                                                                     Fig. 3: mt1 v/s mc and mt2 v/s mc
                                             (a) Original

                                                                                           Figure 3 shows the plot of average OFF time of core and
                   70
                                                                                        of demultiplexed traffic streams. On X axis is the average
                   60
                                                                                        OFF time of core (mc ), while on Y axis is the OFF time
                                                                                        of both streams. As can be seen, the curve comes out to be
                   50                                                                   approximately linear, hence showing direct proportionality.
                                                                                           Next, we deduce the relationship between the mt1 , mt2 and
       Frequency




                   40
                                                                                        p1 , p2 . To achieve this we kept the mc constant and probability
                                                                                        of generation was varied from 0.1 to 0.95 (p1 + p2 = 1). It
                   30
                                                                                        was found that average OFF time of traffic generated by each
                   20
                                                                                        stream is inversely proportional to respectiveprobability.
                                                                                                                                                                       1
                   10                                                                                                                                    mt1 ∝                                     (3a)
                                                                                                                                                                       p1
                    0
                        0   20     40        60      80     100   120   140 150                                                                                        1
                                 Value of inter packet time                                                                                              mt2 ∝                                     (3b)
                                                                                                                                                                       p2
                                        (b) Demultiplexed                                  The Figure 4 shows the plot of mt1 versus the
                                                                                        probability(p1 ) for mc = 50. Probability is on the X-axis
Fig. 2: (a) PDF for Original Trace, (b) PDF for a demultiplexed                         and average OFF time is on the Y-axis. As can be seen
trace (probability= 0.6)                                                                from the plot, curve precisely shows the inverse relationship.
400                                                                                                450


                                                                                                                                  400                            Actual offtime for source offtime 15
                               350
                                                                                                                                                                 Analytical offtime for source offtime 15
                                                                                                                                                                 Actual offtime for source offtime 25
                                                                                                                                  350
      Average Off time (mt1)




                               300                                                                                                                               Analytical offtime for source offtime 25
                                                                                                                                                                 Actual offtime for source offtime 35




                                                                                                               Average Off Time
                                                                                                                                  300                            Analytical offtime for source offtime 35
                               250

                                                                                                                                  250
                               200
                                                                                                                                  200

                               150
                                                                                                                                  150

                               100
                                                                                                                                  100

                                50                                                                                                 50


                                 0                                                                                                  0
                                 0.1   0.2    0.3   0.4   0.5        0.6   0.7     0.8   0.9   1                                    0.1     0.2   0.3   0.4      0.5      0.6     0.7      0.8     0.9      1
                                                     Probability (p1)                                                                                         Probability


                                       Fig. 4: Variation of mt1 w.r.t p1                                 Fig. 5: Analytical v/s actual OFF time of Task 1 for different
                                                                                                         values of mc

As the probability approaches unity the case reduces from
                                                                                                                                  900
point-multipoint scenario to point-point scenario and mt1                                                                                               Actual Off time for source off time 35
                                                                                                                                                        Analytical off time for source offtme 35
approaches mc . While for other destination, it attains a very                                                                    800
                                                                                                                                                        Actual Off time for source off time 25
high value. Using Equations (2a), (2b), (3a) and (3b) with                                                                        700
                                                                                                                                                        Analytical off time for source offtme 25
                                                                                                                                                        Actual Off time for source off time 15
curve fitting of both the curves, empirical relationship between                                                                                         Analytical off time for source offtme 15
                                                                                                               Average Off time




                                                                                                                                  600
average OFF time for each was derived as:
                                                                                                                                  500
                                                                1
                                                     mc +       p2    + c1
                                             mt1                                 + c2              (4)                            400
                                                                p1
                                                                                                                                  300
                                                                1
                                                     mc +       p1    + c3                                                        200
                                             mt2                                 + c4              (5)
                                                                p2
                                                                                                                                  100

   c1 , c2 , c3 , c4 are constants. In our case, when curve fitting                                                                  0
                                                                                                                                        0   0.1   0.2   0.3      0.4      0.5     0.6      0.7     0.8      0.9
was applied following values were obtained c1 = c3 = 6 and                                                                                                    Probability
c2 = c4 = −6.
   Verification of the Equations (4), (5) is performed in two                                             Fig. 6: Analytical v/s actual OFF time of Task 2 for different
steps. We calculate average OFF time of traffic generated by                                              values of mc
each stream in two ways :
  1) The values of mt1 and mt2 is calculated from the demul-
     tiplexed traces obtained with different values of p1 , p2                                                                                          V. NIRGAM
     and mc . These values are referred to as ‘calculated’ or                                               Network-on-chip Interconnect Routing and Application
     actual OFF time from trace.                                                                         Modelling (NIRGAM) [8] is a discrete event, cycle accu-
  2) For all the corresponding values of p1 , p2 and mc , values                                         rate simulator targeted at Network on Chip (NoC) research.
     of mt1 and mt2 is calculated using Equations (4) and (5).                                           NIRGAM is written in SystemC, which is a dynamic library
     These values are referred to as ‘analytical’ OFF time.                                              for hardware modelling built on top of C++. NIRGAM allows
   Analytical and actual values are plotted on same figure to                                             users to change various options in terms of NoC simulation
verify the derived Equations (4) and (5). Figures 5 and 6 show                                           at every stage such as routing algorithm, topologies, virtual
the result of verification. The results have been shown for                                               channels, buffers etc. Simulation framework allows analysing
different values of mc to verify our model for a range of                                                results in terms of various performance metric such as latency,
core OFF time values. On X- axis is the probability of traffic                                            throughput etc. Orion [13] has been integrated into NIRGAM
generation and transmission for each stream and on Y axis is                                             and allows users to creating and analysing power estimation
the OFF time of the traffic generated for that stream. As can be                                          graphs. NIRGAM provides support for fault tolerance [14] and
seen from the Figures 5 and 6, values from analytical formula                                            QoS [15].
very accurately estimates the actual OFF time calculate from                                                NIRGAM supports 2D mesh and 2D torus topologies. Rout-
demultiplexed trace.                                                                                     ing in NIRGAM is done using flits. These are the units that
flow between routers. NIRGAM support wormhole switching               of mt1 and mt2 while last two columns represent values
mechanism. Presently it supports a number of routing algo-           calculated from traces generated by our traffic model. It can
rithm such as XY, OE, DyaD, source, Q-routing, MaXY and              be observed that calculated values and input values are nearly
PROM. A large number of options are available when it comes          equal.
to traffic modelling in NIRGAM as it supports various type
of traffic patterns such as Hotspot NED [12] as well as traffic
injection models.                                                             TABLE I: Calculated vs Input mean OFF time
   Other user configurable parameters in NIRGAM are virtual                 Input OFF           Calculated       mc            Calculated
                                                                           time                Probability                    OFF time
channels i.e. number of virtual channels per physical channels,            Task1    Task2      p1      p2                     Task1     Task2
buffer size of an input channel, clock frequency. All these                16       25         0.60    0.40     4             15.4      22.2
parameters can be specified in the configuration file of the                  20       40         0.66    0.34     8             21.3      43.0
                                                                           16       16         0.50    0.50     3             17.1      18.0
NIRGAM before starting the simulation.                                     15       20         0.56    0.44     3             15.3      20.6
                                                                           10       20         0.65    0.35     1             12.8      22.7
       VI. I MPLEMENTATION OF P ROPOSED M ODEL                             30       10         0.26    0.74     1             32.7      10.9
   As discussed in Section IV, given the values of mc , p1 , p2
we can calculate mt1 and mt2 using Equations (4) and (5).
                                                                        We ran simulation for different values of the flit interval.
Though for implementing the proposed traffic model as a
                                                                     Simulation was done for three values of flit interval – 2, 4 and
traffic generator in any simulator it is desired that mt1 and
                                                                     8 clock cycles. Results are shown in Table II. It is observed that
mt2 should be the input parameters. Different values of these
                                                                     mean OFF time calculated from generated trace is independent
average OFF time will represent different classes of streams.
                                                                     of the flit interval. Hence, proposed traffic model can be used
To derive values of mc , p1 , p2 for given values of mt1
                                                                     with different flit intervals.
and mt2 , we use Equations (4) and (5) and the fact that
p1 + p2 = 1 along with the derived values of c1 , c2 , c3 and
c4 . A generalized version of the equation needed to solve for       TABLE II: Calculated vs Input mean OFF time for different
p1 is shown below in Equation (6).                                   Flit Intervals
                                                                      Input Off time                        Calculated OFF time
                                                                                        Flit Interval = 2     Flit Interval = 4 Flit Interval = 8
                                                                      Task1 Task2       Task1 Task2          Task1 Task2        Task1 Task2
   p3 (mt1 + mt2 + 12) − p2 (mt1 + 2 ∗ mt2 + 18) +
    1                     1
                                                                      15       20       15.8      20.0        16.2     19.0      15.6     20.2
                     p1 (mt2 + 8) − 1 = 0                     (6)     11       30       11.0      31.4        11.2     29.7      11.4     29.1
                                                                      8        11       8.7       11.4        8.7      11.5      8.6      11.6
   Equation (6) has three possible roots, the one between 0           18       18       17.8      18.5        18.5     18.4      18.0     18.2
and 1 is selected as probability values are in range [0 · · · 1].
Computed root is assigned to p1 and p2 is computed as 1−p1 .
mc can be calculated using Equation (4).
   When implementing the traffic model in NIRGAM values               TABLE III: Calculated vs Input mean OFF time for different
of mt1 and mt2 are read from a configuration file. Using these         Burst Length
values Equation (6) is solved for p1 using bisection method           Input Off time                        Calculated OFF time
[16]. Once mc , p1 , p2 are known mc is used to generate                                Burst size = 4        Burst size = 8    Burst size = 12
                                                                      Task1 Task2       Task1 Task2          Task1 Task2        Task1 Task2
bursty traffic. Each time a new burst starts a random number
is generated in range [0 · · · 1]. If the generated number is less    15       20       14.8      20.2        14.6     19.1      14.5     18.9
than p1 , first stream is allowed to transmit i.e. destination is      11       30       11.4      31.6        11.3     28.6      11.0     30.6
                                                                      8        11       8.6       11.6        8.3      11.9      8.0      12.0
chosen according to first stream for the current burst, otherwise      18       18       18.3      17.2        17.2     18.4      18.4     18.4
destination is chosen according to second stream.

               VII. E XPERIMENTAL R ESULTS
                                                                        Simulation was run with different values of the burst size.
   We ran NIRGAM simulator for different values of mt1 and           We have used three values of burst size – 4, 8 and 12 packets.
mt2 on 4 × 4 mesh topology. Traffic model was attached to             Results obtained are shown in Table III. Calculated mean
core 0 and two destinations were cores 7 & 10 respectively.          OFF time from trace is independent of the burst size of the
Traffic was generated for 5000 clock cycle and simulation was         traffic. This observation allows use of different burst sizes for
run for 8000 clock cycles. Number of virtual channels were           modelling different streams/tasks.
eight.
   To verify the traffic model, input values of mt1 and mt2                                     VIII. C ONCLUSION
(values read from configuration file as specified by the user)             This paper presented a traffic model for multicast communi-
are compared with values calculated from demultiplexed trace.        cation in NoC. This also models traffic scenario of concurrent
These values along with calculated values of mc , p1 and p2          tasks mapped to same core; each task requiring communication
are shown in Table I. Columns 1 and 2 show the input values          with different destination. Mapping multiple tasks on a single
NoC core will reduce the size of NoC chip and the cost                         [15] K. K. Paliwal, J. S. George, N. Rameshan, V. Laxmi, M. S. Gaur,
and shall provide more optimal use of network resources. To                         V. Janyani, and R. Narasimhan, “Implementation of Q O S aware Q-
                                                                                    routing algorithm for network-on-chip,” in Communications in Computer
further analyse this concept of the multicasting/multitasking,                      and Information Science, 2009.
we provide a traffic model under the assumption that each task                  [16] A. Eiger, K. Sikorski, and F. Stenger, “A bisection method for systems
generates bursty traffic. For point-multipoint communication,                        of nonlinear equations,” ACM Trans. Math. Softw., vol. 10, no. 4, pp.
                                                                                    367–377, December 1984.
the core can be viewed as generating a single stream with a
fixed average OFF time. This burst is probabilistically demul-
tiplexed into two streams. The probabilities for demultiplexing
are calculated based on specified average OFF time of traffic
generated by each communication stream. Traffic model is
implemented and verified on an open source NoC simulator
NIRGAM. Multicast traffic model is independent of inter-flit
interval and burst size. In this paper, we have presented a
novel model for simultaneous broadcast to two destinations
but the model can be extended to n(n > 2) destinations. In
latter case, the solution will require numerical method. Further
analysis of the performance of the various routing algorithms,
topologies under other traffic distributions shall be part of our
future work.

                             R EFERENCES
 [1] L. Carloni, P. Pande, and Y. Xie, “Networks-on-chip in emerging
     interconnect paradigms: Advantages and challenges,” in Networks-on-
     Chip, 2009. NoCS 2009, may 2009, pp. 93 –102.
 [2] L. Benini and G. D. Micheli, “Networks on chips: A new soc paradigm,”
     Computer, vol. 35, pp. 70–78, 2002.
 [3] W. J. Dally and B. Towles, “Route packets, not wires: on-chip intecon-
     nection networks,” in DAC ’01: Proceedings of the 38th annual Design
     Automation Conference, 2001, pp. 684–689.
 [4] J. Duato, S. Yalamanchili, and N. Lionel, Interconnection Networks: An
     Engineering Approach. San Francisco, CA, USA: Morgan Kaufmann
     Publishers Inc., 2002.
 [5] M. Ali, M. Welzl, and S. Hellebrand, “A dynamic routing mechanism
     for network on chip,” in NORCHIP Conference, 2005. 23rd, 21-22 2005,
     pp. 70 – 73.
 [6] L. Tedesco, A. Mello, L. Giacomet, N. Calazans, and F. Moraes, “Ap-
     plication driven traffic modeling for nocs,” in SBCCI ’06: Proceedings
     of the 19th annual symposium on Integrated circuits and systems design,
     2006, pp. 62–67.
 [7] R. Marculescu and P. Bogdan, “The chip is the network: Toward a sci-
     ence of network-on-chip design,” Foundations and Trends in Electronic
     Design Automation, vol. 2, no. 4, pp. 371–461, 2009.
 [8] “NIRGAM,” 2009. [Online]. Available: http://cse-trac.mnit.ac.in
 [9] T. Ahonen, D. A. Sig¨ enza-Tortosa, H. Bin, and J. Nurmi, “Topology
                            u
     optimization for application-specific networks-on-chip,” in SLIP ’04:
     Proceedings of the 2004 international workshop on System level in-
     terconnect prediction. New York, NY, USA: ACM, 2004, pp. 53–60.
[10] W. H. Ho and T. M. Pinkston, “A methodology for designing efficient
     on-chip interconnects on well-behaved communication patterns,” in
     HPCA ’03: Proceedings of the 9th International Symposium on High-
     Performance Computer Architecture. Washington, DC, USA: IEEE
     Computer Society, 2003, p. 377.
[11] J. Hu and R. Marculescu, “Energy-aware mapping for tile-based noc
     architectures under performance constraints,” in ASP-DAC ’03: Proceed-
     ings of the 2003 Asia and South Pacific Design Automation Conference.
     New York, NY, USA: ACM, 2003, pp. 233–239.
[12] A.-M. Rahmani, I. Kamali, P. Lotfi-Kamran, A. Afzali-Kusha,
     and S. Safari, “Negative exponential distribution traffic pattern for
     power/performance analysis of network on chips,” in VLSI Design, 2009
     22nd International Conference on, 5-9 2009, pp. 157 –162.
[13] A. B. Kahng, B. Li, L.-S. Peh, and K. Samadi, “Orion 2.0: A fast
     and accurate noc power and area model for early-stage design space
     exploration,” in DATE’09, 2009, pp. 423–428.
[14] C. Grecu, L. Anghel, P. P. Pande, A. Ivanov, and R. Saleh, “Essential
     fault-tolerance metrics for noc infrastructures,” in IOLTS ’07: Pro-
     ceedings of the 13th IEEE International On-Line Testing Symposium.
     Washington, DC, USA: IEEE Computer Society, 2007, pp. 37–42.

Más contenido relacionado

La actualidad más candente

Producer consumer-problems
Producer consumer-problemsProducer consumer-problems
Producer consumer-problems
Richard Ashworth
 
A simulation model of ieee 802.15.4 in om ne t++
A simulation model of ieee 802.15.4 in om ne t++A simulation model of ieee 802.15.4 in om ne t++
A simulation model of ieee 802.15.4 in om ne t++
wissem hammouda
 

La actualidad más candente (19)

PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksPR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
 
Stochastic analysis of random ad hoc networks with maximum entropy deployments
Stochastic analysis of random ad hoc networks with maximum entropy deploymentsStochastic analysis of random ad hoc networks with maximum entropy deployments
Stochastic analysis of random ad hoc networks with maximum entropy deployments
 
AREA-EFFICIENT DESIGN OF SCHEDULER FOR ROUTING NODE OF NETWORK-ON-CHIP
AREA-EFFICIENT DESIGN OF SCHEDULER FOR ROUTING NODE OF NETWORK-ON-CHIPAREA-EFFICIENT DESIGN OF SCHEDULER FOR ROUTING NODE OF NETWORK-ON-CHIP
AREA-EFFICIENT DESIGN OF SCHEDULER FOR ROUTING NODE OF NETWORK-ON-CHIP
 
PyData Los Angeles 2020 (Abhilash Majumder)
PyData Los Angeles 2020 (Abhilash Majumder)PyData Los Angeles 2020 (Abhilash Majumder)
PyData Los Angeles 2020 (Abhilash Majumder)
 
PR-155: Exploring Randomly Wired Neural Networks for Image Recognition
PR-155: Exploring Randomly Wired Neural Networks for Image RecognitionPR-155: Exploring Randomly Wired Neural Networks for Image Recognition
PR-155: Exploring Randomly Wired Neural Networks for Image Recognition
 
DIA-TORUS:A NOVEL TOPOLOGY FOR NETWORK ON CHIP DESIGN
DIA-TORUS:A NOVEL TOPOLOGY FOR NETWORK ON CHIP DESIGNDIA-TORUS:A NOVEL TOPOLOGY FOR NETWORK ON CHIP DESIGN
DIA-TORUS:A NOVEL TOPOLOGY FOR NETWORK ON CHIP DESIGN
 
Efficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingEfficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter Sharing
 
On the modeling of
On the modeling ofOn the modeling of
On the modeling of
 
Producer consumer-problems
Producer consumer-problemsProducer consumer-problems
Producer consumer-problems
 
Exploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image RecognitionExploring Randomly Wired Neural Networks for Image Recognition
Exploring Randomly Wired Neural Networks for Image Recognition
 
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
 
82
8282
82
 
87
8787
87
 
83
8383
83
 
Partially connected 3D NoC - Access Noxim.
Partially connected 3D NoC - Access Noxim. Partially connected 3D NoC - Access Noxim.
Partially connected 3D NoC - Access Noxim.
 
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
A simulation model of ieee 802.15.4 in om ne t++
A simulation model of ieee 802.15.4 in om ne t++A simulation model of ieee 802.15.4 in om ne t++
A simulation model of ieee 802.15.4 in om ne t++
 
PR-183: MixNet: Mixed Depthwise Convolutional Kernels
PR-183: MixNet: Mixed Depthwise Convolutional KernelsPR-183: MixNet: Mixed Depthwise Convolutional Kernels
PR-183: MixNet: Mixed Depthwise Convolutional Kernels
 
PR-366: A ConvNet for 2020s
PR-366: A ConvNet for 2020sPR-366: A ConvNet for 2020s
PR-366: A ConvNet for 2020s
 

Destacado (8)

52
5252
52
 
61
6161
61
 
53
5353
53
 
41
4141
41
 
49
4949
49
 
55
5555
55
 
94
9494
94
 
My profile
My profileMy profile
My profile
 

Similar a 62

Performance Comparison and Analysis of Mobile Ad Hoc Routing Protocols
Performance Comparison and Analysis of Mobile Ad Hoc Routing ProtocolsPerformance Comparison and Analysis of Mobile Ad Hoc Routing Protocols
Performance Comparison and Analysis of Mobile Ad Hoc Routing Protocols
CSEIJJournal
 
Ieee 2015 project list_vlsi
Ieee 2015 project list_vlsiIeee 2015 project list_vlsi
Ieee 2015 project list_vlsi
igeeks1234
 
Ieee 2015 project list_vlsi
Ieee 2015 project list_vlsiIeee 2015 project list_vlsi
Ieee 2015 project list_vlsi
igeeks1234
 

Similar a 62 (20)

Design of fault tolerant algorithm for network on chip router using field pr...
Design of fault tolerant algorithm for network on chip router  using field pr...Design of fault tolerant algorithm for network on chip router  using field pr...
Design of fault tolerant algorithm for network on chip router using field pr...
 
A New Programming Model to Simulate Wireless Sensor Networks : Finding The Be...
A New Programming Model to Simulate Wireless Sensor Networks : Finding The Be...A New Programming Model to Simulate Wireless Sensor Networks : Finding The Be...
A New Programming Model to Simulate Wireless Sensor Networks : Finding The Be...
 
APPLYING GENETIC ALGORITHM TO SOLVE PARTITIONING AND MAPPING PROBLEM FOR MESH...
APPLYING GENETIC ALGORITHM TO SOLVE PARTITIONING AND MAPPING PROBLEM FOR MESH...APPLYING GENETIC ALGORITHM TO SOLVE PARTITIONING AND MAPPING PROBLEM FOR MESH...
APPLYING GENETIC ALGORITHM TO SOLVE PARTITIONING AND MAPPING PROBLEM FOR MESH...
 
Applying Genetic Algorithm to Solve Partitioning and Mapping Problem for Mesh...
Applying Genetic Algorithm to Solve Partitioning and Mapping Problem for Mesh...Applying Genetic Algorithm to Solve Partitioning and Mapping Problem for Mesh...
Applying Genetic Algorithm to Solve Partitioning and Mapping Problem for Mesh...
 
A Flexible Software/Hardware Adaptive Network for Embedded Distributed Archit...
A Flexible Software/Hardware Adaptive Network for Embedded Distributed Archit...A Flexible Software/Hardware Adaptive Network for Embedded Distributed Archit...
A Flexible Software/Hardware Adaptive Network for Embedded Distributed Archit...
 
A FLEXIBLE SOFTWARE/HARDWARE ADAPTIVE NETWORK FOR EMBEDDED DISTRIBUTED ARCHIT...
A FLEXIBLE SOFTWARE/HARDWARE ADAPTIVE NETWORK FOR EMBEDDED DISTRIBUTED ARCHIT...A FLEXIBLE SOFTWARE/HARDWARE ADAPTIVE NETWORK FOR EMBEDDED DISTRIBUTED ARCHIT...
A FLEXIBLE SOFTWARE/HARDWARE ADAPTIVE NETWORK FOR EMBEDDED DISTRIBUTED ARCHIT...
 
Performance Comparison and Analysis of Mobile Ad Hoc Routing Protocols
Performance Comparison and Analysis of Mobile Ad Hoc Routing ProtocolsPerformance Comparison and Analysis of Mobile Ad Hoc Routing Protocols
Performance Comparison and Analysis of Mobile Ad Hoc Routing Protocols
 
Estimation of Optimized Energy and Latency Constraint for Task Allocation in ...
Estimation of Optimized Energy and Latency Constraint for Task Allocation in ...Estimation of Optimized Energy and Latency Constraint for Task Allocation in ...
Estimation of Optimized Energy and Latency Constraint for Task Allocation in ...
 
FLEXIBLE VIRTUAL ROUTING FUNCTION DEPLOYMENT IN NFV-BASED NETWORK WITH MINIMU...
FLEXIBLE VIRTUAL ROUTING FUNCTION DEPLOYMENT IN NFV-BASED NETWORK WITH MINIMU...FLEXIBLE VIRTUAL ROUTING FUNCTION DEPLOYMENT IN NFV-BASED NETWORK WITH MINIMU...
FLEXIBLE VIRTUAL ROUTING FUNCTION DEPLOYMENT IN NFV-BASED NETWORK WITH MINIMU...
 
Simulator for Energy Efficient Clustering in Mobile Ad Hoc Networks
Simulator for Energy Efficient Clustering in Mobile Ad Hoc NetworksSimulator for Energy Efficient Clustering in Mobile Ad Hoc Networks
Simulator for Energy Efficient Clustering in Mobile Ad Hoc Networks
 
Evaluation aodv
Evaluation aodvEvaluation aodv
Evaluation aodv
 
RIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONS
RIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONSRIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONS
RIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONS
 
RIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONS
RIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONSRIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONS
RIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONS
 
RIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONS
RIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONSRIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONS
RIVERBED-BASED NETWORK MODELING FOR MULTI-BEAM CONCURRENT TRANSMISSIONS
 
Area-Efficient Design of Scheduler for Routing Node of Network-On-Chip
Area-Efficient Design of Scheduler for Routing Node of Network-On-ChipArea-Efficient Design of Scheduler for Routing Node of Network-On-Chip
Area-Efficient Design of Scheduler for Routing Node of Network-On-Chip
 
Performance Analysis of Mesh-based NoC’s on Routing Algorithms
Performance Analysis of Mesh-based NoC’s on Routing Algorithms Performance Analysis of Mesh-based NoC’s on Routing Algorithms
Performance Analysis of Mesh-based NoC’s on Routing Algorithms
 
Evaluating feasibility of using wireless sensor networks in a coffee crop thr...
Evaluating feasibility of using wireless sensor networks in a coffee crop thr...Evaluating feasibility of using wireless sensor networks in a coffee crop thr...
Evaluating feasibility of using wireless sensor networks in a coffee crop thr...
 
Ieee 2015 project list_vlsi
Ieee 2015 project list_vlsiIeee 2015 project list_vlsi
Ieee 2015 project list_vlsi
 
Me,be ieee 2015 project list_vlsi
Me,be ieee 2015 project list_vlsiMe,be ieee 2015 project list_vlsi
Me,be ieee 2015 project list_vlsi
 
Ieee 2015 project list_vlsi
Ieee 2015 project list_vlsiIeee 2015 project list_vlsi
Ieee 2015 project list_vlsi
 

Más de srimoorthi (19)

84
8484
84
 
75
7575
75
 
73
7373
73
 
72
7272
72
 
70
7070
70
 
69
6969
69
 
68
6868
68
 
63
6363
63
 
60
6060
60
 
57
5757
57
 
56
5656
56
 
50
5050
50
 
51
5151
51
 
45
4545
45
 
44
4444
44
 
43
4343
43
 
42
4242
42
 
39
3939
39
 
38
3838
38
 

62

  • 1. Traffic Characterization for Multicasting in NoC V.Laxmi1 , Roopesh Chuggani2 , M.S.Gaur3 , Pankaj Khandelwal4 , Prateek Bansal5 Department of Computer Engineering National Institute of Technology Jaipur {vlaxmi |gaurms }@mnit.ac.in,{roopesh.chuggani2 |pankaj1394 |prateekbansal.895 }@gmail.com 1 3 Abstract—NoC (Network on Chip) is an emerging paradigm one core to another. Traffic modelling has been proposed as an for design of VLSI/ULSI circuits to overcome communication open area of research in recent papers [7]. Most evaluations bottleneck of traditional bus based systems. NoC communica- and analysis of NoC design parameters are still based on basic tion framework consists of regularly placed routers, which are connected to processing cores. NoC performance is determined synthetic traffic patterns such as CBR (Constant Bit Rate), by latency and throughput for communication requirements. bursty, bit-complement, transpose, etc. These traffic patterns NoC communication traffic modelling plays an important role do not capture real-world scenario as each of these patterns in design of NoC simulators and/or prototypes. This paper comprise of only point-to-point communications, i.e. for each presents a framework for modelling source traffic for multipoint source there is only one destination. Traffic modelling of communication from one source to different destinations as is required for multicasting. Such a traffic model captures real- multicast communication for NoC is still in infancy. world scenarios such as multicasting, execution of concurrent In multimedia applications such as NoC design for modules multiple tasks on a single core (each task requiring commu- of MPEG encoder/decoder, point-to-multipoint communica- nication with different destinations). The model proposes how tion patterns are also needed as experienced by authors while concurrent traffic streams from a single core to different desti- extending capability of an NoC simulator. This requires gen- nations can be mathematically characterized as a single stream at source end. The model is derived from statistical behaviour eration of multiple traffic streams originating from the same of probabilistically demultiplexing of a single traffic stream. In source but destined for different cores. A similar traffic pattern its nascent stage, the method is proposed for a scenario of one is observed when a core is running concurrent tasks; each task source concurrently communicating with two destinations as shall requiring communicating with different destination. be required for mapping two concurrent tasks to same core or In this paper, we propose how multicast communication, simultaneous broadcast to two destinations. Index Terms—Network on Chip, Multicasting, Bursty Traffic, i.e. multiple traffic streams originating at the source, can be Probabilistic Demultiplexing, Exponential Distribution viewed as a single traffic stream without any adverse impact on statistical characteristics of destination traffic streams. The I. I NTRODUCTION model is derived from observations of statistical behaviour of received streams at destinations in a single source multiple VLSI designs are increasingly becoming more complex with destinations scenario. Till now, to the best of our knowledge, increase in scale of integration resulting in more components no traffic model has been proposed to accurately characterize being fabricated on the same chip. With resultant increase in this scenario. In this initial work, we present model for the number of processing cores (CPU, DSP, memory, etc.), two destinations. This can be used as basis for n(n > 2) increased inter-core communication requirement cannot be destinations. satisfied by the traditional bus based communication archi- The model is based on the observation that probabilistic tecture [1], [2]. Network on Chip (NoC) has been proposed division of a bursty traffic stream into two separate streams as an alternative [3]. NoC provides a communication layer results in both streams being bursty. Burst parameter of each of regularly placed, interconnected routers. Inter-core com- stream is related to the that of the original stream. The munication takes place through these routers. Decoupling of proposed traffic model has been implemented and tested on communication and computation simplifies IC design process. an open source NoC simulator NIRGAM [8]. Regularity in NoC structure results in better scalability and This paper is organized as follows: In Section II, we present fault tolerance [2], [4]. Because of its modular structure, many the background survey in this field. In Section III, we present components can be reused from previous designs resulting in objectives of the presented work and motivation for proposed reduced time to market for new NoC designs. traffic model. In Section IV, we derive how statistical charac- NoC design parameters include topology selection, router teristics of traffic streams received at destinations are related to design and choice of routing function. A NoC simulator can those of the source traffic. These relationships are derived from assist the designer in evaluation of different NoC designs. observations of experiments conducted. Section V describes One important aspect of simulator design is characterization of NoC simulator NIRGAM, on which the proposed model is inter-core traffic. Traffic modelling of the cores is an important implemented, in brief. In Section VI, implementation of the step in NoC design [5], [6]. Traffic models are mathematical proposed model on NIRGAM is described. Experimental result characterization of statistical properties of data flowing from are presented in Section VII followed by conclusions and 978-1-4244-8971-8/10$26.00 c 2010 IEEE
  • 2. pointers for further extension in Section VIII. 0 1 2 3 II. R ELATED W ORK 7 Applications needs to be mapped to the underlying NoC 4 5 6 architecture by dividing their functionality of the application into smaller tasks. Each task is mapped onto one NoC core. 8 9 10 11 Many algorithms for mapping these tasks on to IP core have been proposed [9]–[11]. In each of previous work, a single task is mapped onto one IP core. Most of the past work has 12 13 14 15 been done to map a single application onto the underlying network. In [9], the tasks of a process control platform are 0 mapped on to NoC cores in one to one manner. In [11], Hu et IP Core Task Data Flow al propose an energy constrained mapping of communication task graph to a NoC. This work considers single task per core. Fig. 1: NoC Architecture with Multiple Task per core NoC evaluation is based on the assumption of mapping sin- gle task per core and point-to-point traditional traffic patterns like bit complement, transpose [3]. This type of communi- statistical characteristics of traffic received at the destinations. cation is limited to only few applications, because rarely a Following are the assumptions for our model. node communicates with just a single node or with all the 1) There is one source and two destinations. This can other nodes in the network. For modelling a multicast (point happen when at most two traffic streams are emanating to multipoint) scenario, uniform random traffic is used by on a single core. selecting a random destination for each packet; probability of 2) Each stream (task) is generating Bursty traffic; average each destination being selected is same. In [12], a new traffic OFF time of this traffic is modelled using exponential pattern is proposed to create the scenario where tasks with distribution. higher intertask communicating tasks are mapped to cores in 3) Traffic model is independent of burst size (Number adjacent regions. In this traffic pattern, communication is point of packets in a particular burst). Experimental results to point but, traffic is distributed to multiple destinations. suggest that traffic statistics appears to be independent These traffic patterns cannot model the point to multipoint of burst size. Details are discussed in Section VII. traffic generated by multiple tasks executing on a single core. This is because when we map multiple tasks on single We define following parameters for our traffic model : core, traffic of the core is composed of the individual traffic 1) mc : Average (Mean) OFF time of the traffic generated generated by each tasks. Each individual traffic stream can by the core node. have different statistical properties and destination pattern. But 2) p1 : Probability that packet is destined for first destina- traditional traffic generators do not provide functionality for tion. such a communication. 3) p2 : Probability that packet is destined for second destination III. M OTIVATION 4) mt1 : Average (Mean) OFF time of the traffic received In this paper, we try to model point-to-multipoint source by first destination. traffic pattern given the statistical behaviour of traffic received 5) mt2 : Average (Mean) OFF time of the traffic received at the destinations. This will result in multiple traffic streams by second destination. emerging from same core. Each traffic stream may have a Our model is based on the observation that when a bursty different destination and is likely to have different statistical traffic generated using exponential distribution with average properties. OFF time as mc is demultiplexed probabilistically into two Figure 1 shows one such scenario in an NoC of size 4 × 4 traffic streams, demultiplexed traffic streams still follow expo- wherein cores are numbered 0 to 15. Core 0 is multicasting nential distribution. Average OFF time of each stream/task is to cores 9 and 10 respectively. Core 7 is multicasting to cores mt1 and mt2 respectively. Probabilistic demultiplexing means 10 and 12 respectively. There is one unicast communication that each packet is assigned to one of the streams/tasks as per from core 15 to core 13. probabilities (p1 , p2 ). A random number is generated and if it is less than p1 this burst of packets belongs to first stream, IV. P ROPOSED MODEL otherwise to second one. The main objective of the work presented here is to deter- We investigate dependence of mt1 and mt2 on mc , p1 , p2 . mine how a point-to-multipoint traffic pattern can be modelled at source end. We need to derive statistical characteristic of the A. Bursty Traffic Model traffic at source given traffic characteristics at the destination. Bursty traffic is modelled using exponential distribution [8]. For such a derivation, we first consider the inverse of the Both inter packet interval and packet size follow exponential objective. Given source traffic characteristics, what are the distribution. We are concerned only with inter packet intervals.
  • 3. Exponential distribution is parametrized by average value of To verify this observation, we generated and demultiplexed the distribution denoted by m. The probability density function traffic for multiple values of mc . One such instance is shown in (PDF) of an exponential distribution is Figure 2. Here, Figure 2(a) shows the probability distribution x 1 −m of original trace with m = 30 while Figure 2(b) shows PDF me , x≥0 of one of the demultiplexed trace with probability 0.6. As can f (x; m) = (1) 0, x<0 be seen, both approximate to exponential distribution. m is also known as expected value of the distribution. Fol- C. Deriving the relation lowing variables are required in the traffic model To seek relationship between mc , mt1 and mc , mt2 , we B. Observation of Demultiplexed Trace generated and demultiplexed traces for various values of mc We generated a traffic trace with a random average OFF and calculated the values of mt1 and mt2 . It was found time mc . This traffic trace was divided into two different that average OFF time of traffic generated by each stream traces using probabilities (p1 , p2 ). The PDF of the original is directly proportional to average core OFF time. trace was exponential as expected. PDFs of each demultiplexed mt1 ∝ mc (2a) trace was observed to follow similar exponential distribution. This observation was significant because it meant that we can mt2 ∝ mc (2b) generate two different exponential distributions from a single distribution by probabilistically demultiplexing. 100 Offtime of task 1 (mt1) with probability 0.4 120 90 Offtime of task 2 (mt2) with probability 0.6 Average Off time for tasks 80 100 70 80 60 Frequency 50 60 40 40 30 20 20 10 0 5 10 15 20 25 30 35 0 Average Off time at Core 0 50 100 150 Value of inter packet time Fig. 3: mt1 v/s mc and mt2 v/s mc (a) Original Figure 3 shows the plot of average OFF time of core and 70 of demultiplexed traffic streams. On X axis is the average 60 OFF time of core (mc ), while on Y axis is the OFF time of both streams. As can be seen, the curve comes out to be 50 approximately linear, hence showing direct proportionality. Next, we deduce the relationship between the mt1 , mt2 and Frequency 40 p1 , p2 . To achieve this we kept the mc constant and probability of generation was varied from 0.1 to 0.95 (p1 + p2 = 1). It 30 was found that average OFF time of traffic generated by each 20 stream is inversely proportional to respectiveprobability. 1 10 mt1 ∝ (3a) p1 0 0 20 40 60 80 100 120 140 150 1 Value of inter packet time mt2 ∝ (3b) p2 (b) Demultiplexed The Figure 4 shows the plot of mt1 versus the probability(p1 ) for mc = 50. Probability is on the X-axis Fig. 2: (a) PDF for Original Trace, (b) PDF for a demultiplexed and average OFF time is on the Y-axis. As can be seen trace (probability= 0.6) from the plot, curve precisely shows the inverse relationship.
  • 4. 400 450 400 Actual offtime for source offtime 15 350 Analytical offtime for source offtime 15 Actual offtime for source offtime 25 350 Average Off time (mt1) 300 Analytical offtime for source offtime 25 Actual offtime for source offtime 35 Average Off Time 300 Analytical offtime for source offtime 35 250 250 200 200 150 150 100 100 50 50 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Probability (p1) Probability Fig. 4: Variation of mt1 w.r.t p1 Fig. 5: Analytical v/s actual OFF time of Task 1 for different values of mc As the probability approaches unity the case reduces from 900 point-multipoint scenario to point-point scenario and mt1 Actual Off time for source off time 35 Analytical off time for source offtme 35 approaches mc . While for other destination, it attains a very 800 Actual Off time for source off time 25 high value. Using Equations (2a), (2b), (3a) and (3b) with 700 Analytical off time for source offtme 25 Actual Off time for source off time 15 curve fitting of both the curves, empirical relationship between Analytical off time for source offtme 15 Average Off time 600 average OFF time for each was derived as: 500 1 mc + p2 + c1 mt1 + c2 (4) 400 p1 300 1 mc + p1 + c3 200 mt2 + c4 (5) p2 100 c1 , c2 , c3 , c4 are constants. In our case, when curve fitting 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 was applied following values were obtained c1 = c3 = 6 and Probability c2 = c4 = −6. Verification of the Equations (4), (5) is performed in two Fig. 6: Analytical v/s actual OFF time of Task 2 for different steps. We calculate average OFF time of traffic generated by values of mc each stream in two ways : 1) The values of mt1 and mt2 is calculated from the demul- tiplexed traces obtained with different values of p1 , p2 V. NIRGAM and mc . These values are referred to as ‘calculated’ or Network-on-chip Interconnect Routing and Application actual OFF time from trace. Modelling (NIRGAM) [8] is a discrete event, cycle accu- 2) For all the corresponding values of p1 , p2 and mc , values rate simulator targeted at Network on Chip (NoC) research. of mt1 and mt2 is calculated using Equations (4) and (5). NIRGAM is written in SystemC, which is a dynamic library These values are referred to as ‘analytical’ OFF time. for hardware modelling built on top of C++. NIRGAM allows Analytical and actual values are plotted on same figure to users to change various options in terms of NoC simulation verify the derived Equations (4) and (5). Figures 5 and 6 show at every stage such as routing algorithm, topologies, virtual the result of verification. The results have been shown for channels, buffers etc. Simulation framework allows analysing different values of mc to verify our model for a range of results in terms of various performance metric such as latency, core OFF time values. On X- axis is the probability of traffic throughput etc. Orion [13] has been integrated into NIRGAM generation and transmission for each stream and on Y axis is and allows users to creating and analysing power estimation the OFF time of the traffic generated for that stream. As can be graphs. NIRGAM provides support for fault tolerance [14] and seen from the Figures 5 and 6, values from analytical formula QoS [15]. very accurately estimates the actual OFF time calculate from NIRGAM supports 2D mesh and 2D torus topologies. Rout- demultiplexed trace. ing in NIRGAM is done using flits. These are the units that
  • 5. flow between routers. NIRGAM support wormhole switching of mt1 and mt2 while last two columns represent values mechanism. Presently it supports a number of routing algo- calculated from traces generated by our traffic model. It can rithm such as XY, OE, DyaD, source, Q-routing, MaXY and be observed that calculated values and input values are nearly PROM. A large number of options are available when it comes equal. to traffic modelling in NIRGAM as it supports various type of traffic patterns such as Hotspot NED [12] as well as traffic injection models. TABLE I: Calculated vs Input mean OFF time Other user configurable parameters in NIRGAM are virtual Input OFF Calculated mc Calculated time Probability OFF time channels i.e. number of virtual channels per physical channels, Task1 Task2 p1 p2 Task1 Task2 buffer size of an input channel, clock frequency. All these 16 25 0.60 0.40 4 15.4 22.2 parameters can be specified in the configuration file of the 20 40 0.66 0.34 8 21.3 43.0 16 16 0.50 0.50 3 17.1 18.0 NIRGAM before starting the simulation. 15 20 0.56 0.44 3 15.3 20.6 10 20 0.65 0.35 1 12.8 22.7 VI. I MPLEMENTATION OF P ROPOSED M ODEL 30 10 0.26 0.74 1 32.7 10.9 As discussed in Section IV, given the values of mc , p1 , p2 we can calculate mt1 and mt2 using Equations (4) and (5). We ran simulation for different values of the flit interval. Though for implementing the proposed traffic model as a Simulation was done for three values of flit interval – 2, 4 and traffic generator in any simulator it is desired that mt1 and 8 clock cycles. Results are shown in Table II. It is observed that mt2 should be the input parameters. Different values of these mean OFF time calculated from generated trace is independent average OFF time will represent different classes of streams. of the flit interval. Hence, proposed traffic model can be used To derive values of mc , p1 , p2 for given values of mt1 with different flit intervals. and mt2 , we use Equations (4) and (5) and the fact that p1 + p2 = 1 along with the derived values of c1 , c2 , c3 and c4 . A generalized version of the equation needed to solve for TABLE II: Calculated vs Input mean OFF time for different p1 is shown below in Equation (6). Flit Intervals Input Off time Calculated OFF time Flit Interval = 2 Flit Interval = 4 Flit Interval = 8 Task1 Task2 Task1 Task2 Task1 Task2 Task1 Task2 p3 (mt1 + mt2 + 12) − p2 (mt1 + 2 ∗ mt2 + 18) + 1 1 15 20 15.8 20.0 16.2 19.0 15.6 20.2 p1 (mt2 + 8) − 1 = 0 (6) 11 30 11.0 31.4 11.2 29.7 11.4 29.1 8 11 8.7 11.4 8.7 11.5 8.6 11.6 Equation (6) has three possible roots, the one between 0 18 18 17.8 18.5 18.5 18.4 18.0 18.2 and 1 is selected as probability values are in range [0 · · · 1]. Computed root is assigned to p1 and p2 is computed as 1−p1 . mc can be calculated using Equation (4). When implementing the traffic model in NIRGAM values TABLE III: Calculated vs Input mean OFF time for different of mt1 and mt2 are read from a configuration file. Using these Burst Length values Equation (6) is solved for p1 using bisection method Input Off time Calculated OFF time [16]. Once mc , p1 , p2 are known mc is used to generate Burst size = 4 Burst size = 8 Burst size = 12 Task1 Task2 Task1 Task2 Task1 Task2 Task1 Task2 bursty traffic. Each time a new burst starts a random number is generated in range [0 · · · 1]. If the generated number is less 15 20 14.8 20.2 14.6 19.1 14.5 18.9 than p1 , first stream is allowed to transmit i.e. destination is 11 30 11.4 31.6 11.3 28.6 11.0 30.6 8 11 8.6 11.6 8.3 11.9 8.0 12.0 chosen according to first stream for the current burst, otherwise 18 18 18.3 17.2 17.2 18.4 18.4 18.4 destination is chosen according to second stream. VII. E XPERIMENTAL R ESULTS Simulation was run with different values of the burst size. We ran NIRGAM simulator for different values of mt1 and We have used three values of burst size – 4, 8 and 12 packets. mt2 on 4 × 4 mesh topology. Traffic model was attached to Results obtained are shown in Table III. Calculated mean core 0 and two destinations were cores 7 & 10 respectively. OFF time from trace is independent of the burst size of the Traffic was generated for 5000 clock cycle and simulation was traffic. This observation allows use of different burst sizes for run for 8000 clock cycles. Number of virtual channels were modelling different streams/tasks. eight. To verify the traffic model, input values of mt1 and mt2 VIII. C ONCLUSION (values read from configuration file as specified by the user) This paper presented a traffic model for multicast communi- are compared with values calculated from demultiplexed trace. cation in NoC. This also models traffic scenario of concurrent These values along with calculated values of mc , p1 and p2 tasks mapped to same core; each task requiring communication are shown in Table I. Columns 1 and 2 show the input values with different destination. Mapping multiple tasks on a single
  • 6. NoC core will reduce the size of NoC chip and the cost [15] K. K. Paliwal, J. S. George, N. Rameshan, V. Laxmi, M. S. Gaur, and shall provide more optimal use of network resources. To V. Janyani, and R. Narasimhan, “Implementation of Q O S aware Q- routing algorithm for network-on-chip,” in Communications in Computer further analyse this concept of the multicasting/multitasking, and Information Science, 2009. we provide a traffic model under the assumption that each task [16] A. Eiger, K. Sikorski, and F. Stenger, “A bisection method for systems generates bursty traffic. For point-multipoint communication, of nonlinear equations,” ACM Trans. Math. Softw., vol. 10, no. 4, pp. 367–377, December 1984. the core can be viewed as generating a single stream with a fixed average OFF time. This burst is probabilistically demul- tiplexed into two streams. The probabilities for demultiplexing are calculated based on specified average OFF time of traffic generated by each communication stream. Traffic model is implemented and verified on an open source NoC simulator NIRGAM. Multicast traffic model is independent of inter-flit interval and burst size. In this paper, we have presented a novel model for simultaneous broadcast to two destinations but the model can be extended to n(n > 2) destinations. In latter case, the solution will require numerical method. Further analysis of the performance of the various routing algorithms, topologies under other traffic distributions shall be part of our future work. R EFERENCES [1] L. Carloni, P. Pande, and Y. Xie, “Networks-on-chip in emerging interconnect paradigms: Advantages and challenges,” in Networks-on- Chip, 2009. NoCS 2009, may 2009, pp. 93 –102. [2] L. Benini and G. D. Micheli, “Networks on chips: A new soc paradigm,” Computer, vol. 35, pp. 70–78, 2002. [3] W. J. Dally and B. Towles, “Route packets, not wires: on-chip intecon- nection networks,” in DAC ’01: Proceedings of the 38th annual Design Automation Conference, 2001, pp. 684–689. [4] J. Duato, S. Yalamanchili, and N. Lionel, Interconnection Networks: An Engineering Approach. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2002. [5] M. Ali, M. Welzl, and S. Hellebrand, “A dynamic routing mechanism for network on chip,” in NORCHIP Conference, 2005. 23rd, 21-22 2005, pp. 70 – 73. [6] L. Tedesco, A. Mello, L. Giacomet, N. Calazans, and F. Moraes, “Ap- plication driven traffic modeling for nocs,” in SBCCI ’06: Proceedings of the 19th annual symposium on Integrated circuits and systems design, 2006, pp. 62–67. [7] R. Marculescu and P. Bogdan, “The chip is the network: Toward a sci- ence of network-on-chip design,” Foundations and Trends in Electronic Design Automation, vol. 2, no. 4, pp. 371–461, 2009. [8] “NIRGAM,” 2009. [Online]. Available: http://cse-trac.mnit.ac.in [9] T. Ahonen, D. A. Sig¨ enza-Tortosa, H. Bin, and J. Nurmi, “Topology u optimization for application-specific networks-on-chip,” in SLIP ’04: Proceedings of the 2004 international workshop on System level in- terconnect prediction. New York, NY, USA: ACM, 2004, pp. 53–60. [10] W. H. Ho and T. M. Pinkston, “A methodology for designing efficient on-chip interconnects on well-behaved communication patterns,” in HPCA ’03: Proceedings of the 9th International Symposium on High- Performance Computer Architecture. Washington, DC, USA: IEEE Computer Society, 2003, p. 377. [11] J. Hu and R. Marculescu, “Energy-aware mapping for tile-based noc architectures under performance constraints,” in ASP-DAC ’03: Proceed- ings of the 2003 Asia and South Pacific Design Automation Conference. New York, NY, USA: ACM, 2003, pp. 233–239. [12] A.-M. Rahmani, I. Kamali, P. Lotfi-Kamran, A. Afzali-Kusha, and S. Safari, “Negative exponential distribution traffic pattern for power/performance analysis of network on chips,” in VLSI Design, 2009 22nd International Conference on, 5-9 2009, pp. 157 –162. [13] A. B. Kahng, B. Li, L.-S. Peh, and K. Samadi, “Orion 2.0: A fast and accurate noc power and area model for early-stage design space exploration,” in DATE’09, 2009, pp. 423–428. [14] C. Grecu, L. Anghel, P. P. Pande, A. Ivanov, and R. Saleh, “Essential fault-tolerance metrics for noc infrastructures,” in IOLTS ’07: Pro- ceedings of the 13th IEEE International On-Line Testing Symposium. Washington, DC, USA: IEEE Computer Society, 2007, pp. 37–42.