2. 2 A.C.S. Beck et al.
these conflicting design constraints in a sustainable fashion, and still allow huge
fabrication volumes. Each challenge is developed in details throughout the next
chapters, providing an extensive literature review as well as settling a promising
research agenda for adaptability.
1.1 Performance Gap
The possibility of increasing the number of transistors inside an integrated circuit
with the passing years, according to Moore’s Law, has been sustaining the perfor-
mance growth along the years. However, this law, as known today, will no longer
hold in a near future. The reason is very simple: physical limits of silicon [11, 19].
Because of that, new technologies that will completely or partially replace silicon
are arising. However, according to the ITRS roadmap [10], these technologies have
either higher density levels and are slower than traditional scaled CMOS, or entirely
the opposite: new devices can achieve higher speeds but with a huge area and power
overhead, even if one considers future CMOS technologies.
Additionally, high performance architectures as the diffused superscalar machines
are achieving their limits. According to what is discussed in [3, 7], and [17], there
are no novel research results in such systems regarding performance improvements.
The advances in ILP (Instruction Level Parallelism) exploitation are stagnating:
considering Intel’s family of processors, the overall efficiency (comparison of
processors performance running at the same clock frequency) has not significantly
increased since the Pentium Pro in 1995. The newest Intel architectures follow the
same trend: the Core2 micro architecture has not presented a significant increase in
its IPC (Instructions per Cycle) rate, as demonstrated in [15].
Performance stagnation occurs because these architectures are challenging some
well-known limits of the ILP [21]. Therefore, even small increases in the ILP
became extremely costly. One of the techniques used to increase ILP is the careful
choice of the dispatch width. However, the dispatch width offers serious impacts on
the overall circuit area. For example, the register bank area grows cubically with
the dispatch width, considering a typical superscalar processor such as the MIPS
R10000 [5].
In [1], the so-called “Mobile Supercomputers” are discussed, which are those
embedded devices that will need to perform several intensive computational tasks,
such as real-time speech recognition, cryptography, augmented reality, besides
the conventional ones, like word and e-mail processing. Even considering desk-
top computer processors, new architectures may not meet the requirements for
future and more computational demanding embedded systems, giving rise to a
performance gap.
3. 1 Adaptability: The Key for Future Embedded Systems 3
1.2 Power and Energy Constraints
Additionally to performance, one should take into account that the potentially
largest problem in embedded systems design is excessive power consumption.
Future embedded systems are expected not to exceed 75 mW, since batteries do not
have an equivalent Moore’s law [1]. Furthermore, leakage power is becoming more
important and, while a system is in standby mode, leakage will be the dominant
source of power consumption. Nowadays, in general purpose microprocessors, the
leakage power dissipation is between 20 and 30 W (considering a total power budget
of 100 W) [18].
One can observe that, in order to attain the power constraints, companies are
migrating to chip multiprocessors to take advantage of the extra area available, even
though there is still a huge potential to speed up single threaded software. In the
essence, stagnation in the increase of clock frequency, excessive power consumption
and higher hardware costs to ILP exploitation, together with the foreseen slower
technologies, are new architectural challenges that must be dealt with.
1.3 Reuse of Existing Binary Code
Among thousands of products launched by consumer electronics companies, one
can observe those which become a great success and those which completely fail.
The explanation perhaps is not just about their quality, but it is also about their
standardization in the industry and the concern of the final user on how long the
product that is being acquired will be subject to updates.
The x86 architecture is one of these major examples. Considering nowadays
standards, the x86 ISA (Instruction Set Architecture) itself does not follow the
last trends in processor architectures. It was developed at a time when memory
was considered very expensive and developers used to compete on who would
implement more and different instructions in their architectures. The x86 ISA
is a typical example of a traditional CISC machine. Nowadays, the newest x86
compatible architectures spend extra pipeline stages plus a considerable area in
control logic and microprogrammable ROM just to decode these CISC instructions
into RISC-like ones. This way, it is possible to implement deep pipelining and all
other high performance RISC techniques while maintaining the x86 instruction set
and, consequently, backward software compatibility.
Although new instructions have been included in the x86 original instruction
set, like the SIMD, MMX, and SSE ones [6], targeting multimedia applications,
there is still support to the original 80 instructions implemented in the very first x86
processor. This means that any software written for any x86 in the past, even those
launched at the end of the 1970s, can be executed on the last Intel processor. This is
one of the keys to the success of this family: the possibility of reusing the existing
binary code, without any kind of modification. This was one of the main reasons
4. 4 A.C.S. Beck et al.
why this product became the leader in its market segment. Intel could guarantee to
its consumers that their programs would not be obsoleted during a long period of
time and, even when changing the system to a faster one, they would still be able to
reuse and execute the same software again.
Therefore, companies such as Intel and AMD keep implementing more power
consuming superscalar techniques, trying to push the frequency increase for their
operation to the extreme. Branch predictors with higher accuracy, more advanced
algorithms for parallelism detection, or the use of Simultaneous Multithreading
(SMT) architectures, like the Intel Hyperthreading [12], are some of the known
strategies. However, the basic principle used for high performance architectures is
still the same: superscalarity. As embedded products are more and more based on a
huge amount of software development, the cost of sustaining legacy code will most
likely have to be taken into consideration when new platforms come to the market.
1.4 Yield and Manufacturing Costs
In [16], a discussion is made about the future of the fabrication processes using
new technologies. According to the authors standard cells, as they are today, will
not exist anymore. As the manufacturing interface is changing, regular fabrics will
soon become a necessity. How much regularity versus how much configurability
is necessary (as well as the granularity of these regular circuits) is still an open
question. Regularity can be understood as the replication of equal parts, or blocks, to
compose a whole. These blocks can be composed of gates, standard-cells, standard-
blocks, to name a few. What is almost a consensus is the fact that the freedom of the
designers, represented by the irregularity of the project, will be more expensive in
the future. By the use of regular circuits, the design company can decrease costs, as
well as the possibility of manufacturing defects, since the reliability of printing the
geometries employed today in 65 nm and below is a big issue. In [4] it is claimed
that maybe the main research focus for researches when developing a new system
will be reliability, instead of performance.
Nowadays, the amount of resources to create an ASIC design of moderately high
volume, complexity and low power, is considered very high. Some design compa-
nies can still succeed to do it because they have experienced designers, infrastructure
and expertise. However, for the very same reasons, there are companies that just
cannot afford it. For these companies, a more regular fabric seems the best way to
go as a compromise for using an advanced process. As an example, in 1997 there
were 11,000 ASIC design startups. This number dropped to 1,400 in 2003 [20]. The
mask cost seems to be the primary problem. For example, mask costs for a typical
system-on-chip have gone from $800,000 at 65 nm to $2.8 million at 28 nm [8]. This
way, to maintain the same number of ASIC designs, their costs need to return to tens
of thousands of dollars.
The costs concerning the lithography tool chain to fabricate CMOS transistors
are also a major source of high expenses. According to [18], the costs related to
5. 1 Adaptability: The Key for Future Embedded Systems 5
lithography steppers increased from $10 to $35 million in this decade. Therefore,
the cost of a modern factory varies between $2 and $3 billion. On the other hand,
the cost per transistor decreases, because even though it is more expensive to build
a circuit nowadays, more transistors are integrated onto one die.
Moreover, it is very likely that the design and verification costs are growing in the
same proportion, impacting the final cost even more. For the 0.8 μm technology, the
non-recurring engineering (NRE) costs were only about $40,000. With each advance
in IC technology, the NRE costs have dramatically increased. NRE costs for 0.18 μm
design are around $350,000, and at 0.13 μm, the costs are over $1 million [20]. This
trend is expected to continue at each subsequent technology node, making it more
difficult for designers to justify producing an ASIC using nowadays technologies.
The time it takes for a design to be manufactured at a fabrication facility and
returned to the designers in the form of an initial IC (turnaround time) is also
increasing. Longer turnaround times lead to higher design costs, which may imply
in loss of revenue if the design is late to the market.
Because of all these issues discussed before, there is a limit in the number
of situations that can justify producing designs using the latest IC technology.
Already in 2003, less than 1,000 out of every 10,000 ASIC designs had high enough
volumes to justify fabrication at 0.13 μm [20]. Therefore, if design costs and times
for producing a high-end IC are becoming increasingly large, just a few of them
will justify their production in the future. The problems of increasing design costs
and long turnaround times become even more noticeable due to increasing market
pressures. The time available for a company to introduce a product into the market
is shrinking. This way, the design of new ICs is increasingly being driven by time-
to-market concerns.
Nevertheless, there will be a crossover point where, if the company needs a
more customized silicon implementation, it will be necessary to afford the mask
and production costs. However, economics are clearly pushing designers toward
more regular structures that can be manufactured in larger quantities. Regular fabric
would solve the mask cost and many other issues such as printability, extraction,
power integrity, testing, and yield. Customization of a product, however, cannot rely
solely on software programming, mostly for energy efficiency reasons. This way,
some form of hardware adaptability must be present to ensure that low cost, mass
produced devices can still be tuned for different applications needs, without redesign
and fabrication costs.
1.5 Memory
Memories have been a concern since the early years of computing systems. Whether
due to size, manufacturing cost, bandwidth, reliability or energy consumption,
special care has always been taken when designing the memory structure of a
system. The historical and ever growing gap between the access time of memories
and the throughput of processors has also driven the development of very advanced
6. 6 A.C.S. Beck et al.
and large cache memories, with complex allocation and replacement schemes.
Moreover, the growing integration capacity of manufacturing processes has further
fueled the use of large on-chip caches, which occupy a significant fraction of the
silicon area for most current IC designs. Thus, memories represent nowadays a
significant component for the overall cost, performance and power consumption of
most systems, creating the need for careful design and dimensioning of the memory
related subsystems.
The development of memories for current embedded systems is supported mainly
by the scaling of transistors. Thus, the same basic SRAM, DRAM and Flash
cells have been used generation after generation with smaller transistors. While
this approach improves latency and density, it also brings several new challenges.
As leakage current does not decrease at the same pace as density increases, the static
power dissipation is already a major concern for memory architectures, leading to
joint efforts at all design levels. While research on device level tries to provide low
leakage cells [23], research on architecture level tries to power off memory banks
whenever possible [13, 24]. Moreover, the reduced critical charge increases the
soft error rates and places greater pressure on efficient error correction techniques,
especially for safety-critical applications. The reduced feature sizes also increase
process variability, leading to increased losses in yield. Thus, extensive research
is required to maintain the performance and energy consumption improvements
expected from the next generations of embedded systems, while not jeopardizing
yield and reliability.
Another great challenge arises with the growing difficulties found in CMOS
scaling. New memory technologies are expected to replace both the volatile and
the non-volatile fabrics used nowadays. These technologies should provide low
power consumption, low access latency, high reliability, high density, and, most
importantly, ultra-low cost per bit [10]. As coupling the required features on
new technologies is a highly demanding task, several contenders arise as possible
solutions, such as ferroelectric, nanoelectromechanical, and organic cells [10]. Each
memory type has specific tasks within an MPSoC. Since memory is a large part of
any system nowadays, bringing obvious costs and energy dissipation problems, the
challenge is to make its usage as efficient as possible, possibly using run-time or
application based information not available at design time.
1.6 Communication
With the increasing limitations in power consumption and the growing complexity
of improving the current levels of ILP exploitation, the trend towards embedding
multiple processing cores in a single chip has become a reality. While the use
of multiple processors provides more manageable resources, which can be turned
off independently to save power, for instance [9], it is crucial that they are able to
communicate among themselves in an efficient manner, in order to allow actual ac-
celeration with thread level parallelism. From the communication infrastructure one
7. 1 Adaptability: The Key for Future Embedded Systems 7
expects high bandwidth, low latency, low power consumption, low manufacturing
costs, and high reliability, with more or less relevance to each feature depending on
the application. Even though this may be a simple task for a small set of processors,
it becomes increasingly complex for a larger set of processors. Furthermore, aside
from processors, embedded SoCs include heterogeneous components, such as
dedicated accelerators and off-chip communication interfaces, which must also be
interconnected. The number of processing components expected to be integrated
within a single SoC is expected to grow quickly in the next years, exceeding
1,000 components in 2019 [10]. Thus, the need for highly scalable communication
systems is one the most prominent challenges found when creating a multi-
processor system-on-chip (MPSoC).
As classical approaches such as busses or shared multi-port memories have poor
scalability, new communication techniques and topologies are required to meet the
demands of the new MPSoCs with many cores and stringent area and power limi-
tations. Among such techniques, networks-on-chip (NoCs) have received extensive
attention over the past years, since they bring high scalability and high bandwidth
as significant assets [2]. With the rise of NoCs as a promising interconnection
for MPSoCs, several related issues have to be addressed, such as the optimum
memory organization, routing mechanism, thread scheduling and placement, and
so on. Additionally, as all these design choices are highly application-dependant,
there is a great room for adaptability also on the communication infrastructure, not
only for NoCs but for any chosen scheme covering the communication fabric.
1.7 Fault Tolerance
Fault Tolerance has gained more attention in the past years due to the intrinsic
vulnerability that deep-submicron technologies have imposed. As one gets closer
to the physical limits of current CMOS technology, the impact of physical effects
on system reliability is magnified. This is a consequence of the susceptibility that a
very fragile circuit has when exposed to many different types of extreme conditions,
such as elevated temperatures and voltages, radioactive particles coming from outer
space, or impurities presented in the materials used for packaging or manufacturing
the circuit, etc. Independent on the agent that causes the fault, the predictions about
future nanoscale circuits indicate a major need for fault tolerance solutions to cope
with the expected high fault rates [22].
Fault-tolerant solutions exist since 1950, first for the purpose of working in
hostile and remote environments of military and space missions. Later, to attain the
demand for highly reliable mission-critical applications systems, such as banking
systems, car braking, airplanes, telecommunication, etc. [14]. The main problem of
the mentioned solutions is the fact that they are targeted to avoid that a fault affects
the system at any cost, since any problem could have catastrophic consequences.
For this reason, in many cases, there is no concern with the area/power/performance
overhead that the fault-tolerant solution may add to the system.
8. 8 A.C.S. Beck et al.
In this sense, the main challenge is to allow the development of high performance
embedded systems, considering all the aspects mentioned before, such as power
and energy consumption, applications with heterogeneous behavior, memory, etc.,
while still providing a highly reliable system that can cope with a large assortment
of faults. Therefore, this ever-increasing need for fault-tolerant, high performance,
low cost, low energy systems leads to an essential question: which is the best fault-
tolerant approach targeted to embedded systems, that is robust enough to handle
high fault rates and cause a low impact on all the other aspects of embedded
system design? The answer changes among applications, type of task and underlying
hardware platform. Once again, the key to solve this problem at different instances
relies on adaptive techniques to reduce cost and sustain performance.
1.8 Software Engineering and Development for Adaptive
Platforms
Adaptive hardware imposes real challenges for software engineering, from the
requirement elicitation to the software development phases. The difficulties for
software engineering are created due to the high flexibility and design space that
exists in adaptive hardware platforms. Besides the main behavior that the software
implements, i.e. the functional requirements, an adaptive hardware platform unveils
a big range of non-functional requirements that must be met by the software
under execution and supported by the software engineering process. Non-functional
requirements are a burden to software development even nowadays. While it is
somewhat known how to control some of the classical ones, such as performance or
latency, for the ones specifically important to the embedded domain, such as energy
and power, the proper handling is still an open research problem.
Embedded software has radically changed at fast pace within just a few years.
Once being highly specialized to perform just a few tasks, such as decoding voice,
or organizing a simple phone book in case of mobile phones and one at a time,
the software we find today in any mainstream smart phone contains several pieces
of interconnected APIs and frameworks working together to deliver a completely
different experience to the user. The embedded software is now multitask and runs
in parallel, since even mobile devices contains a distinct set of microprocessors,
each one dedicated to a certain task, such as speech processing and graphics. These
distinct architectures exist and are necessary to save energy. Wasting computational
and energy resources is a luxury that resource constrained devices cannot afford.
However, the above intricate and heterogeneous hardware, which support more
than one instruction set architecture (ISA), were designed to be resource-efficient,
and not to ease software design and production. In addition, since there are
potentially many computing nodes, parallel software designed to efficiently occupy
the heterogeneous hardware is mandatory also to save energy. Needless to say
how difficult parallel software design is. If the software is not well designed to
take advantage and efficiently use all the available ISAs, the software designer
9. 1 Adaptability: The Key for Future Embedded Systems 9
will probably miss an optimal point of resources utilization, yielding energy-
hungry applications. One can easily imagine several of them running concurrently,
coming from unknown and distinct software publishers, implementing unforeseen
functionalities, and have the whole picture of how challenging software design and
development for these devices can be.
If adaptive hardware platforms are meant to be programmable commodity
devices in the near future, the software engineering for them must transparently
handle their intrinsic complexity, removing this burden from the code. In the
adaptive embedded systems arena, software will continue to be the actual source of
differentiation between competing products and of innovation for electronics con-
sumer companies. A whole new environment of programming languages, software
development tools, and compilers may be necessary to support the development of
adaptive software or, at least, a deep rethink of the existing technologies. Industry
uses a myriad of programming and modeling languages, versioning systems,
software design and development tools, just to name a few of the key technologies,
to keep delivering innovation in their software products. The big question is how to
make those technologies scale in terms of productivity, reliability, and complexity
for the new and exciting software engineering scenario created by adaptive systems.
1.9 This Book
Industry faces a great number of challenges, at different levels, when designing
embedded systems: they need to boost performance while maintaining energy con-
sumption as low as possible, they must be able to reuse existent software code, and
at the same time they need to take advantage of the extra logic available in the chip,
represented by multiple processors working together. In this book we present and
discuss several strategies to achieve such conflicting and interrelated goals, through
the use of adaptability. We start by discussing the main challenges designers must
handle in these days and in the future. Then, we start showing different hardware
solutions that can cope with some of the aforementioned problems: reconfigurable
systems; dynamic optimization techniques, such as Binary Translation and Trace
Reuse; new memory architectures; homogeneous and heterogeneous multiprocessor
systems and MPSoCs; communication issues and NOCs; fault tolerance against
fabrication defects and soft errors; and, finally, how to employ specialized software
to improve this new scenario for embedded systems design, and how this new kind
of software must be designed and programmed.
In Chap. 2, we show, with the help of examples, how the behavior of even
a single thread execution is heterogeneous, and how difficult it is to distribute
heterogeneous tasks among the components in a SoC environment, reinforcing the
need for adaptability.
Chapter 3 gives an overview of adaptive and reconfigurable systems and their
basic functioning. It starts with a classification about reconfigurable architectures,
10. 10 A.C.S. Beck et al.
including coupling, granularity, etc. Then, several reconfigurable systems are
shown, and for those which are the most used, the chapter discusses their advantages
and drawbacks.
Chapter 4 discusses the importance of memory hierarchies in modern embedded
systems. The importance of carefully dimensioning the size or associativity of
cache memories is presented by means of its impact on access latency and energy
consumption. Moreover, simple benchmark applications show that the optimum
memory architecture greatly varies according to software behavior. Hence, there
is no universal memory hierarchy that will present maximum performance with
minimum energy consumption for every application. This property creates room
for adaptable memory architectures that aim at getting as close as possible to
this optimum configuration for the application at hand. The final part of Chap. 4
discusses relevant works that propose such architectures.
In Chap. 5, Network-on-Chips are shown, and several adaptive techniques that
can be applied to them are discussed. Chapter 6 shows how dynamic techniques,
such as binary translation and trace reuse, work to sustain adaptability and still
maintain binary compatibility. We will also discuss architectures that present some
level of dynamic adaptability, as well as what is the price to pay for such type of
adaptability, and for which kind of applications it is well suited.
Chapter 7, about Fault Tolerance, starts with a brief review of some of the
most used concepts concerning this subject, such as reliability, maintainability,
and dependability, and discusses their impact on the yield rate and costs of
manufacturing. Then, several techniques that employ fault tolerance at some level
are demonstrated, with a critical analysis.
In Chap. 8 we discuss how important the communication infrastructure is for
future embedded systems, which will have more heterogeneous applications being
executed, and how the communication pattern might aggressively change, even with
the same set of heterogeneous cores, from application to application.
Chapter 9 puts adaptive embedded systems into the center of the software engi-
neering process, making them programmable devices. This chapter presents tech-
niques from the software inception, passing through functional and non-functional
requirements elicitation, programming language paradigms, and automatic design
space exploration. Adaptive embedded systems impose harsh burdens to software
design and development, requiring us to devise novel techniques and methodologies
for software engineering. In the end of the chapter, a propositional software design
flow is presented, which helps to connect the techniques and methods discussed in
the previous chapters and to put into technological grounds a research agenda for
adaptive embedded software and systems.
References
1. Austin, T., Blaauw, D., Mahlke, S., Mudge, T., Chakrabarti, C., Wolf, W.: Mobile supercom-
puters. Computer 37(5), 81–83 (2004). doi:http://dx.doi.org/10.1109/MC.2004.1297253
11. 1 Adaptability: The Key for Future Embedded Systems 11
2. Bjerregaard, T., Mahadevan, S.: A survey of research and practices of network-on-chip. ACM
Comput. Surv. 38(1) (2006). doi:http://doi.acm.org/10.1145/1132952.1132953.
3. Borkar, S., Chien, A.A.: The future of microprocessors. Commun. ACM 54(5), 67–77 (2011).
doi:10.1145/1941487.1941507. http://doi.acm.org/10.1145/1941487.1941507
4. Burger, D., Goodman, J.R.: Billion-transistor architectures: there and back again. Computer
37(3), 22–28 (2004). doi:http://dx.doi.org/10.1109/MC.2004.1273999
5. Burns, J., Gaudiot, J.L.: Smt layout overhead and scalability. IEEE Trans. Parallel Distrib. Syst.
13(2), 142–155 (2002). doi:http://dx.doi.org/10.1109/71.983942
6. Conte, G., Tommesani, S., Zanichelli, F.: The long and winding road to high-performance
image processing with mmx/sse. In: CAMP ’00: Proceedings of the Fifth IEEE International
Workshop on Computer Architectures for Machine Perception (CAMP’00), p. 302. IEEE
Computer Society, Washington, DC (2000)
7. Flynn, M.J., Hung, P.: Microprocessor design issues: Thoughts on the road ahead. IEEE Micro.
25(3), 16–31 (2005). doi:http://dx.doi.org/10.1109/MM.2005.56
8. Fujimura, A.: All lithography roads ahead lead to more e-beam innovation. In: Future Fab. Int.
(37), http://www.future-fab.com (2011)
9. Isci, C., Buyuktosunoglu, A., Cher, C., Bose, P., Martonosi, M.: An analysis of efficient
multi-core global power management policies: maximizing performance for a given power
budget. In: Proceedings of the 39th annual IEEE/ACM International Symposium on Mi-
croarchitecture, MICRO 39, pp. 347–358. IEEE Computer Society, Washington, DC (2006).
doi:10.1109/MICRO.2006.8
10. ITRS: ITRS 2011 Roadmap. Tech. rep., International Technology Roadmap for Semiconduc-
tors (2011)
11. Kim, N.S., Austin, T., Blaauw, D., Mudge, T., Flautner, K., Hu, J.S., Irwin, M.J., Kandemir,
M., Narayanan, V.: Leakage current: Moore’s law meets static power. Computer 36(12), 68–75
(2003). doi:http://dx.doi.org/10.1109/MC.2003.1250885
12. Koufaty, D., Marr, D.T.: Hyperthreading technology in the netburst microarchitecture. IEEE
Micro. 23(2), 56–65 (2003)
13. Powell, M., Yang, S.H., Falsafi, B., Roy, K., Vijaykumar, T.N.: Gated-vdd: a circuit technique
to reduce leakage in deep-submicron cache memories. In: Proceedings of the 2000 Interna-
tional Symposium on Low Power Electronics and Design, ISLPED ’00, pp. 90–95. ACM,
New York (2000). doi:10.1145/344166.344526. http://doi.acm.org/10.1145/344166.344526
14. Pradhan, D.K.: Fault-Tolerant Computer System Design. Prentice Hall, Upper Saddle River
(1996)
15. Prakash, T.K., Peng, L.: Performance characterization of spec cpu2006 benchmarks on intel
core 2 duo processor. ISAST Trans. Comput. Softw. Eng. 2(1), 36–41 (2008)
16. Rutenbar, R.A., Baron, M., Daniel, T., Jayaraman, R., Or-Bach, Z., Rose, J., Sechen, C.:
(when) will fpgas kill asics? (panel session). In: DAC ’01: Proceedings of the 38th Annual
Design Automation Conference, pp. 321–322. ACM, New York (2001). doi:http://doi.acm.
org/10.1145/378239.378499
17. Sima, D.: Decisive aspects in the evolution of microprocessors. Proc. IEEE 92(12), 1896–1926
(2004)
18. Thompson, S., Parthasarathy, S.: Moore’s law: The future of si microelectronics. Mater. Today
9(6), 20–25 (2006)
19. Thompson, S.E., Chau, R.S., Ghani, T., Mistry, K., Tyagi, S., Bohr, M.T.: In search of “forever,”
continued transistor scaling one new material at a time. IEEE Trans. Semicond. Manuf. 18(1),
26–36 (2005). doi:10.1109/TSM.2004.841816. http://dx.doi.org/10.1109/TSM.2004.841816
20. Vahid, F., Lysecky, R.L., Zhang, C., Stitt, G.: Highly configurable platforms for embedded
computing systems. Microelectron. J. 34(11), 1025–1029 (2003)
21. Wall, D.W.: Limits of instruction-level parallelism. In: ASPLOS-IV: Proceedings of the
Fourth International Conference on Architectural Support for Programming Languages and
Operating Systems, pp. 176–188. ACM, New York (1991). doi:http://doi.acm.org/10.1145/
106972.106991
12. 12 A.C.S. Beck et al.
22. White, M., Chen, Y.: Scaled cmos technology reliability users guide. Tech. rep., Jet Propulsion
Laboratory, National Aeronautics and Space Administration (2008)
23. Yang, S., et al: 28nm metal-gate high-k cmos soc technology for high-performance mobile
applications. In: Custom Integrated Circuits Conference (CICC), 2011 IEEE, pp. 1–5 (2011).
doi:10.1109/CICC.2011.6055355
24. Zhang, C., Vahid, F., Najjar, W.: A highly configurable cache architecture for embedded
systems. In: Proceedings of the 30th Annual International Symposium on Computer Archi-
tecture, ISCA ’03, pp. 136–146. ACM, New York (2003). doi:10.1145/859618.859635. http://
doi.acm.org/10.1145/859618.859635