Video and audio conferencing infrastructure technology has traditionally been hardware based.
Pexip Infinity is a virtualized software based platform that aims to deliver better and more flexible performance from standard off-the-shelf servers.
This white paper discusses findings and performance of such.
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Software based video, audio, web conferencing - can standard servers deliver?
1. Software Based Conferencing – Can Standard Servers Deliver?
A white paper by Håkon Dahle, CTO, Pexip.
1
Software Based Conferencing –
Can Standard Servers Deliver?
A white paper by Håkon Dahle, CTO, Pexip.
14 November, 2013
Contact Pexip:
w: www.pexip.com
e: info@pexip.com
t: @PexipInc
2. Executive summary
Multipoint video conferencing has always required a significant amount of processing power in
order to deliver a decent customer experience. Over the last twenty years, we have seen a
tremendous development in performance, scalability and user experience. The first multipoint
video conferencing systems were custom architectures with application specific processors, which
were complex and difficult to program. Over time, we saw how the industry moved to less esoteric
CPU architectures which were lower cost and easier to program, yet still required custom hardware
designs. Only lately has it become possible to deliver the required performance and scale using off
the shelf servers with standard Intel processors. With the relentless performance improvements
available in standard server designs we expect custom hardware based solutions to disappear over
the next few years.
Software Based Conferencing – Can Standard Servers Deliver?
A white paper by Håkon Dahle, CTO, Pexip.
2
3. First Generation Second Generation Third Generation
Application Specific
Processors
Standard DSPs
Custom Chassis
Off-the-shelf
Servers
1990 1995 2000 2005 2010 2015
Figure 1: The third generation MCU is simply software running on standard off-the-shelf servers.
First Generation MCUs:
The 1990’s and application specific processors
Standards based conferencing really started in the early 1990’s, with the H.320 and H.261
protocols. As endpoints started to gain acceptance among early adopters, there became a need for
a way to arrange large multipoint meetings. Companies that saw success delivering Multipoint
Conferencing Units (MCUs) in the 1990’s were VideoServer and Accord. Both of these vendors
used custom hardware based on application specific processors from the US processor
manufacturer IIT (later renamed 8x8).
Second Generation MCUs:
Custom hardware using standard DSPs
In the late 1990’s and early 2000’s the industry saw wide adoption of VLIW (Very Long Instruction
Word) processors such as Philips Trimedia and Equator BSP being widely adopted in numerous
endpoints. However these processors were not adopted as widely in MCU products.
Software Based Conferencing – Can Standard Servers Deliver?
A white paper by Håkon Dahle, CTO, Pexip.
3
However the early 2000’s saw the rapid success of Texas Instruments (TI) in this space, where
their C6000-series and DaVinci-series DSPs (Digital Signal Processors) went on to replace the
VLIW processors. By the end of this decade, most vendors delivered MCU products based on TI
DSPs: Codian (later acquired by Tandberg who was in turn acquired by Cisco), Radvision and
Polycom. Common to all these products were custom hardware architectures either in pizza-box
form factors or as large chassis-based systems which carried big price tags, and which typically
faced obsolescence after 4-6 years. Complicating these systems further was the fact that some
4. used DSPs with dedicated hardware accelerators for H.264. This means that some of these
products could not easily be programmed to support new video codecs such as VP8, VP9 and
H.265. For the end user this will eventually mean that these expensive hardware platforms can not
be upgraded to support recent developments in video conferencing technology – a forklift upgrade
will be required.
Third Generation MCUs:
Performance and scale, software on industry standard servers
While custom hardware architectures can provide excellent performance, they are expensive to
develop and development cycles are long. Using software on standard servers seems like a
reasonable idea. In fact, with the latest processors from Intel, and in particular with Intel’s “Sandy
Bridge” processors which started shipping in 2011, industry standard servers are now suitable
platforms for media intensive applications. Instruction set extensions such as SSE (Streaming
SIMD Extensions) and AVX (Advanced Vector eXtensions) together with hyper-threading and an
ever increasing number of processor cores on a single die, allow for performance even better than
custom hardware architectures using ASICs, FPGAs and DSPs.
Software Based Conferencing – Can Standard Servers Deliver?
A white paper by Håkon Dahle, CTO, Pexip.
4
As an example, using a standard 1RU type server from any major vendor (HP, Dell, IBM, Cisco
etc) with dual Intel E5-2600 series CPUs, each with 8 cores at 2.7GHz, Pexip can deliver 32 ports
of high definition conferencing. In terms of “HD ports per rack unit” this compares extremely well
with the traditional custom hardware designs.
For even higher density, using blade servers it is now possible to get more than 1000 ports of true
HD conferencing in a mere 10RU of rack space.
Furthermore, with the recent introduction of Intel’s E5-2600v2 series of processors, we see yet
another improvement in performance. A standard 1RU server configured with dual Intel E5-2600v2
processors, each with 12 cores at 2.7GHz, can deliver 48 ports of 720p30 high definition
conferencing, where a port can be any video codec – H.263, H.264, H.264SVC or VP8.
While this is the latest offering from Intel, the next generation architecture (codename “Haswell”)
is already available in desktop laptop computers. When this architecture becomes available in a
dual or quad socket server CPU design we should expect yet another step-change in performance.
As an example, the new AVX2 instruction set extensions will double the integer performance for
many core video processing algorithms.
5. 2,600,000,000
1,000,000,000
100,000,000
10,000,000
1,000,000
100,000
10,000
2,300
80486
Six-Core Core i7
Six-Core Xeon 7400
Dual-Core Itanium 2
AMD K10
POWER5
Itanium 2 with 8 MB cache
Itanium 2
Pentium 4
AMD K7
16-core SPARKT3
10-co re XeonWestmere-EX
8-core POWER7
Quad-co re z198
Quad-Co re Itanium Tukwila
8-core Xeon Nehalem-EX
Six-Core Opteron 2400
Core i7 (Quad)
Core 2Duo
Cell
AMD K8
Barion
AMD K6-III
AMD K6
Pentium III
Pentium II
AMD K5
Pentium
80386
80286
68000 Pentium
8086 8088
6809
Z80
MOS6502
8080
8008
4004
8085
6800
Atom
1971 1980 1990 2000 2011
Software Based Conferencing – Can Standard Servers Deliver?
A white paper by Håkon Dahle, CTO, Pexip.
5
The impact of Moore’s Law on the future of
conferencing
The development of Intel’s Sandy Bridge, Ivy Bridge and now Haswell are good examples of
Moore’s Law. Moore’s Law states that the number of transistors on a chip doubles every 18-24
months. While the end of Moore’s Law has been predicted several times, the observation has
shown to be true for the last 40 years.
The implications for Pexip customers are important: By using the latest server designs, they can
expect to see a doubling of port capacity every two years. We have already shown progress from
Intel Sandy Bridge (32 ports per RU) to Ivy Bridge (48 ports per RU). Get ready for another
increase in performance and capacity as Haswell becomes available some time in 2014.
Figure 2: Moore's Law shows that processor transistor count doubles every 18-24 months. Curve shows
transistor count for popular microprocessors and their time of introduction. Illustration courtesy of Wikipedia.
6. Software Based Conferencing – Can Standard Servers Deliver?
A white paper by Håkon Dahle, CTO, Pexip.
6
Other benefits of software based conferencing
However there is more to conferencing than just processor performance. There are a number of
other benefits to this approach which will be covered in separate white papers, we will just
mention two important aspects here:
1. As video conferencing becomes a mission-critical collaboration tool, availability and
reliability becomes more critical. By leveraging software and virtualization, the cost of
having a standby server is dramatically reduced, compared with the cost of having a
second custom-hardware MCU chassis on standby. Furthermore, with VMware tools such
as vMotion and High Availability, one can enable yet another level of resilience which has
never been the focus of these custom hardware architectures.
2. Enterprises are adopting virtualization as a key part of their data center strategies. This
will reduce costs, consolidate resources, streamline management and deployment. For
these customers, the ability to run conferencing as just another data center workload is
extremely attractive: Conferencing can now be deployed, managed and monitored across
the globe.
Conclusions
Software based conferencing today delivers performance and scale equal to or better than custom
hardware architectures. Moore’s Law indicates that increase in performance and density will
continue. In addition, software based conferencing allows IT professionals to view video
conferencing as yet another data center workload, and reap all the benefits that standard data
center and virtualization tools allow in terms of reduced cost of ownership, ease of deployment,
ease of management, increased reliability and optimal usage of resources.
References
1. http://www.wainhouse.com/files/wrb-05/WRB-0527.pdf “MXP is based on the newest chip technology
from Philips TriMedia”…“Additionally, MXP is the architecture inside the TANDBERG MPS, the carrier class
MCU announced by the company last month.”
2. http://support.polycom.com/global/documents/support/setup_maintenance/products/network/RMX_2000_
Hardware_Guide_V_7_6.pdf: “The MPM cards perform the various RTP, audio and video processing
functions on the RMX 2000. MPM cards are based on the ATCA standard, with a card manager (CM) and up
to 26 720MHz TI DSP’s”
3. https://www.google.com/patents/US6584077: “The programmable RISCIIT 150 maintains the host
port 164, TDM interface 158 and pixel interface 166, and controls the H.221/BCH 156, Huffman
CODEC 154 and other peripherals internal to the VCP. The VP5 152 performs the compression primitives
and is controlled by the RISCIIT 150. For detailed information, see the IIT VCP Preliminary Data Sheet and
VCP External Reference Specification.”
7. Software Based Conferencing – Can Standard Servers Deliver?
A white paper by Håkon Dahle, CTO, Pexip.
7
4. http://newscenter.ti.com/index.php?s=32851&item=126425: “DALLAS (June 5, 2003) -- Texas
Instruments (NYSE: TXN) (TI) today announced that RADVISION (NASDAQ: RVSN) chose TI´s advanced
TMS320C6000™ programmable digital signal processors (DSPs) to power its new MVP media processor
board, a key component of the company´s recently announced Multimedia Control Unit (MCU) version
3. The MCU v3 is the company´s flagship solution for videoconferencing and rich media communications
for enterprises, institutions and service providers.”
5. http://www.frost.com/prod/servlet/press-release.pag?docid=104990756 : “Resulting from four years of
R&D work, the Codian MCU 4500 Series utilizes the latest in chip technology from TI. The use of next
generation digital signal processors (DSPs) enables the series to provide ten times the MIPs of Codian's
MCU 4200 Series. ”
6. http://media.freescale.com/phoenix.zhtml?c=196520&p=irol-newsArticle_print&ID=1424298&highlight= :
TEL-AVIV, Israel – Designing with Freescale Conference – May 11, 2010 – Freescale Semiconductor’s high-performance
MSC8144 multicore digital signal processor (DSP) has been selected by Radvision for use in
its latest high-definition SCOPIA Elite 5000 Unified Communications Video Infrastructure Multiparty
Conferencing Unit.
7. http://www.businesswire.com/news/home/20031006005717/en/Equator-BSP-15-Powers-Polycoms-VSX-
7000-Video: “CAMPBELL, Calif.--(BUSINESS WIRE)--Oct. 6, 2003--Equator Technologies, Inc., a leader in
programmable system-on-a-chip (SoC) processors for digital media, surveillance and video communication
applications, announced today its inclusion in the new VSX 7000 video-conferencing system from Polycom,
Inc”
8. http://www.edn.com/electronics-news/4360491/Finally-the-Dawn-of-TriMedia- “The TriMedia is a media
processor based on a very long instruction word (VLIW) architecture and targeted at being the "brains" to
consumer, communications and computer applications that feature audio, video, graphics and
communications datastreams.”…“Videoconferencing is one of three areas that Philips is looking at the
TriMedia to be integrated into. Beyond Polycom, the company says it has signed up numerous other video
conferencing systems;….”
9. http://newsroom.intel.com/community/intel_newsroom/blog/2013/09/10/intel-introduces-highly-versatile-datacenter-
processor-family-architected-for-new-era-of-services Intel E5-2600v2 launch
10. http://en.wikipedia.org/wiki/Advanced_Vector_Extensions Intel AVX and AVX2
11. http://en.wikipedia.org/wiki/Moore's_law : “Moore's law is the observation that, over the history of
computing hardware, the number of transistors on integrated circuits doubles approximately every two
years. The period often quoted as "18 months" is due to Intel executive David House, who predicted that
period for a doubling in chip performance (being a combination of the effect of more transistors and their
being faster)”
12. http://en.wikipedia.org/wiki/8x8: “In the early 1990s IIT began producing chips, software and other
technologies for the videoconferencing market. Frustrated by the high prices and low volumes of these
videoconferencing systems, the company changed its name to 8x8 and began marketing its own set-top
videoconferencing systems for consumers under the ViaTV brand”