Monday, 1 August 2016

FPGA for Internet of Things

.P. Sampath is an active member of IEEE and Institution of Engineers India. He is a regular contributor to national newspapers, IEEE-MAS section and has published international papers on VLSI and networks. In this article he shares his insight about FPGA and what revolves around it in the IoT zone.

Billions of devices are expected to be connected wirelessly by 2020. In this emerging era of connected devices, machines need not only be secure but also need to be secure at device, design and system levels. Internet of Things (IoT) requires diverse technology and specialised skill areas such as specialised hardware and sensor development, along with sophisticated real-time embedded firmware, cloud applications and Big Data analytics for massive real-time data into usable information, delivery of data to human-scale and human-usable platforms, particularly sophisticated smartphone apps. 


The IoT is revealing an important need in technology, that is, programmable hardware and I/O.

FPGA at a glance

IoT will soon be driven by field-programmable gate array (FPGA)-like devices, because these devices can interface with the outside world very easily and provide lowest power, lowest latency and best determinism. IoT would interface with temperature, pressure, position, acceleration, analogue-to-digital converters (ADCs), digital-to-analogue converters (DACs), current and voltage, among others. Arduino and Raspberry Pi could also be used
An FPGA can be considered a programmable special-purpose processor as it can handle signals at its input pins, process these and drive signals on its output pins.
The above system is very deterministic. An FPGA can interact with memory and storage devices through serialiser/deserialiser interfaces (SERDES), which also allow for Ethernet, serial or Bluetooth communication. An FPGA can, for example, take an HTTP request packet received from a wireless Ethernet component, decode its request, fetch information from memory and return the requested result back through the Ethernet device.
FPGA could be coupled with an ARM processor to leverage higher-level software functions such as Web servers or security packages, if higher level of processing is required. The key consideration is the programmable aspect of an FPGA. In a typical development cycle, a supplier development kit is employed to configure FPGA, while a printed circuit board (PCB) is developed with specific sensor/communication/display components, as required.
The IoT comprises at least three layers, each with its own medium and protocols

FPGA challenges

IoT challenges include security, privacy, unauthorised access, malicious control and denial of service. A hardware-first approach with respect to security and implementation of necessary functionality on the systems on chip (SoC) level is vital for fully securing devices and platforms such as FPGAs, wearables, smartphones, tablets and other intelligent appliances.
In practice, the hardware-based platform offers a single user interface (UI) across factory locations, real-time visibility into operations and remote, cloud based feature activation. IoT devices also have long life spans, yet manufacturers are likely to stop developing and rolling out patches for a product once it reaches obsolescence. For these reasons, IoT devices should leverage hardware based security and isolation mechanisms that offer robust protection against various forms of attack.
The outside layer of this network comprises physical devices that touch, or almost touch, the real world, such as sensors (optical, thermal, mechanical and others) that measure the physical states of houses, machines or people.
There are some complete control systems such as thermostats, smart appliances or drone helicopters. The presence of these complex devices introduces an encounter with the IoT in the form of sensors and actuators, or complete systems.
Consider the thermostat at home. When we add an interface to it so that a mobile app can read the temperature, check for failures and change the set-point, it works automatically. This approach wants to, whenever possible, move control onto the Internet, and ideally onto a computing cloud and scatter tiny, inexpensive sensors everywhere. Here, we eliminate the thermostat altogether and, instead, put temperature sensors around the house, inside and out. And while we are at it, we pull the controller boards out of the furnace and air-conditioner, connect their inputs and outputs to the Internet as well, so a cloud application can directly read their states and control their sub-systems.
In general, these wireless interfaces match the characteristics such as low power and the ability to sleep at very low quiescent current, long periods of sleep and short bursts of activity. But the interfaces bring with them baggage, too. These are mutually-incompatible, have a short range and use simplified, non-Internet protocol (IP) packet formats. These characteristics necessitate a new kind of device to intermediate between the capillary network and the next layer of the IoT, that is, a local IoT concentrator.
The concentrator serves as a hub for short-range radio frequency (RF) links in its immediate vicinity, manages the link interfaces and exchanges data with these. Because these concentrators are unlikely to have any direct connection to an Internet-access router, these will generally use Wi-Fi or Long Term Evolution (LTE) as a backhaul network, which then becomes the second layer of the IoT. It is the job of the hub, then, to perform routine work of a network bridge as well as packing and unpacking, shaping traffic and translating between headers used in short-range RF packets and headers necessary for backhaul networks.

Two different concepts of IoT: connect to existing intelligent controllers (left) or connect directly to individual sensors and actuators (right) (Source: Altera)

Future trends

In future, we can expect vehicles with an increasing amount of autonomous capabilities to navigate roads and highways, and interact with each other, their owners and the IoT. Intelligent cars and smartgrids are just the beginning of a changing ecosystem, where devices, systems and platforms that were previously disconnected will become online.
Ultimately, integration of various IoT devices and platforms will lead to the proliferation of smartcities across the globe, riding on the new digital infrastructure enabled by ubiquitous connectivity and the ever-increasing bandwidth. Thus, it is important to realise that, just because a system is embedded, it does not mean it is secure or will remain so, indefinitely.
Therefore security must be perceived as hardware rather than software patches, with chip makers routinely forced to contend with a wide range of potentially serious threats including data breaches, counterfeit components and intellectual property (IP) theft. Apart from ensuring fundamental chip security during manufacturing, embedding the right security intellectual property (IP) core into an SoC can help manufacturers design devices, platforms and systems that remain secure throughout their respective lifecycles.

Hardware-enabled examples include device provisioning, subscription management, secure payments, authorisation and return merchandise authorisation (RMA)/test support. Embedded SoC security can provide a critical root of trust, managing sensitive keys for secure boot, service authentication and key management. SoC security core can regulate debug modes to thwart reverse engineering, while providing chip authentication to prevent counterfeiting. SoC based security can also manage one-time programming of on-chip resources.

Sunday, 3 January 2016

Vivado Vs ISE (Vivado Features)


The Vivado Design Suite has been released by Xilinx after four years of development and a year of beta testing. It is a highly integrated design environment with a completely new generation of system-to-IC-level tools, all built on the backbone of a shared scalable data model and a common debug environment.
The suite significantly boosts runtime, and incorporates support for industry standards such as C+/C++/SystemC, the AMBA AXI4 interconnect, IP-XACT IP packaging metadata, the Tool Command Language (Tcl), SystemVerilog and Synopsys Design Constraints (SDC).
Xilinx architected Vivado to enable the combination of all types of programmable technologies and to scale up to 100 million ASIC equivalent-gate designs.
‘All programmable’
Vivado reflects Xilinx’ concept of 'All programmable devices'. These enable customers to achieve higher levels of programmable systems integration, increase system performance, lower BOM cost and total system power consumption, and accelerate design productivity
The concept goes beyond the traditional FPGA heartland of programmable logic and I/O into areas such as software-programmable ARM subsystems, 3D ICs and analog mixed signal. To achieve this, Vivado directly addresses nagging integration and implementation design-productivity bottlenecks.

Examples of the integration bottlenecks include
·         Integrating Algorithmic C And Register-Transfer Level (RTL) IP;
·         Mixing The DSP, Embedded, Connectivity And Logic Domains;
·         Verifying Blocks And ‘Systems’; And
·         Reusing Designs AndIP.

Examples Of Implementation Bottlenecks Include:
·         Hierarchical Chip Planning And Partitioning;
·         Multi-Domain And Multi-Die Physical Optimization;
·         Multi-Variant ‘Design’ Vs. ‘Timing’ Closure; And
·         Late-Stage Engineering Change Orders (Ecos) And The Rippling Effects Of Design Changes.

More specific detail on how Vivado addresses these issues follows, but as a  taster, new features include comprehensive cross-probing of many reports and design views, state-of-the-art graphics-based IP integration and the first fully supported commercial deployment of high-level synthesis (C++ to HDL) by an FPGA vendor (read more on the Vivado HLS/AutoESL technology underpinning this in this sister article's case study with Agilent).
Xilinx introduced its hitherto flagship ISE Design Suite in 1997. Ut featured a then innovative timing-driven place-and-route engine. Over the subsequent decade and a half, the company added technologies such as multi-language synthesis and simulation, IP integration and a host of editing and test utilities. ISE grew to match FPGAs’ own increasing capability to address ever more complex functions.
For Vivado, Xilinx has drawn upon lessons learned from ISE. It has taken across key technologies while also leveraging modern EDA algorithms, tools and techniques. Vivado has been designed with the incoming 20nm node in mind, and so that it can also scale into the foreseeable future.
Meanwhile, Xilinx will develop and support ISE  indefinitely for customers targeting 7 series and older technologies. Going forward, however, Vivado now becomes the flagship design environment, supporting 7 series and future devices.

Deterministic design closure with Vivado

At the heart of any FPGA vendor’s design suite is the physical-implementation flow: synthesis, floorplanning, placement, routing, power and timing analysis, optimization and ECO.
To reduce iterations and design time and improve productivity, Xilinx has built Vivado’s implementation flow using a single, shared, scalable data model. This framework is already found in advanced ASIC design environments.
The model allows all steps in the flow to operate on an in-memory data model that enables debug and analysis at every step. This gives much earlier visibility into key metrics such as timing, power, resource utilization and routing congestion. These estimates become progressively more accurate as the design progresses.
The unified data model allows for tight links between a multidimensional, analytical place-and-route engine and both the suite’s RTL synthesis engine and new multiple-language simulation engines. The same links extend as well to individual suite tools such as Vivado’s IP Integrator, Pin Editor, Floor Planner and Device Editor.
Customers can use a comprehensive cross-probing function to follow a given problem from schematics, timing reports or logic cells to any other view and all the way back to HDL code. This provides analysis at and connects every step of the design process.
Vivado also provides analysis for timing, power, noise and resource utilization at every stage after synthesis. So, if the user learns early on that timing or power is far off specification, he can make short iterations to address the issue proactively rather than run long iterations after place-and-route.
The tight integration afforded by the scalable data model enhances the effectiveness of pushbutton flows for users who want maximum automation, where their tools do the vast majority of the work. At the same time, it gives users who require more advanced controls better analysis and command of every design move.
Hierarchical chip planning, fast synthesis
Vivado lets users partition a design for processing by synthesis, implementation and verification. It promotes a divide-and-conquer team approach to big projects. A new design-preservation feature provides repeatable timing results and the ability to perform partial reconfiguration.
The suite also includes an entirely new synthesis engine that is designed to handle millions of logic cells. Key to this is enhanced support for SystemVerilog.
The synthesis engine’s use of the synthesizable subset of SystemVerilog makes it three times faster than XST, the Xilinx Synthesis Technology in the ISE suite, and includes a ‘quick’ option that lets designers rapidly get a feeling for area and size. They can then debug 15 times faster than before with an RTL or gate-level schematic.
With more ASIC designers moving to programmable platforms, Vivado also leverages Synopsys Design Constraints throughout the flow. The use of this and other standards opens up new levels of automation where customers can now access the latest EDA tools for tasks such as constraint generation, cross-domain clock checking, formal verification and static timing analysis.

Vivado's multidimensional analytical placer

Older FPGA vendor design suites use one-dimensional timing-driven place-and-route engines powered by simulated annealing algorithms that determine randomly where the tool should place logic cells.
With these routers, users enter timing, then the simulated annealing algorithm pseudorandomly places features, seeking a ‘best as it can’ match for timing requirements. This made sense when designs were much smaller and logic cells were the main cause of delays. Today, interconnect and design congestion contribute far more.
Place-and-route engines with simulated annealing algorithms do an adequate job for FPGAs below one million gates. But they underperform as designs grow beyond that. They struggle with congestion and the results become increasingly unpredictable.


Xilinx has developed a multidimensional analytic placement engine for Vivado on a par with those found in million-dollar ASIC place-and-route tools. It analytically finds a solution that primarily minimizes three dimensions of a design: timing, congestion and wire length. The engine does this while taking the entire design into account instead of taking the local-move approach of simulated annealing. As a result, the tool can place and route 10 million gates quickly, deterministically and with consistently strong results (see Figure 1). Also, because it is solving for three factors simultaneously, it requires fewer iterations.



              FIGURE 1: Vivado's placement engine claims to match expensive P&R tools

Xilinx ran the raw RTL for its Zynq-7000 EPP emulation platform, a very large and complex design, in the pushbutton modes of both ISE and Vivado. Each tool was instructed to target Xilinx’s largest FPGA device, the SSI-enabled Virtex-7 2000T FPGA.


FIGURE 2: Vivado benchmarked on the Zynq emulation platform 

Vivado’s place-and-route engine took five hours to place the 1.2 million logic cells; ISE v13.4 took 13 hours (Figure 2). The Vivado Design Suite also implemented the design with less congestion (as seen in the gray and yellow portions) and in a smaller area, reflecting the total wire-length reduction.
Further, the Vivado implementation had better memory compilation efficiency, taking only 9Gbyte to implement the design’s required memory to ISE’s 16Gbyte. Finally, Vivado only utilized three-quarters of the device to implement its version of the design.

Power optimization and analysis:

Vivado incorporates up-to-date power-optimization strategies such as advanced clock-gating which it uses to, for example, can analyze design logic and remove unnecessary switching activity. Specifically, it focuses on the switching-activity factor ‘alpha’. The technique can achieve up to a 30 per cent reduction in dynamic power.
Also, the new shared scalable data model means that users can get power estimates at every stage, enabling up-front analysis so that problem areas can be addressed early on.
Simplifying ECOs with Vivado
Incremental flows allow users to quickly process minor design changes by  reimplementing a small part of the design, making iterations faster. They also enable performance preservation after each incremental change, thus reducing the number of iterations overall.
Vivado includes an extension to the popular ISE FPGA Editor tool, the Vivado Device Editor. Using the editor on a placed-and-routed design gives designers the power to make late-stage ECOs (e.g., move instances, reroute nets, tap a register to a primary output for debug with a scope, change the parameters on a digital clock manager or a lookup table) without going back through synthesis and implementation.

Automation, not dictation

The Vivado tool team adopted a philosophy of automating, not dictating the way people design. However the user starts (C, C++, SystemC, VHDL, Verilog, SystemVerilog, MATLAB or Simulink) and whoever’s IP they use (Xilinx or third-party), Vivado offers a way to automate those flows and boost productivity.
The top priority was to give the suite specialized IP features that facilitate the creation, integration and archiving of intellectual property. There are new IP capabilities in Vivado: IP Packager, IP Integrator and the Extensible IP Catalog. More than 20 vendors already offer IP supporting the suite and these features.
IP Packager allows Xilinx customers, IP developers and ecosystem partners to turn any part of a design — or indeed, the entire design — into a reusable core at any level of the design flow: RTL, netlist, placed netlist and placed-and-routed netlist. The tool creates an IP-XACT description of the IP for easier integration into future designs. The IP Packager specifies the data for each piece of IP in an XML file. Once the IP is packaged, IP Integrator can stitch it into the rest of a design.
IP Integrator allows customers to integrate IP into their designs at the interconnect level rather than at the pin level. The user can drag and drop pieces of IP onto a design and the tool will check upfront that the respective interfaces are compatible. If they are, the user draws one line between the cores and the integrator automatically writes detailed RTL to connects the pins.
The output of that process can then be run back through the IP Packager. The result becomes a piece of IP that other people can reuse and is, as noted above, available in multiple formats.
The Extensible IP Catalog allows users to build their own standard repositories from IP they have created or licensed from Xilinx and third-parties. The catalog conforms to the IP-XACT standard, and this allows design teams and even enterprises to better organize their IP and share it across an organization.
Both the Xilinx System Generator and IP Integrator are part of the Vivado Extensible IP Catalog so that users can easily access catalogued IP and integrate it. Instead of third-party vendors delivering IP in a zip file and with various deliverables, they can now deliver it to you in a unified format that is instantly accessible and compatible with the Vivado suite.

Mainstream high-level synthesis for FPGAs

Perhaps the most forward looking technology in the new suite is Vivado HLS (high-level synthesis), which Xilinx gained through its acquisition of AutoESL in 2010.
Vivado HLS provides comprehensive coverage of C, C++ and SystemC, and carries out floating-point as well as arbitrary precision floating-point calculations. This means that you can work with the tool in an algorithm-development environment rather than a typical hardware environment.
A key advantage of doing this is that algorithms developed at that level can be verified orders of magnitude faster than at the RTL. That provides not only simulation acceleration but also the ability to explore the feasibility of algorithms and then make, at then architectural level, trade-offs in terms of throughput, latency and power.
Designers can use Vivado HLS in many ways to perform a wide range of functions. Consider this demonstration in the form of a common flow for developing IP and integrating it into designs. 

Create a C, C++ or SystemC representation of the design and a C testbench that describes its desired  behavior.

Verify the system behavior of the design using a GNU Compiler Collection/G++ or Visual C++ simulator.

Get the behavioral design functioning satisfactorily and settle the accompanying testbench.
Run the design through Vivado HLS synthesis to generate RTL (Verilog or VHDL). 
Use the RTL to perform Verilog or VHDL simulation of the design or have the tool create a SystemC version using the C-wrapper technology.

Perform a SystemC architectural-level simulation and further verify the architectural behavior and functionality of the design against the previously created C testbench.

§  (a) Once the design has been solidified, put it through the Vivado Design Suite’s physical-implementation flow to program it into a device and run it in hardware; or (b). Using the IP Packager to turn the design into a reusable piece of IP, stitch the IP into a design using IP Integrator or run it in System Generator.


Figure 3 offers another perspective on a Vivado flow.

This is merely one way to use the tool. You can see how Agilent’s Nathan Jachimiec and Xilinx’s Fernando Martinez Vallina used the Vivado HLS technology (called AutoESL technology in the ISE Design Suite flow) to develop a UDP packet engine for Agilent in this article.

Xilinx has also created Vivado Simulator, a new mixed-language simulator for the suite that supports Verilog and VHDL. With a single click of a mouse, users can launch behavioral simulations and view results in an integrated waveform viewer. Simulations are accelerated at the behavioral level using a new performance-optimized simulation kernel that executes up to three times faster than the ISE simulator. Gate-level simulations can also run up to 100 times faster using hardware co-simulation.

Vivado availability

Where Xilinx offered the ISE Design Suite in four editions aimed at different types of designers (Logic, Embedded, DSP and System), the company will offer the Vivado Design Suite in two editions.

The base Design Edition includes the new IP tools in addition to Vivado’s synthesis-to-bitstream flow.
Meanwhile, the System Edition includes all the tools of the Design Edition plus System Generator and Xilinx’s new Vivado HLS.

The Vivado Design Suite version 2012.3 is available now and will be followed by WebPACK availability later this year. ISE Design Suite Edition customers with current support receive the new Vivado Design Suite Editions in addition to ISE at no additional cost







VIVADO DESIGN SUITE


IP AND SYSTEM-CENTRIC TOOL SUITE ACCELERATING PROGRAMMABLE SYSTEMS INTEGRATION AND IMPLEMENTATION BY UP TO 4X :
Programmable devices are at the heart of most systems today, enabling not only programmable logic design, but programmable systems integration. Xilinx has transformed from an FPGA company to an ‘All Programmable’ company, offering technology from logic and IO to SW programmable ARM® processing systems and beyond.With the next decade of programmable platforms, comes the next generation design environment that meets the aggressive pace and the need for enhanced productivity.
Xilinx introduces the Vivado™ Design Suite, an IP and system-centric design environment built from the ground up to accelerate productivity for the next generation of ‘All Programmable’ devices. The new Vivado Design Suite is already proven to accelerate integration and implementation by 4x over traditional design flows, reducing cost by simplifying design and automating, not dictating, a flexible design environment.
Vivado Design Suite provides a highly integrated design environment with a completely new generation of system-to-IC level tools, all built on the backbone of a shared scalable data model and a common debug environment. It is also an open environment based on industry standards such as AMBA® AXI4 interconnect, IP-XACT IP packaging metadata, the Tool Command Language (Tcl), Synopsys® Design Constraints (SDC) and others that facilitates customized design flows. Vivado was architected to enable the combination of all types of programmable technologies and scale up to 100M ASIC equivalent gate designs.

Accelerating Integration and Implementation To eliminate bottlenecks in integration, the Vivado Design Suite includes electronic system level (ESL) design for rapidly synthesizing and verifying C/C++/SystemC-based algorithmic IP, standards based packaging of both algorithmic and RTL IP for reuse, standards based IP stitching and systems integration of all types of IP, and the verification of blocks and systems with 3X faster simulation and HW Co-simulation provides 100X performance. 
The Vivado Design Suite accelerates the implementation process by enabling more turns per day and helping to eliminate them altogether. The new Vivado data model improves run times up to 4x compared to competing solutions. Vivado includes a hierarchical chip planner, a 3-15X faster logic synthesis tool with industry leading support for SystemVerilog, and a 4X faster, more deterministic place and route engine that uses analytics to minimize a ‘cost’ function of multiple variables such as timing, wire length and routing congestion. In addition, incremental flows allow for ECO (Engineering Change Order) induced changes to be quickly processed by only re-implementing a small part of the design, while preserving performance. Finally, leveraging the new shared scalable data model, power, timing and area estimates are provided at every stage of the design flow, enabling up front analysis and then optimization with integrated capabilities such as automated clock gating.
Xilinx Solution Highlights
• Next generation of system-to-IC level tools, built on the backbone of a shared scalable          data model and a common debug environment
• 4x productivity advantage drives beyond programmable logic to programmable systems        integration
• ’All Programmable’ device support including 3D stacked silicon interconnect technology,        ARM processing systems and Analog Mixed Signal (AMS).

For More Info:
Name    : Vinay Kumar 
Email Id: vinaykmrgarg@gmail.com





Saturday, 18 July 2015

Feature Size of Transistors (Most Important Question) !!


The feature size of any semiconductor technology is defined as the minimum length of the MOS transistor channel between the drain and the source. The technology node has been scaling year by year. From early 2000s it has shrunk from 180nm to 16nm designs today (2015).

You would have probably noticed that the technology scaling has followed:

180nm -->> 130nm -->> 90nm -->> 65nm -->> 40nm -->> 28nm --> 22nm--->> 16nm

Ever wondered who decides these numbers? Are these arbitrary or there's some inherent logic behind these numbers? Let's see.

In early 1970s, Gordon Moore of Intel Corp. predicted that the number of transistor on a an integrated circuit would double itself in approximately 18-24 months. This prediction has proven to be accurate as scaling of technology continues unabated even after 40 years! Well, it's mainly because Moore's law has set out a challenge and a roadmap for designers to keep the scaling going! 

You might ask yourself, why scaling? Here's why:
  • If double number of transistors can be incorporated on the same area, it means we get double (roughly) functionality for the same cost!
  • Alternatively, with scaling of technology, the same functionality will be available at roughly half the cost! 
  • Moreover, smaller the channel length, faster would be the transient response of the transistors which would translate into better performance!

The goal of every design company now is to double the number of transistors on their integrated circuits with each technology. As you would notice, the numbers above from 180nm to 130nm to 90nm scale down by roughly a factor of 0.7 (1/sqr. root of 2) . What's so special about 0.7?

If the feature size of the transistor is scaled by 0.7, the area would be scaled by a factor of 0.72=0.49 =~ 0.5. That means if we scale our feature sizes by a factor of roughly 0.7, we would be able to pack twice the number of transistors on the same area as the previous technology!

Latch-Up in CMOS


CMOS device is often portrayed to be an impeccable device, especially in the textbooks. There are some innate problems in the CMOS device and one of them is the latch-up. We're gonna talk about it in detail.

Consider the cross-section of a CMOS inverter. Please note that I have skipped drawing some metal layers and contacts for the sake of simplicity. My focus is on explaining the problem of latch-up and not the layout design rules!

Figure 1: CMOS with parasitic BJTs


In the above cross-section, note that 1-2-3 form a parasitic pnp type bipolar junction transistor, while 4-3-2 form a parasitic npn bipolar junction transistor. Since parasitic transistors would be present in every CMOS device! A simplistic figure depicting these parasitic transistors is given below:

Figure 2: Simplistic Figure depicting parasitic BJTs

Here, the npn and the pnp transistors are depicted with 2 and 3 being common between the two transistors. Also note that the n-well layer and the p-substrate ate lightly doped layers, and hence offer greater resistance as compared to the n+, p+ drain and source regions. The n-well resistance of PMOS is depicted by the resistor R2 and the resistance of p-substrate is depicted by the resistor R1.

Let's say, we've got a spike at the output of the CMOS inverter.

  1. This is a negative spike (or a bump), which decreases the potential of VOUT below ground potential by 0.7V.
  2. As a result, the npn transistor 4-3-2 gets turned ON, and the emitter (n+, 4) starts emitting electrons, which eventually get collected at the collector (n-well, 2) and go into VDD. 
  3. The current hence flows in the reverse direction from the n+ body towards the n-well region.
  4. We will have a voltage drop in the direction of current, as a result the potential at the n-well can reach 0,7V below VDD. This turns the pnp transistor ON, because p+ region emitter is at VDD! 
  5. The current is collected at the collector (p-substrate, 3) and flows into the ground through the p+ body region.
  6. Going opposite to the direction of current, the potential difference increases, and the voltage at p-substrate (just below the n+ of NMOS, might reach 0.7V, thereby injecting more current!
Figure 3: Sequence of events leading to latch-up
As evident from the above steps, just one spike at the output initiates a chain reaction resulting in current flow through the device incessantly, and hence the device gets worn out in a very short span of time! This is the latch-up!

How to mitigate the problem? 
Well, as we just noticed, the main cause of latch-up was the resistance of n-well and the p-substrate layers. Hence, the most logical solution would be to increase their doping concentration by ion implantation. But that would deteriorate the transistor operation! Remember, we need to keep the doping of wells and substrate low; while those of source and drain high yo ensure a good transistor operation.

What do we do now?
Well, we shall stick with ion implantation, but instead of doing it near the silicon surface and hence near the drain and source, implant a deep n-well with high doping concentration, and similarly, implant a deep p-well inside the p-substrate to reduce the resistance and hence kill the parastic BJT action!


References:
  • Latch-up in CMOS, NPTEL lectures by Dr. Nandita Dasgupta.

Sunday, 12 July 2015

Clock Jitter !!

Jitter

Jitter is the short-term variations of a signal with respect to its ideal position in time. Jitter is the variation of the clock period from edge to edge. 

Clock jitter refers to temporal variation of clock period at a given point-that is, the clock period can reduce or expand on cycle by cycle basis. it is strictly a temporal uncertainty measure and is often specified at a given point on the chip. 

From cycle to cycle the period and duty cycle can change slightly due to the clock generation circuitry. Jitter can also be generated from PLL known as PLL jitter. Possible jitter values should be considered for proper PLL design. 

Jitter can be modeled by adding uncertainty regions around the rising and falling edges of the clock waveform

Sources of Jitter Common sources of jitter include:
  • Internal circuitry of the phase-locked loop (PLL)
  • Random thermal noise from a crystal
  • Other resonating devices
  • Random mechanical noise from crystal vibration
  • Signal transmitters
  • Traces and cables
  • Connectors
  • Receivers

Impact of Jitter on sequential system

Jitter directly impacts the performance of a sequential system. Ideally, the clock period starts at edge 2 and ends at edge 5 and with a nominal clock period of TCLK. However as a result of jitter, the worst case scenario happens when the leading edge of the current clock period is delayed (edge 3), and the leading edge of the next clock period occurs early (edge 4). As a result, the total time available to complete the operation is reduced by twice tjitter in the worst case.

Clock Jitter

Clock Skew !!!

Clock skew

The operation of most digital circuit systems, such as computer systems, is synchronized by a "clock" that dictates the sequence and pacing of the devices on the circuit. Ideally, the input to each element has reached its final value before the next clock movement occurs so that the behaviour of the whole circuit can be predicted exactly. The maximum speed at which a system can run must account for the variance that occurs between the various elements of a circuit due to differences in physical composition, temperature, and path length.

In circuit designs, clock skew (sometimes timing skew) is a phenomenon in synchronous circuits in which the clock signal (sent from the clock circuit) arrives at different components at different times. Clock skew can be positive or negative. If the clock signals are in complete synchronicity, then the clock skew observed at the registers is zero.

Reasons for clock skew:

This can be caused by many different things, such as wire-interconnect length, temperature variations, variation in intermediate devices, capacitive coupling, material imperfections, and differences in input capacitance on the clock inputs of devices using the clock. As the clock rate of a circuit increases, timing becomes more critical and less variation can be tolerated if the circuit is to function properly. 

Two types of skews are defined: Local skew and Global skew.

Local skew

Local skew is the difference in the arrival of clock signal at the clock pin of related flops.

Global skew

Global skew is the difference in the arrival of clock signal at the clock pin of non related flops. This is also defined as the difference between shortest clock path delay and longest clock path delay reaching two sequential elements.

Why clock skew is a problem?

Two types of violation can be caused by clock skew. One problem is caused when the clock travels slower than the path from one register to another - allowing data to penetrate two registers in the same clock tick, or maybe destroying the integrity of the latched data. This is called a hold violation because the previous data is not held long enough at the destination flip-flop to be properly clocked through.
Another problem is caused if the destination flip-flop receives the clock tick earlier than the source flip-flop - the data signal has that much less time to reach the destination flip-flop before the next clock tick. If it fails to do so, a setup violation occurs, because the new data was not set up and stable before the next clock tick arrived. 
A hold violation is more serious than a setup violation because it cannot be fixed by increasing the clock period.

Positive Skew
When the source flop is clocked first than the destination flop, the clock skew is called as positive skew.
Positive skew diagram

From the below waveform, we can see that the hold slack reduces, when there is a positive skew. Hence, we can infer that the positive skew increases the chances of hold violation.

Positive skew waveform

Negative Skew
When the destination is clocked before the source, the clock skew is called as negative skew.

From the waveform below, we can see that the setup slack decreases in case of negative skew. So, the negative skew increases the chances of setup violation.
Negative skew

Uncertainty
Clock uncertainty is the time difference between the arrivals of clock signals at registers in one clock domain or between domains.