[Master Thesis] A Framework for FPGA-Based Acceleration of Neural Network Inference with Limited Numerical Precision via High-Level Synthesis with Streaming Functionality

The goal of the research is to investigate, design, develop and evaluate customized hardware implementations for the inference of deep neural networks, with the aim of achieving considerably better speed and energy efficiency than those can be realized with standard processors or GPUs.
Broadly speaking, there are two ways of implementing computations: in hardware or in software. The latter approach is most frequently used, as it is more straightforward and software development skills are widely available. The software is run on a standard processor, which is a generic platform with fixed datapath widths, and high overhead associated with fetching and decoding instructions, accessing memory, and so on. The hardware approach involves the custom design of a circuit dedicated to the particular application needs. The functions performed by the circuit, the degree of parallelism, and the datapath widths can all be tailored precisely to the application requirements, reducing overhead considerably, and improving speed and power versus using a processor. Indeed, such a customized computing approach can provide order(s) of magnitude improvement in speed and energy over processors.
Although specialized hardware has the potential to provide huge acceleration at a fraction of a processors energy, the main drawback is related to its design. On one hand, describing hardware components in a hardware description language (HDL) (e.g. VHDL or Verilog) allows a designer to adopt existing tools for RTL and logic synthesis into the target technology. On the other hand, this requires the designer to specify functionality at a low level of abstraction, where cycle-by-cycle behaviour is completely specified. The use of such languages requires advanced hardware expertise, besides being cumbersome to develop in. This leads to longer development times that can critically impact the time-to-market.

http://www.mediafire.com/file/c90m11zlzcfmou7/%5BA_Framework_for_FPGA-Based_Acceleration_of_Neural_Network_Inference_with_Limited_Numerical_Precision_via_High-Level_Synthesis.pdf

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s