DepFiN A 12 nm Depth First, High Resolution CNN Processor for IO Efficient Inference

Name: DepFiN A 12 nm Depth First, High Resolution CNN Processor for IO Efficient Inference
Uploaded: 2024-01-19T17:17:56+0530
Description: DepFiN A 12 nm Depth First, High Resolution CNN Processor for IO Efficient Inference

admin

Jan 19, 2024 - 17:17

0 21

Abstract:

Applying convolutional neural networks (CNNs) on high-resolution images leads to very large intermediate feature maps (FMs), which dominate the memory traffic. Processing in the classical layer-by-layer order creates the requirement to store the complete FMs at once, when moving from one layer to the next. As the size of these FMs only realistically allows this in off-chip memory, this leads to high off-chip bandwidth, which comes at great energy costs. The DepFiN processor chip, presented in this article, overcomes this cost by running CNNs in a deep layer fusion mode, dubbed depth-first execution, made possible by a control flow that supports frequently switching between layers. To furthermore tackle the computational cost as well, the computationally efficient depthwise + pointwise (DW + PW) layer pairs are explicitly supported in DepFiN by a novel accelerator core that can dynamically change its configuration to manage the low computational intensity of the depthwise layers. Benchmarking measurements show the 12-nm DepFiN chip reaching up to 20 TOPS/W peak, 8.2 TOPS/W on the MC-CNN-fast stereo-matching network excluding input-output (IO) power (at 8-bit 0.6 Vdd) and, crucially, 3.95 TOPS/W with the IO power included on the same network and an up to 18× improvement realized by supporting depth-first (MC-CNN-fast at 8-bit, 0.65 V Vdd).

Click Here To See More

DepFiN A 12 nm Depth First, High Resolution CNN Processor for IO Efficient Inference

DepFiN A 12 nm Depth First, High Resolution CNN Processor for IO Efficient Inference

Tags:

What's Your Reaction?

Related Posts

Popular Posts

Follow Us

Recommended Posts

Popular Tags

Voting Poll