Javascript must be enabled to continue!

Screening Deep Learning Inference Accelerators at the Production Lines

Artificial Intelligence (AI) accelerators can be divided into two main buckets, one for training and another for inference over the trained models. Computation results of AI inference chipsets are expected to be deterministic for a given input. There are different compute engines on the Inference chip which help in acceleration of the Arithmetic operations. The Inference output results are compared with a golden reference output for the accuracy measurements. There can be many errors which can occur during the Inference execution. These errors could be due to the faulty hardware units and these units should be thoroughly screened in the assembly line before they are deployed by the customers in the data centre. This paper talks about a generic Inference application that has been developed to execute inferences over multiple inputs for various real inference models and stress all the compute engines of the Inference chip. Inference outputs from a specific inference unit are stored and are assumed to be golden and further confirmed as golden statistically. Once the golden reference outputs are established, Inference application is deployed in the pre- and post-production environments to screen out defective units whose actual output do not match the reference. Strategy to compare against itself at mass scale resulted in achieving the Defects Per Million target for the customers.

Academy and Industry Research Collaboration Center (AIRCC)

Ashish Sharma Puneesh Khanna Jaimin Maniyar

Computer Science & Technology Trends

2022

Title: Screening Deep Learning Inference Accelerators at the Production Lines

Description:

Artificial Intelligence (AI) accelerators can be divided into two main buckets, one for training and another for inference over the trained models.

Computation results of AI inference chipsets are expected to be deterministic for a given input.

There are different compute engines on the Inference chip which help in acceleration of the Arithmetic operations.

The Inference output results are compared with a golden reference output for the accuracy measurements.

There can be many errors which can occur during the Inference execution.

These errors could be due to the faulty hardware units and these units should be thoroughly screened in the assembly line before they are deployed by the customers in the data centre.

This paper talks about a generic Inference application that has been developed to execute inferences over multiple inputs for various real inference models and stress all the compute engines of the Inference chip.

Inference outputs from a specific inference unit are stored and are assumed to be golden and further confirmed as golden statistically.

Once the golden reference outputs are established, Inference application is deployed in the pre- and post-production environments to screen out defective units whose actual output do not match the reference.

Strategy to compare against itself at mass scale resulted in achieving the Defects Per Million target for the customers.

Back

The pandemic Covid-19 currently demands teachers to be able to use technology in teaching and learning process. But in reality there are still many teachers who have not been able ...

Nanosilicas as Accelerators in Oilwell Cementing at Low Temperatures

Abstract Accelerators are important cementing additives in deepwater wells where low temperatures can lengthen the wait-on-cement (WOC) time, potentially increasing ...

Evaluating the Effectiveness of Randomized and Directed Testbenches in Stress Testing AI Accelerators

As the demand for high-performance AI accelerators grows, ensuring their reliability under extreme computational loads becomes paramount. This study evaluates the effectiveness of ...

Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)

BACKGROUND As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...

Evolutionary Grammatical Inference

Grammatical Inference (also known as grammar induction) is the problem of learning a grammar for a language from a set of examples. In a broad sense, some data is presented to the ...

Adaptive Dataflow and Precision Optimization for Deep Learning on Configurable Hardware Architectures

As deep learning continues to revolutionize a wide range of domains—from computer vision and natural language processing to autonomous systems and edge computing—the demand for ef...

Deep Learning: Implications for Human Learning and Memory

Recent years have seen an explosion of interest in deep learning and deep neural networks. Deep learning lies at the heart of unprecedented feats of machine intelligence as well as...

Lung cancer screening on YouTube: Difficulty of finding balanced information.

162 Background: Lung cancer (LC) is the leading cause of cancer mortality in the US, the ACS estimates upwards of 220,000 new cases will be diagnosed this year. Recently, the Cent...

Email:
Password:

Email:

Screening Deep Learning Inference Accelerators at the Production Lines

Related Results