Javascript must be enabled to continue!
Deploying and scaling distributed parallel deep neural networks on the Tianhe-3 prototype system
View through CrossRef
AbstractDue to the increase in computing power, it is possible to improve the feature extraction and data fitting capabilities of DNN networks by increasing their depth and model complexity. However, the big data and complex models greatly increase the training overhead of DNN, so accelerating their training process becomes a key task. The Tianhe-3 peak speed is designed to target E-class, and the huge computing power provides a potential opportunity for DNN training. We implement and extend LeNet, AlexNet, VGG, and ResNet model training for a single MT-2000+ and FT-2000+ compute nodes, as well as extended multi-node clusters, and propose an improved gradient synchronization process for Dynamic Allreduce communication optimization strategy for the gradient synchronization process base on the ARM architecture features of the Tianhe-3 prototype, providing experimental data and theoretical basis for further enhancing and improving the performance of the Tianhe-3 prototype in large-scale distributed training of neural networks.
Springer Science and Business Media LLC
Title: Deploying and scaling distributed parallel deep neural networks on the Tianhe-3 prototype system
Description:
AbstractDue to the increase in computing power, it is possible to improve the feature extraction and data fitting capabilities of DNN networks by increasing their depth and model complexity.
However, the big data and complex models greatly increase the training overhead of DNN, so accelerating their training process becomes a key task.
The Tianhe-3 peak speed is designed to target E-class, and the huge computing power provides a potential opportunity for DNN training.
We implement and extend LeNet, AlexNet, VGG, and ResNet model training for a single MT-2000+ and FT-2000+ compute nodes, as well as extended multi-node clusters, and propose an improved gradient synchronization process for Dynamic Allreduce communication optimization strategy for the gradient synchronization process base on the ARM architecture features of the Tianhe-3 prototype, providing experimental data and theoretical basis for further enhancing and improving the performance of the Tianhe-3 prototype in large-scale distributed training of neural networks.
Related Results
Fuzzy Chaotic Neural Networks
Fuzzy Chaotic Neural Networks
An understanding of the human brain’s local function has improved in recent years. But the cognition of human brain’s working process as a whole is still obscure. Both fuzzy logic ...
On the role of network dynamics for information processing in artificial and biological neural networks
On the role of network dynamics for information processing in artificial and biological neural networks
Understanding how interactions in complex systems give rise to various collective behaviours has been of interest for researchers across a wide range of fields. However, despite ma...
Deep convolutional neural network and IoT technology for healthcare
Deep convolutional neural network and IoT technology for healthcare
Background
Deep Learning is an AI technology that trains computers to analyze data in an approach similar to the human brain. Deep learning algorithms can find ...
Technical Breakthrough in Production Engineering Ensures Economic Development of ASP Flooding in Daqing Oilfield
Technical Breakthrough in Production Engineering Ensures Economic Development of ASP Flooding in Daqing Oilfield
AbstractPilot tests commenced from 1980s in Daqing Oilfield have proved that ASP flooding could improve the recovery rate by 20% based on water flooding, while scaling issue in pro...
The Geography of Cyberspace
The Geography of Cyberspace
The Virtual and the Physical
The structure of virtual space is a product of the Internet’s geography and technology. Debates around the nature of the virtual — culture, s...
Silicon Micro-Robot With Neural Networks
Silicon Micro-Robot With Neural Networks
Insect type 4.0, 2.7, 2.5 mm. width, length, height size silicon micro-robot system with active hardware neural networks locomotion controlling system is presented in this chapter....
Neural stemness contributes to cell tumorigenicity
Neural stemness contributes to cell tumorigenicity
Abstract
Background: Previous studies demonstrated the dependence of cancer on nerve. Recently, a growing number of studies reveal that cancer cells share the property and ...
Data Science – deep learning of neural networks and their application in healthcare
Data Science – deep learning of neural networks and their application in healthcare
Introduction: Artificial intelligence, which is a set of algorithms, currently does an impressive amount of work related to its analysis and processing. The use of the computing po...

