Javascript must be enabled to continue!

Deploying and scaling distributed parallel deep neural networks on the Tianhe-3 prototype system

AbstractDue to the increase in computing power, it is possible to improve the feature extraction and data fitting capabilities of DNN networks by increasing their depth and model complexity. However, the big data and complex models greatly increase the training overhead of DNN, so accelerating their training process becomes a key task. The Tianhe-3 peak speed is designed to target E-class, and the huge computing power provides a potential opportunity for DNN training. We implement and extend LeNet, AlexNet, VGG, and ResNet model training for a single MT-2000+ and FT-2000+ compute nodes, as well as extended multi-node clusters, and propose an improved gradient synchronization process for Dynamic Allreduce communication optimization strategy for the gradient synchronization process base on the ARM architecture features of the Tianhe-3 prototype, providing experimental data and theoretical basis for further enhancing and improving the performance of the Tianhe-3 prototype in large-scale distributed training of neural networks.

Springer Science and Business Media LLC

Jia Wei Xingjun Zhang Zeyu Ji Jingbo Li Zheng Wei

Scientific Reports

2021

Title: Deploying and scaling distributed parallel deep neural networks on the Tianhe-3 prototype system

Description:

AbstractDue to the increase in computing power, it is possible to improve the feature extraction and data fitting capabilities of DNN networks by increasing their depth and model complexity.

However, the big data and complex models greatly increase the training overhead of DNN, so accelerating their training process becomes a key task.

The Tianhe-3 peak speed is designed to target E-class, and the huge computing power provides a potential opportunity for DNN training.

We implement and extend LeNet, AlexNet, VGG, and ResNet model training for a single MT-2000+ and FT-2000+ compute nodes, as well as extended multi-node clusters, and propose an improved gradient synchronization process for Dynamic Allreduce communication optimization strategy for the gradient synchronization process base on the ARM architecture features of the Tianhe-3 prototype, providing experimental data and theoretical basis for further enhancing and improving the performance of the Tianhe-3 prototype in large-scale distributed training of neural networks.

Back

Related Results

Fuzzy Chaotic Neural Networks

An understanding of the human brain’s local function has improved in recent years. But the cognition of human brain’s working process as a whole is still obscure. Both fuzzy logic ...

On the role of network dynamics for information processing in artificial and biological neural networks

Understanding how interactions in complex systems give rise to various collective behaviours has been of interest for researchers across a wide range of fields. However, despite ma...

Deep convolutional neural network and IoT technology for healthcare

Background Deep Learning is an AI technology that trains computers to analyze data in an approach similar to the human brain. Deep learning algorithms can find ...

Memorization capacity and robustness of neural networks

Machine learning, and deep learning in particular, has recently undergone rapid advancements. To contribute to a rigorous understanding of deep learning, this thesis explores two d...

On Robust and Efficient Parallel Reservoir Simulation on Tianhe-2

Abstract Parallel reservoir simulators are now widely used with availability of super computers. Modern massively parallel supercomputers demonstrate great power for...

FinFET Devices and Integration

Through more than a decade of industry wide R&D effort, 3D-FinFET has found its way into manufacturing. In this abstract, we review the key progress in process and integration ...

ACM SIGCOMM computer communication review

At some point in the future, how far out we do not exactly know, wireless access to the Internet will outstrip all other forms of access bringing the freedom of mobility to the way...

Artificial neural network for the recognition of human emotions under a backpropagation algorithm

The era of the technological revolution increasingly encourages the development of technologies that facilitate in one way or another people's daily activities, thus generating a g...

Email:
Password:

Email: