A heterogeneous many-core platform for experiments on scalable custom interconnects and management of fault and critical events, applied to many-process applications: Vol. II, 2012 technical report
Roberto Ammendola, Andrea Biagioni, Ottorino Frezza, Werner Geurts,, Gert Goossens, Francesca Lo Cicero, Alessandro Lonardo, Pier Stanislao, Paolucci, Davide Rossetti, Francesco Simula, Laura Tosoratto, Piero Vicini

TL;DR
This paper presents a comprehensive report on a heterogeneous many-core platform, focusing on fault management, custom interconnects, and the development of a distributed network processor for scalable many-process applications.
Contribution
It introduces the LO|FA|MO fault awareness system, details the integration of the QUoNG hardware platform, and describes the design of a software-programmable distributed network processor.
Findings
Development of LO|FA|MO for fault awareness
Successful integration of QUoNG platform
Design of a new DNP architecture on FPGA
Abstract
This is the second of a planned collection of four yearly volumes describing the deployment of a heterogeneous many-core platform for experiments on scalable custom interconnects and management of fault and critical events, applied to many-process applications. This volume covers several topics, among which: 1- a system for awareness of faults and critical events (named LO|FA|MO) on experimental heterogeneous many-core hardware platforms; 2- the integration and test of the experimental hardware heterogeneous many-core platform QUoNG, based on the APEnet+ custom interconnect; 3- the design of a Software-Programmable Distributed Network Processor architecture (DNP) using ASIP technology; 4- the initial stages of design of a new DNP generation onto a 28nm FPGA. These developments were performed in the framework of the EURETILE European Project under the Grant Agreement no. 247846.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInterconnection Networks and Systems · Distributed systems and fault tolerance · Parallel Computing and Optimization Techniques
