Effect of Thread Level Parallelism on the Performance of Optimum Architecture for Embedded Applications
Mehdi Alipour, and Hojjat Taghdisi

TL;DR
This paper explores how thread level parallelism affects the performance of an optimized embedded processor architecture, focusing on maximizing TLP within area and power constraints.
Contribution
It presents a comprehensive design space exploration for an optimal uni-thread embedded processor and evaluates maximum TLP under performance, power, and area limitations.
Findings
Maximum TLP improves performance within power and area budgets.
Optimal architecture balances thread level parallelism with resource constraints.
Design guidelines for multi-threaded embedded processors are proposed.
Abstract
According to the increasing complexity of network application and internet traffic, network processor as a subset of embedded processors have to process more computation intensive tasks. By scaling down the feature size and emersion of chip multiprocessors (CMP) that are usually multi-thread processors, the performance requirements are somehow guaranteed. As multithread processors are the heir of uni-thread processors and there isn't any general design flow to design a multithread embedded processor, in this paper we perform a comprehensive design space exploration for an optimum uni-thread embedded processor based on the limited area and power budgets. Finally we run multiple threads on this architecture to find out the maximum thread level parallelism (TLP) based on performance per power and area optimum uni-thread architecture.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Interconnection Networks and Systems
