Application driven evaluation of network on chip architectures forcation parallel signal processing
Abstract. Today’s signal processing applications exhibit steadily increasing throughput requirements which can be achieved by parallel architectures. However, efficient communication is mandatory to fully exploit their parallelism. Turbo-Codes as an instance of highly efficient forward-error correction codes are a very good application to demonstrate the communication complexity in parallel architectures. We present a network-on-chip approach to derive an optimal communication architecture for a parallel Turbo-Decoder system. The performance of such a system significantly depends on the efficiency of the underlying interleaver network to distribute data among the parallel units. We focus on the strictly orthogonal n-dimensional mesh, torus and k-ary-n cube networks comparing deterministic dimension-order and partially adaptive negative- first and planar-adaptive routing algorithms. For each network topology and routing algorithm, input- and output-queued packet switching schemes are compared on the architectural level. The evaluation of candidate network architectures is based on performance measures and implementation cost to allow a fair trade-off.