RISC microprocessors like these were the first to have superscalar execution, because RISC architectures free transistors and die area which can be used to include multiple execution units (this was why RISC designs were faster than CISC designs through the 1980s and into the 1990s).Įxcept for CPUs used in low-power applications, embedded systems, and battery-powered devices, essentially all general-purpose CPUs developed since about 1998 are superscalar. The Motorola MC88100 (1988), the Intel i960CA (1989) and the AMD 29000-series 29050 (1990) microprocessors were the first commercial single-chip superscalar microprocessors. The 1967 IBM System/360 Model 91 was another superscalar mainframe. Seymour Cray's CDC 6600 from 1964 is often mentioned as the first superscalar design. The CPU can execute multiple instructions per clock cycle.The CPU dynamically checks for data dependencies between instructions at run time (versus software checking at compile time).Instructions are issued from a sequential instruction stream.The superscalar technique is traditionally associated with several identifying characteristics (within a given CPU): The former executes multiple instructions in parallel by using multiple execution units, whereas the latter executes multiple instructions in the same execution unit in parallel by dividing the execution unit into different phases. While a superscalar CPU is typically also pipelined, superscalar and pipelining execution are considered different performance enhancement techniques. A multi-core superscalar processor is classified as an MIMD processor (multiple instruction streams, multiple data streams). In Flynn's taxonomy, a single-core superscalar processor is classified as an SISD processor (single instruction stream, single data stream), though a single-core superscalar processor that supports short vector operations could be classified as SIMD (single instruction stream, multiple data streams). Each execution unit is not a separate processor (or a core if the processor is a multi-core processor), but an execution resource within a single CPU such as an arithmetic logic unit. It therefore allows more throughput (the number of instructions that can be executed in a unit of time) than would otherwise be possible at a given clock rate. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a superscalar processor can execute more than one instruction during a clock cycle by simultaneously dispatching multiple instructions to different execution units on the processor. This method achieves 93.7% prediction accuracy with limited hardware overhead.Processor board of a CRAY T3e supercomputer with four superscalar Alpha 21164 processorsĪ superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. The fused microarchitecture adopts a combined bimodal and PAp branch prediction method. The processor in VLIW mode shows 44% and 30% performance improvements than ARM Cortex-A9. We also run the two benchmarks on ARM Cortex-A9 processor, which is integrated in the Zynq-7000 AP SoC device on Xilinx ZC706 evaluation board. The results show that, compared with the superscalar processor, the processor working under VLIW mode can improve the performance by 15% and 8%, respectively, when running EEMBC and DSPstone benchmarks. The two designs are both evaluated on the Xilinx 7-series FPGA (XC7K325T-2FFG900C), using Xilinx Vivado design suite. And then we expand VLIW dispatch method based on this processor, to realize the fused microarchitecture. To provide a performance comparison, we first design an in-order superscalar processor, considering that ARM GPPs always adopt superscalar approaches. This design is based on ARMv7-A&R Instruction Set Architecture (ISA). In order to expand the computation capability of digital signal processing on a General Purpose Processor (GPP), we propose a fused microarchitecture that improves Instruction Level Parallelism (ILP) by supporting both in-order superscalar and very long instruction word (VLIW) dispatch methods in a single pipeline.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |