Encapsulating critical computation subgraphs as application-specific instruction set extensions is an effective technique to enhance the performance of embedded processors. However, the addition of custom functional units to the base processor is required to support the execution of these custom instructions. Although automated tools have been developed to reduce the long design time needed to produce a new extensible processor for each application, short time-to-market, significant non-recurring engineering and design costs are issues. To address these concerns, we introduce an adaptive extensible processor in which custom instructions are generated and added after chip-fabrication. To support this feature, custom functional units (CFUs) are replaced by a reconfigurable functional unit (RFU). The proposed RFU is based on a matrix of functional units which is multi-cycle with the capability of conditional execution. A quantitative approach is utilized to propose an efficient architecture for the RFU and fix its constraints. To generate more effective custom instructions, they are extended over basic blocks and hence, multiple exits custom instructions are proposed. Conditional execution has been added to the RFU to support the multi-exit feature of custom instructions. Experimental results show that multi-exit custom instructions enhance the performance by an average of 67% compared to custom instructions limited to one basic block. A maximum speedup of 4.7, compared to a general embedded processor, and an average speedup of 1.85 was achieved on MiBench benchmark suite.
!!!All Science Journal Classification (ASJC) codes