1-3- HISTORY BACK GROUND
Neural networks have been used to perform static branch prediction [3], where the likely direction of a branch is predicted at compile-time by supplying program features, such as control-flow and opcode information, as input to a trained neural network. This approach achieves a 20% misprediction rate compared to a 25% misprediction rate for static heuristics [4]. Static branch prediction performs worse than existing dynamic techniques, but can be useful for performing static compiler optimizations and providing extra information to dynamic branch predictors such as the agree predictor [5].
Learning vector quantization (LVQ), another neural method, has been suggested for dynamic branch prediction by [6]. LVQ prediction is about as accurate as a table-based branch predictor. Unfortunately, LVQ does not lend itself well to high-speed implementation because it performs complex computations involving floating point numbers. By contrast, our predictor has accuracy superior to any table-based method and can be implemented efficiently.
3.2 Prediction with Neural Methods
Prediction with neural methods is a rich area of study. Neural methods are capable of classification, that is, predicting into which set of classes a particular instance will fall. Suppose a set S is partitioned into n classes, and we are faced with the problem of determining, for an arbitrary element s 2 S, in what class s is. The elements of S have certain features that correlate with their classifications. An artificial neural network can learn correlations between these features and the classification. An artificial neural network is a collection of neurons, some of which receive input and some of which produce output, that are connected by links. Each link has an associated weight that determines the strength of the connection [7]. For a classification problem, such as deciding to which of n classes an input s belongs, there are n output neurons. In the special
case where there are only two classes, there is only one output neuron. Each neuron computes its output from the sum of its input using an activation function. During a training phase, the weights are adjusted using a training algorithm. The algorithm uses a set of training data, which are ordered pairs of inputs and corresponding outputs. The neural network learns correlations between the inputs and outputs, and generalizes this learning to other inputs. To predict which class a new input s is in, we supply s to the input units of the trained neural network, propagate the values through the network, and examine the n output neurons. We classify s according to the neuron with the strongest output. In the special case where nD2 and there is only one output neuron, we classify s according to whether the output value exceeds a certain threshold, typically 0 or 1=2.
3-3-BRANCH PREDICTION WITH PERCEPTRONS
This section provides the background needed to understand our predictor. We describe perceptrons, explain how they can be used in branch prediction, and discuss their strengths and weaknesses. We then describe our basic prediction mechanism, which is essentially a two-level predictor that replaces the pattern history table with a table of perceptrons.
3-3-1 How Perceptrons Work
The perceptron was introduced [8] as a way to study brain function. We consider the simplest of many types of perceptrons [9], a single-layer perceptron consisting of one artificial neuron connecting several input units by weighted edges to one output unit.
A perceptron learns a target Boolean function t(x1, : : : , xn) of n input s. In our case, the xi are the bits of a global branch history shift register, and the target function predicts whether a particular branch will be taken. Intuitively, a perceptron keeps track of positive.
FIG(3-1)
Perceptron Model. The input values x1, : : : , xn, are propagated through the weighted connections by taking their respective products with the weights w1, : : : ,wn. These products are summed, along with the bias weight w0, to produce the output value y.
and negative correlations between branch outcomes in the global history and the branch being predicted. Figure 2 shows a graphical model of a perceptron.Aperceptron is represented by a vector whose elements are the weights. For our purposes, the weights are signed integers. The output is the dot product of the weights vector, w0::n, and the input vector, x1::n (x0 is always set to 1, providing a “bias” input). The output y of a perceptron is computed as
y = w0 + Xn i=1 xiwi
The inputs to our perceptrons are bipolar; that is, each xi is either ¡1, meaning not taken or 1, meaning taken. A negative output is interpreted as predict not taken. A nonnegative output is interpreted as predict taken.
يتبع-------------