Source:
http://www.spacecodesign.com/#HardwareSoftwareCoDesignPartitioningResources
Source:
http://www.tik.ee.ethz.ch/education/lectures/hswcd/slides/
Patrick Schaumont
2nd Edition, 2012, xxii+480p
ISBN: 9781461437369
December 2012
About
This textbook provides an introduction to embedded systems design, with emphasis on integration of custom hardware components with software. The key problem addressed in the book is the following: “How can an embedded systems designer strike a balance between flexibility and efficiency?” The book describes how combining hardware design with software design leads to a solution to this important computer engineering problem. The book covers four topics in hardware/software codesign: Fundamentals, the Design Space of Custom Architectures, the Hardware/software Interface and Application Examples. The book comes with an associated design environment that helps the reader to perform experiments in hardware/software codesign. Each chapter also includes exercises and further reading suggestions.
Table of Contents [PDF]
Instructor Resources


Education 


Design Methodologies 


Formal Aspects 


Crypto Hardware 

Modeling in the GEZEL LanguageDatapaths are the basic building blocks in GEZEL, similar to a module in Verilog or an entity in VHDL. Registers, signals and expressions are essential modeling elements. Expressions can be assembled in datapath definitions. Datapaths can finally be composed to larger architectures. 

Register and Signals  GEZEL models synchronous, singleclock designs. Yet, a clock signal is not present in GEZEL language, it is implicit in the design description. By looking at a GEZEL program, you can say precisely how it will behave as a clockcycle true description. You can do this by looking at the kind of variables it uses in calculations. GEZEL has two kinds of variables: signals and registers.
A signal can hold a value within a single clock cycle. It has the same meaning as a wire in an actual implementation. A signal also has a name and a type and is created with the sig keyword. For example, a signal with name v12 and type ns(12) is created as follows. sig v12 : ns(12); This type ns(12) stands for a 12bit unsigned number. Signal v12 can hold values from 0 to 4095. When you force this signal to hold values outside of this range, precision loss will occur. There is one other type available for values, called tc(n). This type represents arbitrarylength signed numbers with two’s complement representation. For example, to create the equivalent of a C integer on a 32bit machine, use the following definition. sig aCinteger : tc(32); Registers are used to store values over multiple clock cycles. In contrast to signals, register variables have two values: a currentvalue and a nextvalue. The currentvalue is the value available at the output of a register, so it is the value obtained when reading from the register. The nextvalue is the value at the input of the register, so it is the value that is being written into the register. A register is created in the same way as a signal but uses the reg keyword. A 16bit unsigned register for example is created as reg r : ns(16); The register lies at the basis of clockcycletrue behavior. There are implicit simulation semantics tied to the register. At the start of each clock cycle, the nextvalue (of the previous clock cycle) is copied into the currentvalue (of the current clock cycle). In between clock edges, the nextvalue is updated based on the currentvalue, constants and inputs. This way, it is possible to create clockcycle true descriptions without mentioning the clock explicitly. The initial value of a register is zero (0), while the initial value of a signal is undefined. 

Expressions  Expressions enable calculations with signals and registers. Expressions are formed using operators that reference the names of signals and registers. For example, an addition of two signals b and c into signal a looks like
a = b + c; When a has insufficient precision to hold all possible combinations of the sum b + c, precision loss can occur. For example, assume the following types for a, b and c: sig a, b, c : ns (8); Clearly, when b + c is bigger than 256, the result cannot be stored in a. GEZEL will throw out bits at the mostsignificant side of the result (overflow). If b + c is 260, then the resulting value in a will be 4 (260 = 256 + 4). In some expressions, intermediate values will occur. In the above expression, b + c is such an intermediate value. A more obvious example is a = ((b+b) + (c+c)); Brackets are used to indicate the order in which this expression is to be evaluated. First, the sums b+b and c+c are obtained. These two intermediate values are combined and assigned to a. Intermediate values need a type, too. GEZEL uses a default type rule to choose the type of intermediate results, such that the intermediate result does not loose precision. Expressions combine signals and registers with operators. Operators have a precedence, a preferred order of evaluation. For example, in an expression such as a = b * b + c * c; the multiplications (*) will be performed before the additions (+), because multiplication has a higher precedence than addition. Precedence rules can be modified by using round brackets. The following bullets introduce the different operators that can be used in expressions, starting at the ones with low precedence and going up to highprecedence operations. Assignment and Selection The assignment operation updates the value of a signal or register. The selection operation conditionally extracts the value of a signal or register.
Bitwise Logical Operations Bitwise logical operations combine two bitpatterns into a new bitpattern. The bits at corresponding indices are combined using a singlebit logical operations. The logical operations are Inclusive Or, Exclusive Or, and And.
Comparison Operations Comparison operations compare the value of two expressions and yield a trueorfalse result. The value true or false is represented as a 1bit unsigned number (ns(1)), with 1 indicating true, and 0 indicating false.
Arithmetic Operations Arithmetic Operations do calculations on all of the bits of a signal or register, treated as an unsigned number or else a two’s complement signed number.
Cast Operation A cast operation converts the value of a signal into one with another type. This way, it is possible to convert for example a 5bit unsigned number into a 6bit signed number. When the target type has enough bits, no precision will be lost. For two’s complement signed numbers, a concept called sign extension is applicable. Sign extension preserves the sign of a two’s complement number when the wordlength increases. When the target type has insufficient bits, some precision can be lost. Bits are chopped off at the mostsignificant side. The resulting bitpattern is interpreted as a signed/unsigned number of the targeted wordlength. For example, if a is ns(8) and holds the value 7, and b is tc(4), then the following operation will leave the bitpattern 0b1111 in b, which is interpreted as 1.
b = (tc(3)) a; Bit Selection Operation A bit selection operation extracts part of a bitpattern in a word. There is a singlebit format as well as a bitvector format.
Lookup Table Operation A Lookup Table Operation offers access to a constant array, which is defined earlier in the code. The lookup table content needs to be defined first, after which it can be accessed using a lookup table operation. A Lookup Table definition is done by enumerating all the elements in the lookup table in a comma separated list as follows:
lookup a : ns(8) = {15, 22, 36, 0x4f}; This defines a lookup table a which holds elements of type ns(8). The table holds 4 elements. The element at index position 0 is 15 and the element at index position 3 is 0x4f (79). The lookup table access operation simply access the array using the index in between round brackets. For example, to access the third element of a, one would use
a(2) 

Signal Flow Graphs  The cycletrue execution model of GEZEL expresses concurrency by allowing multiple expressions to be evaluated in the same clock cycle. A set of expressions that execute together in the same clock cycle are grouped together in a signal flowgraph. Consider the design of a Viterbi Butterfly operation (a wellknown operation in convolutional decoding). This operation processes tuples of data according to an operation called addcompareselect.
y1 = min( d1 + a, d2  a ) y2 = min( d1  a, d2 + a ) Assume the following set of signals and registers.
sig a1, s1, a2, s2 : ns(8); // intermediate signals reg d1, d2, y1, y2 : ns(8); // input and output tuple reg a : ns(8); The signals flowgraph of expressions that implements this equation can be as follows.
always { a1 = d1 + a; s1 = d1  a; a2 = d2 + a; s2 = d2 + a; y1 = (a1 > s2) ? s2 : a1; y2 = (s1 > a2) ? a2 : s1; } The keyword always indicates that the group of expressions following it will execute each clock cycle. A signal flow graph can hold an arbitrary number of expressions. All expressions within a single signal flow graph are concurrent within one clock cycle. The order in which expressions are evaluated is independent of the order in which they appear in the GEZEL program (i.e., it is independent of their lexical order). Rather, the order is determined by the data precedences of signals and registers. A register can always be read, at any moment during a clock cycle. As discussed in Section 2.1 on page 9, a register has both a current value and a next value. For a signal, this is not the case. A signal has only an immediate value, valid within a single clock cycle. Thus, a signal has to be written first before it can be read. It has to be written the first time within a clock cycle based on values in registers and constants. As a consequence of this property of signals and registers, the order of expressions within a signal flow graph becomes irrelevant. For example, if you would write:
always { y1 = (a1 > s2) ? s2 : a1; y2 = (s1 > a2) ? a2 : s1; a1 = d1 + a; s1 = d1  a; a2 = d2 + a; s2 = d2 + a; } 

Named Signal Flow Graphs  Besides the unnamed always signal flow graph, you can create signal flow graphs with a name using the sfg keyword. For example, the previous signal flow graph could be written as:
sfg mysfg { y1 = (a1 > s2) ? s2 : a1; y2 = (s1 > a2) ? a2 : s1; a1 = d1 + a; s1 = d1  a; a2 = d2 + a; s2 = d2 + a; } The difference between a named signal flowgraph (sfg) and the unnamed always is that the former does not automatically execute each clock cycle. GEZEL will allow you to create a controller that schedules the named signal flowgraph. 

Datapath Modules  A datapath corresponds to a module in Verilog or an entity in VHDL. It is a piece of hardware logic that is treated as a single entity by subsequent RT and logic synthesis tools. A datapath combines a number of named signal flow graphs with a list of input and output signals. A signal flow graph can be thought of as an instruction for that datapath. A datapath can have only a single always signal flow graph, but it can have multiple named signal flow graphs.A datapath is the smallest GEZEL unit that can be simulated. So, subsequent examples will be fully selfcontained programs rather than snippets. Here is an example of a 2bit counter as a hardwired datapath
dp counter(out value : ns(2)) { reg c : ns(2); always { value = c; c = c + 1; $display("Cycle ", $cycle, ": counter = ", value); } } system S { counter; } This datapath has a single output port called value. An output port also has a type, indicated after the colon following the port name. The ports define the outline of the datapath. The only way an ‘outsider’ can access the datapath is by reading/writing values on the datapath ports. On line 2, we create a 2bit register. This register is local to the datapath counter. It can be accessed only from within the datapath. On line 37, we define unnamed always signal flowgraph. It contains, besides expressions, also a directive on line 6. A GEZEL directive does affect how the simulator behaves, but it does not affect the simulation outcome. In this case we are using the display directive, which is used to print out values on the datapath. One special variable that is accessed is called $cycle. This variable returns the current simulation cycle. Thus, the effect of the display directive will be to print out the current simulation cycle as well as the output value of the counter. Finally, on lines 1011, the toplevel of the system is expressed. A GEZEL file must always have a system statement. A datapath definition thus consists of three elements: An IO definition, a definition of local signals and registers, and a set of signal flowgraphs. The IOdefinition can create input – as well as output ports. For example, a simple ALU that can add, subtract and accumulate would look as follows.
dp alu(in x, y : ns(8); out z : ns(8)) { reg acc : ns(8); sfg add { z = x + y; } sfg sub { z = x  y; } sfg accumulate { acc = acc + x; z = acc + x; } sfg rst { acc = 0; z = 0; } } There are four named signal flowgraphs in this example. The datapath has two inputs, x and y, and one output, z. There is an internal accumulator register, acc. There is one signal flowgraph call rst. This will be used to reset the accumulator register. During this reset operation, we will also drive the output of the datapath to zero. Not all datapath definitions that one can write down in GEZEL are valid. There are four rules to which a datapath definition must conform. When any of those rules are violated, then either the GEZEL parser will reject your code, or else a runtime error message will be triggered. The four rules are enumerated below.
Here are a few examples of erroneous signal flowgraphs.
// WRONG: output v is not always defined dp bad1(out v : ns(1)) { always {} } // WRONG: a combinatorial loop between signals dp bad2 { sig a, b : ns(1); always { a = b + 1; // a defines b, b defines a b = a + 1; // and both are signals (not registers) } } // WRONG: dangling signal dp bad3 { sig a, b : ns(1); always { a = b + 1; // b is unknown } } // WRONG: a signal is assigned more than once dp bad4 { sig a : ns(1); always { a = 1; a = 5; } } 

Structural Hierarchy  Datapaths can be included inside of other datapaths, thus implementing structural hierarchy. For this purpose, GEZEL provides the keyword use. Consider the example of a 4input AND gate.
1 dp andgate(in a, b : ns(1); out q : ns(1)) { 2 always { 3 q = a & b; 4 } 5 } 6 7 dp andgate2 : andgate 8 dp andgate3 : andgate 9 10 dp fourinputand(in a, b, c, d : ns(1); out q : ns(1)) { 11 sig s1, s2 : ns(1); 12 use andgate ( a, b, s1); 13 use andgate2( c, d, s2); 14 use andgate3(s1, s2, q); 15 always { 16 $display(a," ", b, " ", c, " ", d, " > ", q); 17 } 18 } 19 20 dp tst(out a, b, c, d : ns(1)) { 21 reg n : ns(4); 22 always { 23 n = n + 1; 24 a = n[0]; b = n[1]; c = n[2]; d = n[3]; 25 } 26 } 27 28 dp sysandgate { 29 sig a, b, c, d, q : ns(1); 30 use tst(a, b, c, d); 31 use fourinputand(a, b, c, d, q); 32 } 33 34 system S { 35 sysandgate; 36 } Lines 1018 define a fourinput AND gate using three twoinput AND gates. A use statement allows to include a twoinput AND gate inside of the fourinput AND gate. Connections can be made to datapath inputs, outputs or local signals. Of course, the semantic requirements enumerated earlier must be obeyed. Lines 2026 define a testbench that enumerates all 4bit input patterns by decomposing the bits of a counter. Finally, lines 2832 interconnect the testbench to the fourinput AND gate in a system block. 

Datapath Cloning  Sometimes, multiple copies of one and the same datapath are needed. GEZEL provides a cloning operation to create such an identical copy of a single datapath. The next example shows how three identical AND gates can be created by defining one and then cloning the first AND gate two times.
dp andgate(in a, b : ns(1); out q : ns(1)) { always { q = a & b; } } dp andgate2 : andgate dp andgate3 : andgate Cloning creates an identical but independent copy. If the parent datapath includes a register, then the cloned datapath will contain its’ own register. This completes basic modeling techniques for datapaths. The next section covers the modeling of controllers, that enable the use of datapath with multiple signal flowgraphs. 

Controller Design  The control/datapath model of GEZEL is based on a more generic form of registertransfer level modeling called Finite State Machine and Datapath, or FSMD for short. An FSMD model expresses both datapath operations as well as control operations. It makes a clear distinction however between what is control and what is data processing. An FSMD consists of two crosscoupled state machines. One plays the role of the controller, the other plays the role of the datapath. Information exchange between the two includes conditions (going from the datapath to the controller) and instructions (going from the controller to the datapath).An FSMD provides separate modeling for data processing and for control processing. That is for a good reason, in practice there are many differences between the controller and the datapath. First, the modeling style for the two is different. Datapaths are modeled with expressions on signals and registers. Controllers are modeled with state transition graphs. Second, the logic implementation style of the two also shows differences. A datapath with operators typically exhibits a regular logic style. A controller on the other hand exhibits an irregular logic style.
The FSMD concepts map as follows to the GEZEL model.


Sequencer Controllers  Besides the trivial hardwired controller discussed above, the simplest controller is the sequencer. As the name indicates, a sequencer will execute a set of instructions sequentially, without taking any conditions into account.A typical case where sequencers are useful is for static, fixed schedules. Consider for example a 4tap decimating averaging filter. Such a filter reads four subsequent samples, integrates and dumps the sum of the samples at every fourth sample.
1 dp avg(in i : ns(8); out o : ns(8)) { 2 reg acc : ns(9); 3 sfg phase0 { acc = i; o = 0; } 4 sfg phase12 { acc = acc + i; o = 0;} 5 sfg phase3 { o = (acc + i) >> 2;} 6 } 7 sequencer h_avg(avg) { phase0; 8 phase12; 9 phase12; 10 phase3;} 11 12 dp tst(in i : ns(8); out o : ns(8)) { 13 reg a : ns(8); 14 always { 15 o = a; 16 a = a + 2; 17 $display("C", $cycle, ": i=", o, " o=", i); 18 } 19 } 20 21 dp sysavg { 22 sig i, o : ns(8); 23 use avg(i, o); 24 use tst(o, i); 25 } 26 27 system S { 28 sysavg; 29 } An averaging filter has four phases. As the datapath in lines 16 illustrates, there is an initialization instruction (phase0), an accumulation instruction (phase12) and a termination instruction (phase3). The controller for this datapath is a sequencer with four steps, as shown in lines 710. Lines 1219 show a simple testbench that will feed a string of even numbers to this fourphase averager. Finally, lines 2129 show the system interconnect function. An important motivation for developing FSMD models, instead of plain hardwired datapaths, is that an FSMD allows to express operation sharing in an elegant way. Consider the descriptions in phase0, phase12 and phase3. They specify two assignments on an accumulator register and three assignments to an output port without the use of a multiplexer. When the same behavior would be represented in a single sfg, it would look like this:
reg phase : ns(2); sfg singlephase { acc = (phase == 0) ? i : acc + i; o = (phase == 3) ? (acc + i) >> 2 : 0; phase = phase + 1; } This description style gives you precise control over how the implementation will look like, but requires more modeling as the control operations have to be written down explicitly as expression. 

FSM Controllers  A Finite State Machine implements conditional control sequencing on a datapath. The control model is captured by a state transition graph. A Finite State Machine can be in a welldefined number of states. One of these states is the initial state, it is the state the FSM is in when it first initializes.
A Finite State Machine will take one state transition per clock cycle. During this state transition, a datapath instruction (one or more sfg) can be executed. A state transition can be conditional. In that case, the condition is based on the values of registers in the datapath (or on logical expressions directly derived from it). When state transitions are conditional, then the set of conditions must be complete. This means that, for every if (truebranch), there must be a complimentary else (falsebranch). Consider the following simple example of FSM modeling.
fsm h_avg(avg) { initial s0; state s1, s2, s3; @s0 phase0 > s1; @s1 phase12 > s2; @s2 phase12 > s3; @s3 phase3 > s0; } This description creates four states, called s0, s1, s2 and s3. s0 is the initial state, the others are normal states. A state transition indicates the start state with the @ symbol, and the target state with an arrow (>). In between, a datapath instruction is indicated. A single sfg can be written as such, a group of sfg is specified as a commaseparated list in between round brackets. State transitions can also be made conditional. The state transition condition needs to be a value directly calculated from a register. An example of conditional state transitions is as follows.
@s0 if ( c1 & c2) then (sfg1) > s0; else if ( c1 & ~c2) then (sfg2) > s0; else if (~c1 & c3) then (sfg3) > s0; else if (~c1 & ~c3) then (sfg4) > s0; This can also be written hierarchically, as follows.
@s0 if (c1) then if (c2) then (sfg1) > s0; else (sfg2) > s0; else if (c3) then (sfg3) > s0; else (sfg4) > s0; Next is an example with slightly more complicated FSM control. The example is a raster line drawing routine, known as the Bresenham Algorithm. The strong point of this algorithm is that it can draw lines of arbitrary slope on a discrete (X,Y) grid, and without the use of floating point arithmetic. The complete GEZEL listing illustrates how a slightly more complicated design looks like.
1 // Bresenham line plotter for points in an arbitrary octant 2 dp bresen(in x1_in, y1_in, x2_in, y2_in : tc(12)) { 3 reg x, y : tc(12); // current plot position 4 reg e : tc(12); // accumulated error 5 reg eol : tc(1); // endofloop flag 6 reg einc1, einc2 : tc(12); // increments 7 reg xinc1, xinc2 : tc(12); 8 reg yinc1, yinc2 : tc(12); 9 sig se, sdx, sdy : tc(12); 10 sig asdx, asdy : tc(12); 11 sig stepx, stepy : tc(12); 12 13 sfg init { 14 // evaluate range of pixels and their absolute value 15 sdx = x2_in  x1_in; asdx = (sdx < 0) ? sdx : sdx; 16 sdy = y2_in  y1_in; asdy = (sdy < 0) ? sdy : sdy; 17 // determine direction of x and y increments 18 stepx = (sdx < 0) ? 1 : 1; 19 stepy = (sdy < 0) ? 1 : 1; 20 // initial error 21 se = (asdx > asdy) ? 2 * asdy  asdx : 2 * asdx  asdy; 22 // error increment for straight (einc1) and diagonal (einc2) step 23 einc1 = (asdx > asdy) ? (asdy  asdx) : (asdx  asdy); 24 einc2 = (asdx > asdy) ? asdy : asdx; 25 // increment in x direction for straight and diagonal steps 26 xinc1 = (asdx > asdy) ? stepx : stepx; 27 xinc2 = (asdx > asdy) ? stepx : 0; 28 // increment in y direction for straight and diagonal step 29 yinc1 = (asdx > asdy) ? stepy : stepy; 30 yinc2 = (asdx > asdy) ? 0 : stepy; 31 // initialize registers 32 x = x1_in; y = y1_in; 33 e = se; 34 } 35 36 // endofloop test  check if we reach target 37 sfg looptest { 38 eol = ((x == x2_in) & (y == y2_in)); 39 } 40 41 // loop body: adjust x, y and error accumulator 42 // use error value to decide straight or diagonal step 43 sfg loop { 44 x = (e >= 0) ? x + xinc1 : x + xinc2; 45 y = (e >= 0) ? y + yinc1 : y + yinc2; 46 e = (e >= 0) ? e + einc1 : e + einc2; 47 $display($hex,"Cycle: ",$cycle," Plot point (", x, ",", y, ") "); 48 } 49 } 50 // controller for bresenham algorithm 51 // initializes, draws one line and then waits in state s3 52 fsm f_bresen(bresen) { 53 initial s0; 54 state s1, s2, s3; 55 @s0 (init) > s1; 56 @s1 (loop, looptest) > s2; 57 @s2 if (eol) then (init) > s3; 58 else (loop, looptest) > s2; 59 @s3 (init) > s3; 60 } 61 62 // testbench 63 dp test_bresen(out x1, y1, x2, y2 : tc(12)) { 64 sig sx : tc(12); 65 sfg run { 66 x1 = 5; x2 = 18; y1 = 2; y2 = 8; 67 } 68 } 69 hardwired h_test_bresen(test_bresen) {run;} 70 71 dp sysbresen { 72 sig x1, y1, x2, y2 : tc(12); 73 use bresen(x1, y1, x2, y2); 74 use test_bresen(x1, y1, x2, y2); 75 } 76 77 system S { 78 sysbresen; 79 } The Bresenham datapath accepts two coordinate tuples, indicating the starting resp. ending points of the vector to be drawn. The bulk of the calculation of the algorithm takes place in an initialization phase, for which a single sfg is created (lines 1334). Basically, the Bresenham algorithm works with three accumulators: one for the x coordinate (register x), one for the y coordinate (register y), and one error accumulator (register e). At runtime, the error accumulator is evaluated to decide on the required increments in the x and y accumulators. Not all vectors have the same length, and the Bresenham algorithm only takes a single step (horizontal, vertical or diagonal) per iteration. Because each clock only a single iteration of the Bresenham algortihm is executed, a complete line takes a variable number of clock cycles to generate a vector. Lines 3739 contain a loop test that decide when to terminate a loop. The actual loop body, which contains the error accumulations, is shown in lines 4348. The FSM controller of the Bresenham algorithm is shown in lines 5260. After initialization, the algorithm takes a first iteration of the loop and evaluates the endofloop flag on line 56. From then on, the FSM takes conditional state transitions, which will take it back each time from state s2 to state s2 (line 58), or else terminate the loop into state s3 (line 57). The test (eol) checks when the endofloop flag becomes true. This test is taken on the value in a register, so it actually checks the endofloop condition of the previous iteration. For this reason, the instruction of the transition into s3 is an initialization instruction (line 57). When the output of eol is high, the x and y accumulators are already at there target position, and no more increments should be done. Finally, lines 6379 show a simple testbench for the vector generator. The test will evaluate pixels from the vector running from (5,2) to (18,8) (line 66). 
Source:
http://rijndael.ece.vt.edu/gezel2/documentation.html
Virtual Fashion Education
"chúng tôi chỉ là tôi tớ của anh em, vì Đức Kitô" (2Cr 4,5b)
hienphap.net
News About Tech, Money and Innovation
Modern art using the GPU
Find the perfect theme for your blog.
Learn to Learn
Con tằm đến thác vẫn còn vương tơ
Khoa Vật lý, Đại học Sư phạm Tp.HCM  ĐT :(08)38352020  109
Blog Toán Cao Cấp (M4Ps)
Indulge Travel, Adventure, & New Experiences
"Behind every stack of books there is a flood of knowledge."
The latest news on WordPress.com and the WordPress community.