0% found this document useful (0 votes)
64 views

Document 1

this is a solution

Uploaded by

1fbe4bd97e
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
64 views

Document 1

this is a solution

Uploaded by

1fbe4bd97e
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 21
ERAGHCEIRFODIEMISI22 (solution page 369) A. Try to calculate 14! with a 32-bit int. Verify whether the computation of 14! overflows. B. What if the computation is done with a 64-bit long int? Solution to Problem 3.22 (page 257) A. The computation of 14! would overflow with a 32-bit int. As we learned in Problem 2.35, when we get value x while attempting to compute m!, we can test for overflow by computing x/n and seeing whether it equals (n — 1)! (assuming that we have already ensured that the computation of (v — 1)! did not overflow). In this case we get 1,278,945,280/14 = 91353234.286. As a second test, we can see that any factorial beyond 10! must be a multiple of 100 and therefore have zeros for the last two digits. The correct value of 14! is 87,178,291,200. Further, we can build up a table of factorials computed through 14! with data type int, as shown below: a nt OK? 1 1 Y 2 2° Y 3 6 oY 4 24 OY 5 200 °«=Y 6 mR ¥ 7 5,040 8 40320 -Y 9 362,880 10 3,628,800 ¥ ul 39,916,800 12 479,001,600 13 1,932,053504 = -N 14 -1,278,945,280-N B. Doing the computation with data type Long lets us go up to 201, thus the 14! ‘computation does not overflow. t PERCECEIPFODIEIBIZG (solution page 371) For C code having the general form short loop_vhile(short a, short b) return result; ccc, run with command-line option -0g, produces the following code: 1 2 3 4 5 6 7 8 9 10 u We can see that the compiler used a jump-to-middle translation, using the instruction on line 3 to jump to the test starting with label .L2. Fill in the mis short 1oop_uhile(short a, short b) ain frdi, b in trai loop_vhile: novl $0, Yeax jmp -L2 13: leaq — (irsi,sedi), Yrdx addq erdx, trax subg $1, Yrdi ss empq rsi, edi jg 13 rep; ret jmp ing parts of the C code. Solution to Problem 3.24 (page 260) This assembly code is a fairly straightforward translation of the loop using the jump-to-middle method. The full C code is as follows: short loop_while(short a, short b) { short result = 0; while (a > b) { result = result + (a*b); a=a-l; + return result; PFBCHEEIRFODIEHIBIZS (solution page 371) For C code having the general form long loop_while2(long a, long b) t long result 3 while ( d¢ result = 3 b= + return result; + cc, run with command-line option -01, produces the following code: 2 in frdi, b in frei 1 loop_while2: 2 testq irsi, desi 2 jie -18 4 movq = ersi, trax SALT: 6 imilg edi, Yrax 7 subq = erdi, Mrsi ® testq esi, drei Solution to Problem 3.25 (page 262) While the generated code does not follow the exact pattern of the guarded-do translation, we can see that it is equivalent to the following C code: long loop_while2(long a, long b) { long result = b; while (b > 0) { result = result * a; = b-a; t return result; } We will often see cases, especially when compiling with higher levels of opti- mization, where Gcc takes some liberties in the exact form of the code it generates, while preserving the required functionality. Practice Problem 3.26 (solution page 372) A function test_one has the following overall structure: short test_one(unsigned short x) { short val = 1; while (...) { } return . > ‘The occ C compiler generates the following assembly code: short test_one(unsigned short x) x ie Med 1 test_one: 2 movl 81, eax S jap LS. 6 Lb: 5 xorg irdi, eax © sheq ri shite righe by 4 2 Lb 5 testq trdi, drat jue LS jo andl $0, eax net Reverse engineer the operation of this code and then do the following: ‘A. Determine what loop translation method was used. B._ Use the assembly-code version to fill in the missing parts of the Cade. C. Describe in English what this function computes. Solution to Problem 3.26 (page 264) Being able to work backward from assembly code to C code is a prime example of reverse engineering. A. B. We can see that the code uses the jump-to-middle translation, using the jmp instruction on line 3. Here is the original C code: short test_one(unsigned short x) { short val = 1; while (x) { val “= x; return val & 0; i This code computes the parity of argument x. That is, it returns 1 if there is an odd number of ones in x and 0 if there is an even number. PIRECEIRFOGIEMIBIZE (solution page 372) A function test_two has the following overall structure: short test_two(unsigned short x) { short val = 0; short i; for Ci... rf 2 return val; The ace C compiler generates the following assembly code: test fun_b(unsigned test x) x in frat 1 test_two: 2 movi = $1, Yedx 2 movl $65, eax 4 110: 5 -movg edi, Krex 6 andl $1, Yecx 7 addq trax, ‘trax s org ercx, %rax 9 shrq edi Shite right by 1 m0 addqg $1, re no jne L10 12 reps ret Reverse engineer the operation of this code and then do the following: A. Use the assembly-code version to fill in the missing parts of the C code. B. Explain why there is neither ar to the test portion of the loop. ial test before the loop nor an initial jump Describe in English what this function computes. Solution to Problem 3.28 (page 267) ‘This problem is trickier than Problem 3.26, since the code within the loop is more complex and the overall operation is less familiar. A. Here is the original C code: long fun_b(unsigned long x) { long val = 0; long i; 64; i != 0; i--) { val << 1) | (x & Ox1); x >>= 45 for (i } return val; } B. The code was generated using the guarded-do transformation, but the com- piler detected that, since i is initialized to 64, it will satisfy the test i #0, and therefore the initial test is not required. C. This code reverses the bits in x, creating a mirror image. It does this by shifting the bits of x from left to right, and then filling these bits in as it shifts val from right to left. Dicaceonbly of lazt(leas 8, Leas ¥) = ia Bets (EEREEEREISEIERIISE? (solution page 375) The disassembled code for two functions fret and Inet is shown below, along ‘with the code for a call of ¢sret by function nasa: via feat 19000000000400540 00540 00543 00847 49 89 £8 48 OF af o6 3 Dzaceenbly of laze(long © x in Ends ‘20v0000000800648 : 00848: ao0s4e: 00850: 400855 400860: ‘00865 48 8a 77 OL 42 82 ef OL of eb fz a2 t #368 of 02 tat at 48 89 «2 nov edi trax seul trai rar ea ort (keds) teat sub Sox, bra eallg 400810 aaet> reps retq eallg 400808 pov Mraz, rds BBE ca t2202(10) Each of these instructions is given a label, similar to those in Figure 3.27(a. ‘Starting with the calling of #rst (10) by main, fill in the following table to trace. instruction execution through to the point where the program returns back 10 vizep _ Description FI Rn B ra R a se Call First G0 solution to Froblem 3.52 (page 260) Tracing through the program execution at this level of detail reinforces many aspects of procedure call and return. We can see clearly how control is passed to the function when it is called, and how the calling function resumes upon return. We can also see how arguments get passed through registers %{rdi and %rsi, and how results are returned via register %rax. Instruction State values (at beginning) Label PC Instruction %rdi Yrsi Yrax Yersp ixsp Description MI —0x400560 callq 10 — — Ox?£ffffffes20 - Call first (10) Fl -0x400548 lea 10 — — Ox?fffffffe818 0x400565 Entry of first F2 — 0x40054c sub 10 il — Ox?#ffffffe8is 0x400565 F3 0x400550 callq 9 i — Ox?ffffffie8i8 0x400565 Calllast(9, 11) Ll 0x400540 mov 9 IL — Ox?7fffffffe810 0x400555 Entry of last 12 0x400543 imal 9 11 9 Ox?fffffffe810 0x400555 13 0x400847 retq 9 11 99 Ox7#tfffffe810 0x400555 Return 99 from last F4 0x400555 repz repq 9 i 99 Ox7ffffffie8i8 0x400565 Return 99 from first M2 = 0x400565 mov 9 11 99 Ox7fffffffe820 _ Resume main RESEHEEIRFOBIERIBISS (solution page 376) For a C function having the general structure long rfun(unsigned long x) { it ( ) return : unsigned long mx 5 long rv = rfun(nx); return 3 ‘ace generates the following, assembly code: Jong rfun(unsigned Jong x) x im fed 1 of 2 pushq drbx. 3 movq—krdi, rbx 4 movl $0, eax 5 testq tri, krdi 6 ie 12 7 shrq $2, radi 8 call rfun © addq rb, trax io LB: 1) popq herb 12 ret A. What value does rfun store in the callee-saved register Zrbx? B. Fill in the missing expressions in the C code shown above. Solution to Problem 3.35 (page 290) This problem provides a chance to examine the code for a recursive function. An important lesson to learn is that recursive code has the exact same structure as the other functions we have seen. The stack and register-saving disciplines suffice to make recursive functions operate correctly. A. Register %rbx holds the value of parameter x, so that it can be used to compute the result expression. B. The assembly code was generated from the following C code: long rfun(unsigned long x) { if (x == 0) return 0; unsigned long nx = x>>2; long rv = rfun(nx); return x + rv; PFACHICEIPFOBIENNSIBG (solution page 377) Consider the following declarations: int = P[5]; short Q([2]; int —**R[9]; double *S[10]; short *T(2]; Fill in the following table describing the element size, the total size, and the address of element i for each of these arrays. Array Element size Totalsize = Startaddress_~— Elementi P Q R Ss T 4 20 _ xp _xX+4i- 2 X x4 Qi —g—_ —72 XR —xX+8i- 8 gg Xs __x+ Bi 8 16 ar x + Bi Solution to Problem 3.36 (page 292) This exercise tests your understanding of data sizes and array indexing. Observe that a pointer of any kind is 8 bytes long. Data type short requires 2 bytes, while int requires 4. Array Element size ‘Total size Start address Element i P 4 20 Xp xp t4i Q 2 4 Xq Xq+2i R 8 2 Xp xg +81 s 8 80 Xs Xg + 8 T 8 16 Xp xp + 8 Practice(PFObleny3937 (solution page 377) Suppose xp, the address of short integer array P, and long integer index i are stored in registers Zrdx and %rcx, respectively. For each of the following expressions, give its type, a formula for its value, and an assembly-code implementation. The result should be stored in register %rax if it is a pointer and register element %ax if it has data type short. Expression Type Value Assembly code Plt] -short— -M[x+2] _movw 0x2(%rdx) , %ax PH3+i short® —x+6+ 2i __leaq_Ox6(%rdx, Yrex, 2), Yrax P[ix6-5] short M[x+12i- 10] _movw -OxA(%rdx, %rex, 12), %ax P[2] —short. M[x+4] —movww 0x4(%rdx), %ax &Pli+ 2] —short* x%+2i+4 —JeagOx4(%rdx, Yrcx, 2), Yrax Solution to Problem 3.37 (page 294) This problem is a variant of the one shown for integer array E. It is important to understand the difference between a pointer and the object being pointed to. Since data type short requires 2 bytes, all of the array indices are scaled by a factor of 2. Rather than using mov1, as before, we now use movw. Expression Type Value Assembly Code P[i] short M[xp + 2] movw 2(%rdx) , Zax P+3ti short * xp +6+42i leaq 6(%rdx,%rex,J) ,%rax P[i*6-5] short M[xp+12i—10] — movw -10(%rdx, %rex, 12) , fax P[2] short MLxp +4] movw 4(%rdx) , ax &P (i+2] short * x »+2i+4 leaq 4(%rdx,%rex,2) ,%rax PaCHEEIPFODIEM|BSISS (solution page 377) Consider the following source code, where M and N are constants declared with #define: long PIM] (N]; Jong Q(N] (M1; long sum_element (long i, long j) { return P[iJ[j] + Q(j] (il; + In compiling this program, ccc generates the following assembly code: ong sum_element (long i, long j) iin Srdi, j in frsi leaq subq addq leaq addq movg addq ret wevanueauna sum_element: OC %rdi,8), Mrdx Bi in rdx dedi, trax 7i in rdx Yrsi, Yrdx 7i + jin rdx (hrsi,%rsi,4), yrax 5j in rax Yrax, Yrdi i+ 5j in rdi Q(%rdi,8), trax q+ B(i + 5)) P(,%rdx,8), %rax p+ 8(7i+j) M=5,N=7 Use your reverse engineering skills to determine the values of M and N based on this assembly code. Solution to Problem 3.38 (page 295) This problem requires you to work through the scaling operations to determine the address computations, and to apply Equation 3.1 for row-major indexing. The first step is to annotate the assembly code to determine how the address references are computed: long sum_element (long i, long j) iin frdi, j in %rsi 1 sum_element: 2 leaq 0(,%rdi,8), %rdx Compute 8i 3 subq drdi, %rdx Compute 7i 4 addq drsi, trax Compute 7i + j 5 leaq (krsi,%rsi,4), trax Compute 5j 6 addq drax, ‘hrdi Compute i + Sj 7 movq QC, %rdi,8), %rax Retrieve M[xg + 8 (Sj + i] 8 addq P(,%rdx,8), %rax Add M[xp + 8 (i + J] 9 ret We can see that the reference to matrix P is at byte offset 8 . (7i + j), while the reference to matrix Q is at byte offset 8 - (5j +7). From this, we can determine that P has 7 columns, while Q has 5, giving M =5 and N =7. PEACHCEIRFOBIEMIGM (solution page 696) In the following, let r be the number of rows in a DRAM array, ¢ the number of columns, }, the number of bits needed to address the rows, and b, the number of bits needed to address the columns, For each of the following DRAMs, determine the power-of-2 array dimensions that minimize max(h, , b,), the maximum number of bits needed to address the rows or columns of the array. Figure 6.5 Reading the contents of a memory module. Organization 16x1 16x4 128 x8 s12x4 1,024 x 4 2 Supercel (i) o4ME memory module consisting of fight M8 ORAM asta ‘BAER word at main memory address A controler ‘4-0 ord 10 CPU chip r ‘ ke max(h,, be)

You might also like