0% found this document useful (0 votes)
95 views

An Interesting Pointer Puzzle

A reader of my blog sent me a question the other day asking to explain a piece of code with pointers. I found it to be a very interesting puzzle, not just because I had to drop into an object dump with a friend to work through it. The error is consistent, even across platforms. Here is a slightly modified version of the original code. We will call this file bad.c. See if you can notice the error.

Uploaded by

Silas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views

An Interesting Pointer Puzzle

A reader of my blog sent me a question the other day asking to explain a piece of code with pointers. I found it to be a very interesting puzzle, not just because I had to drop into an object dump with a friend to work through it. The error is consistent, even across platforms. Here is a slightly modified version of the original code. We will call this file bad.c. See if you can notice the error.

Uploaded by

Silas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
You are on page 1/ 7

http://denniskubes.

com/2014/08/11/interesting-pointer-puzzle/

An Interesting Pointer Puzzle


Comment
A reader of my blog sent me a question the other day asking to explain a piece of code with
pointers. I found it to be a very interesting puzzle, not just because I had to drop into an object
dump with a friend to work through it. The error is consistent, even across platforms. Here is a
slightly modified version of the original code. We will call this file bad.c. See if you can notice the
error.
#include <stdio.h> int main() { int *a[] = {0,1,2,3,4}; printf("arr0=%d\n", *a+0); printf("arr1=%d\n", *a+1); printf("arr2=
%d\n", *a+2); printf("arr3=%d\n", *a+3); printf("arr4=%d\n", *a+4); return 0; }

1
2
3
4
5
6
7
8
9
10
11
12
13
14

#include <stdio.h>
int main()
{
int *a[] = {0,1,2,3,4};
printf("arr0=%d\n", *a+0);
printf("arr1=%d\n", *a+1);
printf("arr2=%d\n", *a+2);
printf("arr3=%d\n", *a+3);
printf("arr4=%d\n", *a+4);

return 0;
}
If you compile that, gcc bad.c you will get a bunch of warnings. Affirming that it is not good to
ignore warnings, upon running it you should see output like this.
arr0=0
arr1=4
arr2=8
arr3=12
arr4=16

We ran this on both x64 mac and linux. Same result. What the expected output is is something like
this.
arr0=0
arr1=1
arr2=2
arr3=3
arr4=4

Try to figure out the error before going further.

The Error
The error is actually easy to overlook. The int *a[] means an array of int pointers. What was
probably intended was an array of ints.
int main() { int *a[] = {0,1,2,3,4}; // should be int a[] = {0,1,2,3,4}; // notice no pointer star before a printf("arr0=%d\n",
*a+0); // also *a+0 should probably be *(a + 0) ... }

int main()
{
int *a[] = {0,1,2,3,4};
// should be int a[] = {0,1,2,3,4};
// notice no pointer star before a
printf("arr0=%d\n", *a+0);
// also *a+0 should probably be *(a + 0)
...
}
Change that and compile it and done. We get the output. The *a+n expression dereferences the
value at the start of the a array, in this case 0, and adds n to it. The *a+0 was probably intended to
be *(a + 0) as well, leading to the same output but for a different reason. Mildly interesting.
1
2
3
4
5
6
7
8
9

Ignoring the array of int pointers instead of ints and the dereferencing logic errors, what is
interesting is that the error is consistent, even across platforms. Do you know why?

The Puzzle
Confused. Run the error version multiple times you will get the same output. Usually with pointer
errors you get changing memory locations and values. But the error output was consistent. I had to
work through it in an object dump with my friend Seth Hartbecke to figure out what was going on.
#include <stdio.h> int main() { int *a[] = {0,0,0,0,0}; // change array to all zeros, compile, same output }
1
2
3
4
5
6
7

#include <stdio.h>

int main()
{
int *a[] = {0,0,0,0,0};
// change array to all zeros, compile, same output
}
Lets drop into assembly output.
000000000040052d <main>: 40052d: 55 push %rbp 40052e: 48 89 e5 mov %rsp,%rbp 400531: 48 83 ec 30 sub
$0x30,%rsp 400535: 48 c7 45 d0 00 00 00 movq $0x0,-0x30(%rbp) # placing array onto stack 40053c: 00 40053d: 48 c7
45 d8 00 00 00 movq $0x0,-0x28(%rbp) 400544: 00 400545: 48 c7 45 e0 00 00 00 movq $0x0,-0x20(%rbp) 40054c: 00
40054d: 48 c7 45 e8 00 00 00 movq $0x0,-0x18(%rbp) 400554: 00 400555: 48 c7 45 f0 00 00 00 movq $0x0,-0x10(%rbp)
40055c: 00 40055d: 48 8b 45 d0 mov -0x30(%rbp),%rax # move first array value to rax register 400561: 48 89 c6 mov
%rax,%rsi # no add just move rax value 400564: bf 94 06 40 00 mov $0x400694,%edi 400569: b8 00 00 00 00 mov
$0x0,%eax 40056e: e8 9d fe ff ff callq 400410 <printf@plt> 400573: 48 8b 45 d0 mov -0x30(%rbp),%rax # move first
array value to rax register 400577: 48 83 c0 04 add $0x4,%rax # add 4 to the value in the rax register 40057b: 48 89 c6

mov %rax,%rsi 40057e: bf 9d 06 40 00 mov $0x40069d,%edi 400583: b8 00 00 00 00 mov $0x0,%eax 400588: e8 83 fe ff


ff callq 400410 <printf@plt> 40058d: 48 8b 45 d0 mov -0x30(%rbp),%rax # move first array value to rax register 400591:
48 83 c0 08 add $0x8,%rax # add 8 to the value in the rax register 400595: 48 89 c6 mov %rax,%rsi 400598: bf a6 06 40
00 mov $0x4006a6,%edi 40059d: b8 00 00 00 00 mov $0x0,%eax 4005a2: e8 69 fe ff ff callq 400410 <printf@plt>
4005a7: 48 8b 45 d0 mov -0x30(%rbp),%rax # move first array value to rax register 4005ab: 48 83 c0 0c add $0xc,%rax #
add 12 to the value in the rax register 4005af: 48 89 c6 mov %rax,%rsi 4005b2: bf af 06 40 00 mov $0x4006af,%edi
4005b7: b8 00 00 00 00 mov $0x0,%eax 4005bc: e8 4f fe ff ff callq 400410 <printf@plt> 4005c1: 48 8b 45 d0 mov
-0x30(%rbp),%rax # move first array value to rax register 4005c5: 48 83 c0 10 add $0x10,%rax # add 16 to the value in
the rax register 4005c9: 48 89 c6 mov %rax,%rsi 4005cc: bf b8 06 40 00 mov $0x4006b8,%edi 4005d1: b8 00 00 00 00
mov $0x0,%eax ...

000000000040052d <main>:
40052d: 55
push %rbp
40052e: 48 89 e5
mov %rsp,%rbp
400531: 48 83 ec 30
sub $0x30,%rsp
400535: 48 c7 45 d0 00 00 00 movq $0x0,-0x30(%rbp) # placing array onto stack
1
40053c: 00
2
3
40053d: 48 c7 45 d8 00 00 00 movq $0x0,-0x28(%rbp)
4
400544: 00
5
400545: 48 c7 45 e0 00 00 00 movq $0x0,-0x20(%rbp)
6
40054c: 00
7
40054d: 48 c7 45 e8 00 00 00 movq $0x0,-0x18(%rbp)
8
400554: 00
9
400555: 48 c7 45 f0 00 00 00 movq $0x0,-0x10(%rbp)
10
11 40055c: 00
mov -0x30(%rbp),%rax # move first array value to rax
12 40055d: 48 8b 45 d0
13 register
mov %rax,%rsi # no add just move rax value
14 400561: 48 89 c6
400564:
bf
94
06
40
00
mov $0x400694,%edi
15
16 400569: b8 00 00 00 00
mov $0x0,%eax
17 40056e: e8 9d fe ff ff
callq 400410 <printf@plt>
18 400573: 48 8b 45 d0
mov -0x30(%rbp),%rax # move first array value to rax
19 register
20 400577: 48 83 c0 04
add $0x4,%rax # add 4 to the value in the rax register
21
40057b: 48 89 c6
mov %rax,%rsi
22
40057e: bf 9d 06 40 00
mov $0x40069d,%edi
23
400583:
b8
00
00
00
00
mov $0x0,%eax
24
callq 400410 <printf@plt>
25 400588: e8 83 fe ff ff
40058d:
48
8b
45
d0
mov -0x30(%rbp),%rax # move first array value to rax
26
register
27
add $0x8,%rax # add 8 to the value in the rax register
28 400591: 48 83 c0 08
29 400595: 48 89 c6
mov %rax,%rsi
30 400598: bf a6 06 40 00
mov $0x4006a6,%edi
31 40059d: b8 00 00 00 00
mov $0x0,%eax
32 4005a2: e8 69 fe ff ff
callq 400410 <printf@plt>
33 4005a7: 48 8b 45 d0
mov -0x30(%rbp),%rax # move first array value to rax
34
register
35
4005ab: 48 83 c0 0c
add $0xc,%rax # add 12 to the value in the rax register
36
4005af:
48
89
c6
mov
%rax,%rsi
37
mov $0x4006af,%edi
38 4005b2: bf af 06 40 00
4005b7:
b8
00
00
00
00
mov $0x0,%eax
39
callq 400410 <printf@plt>
40 4005bc: e8 4f fe ff ff
mov -0x30(%rbp),%rax # move first array value to rax
41 4005c1: 48 8b 45 d0
42 register
43 4005c5: 48 83 c0 10
add $0x10,%rax # add 16 to the value in the rax register
4005c9: 48 89 c6
mov %rax,%rsi
4005cc: bf b8 06 40 00
mov $0x4006b8,%edi
4005d1: b8 00 00 00 00
mov $0x0,%eax
...
This is interesting. It takes the first value of the array, 0, and adds 4, 8, 12, and 16 to it in sequence
to give us our output. Where did those values come from? Have you solved it?

A Bad Case of Pointer Math


This is a bad case of pointer math.
int main() { int *a[] = {0,0,0,0,0}; // array of int pointers, pointers hold addresses printf("arr0=%d\n", *a+0); printf("arr1=
%d\n", *a+1); printf("arr2=%d\n", *a+2); printf("arr3=%d\n", *a+3); printf("arr4=%d\n", *a+4); ... }

1
2
3
4
5
6
7
8
9
10
11
12

int main()
{
int *a[] = {0,0,0,0,0}; // array of int pointers, pointers hold addresses
printf("arr0=%d\n", *a+0);
printf("arr1=%d\n", *a+1);
printf("arr2=%d\n", *a+2);
printf("arr3=%d\n", *a+3);
printf("arr4=%d\n", *a+4);

...
}
The first piece of the puzzle is that int *a[] is an array of pointers. The initialization
{0,0,0,0,0} or {0,1,2,3,4} are, rightly so according to the compiler, seen as pointers
holding addresses. There are instances where absolute addresses are used, though it is more
common in embedded programming.
The second piece to the puzzle is the 0, 4, 8, 12, 16 in the assembly. These are equivalent to 0 *
4, 1 * 4, 2 * 4, 3 * 4, and 4 * 4 respectively.
The int *a[] being an array of pointers, the compiler is doing pointer math. The *a + 1 value
is being evaluated as an int pointer + n. In this case ints are 4 bytes and according to pointer math, a
+ n items get translated to 4 bytes * n items.
int main() { int *a[] = {0,0,0,0,0}; // array of int pointers, pointers hold addresses printf("arr0=%d\n", *a+0); // *a+0 == 4
bytes * 0 == 0 printf("arr1=%d\n", *a+1); // *a+1 == 4 bytes * 1 == 4 printf("arr2=%d\n", *a+2); // *a+2 == 4 bytes * 2 ==
8 printf("arr3=%d\n", *a+3); // *a+3 == 4 bytes * 3 == 12 printf("arr4=%d\n", *a+4); // *a+4 == 4 bytes * 4 == 16 ... }
1
2
3
4
5
6
7
8
9
10
11

int main()
{
int *a[] = {0,0,0,0,0}; // array of int pointers, pointers hold addresses
printf("arr0=%d\n", *a+0); // *a+0 == 4 bytes * 0 == 0
printf("arr1=%d\n", *a+1); // *a+1 == 4 bytes * 1 == 4
printf("arr2=%d\n", *a+2); // *a+2 == 4 bytes * 2 == 8
printf("arr3=%d\n", *a+3); // *a+3 == 4 bytes * 3 == 12
printf("arr4=%d\n", *a+4); // *a+4 == 4 bytes * 4 == 16
...

}
In the end a consistent, and somewhat interesting, case of bad pointer math. All from one little
character.

Um leitor do meu blog me enviou uma pergunta no outro dia pedindo para explicar um pedao de
cdigo com ponteiros. Eu achei que fosse um quebra-cabea muito interessante, no s porque eu
tinha que cair em um depsito de objeto com um amigo para trabalhar com ele. O erro consistente,
mesmo em todas as plataformas. Aqui est uma verso ligeiramente modificada do cdigo original.
Vamos chamar esse arquivo bad.c. Veja se voc pode perceber o erro.
Se voc compilar que, bad.c gcc voc vai ter um monte de avisos. Afirmando que no bom para
ignorar os avisos, ao execut-lo, voc dever ver uma sada como esta.
Ns rodamos este em ambos x64 mac e linux. Mesmo resultado. Qual o resultado esperado algo
como isto.
O erro realmente fcil de ignorar. O int * a [] significa uma matriz de ponteiros int. O que foi,
provavelmente, a inteno era um array de inteiros.
Mude isso e compil-lo e feito. Ficamos com a sada. A * um + n expresso dereferences o valor no
incio do uma matriz, neste caso 0, e adiciona n para ela. A * um + 0 foi, provavelmente, destina-se
a ser * (a + 0), bem como, levando a sada da mesma, mas por razes diferentes. Ligeiramente
interessante.
Ignorando a matriz de ponteiros int em vez de ints e os erros de lgica dereferencing, o que
interessante que o erro consistente, mesmo em todas as plataformas. Voc sabe por qu?
The Puzzle
Confused. Executar a verso de erro vrias vezes voc vai ter a mesma sada. Geralmente com erros
de ponteiro que voc comea a mudar posies de memria e valores. Mas a sada de erro foi
consistente. Eu tive que trabalhar com ele em um depsito de objeto com o meu amigo Seth
Hartbecke para descobrir o que estava acontecendo.
Vamos cair na sada de montagem.
Isto interessante. Leva o primeiro valor da matriz, 0, e adiciona 4, 8, 12, e 16 a ele em sequncia
para dar nosso sada. Onde que esses valores vm? Voc j resolveu?
A Bad Case of Pointer Math
Este um caso de m ponteiro matemtica.
A primeira pea do quebra-cabea que int * a [] uma matriz de ponteiros. A inicializao
{0.0.0.0.0} ou {0,1,2,3,4} so, com toda a razo de acordo com o compilador, visto como ponteiros
titulares de endereos. H casos em que so usados endereos absolutos, embora seja mais comum
em programao incorporado.
A segunda pea do quebra-cabea a 0, 4, 8, 12, 16 na montagem. Estes so equivalentes a 0 * 4, 1
* 4, 4 * 2, 3 * 4, 4 e * 4, respectivamente.
O int * a matemtica ponteiro [] sendo uma matriz de ponteiros, o compilador est fazendo. O * a +
1 valor est sendo avaliado como um ponteiro int + n. Neste caso ints so 4 bytes e de acordo com a

matemtica ponteiro, a + n itens so traduzidos para 4 bytes * n itens.


No final, um consistente, e um tanto interessante, caso de mau ponteiro matemtica. Tudo a partir
de um personagem pouco.

You might also like