Looping over structure elements using pointers in C

  • A+
Category:Languages

I wrote this code to iterate over members of a structure. It works fine. Can I use similar method for structures with mixed type elements, i.e. some integers, some floats and ...?

#include <stdio.h> #include <stdlib.h>  struct newData {     int x;     int y;     int z; }  ;  int main() {     struct newData data1;     data1.x = 10;     data1.y = 20;     data1.z = 30;      struct newData *data2 = &data1;     long int *addr = data2;     for (int i=0; i<3; i++)     {         printf("%d /n", *(addr+i));     } } 

 


In C, "it works fine" is not good enough. Because your compiler is allowed to do this:

struct newData {     int x;     char padding1[523];     int y;     char padding2[364];     int z;     char padding3[251]; }; 

Of course, this is an extreme example. But you get the general idea; it's not guaranteed that your loop will work because it's not guaranteed that struct newData is equivalent to int[3].

So no, it's not possible in the general case because it's not always possible in the specific case!


Now, you might be thinking: "What idiots decided this?!" Well, I can't tell you that, but I can tell you why. Computers are very different to each other, and if you want code to run fast then the compiler has to be able to choose how to compile the code. Here's an example:

Processor 8 has an instruction to get individual bytes, and put them in a register:

GETBYTE addr, reg 

This works well with this struct:

struct some_bytes {    char age;    char data;    char stuff; } 

struct some_bytes can happily take up 3 bytes, and the code is fast. But what about Processor 16? It doesn't have GETBYTE, but it does have GETWORD:

GETWORD even_addr, reghl 

This only accepts an even-numbered address, and reads two bytes; one into the "high" part of the register and one into the "low" part of the register. In order to make the code fast, the compiler has to do this:

struct some_bytes {    char age;    char pad1;    char data;    char pad2;    char stuff;    char pad3; } 

This means that the code can run faster, but it also means that your loop won't work. That's OK though, because it's something called "Undefined Behaviour"; the compiler is allowed to assume that it'll never happen, and if it does happen the behaviour is undefined.

In fact, you've already run across this behaviour! Your particular compiler was doing this:

struct newData {     int x;     int pad1;     int y;     int pad2;     int z;     int pad3; }; 

Because your particular compiler defines long int as twice the length of int, you were able to do this:

|  x  | pad |  y  | pad |  z  | pad |  | long no.1 | long no.2 | long no.3 | | int |     | int |     | int |      

That code is, as you can tell by my precarious diagram, precarious. It probably won't work anywhere else. What's worse, your compiler, if it was being clever, would be able to do this:

for (int i=0; i<3; i++) {     printf("%d /n", *(addr+i)); } 

Hmm... addr is from data2 which is from data1 which is a pointer to a struct newData. The C specification says that only the pointer to the start of the struct will ever be dereferenced, so I can assume that i is always 0 in this loop!

for (int i=0; i<3 && i == 0; i++) {     printf("%d /n", *(addr+i)); } 

That means it only runs once! Hooray!

printf("%d /n", *(addr + 0)); 

And all I need to compile is this:

int main() {     printf("%d /n", 10); } 

Wow, the programmer will be so pleased that I've managed to speed this code up so much!

You won't be pleased. In fact, you'll get unexpected behaviour, and won't be able to work out why. But you would be pleased if you had written code free of Undefined Behaviour, and your compiler had done something similar. So it stays.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: