[Tutorial] Compiler string bug
#1

Interesting bug I found the other day (more endeavored to find). With a certain revision of the compiler (not sure which one), this became invalid code:

Code:
#define STR(%0) "%0"
This was a trick which dcmd relied on and as a result it was broken, prompting a compiler modification to introduce the stringise operator:

Code:
#define STR(%0) #%0
This was added to the SA:MP compiler, and later to the official compiler as well, though in a slightly different format. Obviously this needs to know where to end the string:

Code:
print(STR(hi));
You want the generated code to be:

Code:
print("hi"); // Valid
Not:

Code:
print("hi);" // Invalid
For this reason the stringise operator looks for certain operators to be used as explicit ends of strings, namely ',', ')' and ';', there are however numerous bugs in the implementation (done by me), one is the fact that ':' is not considered a valid end so:

Code:
new g[32] = b ? "hi" : "there";
Is now invalid, however this was very quickly worked around:

Code:
new g[32] = b ? ("hi") : ("there");
The other MAJOR bug is that context is not taken in to account, so:

Code:
print(STR(5 * (6 + 7)));
You would expect to produce:

Code:
print("5 * (6 + 7)");
But the string in fact ends at the first close bracket, regardless of indentation level, giving:

Code:
print("5 * (6 + 7"));
Which is again invalid. Recently this became an issue for me, so I went back and tried to find a bug in the original "%0" stop, which I found:

Code:
"\"%0\""
The fix does not take in to account escape sequences, so thinks the string ends after the second double quotes, which it doesn't. This allows the compiler to expand "%0" to whatever it's contents are inside the string. However the next part of the compiler IS escape sequence aware, so any further macros will NOT be evaluated. To demonstrate this, here are several similar bits of code with their true outputs:

Input:

Code:
#define MACRO(%0) (%0) + 1 #define STR(%0) #%0 print(STR(MACRO(7)));
Output (invalid):

Code:
print("(7") + 1); // Brackets out of place.
Input:

Code:
#define MACRO(%0) %0 + 1 #define STR(%0) #%0 print(STR(MACRO(7)));
Output (valid):

Code:
print("7 + 1"); // No brackets.
Input:

Code:
#define MACRO(%0) (%0) + 1 #define STR(%0) "%0" print(STR(MACRO(7)));
Output (odd):

Code:
print("%0"); // STR expanded, %0 not.
Input:

Code:
#define MACRO(%0) "(%0) + 1" #define STR(%0) #%0 print(STR(MACRO(7)));
Output (odd):

Code:
print("(%0) + 1"); // STR and MACRO expanded, %0 not.
Input:

Code:
#define MACRO(%0) "\"(%0) + 1\"" #define STR(%0) #%0 print(STR(MACRO(7)));
Output (valid):

Code:
print("\"(7) + 1\""); // STR and MACRO expanded.
Input:

Code:
#define MACRO(%0) "(%0) + 1" #define STR(%0) "\"%0\"" print(STR(MACRO(7)));
Output (wrong):

Code:
print("\"MACRO(7)\""); // "%0" expanded, "MACRO" not.
Interestingly, if you only do:

Code:
#define STR(%0) "\"%0"
It will break all macro expansions after the current one because it will think it's back in a string (at least for macro parameter expansion, normal macros will be expanded fine.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)