14.04.2015, 18:44
(
Last edited by Misiur; 17/04/2015 at 10:45 AM.
)
Disclaimer
This is a VERY advanced PAWN topic, if you don't know what to use #emit for, you don't need it, simple as! The compiler will NOT detect mistakes and it is VERY easy to screw your entire mode up using this. If you want a basic introduction to some of the concepts here, read kizla's tutorial first:
Abstract machine and emit
Introduction
I decided to start a topic for discussing this PAWN element, this is partially a tutorial but it's not actually designed to tell you exactly how to use it. There has been a slow, self-reinforcing, increase in the development of code using "#emit", so this topic is to collect together some of the developments already found and discuss future ones. Further to the disclaimer above:
Basics
PAWN is open source, the source code for the virtual machine which executes P-code (what PAWN compiles to and what is stored in .AMX files) can be found here:
http://code.******.com/p/pawnscript/...%2Ftrunk%2Famx
The source code for the 4.0 PAWN compiler can be found here (note that SA:MP uses a custom 3.6 compiler):
http://code.******.com/p/pawnscript/...unk%2Fcompiler
The general project is here:
http://www.compuphase.com/pawn/pawn.htm
http://code.******.com/p/pawnscript/
These are all important resources. (At least) three important techniques used below were discovered by studying and understanding how the machine works at a low level and exploiting this knowledge.
The Pawn_Implementer_Guide.pdf (pawn-imp.pdf) can be found here:
http://www.compuphase.com/pawn/Pawn_...nter_Guide.pdf
This is possibly the most important of the resources as it documents all the OpCodes, the .AMX file format, the function call convention, the stack layout, and everything else you need to know.
Know this compiler switch! It will generate a .ASM file for your mode instead of a .AMX file that can be "easily" read to see what assembly is generated for different PAWN code. There is also "-l" which only runs the pre-processor, but that's less useful here.
Other concepts will be gone over VERY quickly, if you don't understand anything look it up online - the x86 processor is a stack/register machine (as are MANY other architectures), so there's plenty of general documentation on the concept.
Registers
There are two temporary registers in which most calculations are done. "pri" (PRImary) and "alt" (ALTernate), the former does most of the work and has the most operations available. There are also registers storing execution state such as the next instruction and the current stack pointer. These are read with "LCTRL" and written with "SCTRL".
AMX Format
Understanding the AMX file format is crucial to understanding most "#emit" use. Global and static data is stored in a segment called "data" ("DAT"), function's code is stored in the "code" segment ("COD"), locals in the "stack" ("STP" and "STK") and some local arrays in the "heap" ("HEA"). Note that the heap and the stack are in the same block of memory but start at opposite ends, so it is possible for them to overwrite each other (in which case you will get a stack/heap collision error at run-time). This memory is also allocated by the server when a mode starts and is not stored in the .AMX file (as "COD" and "DAT" are), the size is controlled using:
An entry in one of the lists looks like:
The name table is just a collection of strings in C order. This is different to PAWN packed strings which are stored in cells in little endian:
The strings in this table are NULL delimited, and packed to use as little memory as possible (see the later section on "jagged arrays" for an idea of what this looks like).
Constant arrays (especially string constants) are all stored in the "DAT" segment (there are options to move them, but they're irrelevent in SA:MP, only when ROM is available).
If you wish to explore the AMX file, add this to the top of a mode, compile it and open the .AMX file in a HEX editor such as "XVI":
This pragma will disable the compression normally used to make the AMX smaller, and will thus make that the output exactly conforms to the scructure I posted above.
OpCodes
OpCodes are used to write the basic P-code instructions in to PAWN. There is a long section documenting them all and their uses in the pawn-imp document, so they will be very quickly skimmed over here. They take 0 or 1 parameter (but they don't CHECK parameters and some won't do what you expect) (some take more, but aren't supported) or specify their operands in their name:
".S" means "stack", ".C" means "constant", ".I" means "indirection", ".B" means an alternate implementation, ".ADR" means "address", ".pri" means "primary register", ".alt" means "alternate register".
In the code above, the result will be "11". If we assume that "a", "b", and "c" are the only local variables in the current function, the code below will give a result of "12":
The addresses of variables will be explained later, but here the wrong data is loaded and the compiler will tell you nothing.
This code will print "5":
This code will print "-5":
This code will compile but fail to run (not sure why as the .AMX doesn't actually contain the extra "85" (which, by the way, is the OpCode number for "NEG")):
Often it is possible to shuffle the order in which specific opcodes are run to minimise temporary variable or stack usage and maximise the use of "pri" and "alt", but not always.
Stack
Compiles as:
After calling "Func" the stack looks like:
Extra points for spotting that the return address from "Func" doesn't match the code I posted, it's actually for this code used to dump the stack (except "i" and "j") and then some:
You can reference variables by name:
Or by relative address:
Things like the parameters passed to a function do not have a variable name, but they DO have a stack location, so you can still get at them. With what I've just said, you should now be able to write "numargs" in one line of assembly (two if you want to save the value to a variable instead of "pri" or "alt").
The first version creates a COPY of the variable by pushing it to the stack:
The second version modifies the original by pushing the address to the stack:
The third version is essentially the same as the second, but you can't reference the variables by name.
Here are two scripts for you to compile and look at. If you don't learn something about the way the PAWN compiler stores strings, you've done it wrong (note that arrays are ALWAYS passed by reference, as are varargs - look at how "printf" gets "65"):
From this point on I'll assume you understand all of this, and have played around with the examples above and more scripts (using "-a") until you do. To check you understand it, pass an array as the first parameter to "Func" in the second example, without changing the function prototype.
Arrays
Compile and study the following mode:
What do you think this data is:
It comes after the data for "gFirst" (represented as a string of "2a") and before the data for "gSecond" (a string of "0"). Hint: Look at the code calling the second "printf", work through it manually, and look at the differences between the values in that line. All numbers in a .ASM file are in HEX, and all addresses and offsets are in BYTES, not CELLS.
Run that and see what happens, then try figure out why from the assembly. We'll go one further in a second.
OK, so the above code stores 12 characters in 12 cells (we'll ignore packed strings for now). The following code stores them in 18 cells, but makes accessing them easier:
Clearly the former is preferable from the point of view of memory, the latter is preferable from the point of view of access. We now know how 2D arrays access elements, so can we manipulate this knowledge to combine the best of both worlds? Yes! The first bit of data in a 2D array stores the offset in the main data for each of the second dimensions. By default they are all the same length, but we can manipulate these pointers at a low level to point to different offsets. So instead of:
We end up with something like:
For reference, I will refer to each set of data in the second dimension as a "data set", so:
Is the first data set in the second dimension of the original array.
This is what was meant earlier by extending the abilities of PAWN - you can't have arrays with different numbers of elements in each of the second dimension data sets when using the compiler (and they're tricky without the "-d0" compiler flag to remove run-time bounds checks):
These are called "Jagged" arrays simply because that's how they look if your language supports declaring them natively (which PAWN doesn't - this is a hypothetical declaraion):
Snippets
You can even use offsets if you know them:
So what happens if you do:
That will try and load the data 10000 cells after the current frame pointer, which is likely to be outside the stack. PAWN is "clever" - if you try and load an address that is not in the data, stack, or heap parts of the script, it will complain and not let you do it for security reasons. "LOAD" loads a global variable's data, "LOAD.S" loads a local variable's data, and thus both specify from which data area they should be loaded; however, there are two more OpCodes: "LREF" and "LREF.S". Like "LOAD" and "LOAD.S" the former takes a global and the latter takes a local, but instead of loading the data in that variable, they load the data at the address (relative to "DAT") stored in that variable. They too are restricted so you can't just load an address from any variable, but the CONTENTS of the variable are NOT limited, so you can put any pointer in to the variable and it will be loaded.
The following will demonstrate this. The "DAT" register stores the offset from the start of the AMX to the start of the data segment, which means that the start of the AMX is at an address of "-DAT" (falling outside the data segment). The pointer to the list of public functions is 32 bytes from the start of the AMX, so has an address of "-DAT + 32" (or "32 - DAT"), lets load this data using the fact that "LREF" doesn't check if the second address is valid. "LREF" ALWAYS takes a variable, so we get a little bit of indirection:
If you see a "STOR" followed by an "LREF", this is almost certainly what it is doing - accessing information outside the data segment. You can even access variables on the stack this way, but there are better ways. And if you know the offset from the start of the AMX to other server data, you can read and write (using "SREF") it.
In the code above the three calls to "Func1" all have a constant number of parameters at compile-time. "Func1" can take variable arguments but the PAWN compiler mandates that the number be known in advance. The single "printf" call would have 2, 3, or 4, parameters depending on where "Func1" was called from and this is not supported, but can be bypassed. Note that this is not easy:
For more information see this post by Zeex and my follow up (and read the comments above which explain what different chunks of OpCodes are doing):
http://forum.sa-mp.com/showthread.ph...718#post549718
Here "SYSREQ.C" is used to call the native function. Note however that there is a bug in the compiler which means that it will crash if you use "SYSREQ.C" with a function not already used in your code. The simplest way to avoid this is by doing:
Before your "printf" using function. This is public so that it is ALWAYS included in the final .AMX file, a stock function would not be used, thus would be removed and thus would trigger the bug. Just having the function used somewhere isn't enough, it must be in a function that is included in the final output and comes before the "SYSREQ.C" usage, but there are still other ways of doing this.
Brilliant - this will get the return address, put it on the stack, and jump to the start of the given function (always a "PROC" instruction). Except this takes a constant offset, and working them out is impossible unless you know how the compiler will lay out your functions in memory.
There is also "CALL.pri" which will jump to the absolute address stored in "pri", except it won't as it was removed for security reasons. This means the only way left to call functions is by manipulation of the "COD" pointer itself:
That will get the current instruction pointer (always points to the NEXT instruction to run), add an offset of 28 to get the instruction after the "SCTRL" line, push that value as the return address, then store the address of the first instruction of the function to "CIP". Basiclally, this code manually replicates what "CALL" does. It's also fairly constant, but if you change the instructions between "LCTRL" and "SCTRL" you will also need to change the offset (4 bytes per instruction and operand (counted in comments)) to get the address AFTER "SCTRL".
This will load the address of "Func1" into the "alt" register. The advantage of this is that you don't need the "pointer" register at all; the disadvantages are that this will mean changing the standard call code (and thus the offset) slightly, and that you must specify the function in advance - there's no real dynamic coding beyond maybe a "switch". You can of course easilly get around the first problem:
All you need to know to understand how this code works is that data addresses in PAWN are relative to "DAT", which itself is relative to the base address of the .AMX file; and that "funcidx" returns the index in the public function list of the given function. See the section above on the .AMX format for more details. This has the advantage that you can construct the function name at run-time ("format" etc), but you need the complete name.
"idx" is used to detect multiple functions with the same name start, returning one at a time. There are also functions to get the pointer to the public function entry (the address of the name/code pointer pair) and the function name, and to get data using other name parts (some more efficiently):
That code will loop through all ZCMD commands using a 4 character prefix. You could instead use "AMX_GetPublicName" and pass "cmd_" as the last parameter, but it's less efficient.
Libraries
There are a few libraries which rely heavilly on "#emit" tricks in order to work, these will be listed and detailed here where they are of interest. Note that I clearly know about my own libraries better than other people's, so if the list seems a little biased it currently is, please link to other interesting ones.
This library allows you to find the offset from the "DAT" segment of the AMX to other parts of the SA:MP server itself. For example the mode restart time is adjustable in YSF because it is located at a know address. To modify this from PAWN requires knowing the address of the AMX itself. This is found by calling a function ("GetAmxBase"), compiled using a relative offset from the call location, but converted at run-time to use a global absolute address. "GetAmxBase" reads the data located at the address 4 bytes before its return address, i.e. reads its own address according to the call site, and subtracts what it thinks is its absolute address in the AMX, giving the globally absolute address of the AMX base. Once this is known any other offset can be trivially calculated.
Clearly the code above will not compile, but using this you can pass variable arguments through to another function without messing about with the "#emit" code seen earlier:
Here "0" is the number of static (i.e. named) parameters passed to "Func1". "va_printf" is exactly the same as "printf" but uses the parameters of "Func1" instead:
There is a custom syntax to make it clear that something unusual is going on, but the emphasis here is on clarity of execution. The main aim of this is to allow the use of "format" in a function such as "SendClientMessageFormatted".
This is a VERY advanced PAWN topic, if you don't know what to use #emit for, you don't need it, simple as! The compiler will NOT detect mistakes and it is VERY easy to screw your entire mode up using this. If you want a basic introduction to some of the concepts here, read kizla's tutorial first:
Abstract machine and emit
Introduction
I decided to start a topic for discussing this PAWN element, this is partially a tutorial but it's not actually designed to tell you exactly how to use it. There has been a slow, self-reinforcing, increase in the development of code using "#emit", so this topic is to collect together some of the developments already found and discuss future ones. Further to the disclaimer above:
- If you are writing a gamemode you probably don't need this.
- If you are writing a filterscript you probably don't need this.
- If you are writing a library you probably don't need this.
- If you are looking to optimise your mode this is not the way to go.
Basics
PAWN is open source, the source code for the virtual machine which executes P-code (what PAWN compiles to and what is stored in .AMX files) can be found here:
http://code.******.com/p/pawnscript/...%2Ftrunk%2Famx
The source code for the 4.0 PAWN compiler can be found here (note that SA:MP uses a custom 3.6 compiler):
http://code.******.com/p/pawnscript/...unk%2Fcompiler
The general project is here:
http://www.compuphase.com/pawn/pawn.htm
http://code.******.com/p/pawnscript/
These are all important resources. (At least) three important techniques used below were discovered by studying and understanding how the machine works at a low level and exploiting this knowledge.
The Pawn_Implementer_Guide.pdf (pawn-imp.pdf) can be found here:
http://www.compuphase.com/pawn/Pawn_...nter_Guide.pdf
This is possibly the most important of the resources as it documents all the OpCodes, the .AMX file format, the function call convention, the stack layout, and everything else you need to know.
Code:
-a
Other concepts will be gone over VERY quickly, if you don't understand anything look it up online - the x86 processor is a stack/register machine (as are MANY other architectures), so there's plenty of general documentation on the concept.
Registers
There are two temporary registers in which most calculations are done. "pri" (PRImary) and "alt" (ALTernate), the former does most of the work and has the most operations available. There are also registers storing execution state such as the next instruction and the current stack pointer. These are read with "LCTRL" and written with "SCTRL".
AMX Format
Understanding the AMX file format is crucial to understanding most "#emit" use. Global and static data is stored in a segment called "data" ("DAT"), function's code is stored in the "code" segment ("COD"), locals in the "stack" ("STP" and "STK") and some local arrays in the "heap" ("HEA"). Note that the heap and the stack are in the same block of memory but start at opposite ends, so it is possible for them to overwrite each other (in which case you will get a stack/heap collision error at run-time). This memory is also allocated by the server when a mode starts and is not stored in the .AMX file (as "COD" and "DAT" are), the size is controlled using:
Code:
#pragma dynamic <size>
- Structure
Code:
4: Size 2: Magic 1: File version (8) 1: Minimum VM version 2: Flags (execution options) 2: Defsize (bytes to store a named function's data) 4: COD offset from the start of the AMX file 4: DAT offset 4: HEA offset 4: STP offset 4: CIP (Current Instruction Pointer), address of "main" or -1. 4: Public functions list pointer 4: Native functions list pointer 4: Libraries list pointer 4: Public variables list pointer 4: Public tags list pointer 4: Pointer to names table X: Publics list X: Natives list X: Library list X: Pubvar list X: Tags list X: Names table X: COD X: DAT X: STP/HEA
Code:
4: Address of the function/variable (or other value) 4: Pointer to the name
Code:
Hello There ; Original string Hello There ; C storage order H e l l o T h e r e ; PAWN storage order lleHhT o ere ; Packed string storage order
Constant arrays (especially string constants) are all stored in the "DAT" segment (there are options to move them, but they're irrelevent in SA:MP, only when ROM is available).
If you wish to explore the AMX file, add this to the top of a mode, compile it and open the .AMX file in a HEX editor such as "XVI":
Code:
#pragma compress 0
OpCodes
OpCodes are used to write the basic P-code instructions in to PAWN. There is a long section documenting them all and their uses in the pawn-imp document, so they will be very quickly skimmed over here. They take 0 or 1 parameter (but they don't CHECK parameters and some won't do what you expect) (some take more, but aren't supported) or specify their operands in their name:
Code:
new a = 5, b = 6, c; // Load the data from "a" in to the primary register. #emit LOAD.S.pri a // Load the data from "b" in to the alternate register. #emit LOAD.S.alt b // Add the numbers and store the result in the primary register. #emit ADD // Copy the result to "c". #emit STOR.S.pri c
In the code above, the result will be "11". If we assume that "a", "b", and "c" are the only local variables in the current function, the code below will give a result of "12":
Code:
new a = 5, b = 6, c; // Load the local address of "a" in to the primary register. #emit CONST.pri a // Load the local address of "b" in to the alternate register. #emit CONST.alt b // Add the numbers and store the result in the primary register. #emit ADD // Copy the result to "c". #emit STOR.S.pri c
This code will print "5":
Code:
new a; #emit CONST.pri 5 #emit STOR.S.pri a printf("%d", a);
Code:
new a; #emit CONST.pri 5 #emit NEG #emit STOR.S.pri a printf("%d", a);
Code:
new a; #emit CONST.pri 5 #emit NEG 85 #emit STOR.S.pri a printf("%d", a);
Stack
- Calling Convention
Code:
Func(a, b) { } main() { Func(2, 3); }
Code:
CODE 0 ; 0 ;program exit point halt 0 proc ; Func ; line 2 ;$lcl b 10 ;$lcl a c zero.pri retn proc ; main ; line 6 ; line 7 push.c 3 ;$par push.c 2 ;$par push.c 8 call Func ;$exp zero.pri retn STKSIZE 1000
Code:
0 ; Bytes passed to "main" 0 ; "main" return address 0 ; Previous frame 3 ; b 2 ; a 8 ; Bytes passed to "Func" 236 ; "Func" return address 16400 ; Previous frame ; Current frame
Code:
#include <a_samp> Func(a, b) { for (new i = 0, j; i != 40; i += 4) { #emit LOAD.S.alt i #emit LCTRL 5 #emit ADD #emit STOR.S.pri j #emit LREF.S.pri j #emit STOR.S.pri j printf("%d = %d", i, j); } } main() { Func(2, 3); }
- Locals
You can reference variables by name:
Code:
#emit LOAD.S.pri a
Code:
#emit LOAD.S.pri 12
- Stack and frame pointer
- Pass-by-value, pass-by-reference, and varargs
Code:
Func(a) { a = 5; // a is 5 } main() { new v = 6; // v is 6 Func(v); // v is 6 }
Code:
Func(&a) { a = 4; // a is 4 } main() { new v = 6; // v is 6 Func(v); // v is 4 }
Code:
Func(...) { setarg(0, 3); } main() { new v = 6; // v is 6 Func(v); // v is 3 }
Code:
#emit PUSH.S a
Code:
#emit CONST.pri 5 #emit STOR.S.pri
Code:
#emit PUSH.addr a
Code:
#emit CONST.pri 5 #emit SREF.S.pri
Here are two scripts for you to compile and look at. If you don't learn something about the way the PAWN compiler stores strings, you've done it wrong (note that arrays are ALWAYS passed by reference, as are varargs - look at how "printf" gets "65"):
Code:
#include <a_samp> stock Func(a, b, c) { printf("%d %d %d", a, b, c); } main() { new x[10], v = 7; Func(v, 65, x[0]); Func(0x65, x[v], x[v + 1]); printf("%d %d %d", 65, v, x[0]); printf("%d %d %d", 0x65, x[v], x[v + 1]); }
Code:
#include <a_samp> stock const cg_@d_@d_@d[10] = "%d %d %d"; stock Func(&a, b, c) { printf(cg_@d_@d_@d, a, b, c); } main() { new x[10], v = 7; Func(v, 65, x[0]); Func(0x65, x[v], x[v + 1]); printf(cg_@d_@d_@d, 65, v, x[0]); printf(cg_@d_@d_@d, 0x65, x[v], x[v + 1]); }
- Size
Code:
#pragma dynamic 65536
Compile and study the following mode:
Code:
#include <a_samp> new gSingle[30] = {42, ...}, gDouble[10][20]; main() { new v = 2; printf("%d", gSingle[v]); printf("%d", gDouble[v][v]); }
Code:
dump 28 50 78 a0 c8 f0 118 140 168 190
- Jagged arrays
Code:
#include <a_samp> new gStrings[3][4] = { {'H', 'e', 'l', 'l'}, {'o', '\0', 'H', 'i'}, {'\0', 'Y', 'o', '\0'} }; main() { printf("%s", gStrings[0][0]); printf("%s", gStrings[1][2]); printf("%s", gStrings[2][1]); }
OK, so the above code stores 12 characters in 12 cells (we'll ignore packed strings for now). The following code stores them in 18 cells, but makes accessing them easier:
Code:
#include <a_samp> new gStrings[3][6] = { {'H', 'e', 'l', 'l', 'o', '\0'}, {'H', 'i', '\0', '\0', '\0', '\0'}, {'Y', 'o', '\0', '\0', '\0', '\0'} }; main() { printf("%s", gStrings[0]); printf("%s", gStrings[1]); printf("%s", gStrings[2]); }
Code:
new gStrings[3][4] = { {'H', 'e', 'l', 'l'}, {'o', '\0', 'H', 'i'}, {'\0', 'Y', 'o', '\0'} };
Code:
new gStrings[3][6, 3, 3] = { {'H', 'e', 'l', 'l', 'o', '\0'}, {'H', 'i', '\0'}, {'Y', 'o', '\0'} };
Code:
{'H', 'e', 'l', 'l'}
This is what was meant earlier by extending the abilities of PAWN - you can't have arrays with different numbers of elements in each of the second dimension data sets when using the compiler (and they're tricky without the "-d0" compiler flag to remove run-time bounds checks):
Code:
#include <a_samp> new gStrings[3][4] = { {'H', 'e', 'l', 'l'}, {'o', '\0', 'H', 'i'}, {'\0', 'Y', 'o', '\0'} }; main() { // Get the address of the start of the array. #emit CONST.pri gStrings // Get the address of the pointer to the second data set. #emit ADD.C 4 #emit MOVE.alt // Load the data there. #emit LOAD.I // Add 2 cells and store the result back. #emit ADD.C 8 #emit STOR.I // Get the third pointer. #emit CONST.pri 4 #emit ADD #emit MOVE.alt #emit LOAD.I // Add one cell. #emit ADD.C 4 #emit STOR.I printf("%s", gStrings[0]); printf("%s", gStrings[1]); printf("%s", gStrings[2]); }
Code:
new gStrings[3][6, 3, 3] = { {'H', 'e', 'l', 'l', 'o', '\0'}, {'H', 'i', '\0'}, {'Y', 'o', '\0'} };
- Loading Data
Code:
new local = 42; #emit LOAD.S.pri local
Code:
new local = 42; #emit LOAD.S.pri 4
Code:
new local = 42; #emit LOAD.S.pri 40000
The following will demonstrate this. The "DAT" register stores the offset from the start of the AMX to the start of the data segment, which means that the start of the AMX is at an address of "-DAT" (falling outside the data segment). The pointer to the list of public functions is 32 bytes from the start of the AMX, so has an address of "-DAT + 32" (or "32 - DAT"), lets load this data using the fact that "LREF" doesn't check if the second address is valid. "LREF" ALWAYS takes a variable, so we get a little bit of indirection:
Code:
new pointer; #emit LCTRL 1 #emit NEG #emit ADD.C 32 #emit STOR.S.pri pointer #emit LREF.S.pri pointer
- Variable Arguments
Code:
Func1(...) { // No idea how many parameters will be here as it's not a constant. printf(...); } main() { // This is a constant - 2. Func1("%d", 6); // This is a constant - 3. Func1("%d %d", 6, 7); // This is a constant - 4. Func1("%d %d %d", 6, 7, 8); }
Code:
stock Func1(...) { // This is the number of parameters which are not variable that are passed // to this function (i.e. the number of named parameters). static const STATIC_ARGS = 0; // Get the number of variable arguments. new n = (numargs() - STATIC_ARGS) * BYTES_PER_CELL; if (n) { new arg_start, arg_end; // Load the real address of the last static parameter. Do this by // loading the address of the last known static parameter and then // adding the value of [FRM]. Because there are no static parameters // we have to use the number "8" as the stop point address. #emit CONST.alt 8 #emit LCTRL 5 #emit ADD #emit STOR.S.pri arg_start // Load the address of the last variable parameter. Do this by adding // the number of variable parameters on the value just loaded. #emit LOAD.S.alt n #emit ADD #emit STOR.S.pri arg_end // Push the variable arguments. This is done by loading the value of // each one in reverse order and pushing them. I'd love to be able to // rewrite this to use the values of pri and alt for comparison, // instead of having to constantly load and reload two variables. do { #emit LOAD.I #emit PUSH.pri arg_end -= BYTES_PER_CELL; #emit LOAD.S.pri arg_end } while (arg_end > arg_start); // Push the static parameters (none here). // Now push the number of arguments passed to format, including both // static and variable ones and call the function. n += BYTES_PER_CELL * STATIC_ARGS; #emit PUSH.S n #emit SYSREQ.C printf // Remove all data, including the return value, from the stack. n += BYTES_PER_CELL; #emit LCTRL 4 #emit LOAD.S.alt n #emit ADD #emit SCTRL 4 } }
http://forum.sa-mp.com/showthread.ph...718#post549718
Here "SYSREQ.C" is used to call the native function. Note however that there is a bug in the compiler which means that it will crash if you use "SYSREQ.C" with a function not already used in your code. The simplest way to avoid this is by doing:
Code:
forward _Func1_SYSREQ(); public _Func1_SYSREQ() { printf(""); }
- Calling Functions
Code:
#emit CALL Func1
There is also "CALL.pri" which will jump to the absolute address stored in "pri", except it won't as it was removed for security reasons. This means the only way left to call functions is by manipulation of the "COD" pointer itself:
Code:
#emit LCTRL 6 #emit ADD.C 28 // 2 #emit PUSH.pri // 1 #emit LOAD.S.pri pointer // 2 #emit SCTRL 6 // 2
- Finding Functions
- 1) Constants
Code:
#emit CONST.alt Func1
Code:
#emit CONST.pri Func1 #emit STOR.S.pri pointer
- 2) funcidx
Code:
new idx = funcidx("Func1"), pointer; // Get the pointer to the public function list. #emit LCTRL 1 #emit NEG #emit ADD.C 32 #emit STOR.S.pri pointer #emit LREF.S.alt pointer // Get the pointer to the function at the given index. #emit LCTRL 1 #emit NEG #emit ADD #emit LOAD.S.alt idx #emit SHL.C.alt 3 #emit ADD #emit STOR.S.pri pointer #emit LREF.S.pri pointer #emit STOR.S.pri pointer // Call the function
- 3) Searching
- 3) y_amx
Code:
new idx, pointer; while ((idx = AMX_GetPublicPointer(idx, pointer, "Func1))) { // Call the function. }
Code:
new idx, buffer[32]; while ((idx = AMX_GetPublicNamePrefix(idx, buffer, _A<cmd_>))) { // Call the function. }
Libraries
There are a few libraries which rely heavilly on "#emit" tricks in order to work, these will be listed and detailed here where they are of interest. Note that I clearly know about my own libraries better than other people's, so if the list seems a little biased it currently is, please link to other interesting ones.
- y_inline - ******
- pointers - Slice
- y_amx - ******
- phys_memory - Zeex_
This library allows you to find the offset from the "DAT" segment of the AMX to other parts of the SA:MP server itself. For example the mode restart time is adjustable in YSF because it is located at a know address. To modify this from PAWN requires knowing the address of the AMX itself. This is found by calling a function ("GetAmxBase"), compiled using a relative offset from the call location, but converted at run-time to use a global absolute address. "GetAmxBase" reads the data located at the address 4 bytes before its return address, i.e. reads its own address according to the call site, and subtracts what it thinks is its absolute address in the AMX, giving the globally absolute address of the AMX base. Once this is known any other offset can be trivially calculated.
- y_va - ******
Code:
Func1(a, b, c) { y_va_func(); } y_va_func() { // Push the parameters of the function that CALLED y_va_func, not the // parameters to this function itself (there are none). Func3(a, b, c); } Func3(...) { }
Code:
Func1(...) { va_printf("%d %d %d", 0); }
Code:
Func1(42, 43, 44);