[Tutorial] #emit
#1

Disclaimer

This is a VERY advanced PAWN topic, if you don't know what to use #emit for, you don't need it, simple as! The compiler will NOT detect mistakes and it is VERY easy to screw your entire mode up using this. If you want a basic introduction to some of the concepts here, read kizla's tutorial first:

Abstract machine and emit

Introduction

I decided to start a topic for discussing this PAWN element, this is partially a tutorial but it's not actually designed to tell you exactly how to use it. There has been a slow, self-reinforcing, increase in the development of code using "#emit", so this topic is to collect together some of the developments already found and discuss future ones. Further to the disclaimer above:
  • If you are writing a gamemode you probably don't need this.
  • If you are writing a filterscript you probably don't need this.
  • If you are writing a library you probably don't need this.
  • If you are looking to optimise your mode this is not the way to go.
This is generally used to extend the abilities of PAWN itself, not write things in PAWN. Given the advanced nature of this tool, this topic will also be very advanced so save yourself the headache - I do not explain what a stack is, what a heap is, what OpCodes are or what registers are because these are all well documented elsewhere (but I do cover PAWN specifics on each topic). If you want to program in PAWN it's very easy, you don't need to know anything about the underlying PAWN architecture. If you want to program using #emit you need to understand how the virtual machine itself works (the more the better).

Basics

PAWN is open source, the source code for the virtual machine which executes P-code (what PAWN compiles to and what is stored in .AMX files) can be found here:

http://code.******.com/p/pawnscript/...%2Ftrunk%2Famx

The source code for the 4.0 PAWN compiler can be found here (note that SA:MP uses a custom 3.6 compiler):

http://code.******.com/p/pawnscript/...unk%2Fcompiler

The general project is here:

http://www.compuphase.com/pawn/pawn.htm

http://code.******.com/p/pawnscript/

These are all important resources. (At least) three important techniques used below were discovered by studying and understanding how the machine works at a low level and exploiting this knowledge.

The Pawn_Implementer_Guide.pdf (pawn-imp.pdf) can be found here:

http://www.compuphase.com/pawn/Pawn_...nter_Guide.pdf

This is possibly the most important of the resources as it documents all the OpCodes, the .AMX file format, the function call convention, the stack layout, and everything else you need to know.

Code:
-a
Know this compiler switch! It will generate a .ASM file for your mode instead of a .AMX file that can be "easily" read to see what assembly is generated for different PAWN code. There is also "-l" which only runs the pre-processor, but that's less useful here.

Other concepts will be gone over VERY quickly, if you don't understand anything look it up online - the x86 processor is a stack/register machine (as are MANY other architectures), so there's plenty of general documentation on the concept.

Registers

There are two temporary registers in which most calculations are done. "pri" (PRImary) and "alt" (ALTernate), the former does most of the work and has the most operations available. There are also registers storing execution state such as the next instruction and the current stack pointer. These are read with "LCTRL" and written with "SCTRL".

AMX Format

Understanding the AMX file format is crucial to understanding most "#emit" use. Global and static data is stored in a segment called "data" ("DAT"), function's code is stored in the "code" segment ("COD"), locals in the "stack" ("STP" and "STK") and some local arrays in the "heap" ("HEA"). Note that the heap and the stack are in the same block of memory but start at opposite ends, so it is possible for them to overwrite each other (in which case you will get a stack/heap collision error at run-time). This memory is also allocated by the server when a mode starts and is not stored in the .AMX file (as "COD" and "DAT" are), the size is controlled using:

Code:
#pragma dynamic <size>
  • Structure
The AMX structure in memory (i.e. with "STP" and "HEA") in bytes. See pawn-imp for more details. Segments marked "X" are variable length depending on the amount of data.

Code:
4: Size
2: Magic
1: File version (8)
1: Minimum VM version
2: Flags (execution options)
2: Defsize (bytes to store a named function's data)
4: COD offset from the start of the AMX file
4: DAT offset
4: HEA offset
4: STP offset
4: CIP (Current Instruction Pointer), address of "main" or -1.
4: Public functions list pointer
4: Native functions list pointer
4: Libraries list pointer
4: Public variables list pointer
4: Public tags list pointer
4: Pointer to names table
X: Publics list
X: Natives list
X: Library list
X: Pubvar list
X: Tags list
X: Names table
X: COD
X: DAT
X: STP/HEA
An entry in one of the lists looks like:

Code:
4: Address of the function/variable (or other value)
4: Pointer to the name
The name table is just a collection of strings in C order. This is different to PAWN packed strings which are stored in cells in little endian:

Code:
Hello There ; Original string
Hello There ; C storage order
H   e   l   l   o       T   h   e   r   e    ; PAWN storage order
lleHhT o ere ; Packed string storage order
The strings in this table are NULL delimited, and packed to use as little memory as possible (see the later section on "jagged arrays" for an idea of what this looks like).

Constant arrays (especially string constants) are all stored in the "DAT" segment (there are options to move them, but they're irrelevent in SA:MP, only when ROM is available).

If you wish to explore the AMX file, add this to the top of a mode, compile it and open the .AMX file in a HEX editor such as "XVI":

Code:
#pragma compress 0
This pragma will disable the compression normally used to make the AMX smaller, and will thus make that the output exactly conforms to the scructure I posted above.

OpCodes

OpCodes are used to write the basic P-code instructions in to PAWN. There is a long section documenting them all and their uses in the pawn-imp document, so they will be very quickly skimmed over here. They take 0 or 1 parameter (but they don't CHECK parameters and some won't do what you expect) (some take more, but aren't supported) or specify their operands in their name:

Code:
new
    a = 5,
    b = 6,
    c;
// Load the data from "a" in to the primary register.
#emit LOAD.S.pri a
// Load the data from "b" in to the alternate register.
#emit LOAD.S.alt b
// Add the numbers and store the result in the primary register.
#emit ADD
// Copy the result to "c".
#emit STOR.S.pri c
".S" means "stack", ".C" means "constant", ".I" means "indirection", ".B" means an alternate implementation, ".ADR" means "address", ".pri" means "primary register", ".alt" means "alternate register".

In the code above, the result will be "11". If we assume that "a", "b", and "c" are the only local variables in the current function, the code below will give a result of "12":

Code:
new
    a = 5,
    b = 6,
    c;
// Load the local address of "a" in to the primary register.
#emit CONST.pri  a
// Load the local address of "b" in to the alternate register.
#emit CONST.alt  b
// Add the numbers and store the result in the primary register.
#emit ADD
// Copy the result to "c".
#emit STOR.S.pri c
The addresses of variables will be explained later, but here the wrong data is loaded and the compiler will tell you nothing.

This code will print "5":

Code:
new
    a;
#emit CONST.pri  5
#emit STOR.S.pri a
printf("%d", a);
This code will print "-5":

Code:
new
    a;
#emit CONST.pri  5
#emit NEG
#emit STOR.S.pri a
printf("%d", a);
This code will compile but fail to run (not sure why as the .AMX doesn't actually contain the extra "85" (which, by the way, is the OpCode number for "NEG")):

Code:
new
    a;
#emit CONST.pri  5
#emit NEG        85
#emit STOR.S.pri a
printf("%d", a);
Often it is possible to shuffle the order in which specific opcodes are run to minimise temporary variable or stack usage and maximise the use of "pri" and "alt", but not always.

Stack
  • Calling Convention
The PAWN stack grows down. Parameters are pushed in reverse order, putting them in order when read up through memory.

Code:
Func(a, b)
{
}

main()
{
    Func(2, 3);
}
Compiles as:

Code:
CODE 0 ; 0
;program exit point
    halt 0

    proc  ; Func
    ; line 2
    ;$lcl b 10
    ;$lcl a c
    zero.pri
    retn

    proc  ; main
    ; line 6
    ; line 7
    push.c 3
    ;$par
    push.c 2
    ;$par
    push.c 8
    call Func
    ;$exp
    zero.pri
    retn


STKSIZE 1000
After calling "Func" the stack looks like:

Code:
0 ; Bytes passed to "main"
0 ; "main" return address
0 ; Previous frame
3 ; b
2 ; a
8 ; Bytes passed to "Func"
236 ; "Func" return address
16400 ; Previous frame
; Current frame
Extra points for spotting that the return address from "Func" doesn't match the code I posted, it's actually for this code used to dump the stack (except "i" and "j") and then some:

Code:
#include <a_samp>

Func(a, b)
{
    for (new i = 0, j; i != 40; i += 4)
    {
        #emit LOAD.S.alt i
        #emit LCTRL 5
        #emit ADD
        #emit STOR.S.pri j
        #emit LREF.S.pri j
        #emit STOR.S.pri j
        printf("%d = %d", i, j);
    }
}

main()
{
    Func(2, 3);
}
  • Locals
Local variables are located BELOW the current frame pointer, function parameters are ABOVE the current frame pointer (which is modified by the "PROC" instruction on function entry). In the code above "i" has a relative address of "-4" ("0xFFFFFFFC"), "j" has a relative address of "-8" ("0xFFFFFFF8"), "a" has a relative address of "12", and "b" has a relative address of "16".

You can reference variables by name:

Code:
#emit LOAD.S.pri a
Or by relative address:

Code:
#emit LOAD.S.pri 12
Things like the parameters passed to a function do not have a variable name, but they DO have a stack location, so you can still get at them. With what I've just said, you should now be able to write "numargs" in one line of assembly (two if you want to save the value to a variable instead of "pri" or "alt").
  • Stack and frame pointer
The frame pointer ("FRM") points to the address at the bottom of the current functions header (the data pushed when the function is called, aka the frame). The stack pointer points to the last data put on to the stack (including locals), so in the second chunk of code above the stack pointer will be 8 bytes below the frame pointer, at "j". If a new variable were declared it would be put below the stack pointer and the stack pointer would be moved down 4 bytes (called a "PUSH"). In the same way, when a variable goes out of scope, it is "POPPED" and the stack pointer is moved back up 4 bytes (or more for many/large variables).
  • Pass-by-value, pass-by-reference, and varargs
Code:
Func(a)
{
    a = 5;
    // a is 5
}

main()
{
    new
        v = 6;
    // v is 6
    Func(v);
    // v is 6
}
Code:
Func(&a)
{
    a = 4;
    // a is 4
}

main()
{
    new
        v = 6;
    // v is 6
    Func(v);
    // v is 4
}
Code:
Func(...)
{
    setarg(0, 3);
}

main()
{
    new
        v = 6;
    // v is 6
    Func(v);
    // v is 3
}
The first version creates a COPY of the variable by pushing it to the stack:

Code:
#emit PUSH.S a
Code:
#emit CONST.pri 5
#emit STOR.S.pri
The second version modifies the original by pushing the address to the stack:

Code:
#emit PUSH.addr a
Code:
#emit CONST.pri 5
#emit SREF.S.pri
The third version is essentially the same as the second, but you can't reference the variables by name.

Here are two scripts for you to compile and look at. If you don't learn something about the way the PAWN compiler stores strings, you've done it wrong (note that arrays are ALWAYS passed by reference, as are varargs - look at how "printf" gets "65"):

Code:
#include <a_samp>

stock Func(a, b, c)
{
    printf("%d %d %d", a, b, c);
}

main()
{
    new
        x[10],
        v = 7;
    Func(v, 65, x[0]);
    Func(0x65, x[v], x[v + 1]);
    printf("%d %d %d", 65, v, x[0]);
    printf("%d %d %d", 0x65, x[v], x[v + 1]);
}
Code:
#include <a_samp>

stock const
    cg_@d_@d_@d[10] = "%d %d %d";

stock Func(&a, b, c)
{
    printf(cg_@d_@d_@d, a, b, c);
}

main()
{
    new
        x[10],
        v = 7;
    Func(v, 65, x[0]);
    Func(0x65, x[v], x[v + 1]);
    printf(cg_@d_@d_@d, 65, v, x[0]);
    printf(cg_@d_@d_@d, 0x65, x[v], x[v + 1]);
}
From this point on I'll assume you understand all of this, and have played around with the examples above and more scripts (using "-a") until you do. To check you understand it, pass an array as the first parameter to "Func" in the second example, without changing the function prototype.
  • Size
How does the assembly change when you use:

Code:
#pragma dynamic 65536
Arrays

Compile and study the following mode:

Code:
#include <a_samp>

new
    gSingle[30] = {42, ...},
    gDouble[10][20];

main()
{
    new
        v = 2;
    printf("%d", gSingle[v]);
    printf("%d", gDouble[v][v]);
}
What do you think this data is:

Code:
dump 28 50 78 a0 c8 f0 118 140 168 190
It comes after the data for "gFirst" (represented as a string of "2a") and before the data for "gSecond" (a string of "0"). Hint: Look at the code calling the second "printf", work through it manually, and look at the differences between the values in that line. All numbers in a .ASM file are in HEX, and all addresses and offsets are in BYTES, not CELLS.
  • Jagged arrays
A random little trick:

Code:
#include <a_samp>

new
    gStrings[3][4] =
    {
        {'H',  'e',  'l', 'l'},
        {'o',  '\0', 'H', 'i'},
        {'\0', 'Y',  'o', '\0'}
    };

main()
{
    printf("%s", gStrings[0][0]);
    printf("%s", gStrings[1][2]);
    printf("%s", gStrings[2][1]);
}
Run that and see what happens, then try figure out why from the assembly. We'll go one further in a second.

OK, so the above code stores 12 characters in 12 cells (we'll ignore packed strings for now). The following code stores them in 18 cells, but makes accessing them easier:

Code:
#include <a_samp>

new
    gStrings[3][6] =
    {
        {'H', 'e',  'l',  'l',  'o', '\0'},
        {'H', 'i', '\0', '\0', '\0', '\0'},
        {'Y', 'o', '\0', '\0', '\0', '\0'}
    };

main()
{
    printf("%s", gStrings[0]);
    printf("%s", gStrings[1]);
    printf("%s", gStrings[2]);
}
Clearly the former is preferable from the point of view of memory, the latter is preferable from the point of view of access. We now know how 2D arrays access elements, so can we manipulate this knowledge to combine the best of both worlds? Yes! The first bit of data in a 2D array stores the offset in the main data for each of the second dimensions. By default they are all the same length, but we can manipulate these pointers at a low level to point to different offsets. So instead of:

Code:
new
    gStrings[3][4] =
    {
        {'H',  'e',  'l', 'l'},
        {'o',  '\0', 'H', 'i'},
        {'\0', 'Y',  'o', '\0'}
    };
We end up with something like:

Code:
new
    gStrings[3][6, 3, 3] =
    {
        {'H', 'e',  'l', 'l', 'o', '\0'},
        {'H', 'i', '\0'},
        {'Y', 'o', '\0'}
    };
For reference, I will refer to each set of data in the second dimension as a "data set", so:

Code:
{'H',  'e',  'l', 'l'}
Is the first data set in the second dimension of the original array.

This is what was meant earlier by extending the abilities of PAWN - you can't have arrays with different numbers of elements in each of the second dimension data sets when using the compiler (and they're tricky without the "-d0" compiler flag to remove run-time bounds checks):

Code:
#include <a_samp>

new
    gStrings[3][4] =
    {
        {'H',  'e',  'l', 'l'},
        {'o',  '\0', 'H', 'i'},
        {'\0', 'Y',  'o', '\0'}
    };

main()
{
    // Get the address of the start of the array.
    #emit CONST.pri gStrings
    // Get the address of the pointer to the second data set.
    #emit ADD.C     4
    #emit MOVE.alt
    // Load the data there.
    #emit LOAD.I
    // Add 2 cells and store the result back.
    #emit ADD.C     8
    #emit STOR.I
    // Get the third pointer.
    #emit CONST.pri 4
    #emit ADD
    #emit MOVE.alt
    #emit LOAD.I
    // Add one cell.
    #emit ADD.C     4
    #emit STOR.I
    printf("%s", gStrings[0]);
    printf("%s", gStrings[1]);
    printf("%s", gStrings[2]);
}
These are called "Jagged" arrays simply because that's how they look if your language supports declaring them natively (which PAWN doesn't - this is a hypothetical declaraion):

Code:
new
    gStrings[3][6, 3, 3] =
    {
        {'H', 'e',  'l', 'l', 'o', '\0'},
        {'H', 'i', '\0'},
        {'Y', 'o', '\0'}
    };
Snippets
  • Loading Data
Usually, loading data is very easy:

Code:
new
    local = 42;
#emit LOAD.S.pri local
You can even use offsets if you know them:

Code:
new
    local = 42;
#emit LOAD.S.pri 4
So what happens if you do:

Code:
new
    local = 42;
#emit LOAD.S.pri 40000
That will try and load the data 10000 cells after the current frame pointer, which is likely to be outside the stack. PAWN is "clever" - if you try and load an address that is not in the data, stack, or heap parts of the script, it will complain and not let you do it for security reasons. "LOAD" loads a global variable's data, "LOAD.S" loads a local variable's data, and thus both specify from which data area they should be loaded; however, there are two more OpCodes: "LREF" and "LREF.S". Like "LOAD" and "LOAD.S" the former takes a global and the latter takes a local, but instead of loading the data in that variable, they load the data at the address (relative to "DAT") stored in that variable. They too are restricted so you can't just load an address from any variable, but the CONTENTS of the variable are NOT limited, so you can put any pointer in to the variable and it will be loaded.

The following will demonstrate this. The "DAT" register stores the offset from the start of the AMX to the start of the data segment, which means that the start of the AMX is at an address of "-DAT" (falling outside the data segment). The pointer to the list of public functions is 32 bytes from the start of the AMX, so has an address of "-DAT + 32" (or "32 - DAT"), lets load this data using the fact that "LREF" doesn't check if the second address is valid. "LREF" ALWAYS takes a variable, so we get a little bit of indirection:

Code:
new
    pointer;
#emit LCTRL 1
#emit NEG
#emit ADD.C 32
#emit STOR.S.pri pointer
#emit LREF.S.pri pointer
If you see a "STOR" followed by an "LREF", this is almost certainly what it is doing - accessing information outside the data segment. You can even access variables on the stack this way, but there are better ways. And if you know the offset from the start of the AMX to other server data, you can read and write (using "SREF") it.
  • Variable Arguments
Passing many arguments to a vararg function such as printf is easy, using varargs in a function is easy, getting variable arguments in a function and passing them to another function is impossible. Currently the number of parameters passed to a function is fixed at compile-time. Passing variable arguments to another variable argument would require altering this number at run-time as it would depend on where the outer function was called from as to how many parameters were passed to the inner function:

Code:
Func1(...)
{
    // No idea how many parameters will be here as it's not a constant.
    printf(...);
}

main()
{
    // This is a constant - 2.
    Func1("%d", 6);
    // This is a constant - 3.
    Func1("%d %d", 6, 7);
    // This is a constant - 4.
    Func1("%d %d %d", 6, 7, 8);
}
In the code above the three calls to "Func1" all have a constant number of parameters at compile-time. "Func1" can take variable arguments but the PAWN compiler mandates that the number be known in advance. The single "printf" call would have 2, 3, or 4, parameters depending on where "Func1" was called from and this is not supported, but can be bypassed. Note that this is not easy:

Code:
stock Func1(...)
{
    // This is the number of parameters which are not variable that are passed
    // to this function (i.e. the number of named parameters).
    static const
            STATIC_ARGS = 0;
    // Get the number of variable arguments.
    new
        n = (numargs() - STATIC_ARGS) * BYTES_PER_CELL;
    if (n)
    {
        new
            arg_start,
            arg_end;
        
        // Load the real address of the last static parameter. Do this by
        // loading the address of the last known static parameter and then
        // adding the value of [FRM].  Because there are no static parameters
        // we have to use the number "8" as the stop point address.
        #emit CONST.alt        8
        #emit LCTRL          5
        #emit ADD
        #emit STOR.S.pri        arg_start
        
        // Load the address of the last variable parameter. Do this by adding
        // the number of variable parameters on the value just loaded.
        #emit LOAD.S.alt        n
        #emit ADD
        #emit STOR.S.pri        arg_end
        
        // Push the variable arguments. This is done by loading the value of
        // each one in reverse order and pushing them. I'd love to be able to
        // rewrite this to use the values of pri and alt for comparison,
        // instead of having to constantly load and reload two variables.
        do
        {
            #emit LOAD.I
            #emit PUSH.pri
            arg_end -= BYTES_PER_CELL;
            #emit LOAD.S.pri      arg_end
        }
        while (arg_end > arg_start);
        
        // Push the static parameters (none here).
        
        // Now push the number of arguments passed to format, including both
        // static and variable ones and call the function.
        n += BYTES_PER_CELL * STATIC_ARGS;
        #emit PUSH.S          n
        #emit SYSREQ.C         printf
        
        // Remove all data, including the return value, from the stack.
        n += BYTES_PER_CELL;
        #emit LCTRL          4
        #emit LOAD.S.alt        n
        #emit ADD
        #emit SCTRL          4
    }
}
For more information see this post by Zeex and my follow up (and read the comments above which explain what different chunks of OpCodes are doing):

http://forum.sa-mp.com/showthread.ph...718#post549718

Here "SYSREQ.C" is used to call the native function. Note however that there is a bug in the compiler which means that it will crash if you use "SYSREQ.C" with a function not already used in your code. The simplest way to avoid this is by doing:

Code:
forward _Func1_SYSREQ();
public _Func1_SYSREQ()
{
    printf("");
}
Before your "printf" using function. This is public so that it is ALWAYS included in the final .AMX file, a stock function would not be used, thus would be removed and thus would trigger the bug. Just having the function used somewhere isn't enough, it must be in a function that is included in the final output and comes before the "SYSREQ.C" usage, but there are still other ways of doing this.
  • Calling Functions
Calling a normal function in PAWN evaluates to the quite simple:

Code:
#emit CALL Func1
Brilliant - this will get the return address, put it on the stack, and jump to the start of the given function (always a "PROC" instruction). Except this takes a constant offset, and working them out is impossible unless you know how the compiler will lay out your functions in memory.

There is also "CALL.pri" which will jump to the absolute address stored in "pri", except it won't as it was removed for security reasons. This means the only way left to call functions is by manipulation of the "COD" pointer itself:

Code:
#emit LCTRL      6
#emit ADD.C      28      // 2
#emit PUSH.pri           // 1
#emit LOAD.S.pri pointer // 2
#emit SCTRL      6       // 2
That will get the current instruction pointer (always points to the NEXT instruction to run), add an offset of 28 to get the instruction after the "SCTRL" line, push that value as the return address, then store the address of the first instruction of the function to "CIP". Basiclally, this code manually replicates what "CALL" does. It's also fairly constant, but if you change the instructions between "LCTRL" and "SCTRL" you will also need to change the offset (4 bytes per instruction and operand (counted in comments)) to get the address AFTER "SCTRL".
  • Finding Functions
In the last section, the code used the variable "pointer" to specify the location of a function, this is only of any use if you know the address of a function - fortunately there are a number of ways of doing this:
  • 1) Constants
Code:
#emit CONST.alt Func1
This will load the address of "Func1" into the "alt" register. The advantage of this is that you don't need the "pointer" register at all; the disadvantages are that this will mean changing the standard call code (and thus the offset) slightly, and that you must specify the function in advance - there's no real dynamic coding beyond maybe a "switch". You can of course easilly get around the first problem:

Code:
#emit CONST.pri  Func1
#emit STOR.S.pri pointer
  • 2) funcidx
Public functions have their name stored in the AMX so that they can be accessed externally. This means that we can access a function internally by name (again, if it is public):

Code:
new
    idx = funcidx("Func1"),
    pointer;
// Get the pointer to the public function list.
#emit LCTRL       1
#emit NEG
#emit ADD.C       32
#emit STOR.S.pri  pointer
#emit LREF.S.alt  pointer
// Get the pointer to the function at the given index.
#emit LCTRL 1
#emit NEG
#emit ADD
#emit LOAD.S.alt  idx
#emit SHL.C.alt   3
#emit ADD
#emit STOR.S.pri  pointer
#emit LREF.S.pri  pointer
#emit STOR.S.pri  pointer
// Call the function
All you need to know to understand how this code works is that data addresses in PAWN are relative to "DAT", which itself is relative to the base address of the .AMX file; and that "funcidx" returns the index in the public function list of the given function. See the section above on the .AMX format for more details. This has the advantage that you can construct the function name at run-time ("format" etc), but you need the complete name.
  • 3) Searching
If you don't know the full function name (e.g. all "ZCMD" commands start with "cmd_", you can loop through the public function list and check all their names (watch the byte order). If a function name matches a pattern of your choosing you can get the current pointer. I'm not going to go too deep in to this as there is a library which does all this already - if you want more information then read the code, explaining it here is pointless.
  • 3) y_amx
Using "y_amx", the search code using "funcidx" would become:

Code:
new
    idx,
    pointer;
while ((idx = AMX_GetPublicPointer(idx, pointer, "Func1)))
{
    // Call the function.
}
"idx" is used to detect multiple functions with the same name start, returning one at a time. There are also functions to get the pointer to the public function entry (the address of the name/code pointer pair) and the function name, and to get data using other name parts (some more efficiently):

Code:
new
    idx,
    buffer[32];
while ((idx = AMX_GetPublicNamePrefix(idx, buffer, _A<cmd_>)))
{
    // Call the function.
}
That code will loop through all ZCMD commands using a 4 character prefix. You could instead use "AMX_GetPublicName" and pass "cmd_" as the last parameter, but it's less efficient.

Libraries

There are a few libraries which rely heavilly on "#emit" tricks in order to work, these will be listed and detailed here where they are of interest. Note that I clearly know about my own libraries better than other people's, so if the list seems a little biased it currently is, please link to other interesting ones.
  • y_inline - ******
TO COME - THIS WILL BE A BIG SECTION!
  • pointers - Slice
This is basically an expansion of the "jagged arrays" technique above. Instead of changing the offset pointer to point to a different location in the second dimension, this library changes the offset pointer radically, to the point where it actually points to another variable entirely. This can be done to such an extent that it points to variables in other memory spaces (for example stack or heap variables instead of just global variables). The "@" macro looks like an array, but instead takes a parameter which is the (previously prepared) offset from the start of an internal array to the specified data. This parameter is used to modify the offset in the first dimension of the array and then the second dimension, now moved, is read or written
  • y_amx - ******
This library simply provides simple access to the various AMX file elements. Information on segment offsets (absolute and relative to "DAT"), various methods to list all public and native functions (and some other rarely used elements), and read and write arbitrary addresses within the AMX address space. See above for more details.
  • phys_memory - Zeex_
This library locates the AMX in global memory. Normal operations in PAWN are relative to the base address, so reading address "0" will read the data 0 bytes from the start of the PAWN data segment. Suitable offsets (see y_amx) can read data from elsewhere in the amx, for example from the code segment coming before the data segment, but these are still all RELATIVE to "DAT".

This library allows you to find the offset from the "DAT" segment of the AMX to other parts of the SA:MP server itself. For example the mode restart time is adjustable in YSF because it is located at a know address. To modify this from PAWN requires knowing the address of the AMX itself. This is found by calling a function ("GetAmxBase"), compiled using a relative offset from the call location, but converted at run-time to use a global absolute address. "GetAmxBase" reads the data located at the address 4 bytes before its return address, i.e. reads its own address according to the call site, and subtracts what it thinks is its absolute address in the AMX, giving the globally absolute address of the AMX base. Once this is known any other offset can be trivially calculated.
  • y_va - ******
We have seen already how to pass the variable arguments passed to a function on to another variable arguments function. This is done by looping through the parameters of the current stack frame and pushing all found data. y_va is similar, but instead of pushing the parameters of the CURRENT stack frame, it pushes the parameters of the PREVIOUS stack frame. This can be demonstrated as:

Code:
Func1(a, b, c)
{
    y_va_func();
}

y_va_func()
{
    // Push the parameters of the function that CALLED y_va_func, not the
    // parameters to this function itself (there are none).
    Func3(a, b, c);
}

Func3(...)
{
}
Clearly the code above will not compile, but using this you can pass variable arguments through to another function without messing about with the "#emit" code seen earlier:

Code:
Func1(...)
{
    va_printf("%d %d %d", 0);
}
Here "0" is the number of static (i.e. named) parameters passed to "Func1". "va_printf" is exactly the same as "printf" but uses the parameters of "Func1" instead:

Code:
Func1(42, 43, 44);
There is a custom syntax to make it clear that something unusual is going on, but the emphasis here is on clarity of execution. The main aim of this is to allow the use of "format" in a function such as "SendClientMessageFormatted".
Reply
#2

Why does any CALL crash my compiler?
Reply
#3

Quote:
Originally Posted by sprtik
View Post
Why does any CALL crash my compiler?
This is a compiler bug.
Use the Zeex's PAWN Compiler : https://github.com/Zeex/pawn/releases, this bug is fixed.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)