Login

Yashas

New Code Optimizations
Total Optimization Tricks:13 | Update:21st June 2016

I will be sharing few ideas using which you can optimize your code. There have been many such topics in the past but this topic has some new ideas.

Related Threads:

Currently this topic has the following tricks:

Arrays are slower than normal variables
Do not use CallLocalFunction & funcidx when you know the function name in advance
Natives are lot faster than PAWN Code
Conditions in loops
Assigning multiple variables to the same value
Delay declaring local variables
Simplifying & Rephrasing Math to avoid expensive operations
memcpy,strfind,etc work on arrays too
Is using CallRemoteFunction really worth?
Accessing array elements multiple times
Do not mix floats and integers in an expression by Mauzen
Using Streamer unnecessarily
Good & Bad Use of functions (Optimizing 2D Array manipulation code)

Some of these make a significant improvement whereas some do not. You can ignore some minor optimizations and give priority to writing readable code.

1. Arrays are slower than normal variables

The following code is inefficient:

Код:

new Float:pos[3];
GetPlayerPos(playerid, pos[0], pos[1], pos[2]);

And here is the assembly version of that above code:

Код:

zero.pri
addr.alt fffffff4
fill c ;These 3 instructions are responsible for zeroing all the array elements
break	; 38

addr.pri fffffff4 ;Get the address of the array
add.c 8 ;Add the index (index 2 means 2*4 bytes ahead)
load.i ;This will get the value stored at that address
push.pri ;Now push the argument

addr.pri fffffff4 ;Same as above
add.c 4
load.i
push.pri
addr.pri fffffff4 ;Same as above
load.i
push.pri

Now here is an equivalent code written more efficiently:

Код:

new Float:x, Float:y, Float:z;
GetPlayerPos(playerid, x, y , z);

And here is the assembly version:

Код:

push.c 0 //Making room for the variables on the stack
push.c 0
push.c 0
push.adr fffffff4 //Pushing the arguments
push.adr fffffff8
push.adr fffffffc

When you want to access an array element, the compiler uses the following algorithm:

Address of first element + 4*Index = the location where Array[Index] is stored (This formula is true only for 1-dimensional arrays)

After computing the address of the array element, the data stored in the element can be retrieved.

This doesn't mean that you must not use arrays. You must instead use arrays wisely. Do not make arrays for no reason when the same thing can be simply done using normal variables.

In my opinion using x, y, z is actually more readable than using an array pos[3].

Speed Tests:
Array (10 Assignments):2444,2448,2473
Non-Array (10 Assignments):972,975,963

Speed Test Code:http://pastebin.com/aMkNtaC2

Non-Array version is 2.5 times faster than the array version.

2. Do not use CallLocalFunction & funcidx when you know the function name in advance

Do you know that CallLocalFunction and funcidx are slow functions? They are very slow because they need to check the function name which you pass as an argument in a list of all publics. That means lot of internal strcmps.

Код:

if(funcidx("OnPlayerEatBanana") == -1)

You actually don't need that line.You can do the same with "0" instructions. If you know the function name already, then simply use the pre-processor directives to find out if the function exists.

Код:

#if defined OnPlayerEatBanana
//OnPlayerEatBanana has been declared
#endif

Код:

if(CallLocalFunction("OnPlayerEatBanana","ii",playerid,bananaid"))

you can

Код:

#if defined OnPlayerEatBanana
if(OnPlayerEatBanana(playerid,bananaid))
#else
//That function hasn't been declared
#endif

Speed Tests: (1 public taking zero arguments)
Calling directly:204,226,218
CallLocalFunction:1112,1097,1001

Please note that this is the best case for CallLocalFunction. In reality CallLocalFunction will be much slower since there will be many public functions.

3. Natives are lot faster than PAWN Code
Avoid creating your functions when there is a native which can do it (or maybe using a combination of natives).

The reason why natives are way lot faster is because the native functions are directly executed by your computer whereas all your PAWN Code is executed in a virtual computer. For every PAWN instruction the AMX Machine (the virtual computer) has to decode the instruction then get the operands and then execute the instruction. Decoding and fetching the operands takes some CPU.

Код:

stock strcpy(dest[], src[], sz=sizeof(dest))
{
  dest[0] = 0;
  return strcat(dest,src,sz); //Notice that I have used strcat instead of writing my own loops
}

Speed Tests:
Loop based strcpy vs native strcat

Here are two equivalent functions which function as strcpy.
http://pastebin.com/Y7RJ21tw

Native:697,700,718,705
Non-Native:5484,5422,5507,5562

4. Conditions in loops

I don't know how many times I have told people about this but still there are few who still don't do this simple optimization.

Code 1:

Код:

for(new i = 0;i <= GetPlayerPoolSize();i++) {}

Code 2:

Код:

for(new i = 0,j = GetPlayerPoolSize();i <= j;i++) {}

In the first code, GetPlayerPoolSize is called for every iteration. GetPlayerPoolSize in our timeframe, returns a constant value every time it is called. So why simply call GetPlayerPoolSize on every iteration?

That's what the second code avoids. It makes a local variable which stores the value returned by GetPlayerPoolSize and uses it in the condition. Therefore calling the function just once and avoiding the function overhead.

Speed Tests:
With Optimization:1102,1080,1069,1091
Without Optimization:2374,2359,2429,2364

Test Code:http://pastebin.com/SLZDGRG4

Though the improvement in the above case may be negligible compared to the code which you put inside the loop, there will be times when you will be using a more slower functions.

Код:

for(new i = 0; i < CallRemoteFunction("GetPlayersInTeam", "i", TEAM_ID); i++) 
{
  
}

5. Assigning multiple variables to the same value & using memset

Code 1:

Код:

x = abc;
y = abc;
z = abc;

Code 2:

Код:

x = 
y = 
z = abc;

Which code do you think is faster?

Code 1:

Код:

load.pri c ;Get abc
stor.pri 8 ;Store it in X
break	; 20
load.pri c ;Get abc
stor.pri 4 ;Store it in Y
break	; 34
load.pri c ;Get abc
stor.pri 0 ;Store it in Z

Code 2:

Код:

load.pri c ;Get abc
stor.pri 0 ;Store in X
stor.pri 4 ;Store in Y
stor.pri 8 ;Store in Z

See the difference? The first code has extra useless instructions which fetches abc again and again when its already there whereas the second version gets abc only once and sets x,y,z.

The obvious conclusion is that Code 2 is faster but this is probably insignificant.

When you have large arrays that have to be set to zeros or ones or any other value use memset.

Speed Tests:
Using memset to set all elements of an array (3D) of 100 elements to zero:363,367,372
Setting elements of an array (3D) of 100 elements to zero using for loop:6662,6642,6687

6. Delay declaring local variables
I have seen scripts where all local variables are put at the top of the function though there are some which are needed sometimes.The examples should make things clear.

Bad Code:

Код:

public OnPlayerDoSomething(playerid)
{
  new actionid = GetPlayerAction(playerid), pee_id, peed_on_whome, amount_of_pee;
  if(actionid == PLAYER_PEE)
  {

  }
}

Good Code:

Код:

public OnPlayerDoSomething(playerid)
{
  new actionid = GetPlayerAction(playerid);
  if(actionid == PLAYER_PEE)
  {
  new pee_id,peed_on_whome,amount_of_pee;
  }
}

If you had gone through my previous tips, you must be knowing by now that when you create a local variable, the compiler first creates some room for it in the stack and then initializes it with zero.

So you must not simply make local variables if you are not sure if you are going to use them.The second code creates locals if and only if it needs it whereas in the first code the locals are created even though you may not use them.

This does not make any significant effect on the performance for few variables however it improves the readability of the code.

7.Simplifying & Rephrasing Math to avoid expensive operations

I always keep a pen and paper on the desk while I write programs. I write on paper the equations and do some shifting and changes and get a simpler equation.

Here is a classic example which will boost the performance of this snippet:

Код:

new Float:x,Float:y,Float:z;
GetPlayerVelocity(playerid,x,y,z);
if(floatsqrt( (x*x) + (y*y) + (z*z)) > 5.0)

Код:

new Float:x,Float:y,Float:z;
GetPlayerVelocity(playerid,x,y,z);
if( ((x*x) + (y*y) + (z*z)) > 25.0)

Do you notice the change?
I squared both sides on the condition in the if statement and eliminated the slow function 'floatsqrt'.

Here is another one:

Код:

for(new i = 0, j = GetTickCount(); i < 10; i++)
{
  if( j - LastTick[i] > MAX_TIME_ALLOWED)
  {

  }
}

Код:

for(new i = 0, j = GetTickCount() - MAX_TIME_ALLOWED; i < 10; i++)
{
  if(j > LastTick[i])
  {

  }
}

Whoa, I removed MAX_TIME_ALLOWED for the condition. Now the subtraction is done only once whereas it is done every time in the first code. Even this improvement is insignificant unless you have the operation which consume lot of CPU.

8. memcpy, strfind, etc work on arrays too

After all strings and arrays are one and the same. The only difference is that a string is terminated by a null character whereas a normal array doesn't.

Код:

new DefaultPlayerArray[100] = {1,2,3,4,5,6,7,8,9,10};
new PlayerArray[MAX_PLAYERS][100];

for(new i = sizeof(DefaultPlayerArray); i != -1; i--)
{
  PlayerArray[playerid][i] = DefaultPlayerArray[i];
}

Here is another equivalent code

Код:

memcpy(PlayerArray[playerid], DefaultPlayerArray, 0, sizeof(DefaultPlayerArray)*4, sizeof(PlayerArray[]));

I did some benchmarks for the two codes and here are the results:
Loop Version:
4286ms
4309ms
4410ms

memcpy version:
60ms
62ms
60ms

Similarly you can use strfind,strmid and many other string functions on arrays.The only problem is that when str functions find a element '0' in your array, the function terminates because the value 0 means '\0', i.e:null character.

9. Is using CallRemoteFunction really worth?
First of all, I would like to say that CallRemoteFunction is horribly slow and must be avoided whenever possible. CallRemoteFunction is usually used to update player variables when you have an anti-cheat in some other script.

Did you ever thought of having an anti-cheat in every script? I actually have one anti-hack in my gamemode which ensures that modified data is not updated in database and another anti-hack in administration filterscript which deals with taking action on the cheat (works independently).

Why two anti-cheats? We have two choices, either make two anti-cheats or use CallRemoteFunction to update the player variable.

Sometimes having two separate anti cheats is faster, in fact the some of the anti-cheat checks take a quarter of the time CallRemoteFunction takes to call the update function.

It doesn't matter if you are calculating some player variables in each and every script. Its way better than updating in one script and using CallRemoteFunction to access it.

10. Accessing array elements multiple times

Lets take an example to understand what we are talking about here

Код:

new val = value[x][y][z];
for(new i = 50; i != -1; --i) Arr[i] = val;

Код:

for(new i = 50; i != -1; --i) Arr[i] = value[x][y][z];

Which one in your opinion is faster?

The first one is faster if you had read the Tip #2 carefully. You know that calculating the correct address from the array index takes some time. In the second code, the address calculation is done every time the value is copied to Arr whereas in the first case we calculate the address only once.

Here is the take away message, if you are going to access an array element multiple times then create a temporary copy of the array element in a local variable and use the local variable.

Speed Tests:
Code 1:2280,2330,2350
Code 2:8008,8183,8147

11. Do not mix floats and integers in an expression
by Mauzen

Maybe this one is too simple, but I at least wanted to add it, as I see people doing that "mistake" very often.

Never mix up floats and integers (even if it does not give a tag mismatch warning). Always use the same datatypes in a single statement.

e.g.

pawn Код:

new Float:result = 2.0 + 1;

// Is compiled as
new Float:result = 2.0 + float(1);

// Which is significantly slower than
new Float:result = 2.0 + 1.0;

pawn Код:

new mindist = 10;
if (GetPlayerDistanceFromPoint(playerid, 237.9, 115.6, 1010.2) < mindist)

// Is compiled as
new mindist = 10;
if (GetPlayerDistanceFromPoint(playerid, 237.9, 115.6, 1010.2) < float(mindist))

// Which is significantly slower than
new Float:mindist = 10.0;
if (GetPlayerDistanceFromPoint(playerid, 237.9, 115.6, 1010.2) < mindist)

For complex mathematical tasks a simple .0 can make a speed difference of some percent.

12. Using streamer unnecessarily
It has become a habit for everyone to use streamer even when you need just 10 or 20 map icons, 50 objects, etc.

Do you even know what is a streamer? Streamer is a plugin/include which allows you to bypass the SAMP limits. SAMP allows a maximum of 1000 objects and you cannot have more objects than that.

Streamer allows you to bypass that limit by creating objects when a player is within the draw distance to the object and destroying it when there is no player in vicinity of the object. So basically streamer creates objects when required and destroys them when not required. In this way it allows you to cross the SAMP Limits.

When you use Streamer functions, say CreateDynamicObject, streamer doesn't really create an object. It adds the object information (X,Y,Z,RotX,RotY,RotZ....) to a database of objects. After a definite number of server tick/cycle have passed, it goes through all the objects in the database and checks if there is a player close to an object and creates it if required.

You can see streamer adding information about the object to a database here.

Player update starts here
Here is the function responsible for updating objects.

Does it makes sense to use streamer when you have less than 1000 objects?
Do you really need a streamer?

NO!

If you are sure that you are not going to cross the SAMP limit, then you needn't use a streamer.

This brings us a new problem, suppose in your existing version, you have 500 objects but you are going to update your script which needs 1500 objects. So now do you need to convert all the SAMP Object natives to streamer natives?

No if you wrote your initial version smartly.

Here is what I do:

Код:

#define CreateDynamicObject CreateObject

You can now use CreateDynamicObject in your code even though you do not have streamer.

When you know that you are going to need streamer, just remove the defines and include streamer.

A even clever way is to use CreateObject for the objects which are present at popular zones such as a spawn point where you can assume that a player will almost always be present at that location. For objects which are at remote locations and players hardly visit them, you should definitely use CreateDynamicObject because these are the ones which needn't be created all the time whereas the popular ones are bound to exist all the time (even if used with streamer so using streamer for such objects is not worth the cost).

You will have to do it for many objects to see a reasonable improvment since streamer is a plugin and hence it is pretty fast compared to PAWN code.

Similarly, you can do it for other natives.

13. Good & Bad Use of functions (Optimizing 2D Array manipulation code)
A common myth which many believe in is that function calls are very expensive which isn't true. In fact, raw function calls (empty) are many times faster than dereferencing a 2D array.

Код:

native SLE_algo_foreach_list_init(list:listid, &val);
native SLE_algo_foreach_list_get(feid);

#define foreach::list(%0(%1)) for(new %1, fel_%0@%1_id = SLE_algo_foreach_list_init(%0, %1); SLE_algo_foreach_list_get(fel_%0@%1_id);)

If you carefully it just makes a function call to obtain the value which is faster than dereferencing the array. To clear the mist completely, the function is defined in a plugin otherwise for obvious reasons it wouldn't be faster because there is indeed an array being used but that is inside a plugin.

That example was just to show that a function call isn't that costly compared to the other code that you write. This means that you should create functions for large chunks of code which does a specific task when necessary especially if it improves the readablity of the code.

However, misuse of functions could prove costly especially in loops which has been discussed earlier.

Here is one situation where using a 1D/2D array would fare better.

Quote:

Originally Posted by Vince

One thing I see constantly, and which is not mentioned in the first post, is the excessive usage of GetPlayerName. A player can't change his name while connected (the exception, of course, being SetPlayerName) so it seems redundant to call the function over and over again. I would say using a wrapper is even worse. Simply store it in a variable when the player connects and then use that variable everywhere. The same goes for GetPlayerIP.

There is another myth which says "making functions is always slower" which isn't true at all when it comes to dealing with multi-dimensional arrays. If done properly, making functions can actually improve performance significantly.

Код:

for(new y = 0; y < 100; y++)
{
  Array[playerid][y] = y;
}

is considerably slower than

Код:

stock DoSomething(arr[])
{
  for(new y = 0; y < 100; y++)
  {
  arr[y] = y;
  }
}

The reason lies within the roots of the assembly code. A quick look at how arrays are dereferenced in PAWN explains it.

This is the amount of code that is involved in dereferencing a 2D Array

Код:

#emit CONST.alt arr //Load the address of the array
#emit CONST.pri 2 //We want to access the 2nd sub-array
#emit IDXADDR //Address of the 2nd element of the major array

#emit MOVE.alt //Keep a copy of that address since we need to add it to the offset to get the address of the sub-array

//ALT = PRI = Address of the 2nd element of the major array
#emit LOAD.I

//ALT = Address of the 2nd element of the major array
//PRI = offset relative to the address stored in the ALT to the 2nd sub-array
#emit ADD

//PRI now has the address of the sub-array
#emit MOVE.alt //Move the address of the first element of the sub-array from PRI to ALT

#emit CONST.pri 4 //We want the 4th element of the sub-array
#emit LIDX//Load the value stored at arr[2][4]

Compare it with what goes to dereference a 1D array

Код:

#emit CONST.alt array_address
#emit CONST.pri n
#emit IDXADDR //PRI now has the address of the (n + 1)th element

#emit CONST.alt array_address
#emit CONST.pri n
#emit LIDX //PRI now has the value stored in the (n + 1)th element

There is

This is how arrays are passed

Код:

 //Pushing the address of the global string
  #emit PUSH.C global_str 

  //Pushing a local string
  #emit PUSH.S cmdtext

Which clearly explains why that works. When you push arrays, you push the address of the array so in your function call you recieve a 1D array. The first part of the 2D array dereference code is essentially skipped in every iteration in this case which makes it a whole lot faster.

Its sad that PAWN doesn't provide pointers.

Login
Username:
Password:	Lost Password?
	Remember me