Tiny But Super Optimizations
#1

In this thread I'll point out some very small optimizations that can make a huge difference in speed in the long run.

Most of these are very minor, negligible speed differences. The point is THERE ARE DIFFERENCES. There are a lot of situations where these would help majorly. Sure these cases are rare, but they do exist and I've encountered them a lot!

Bare in mind, this thread is still no where near being complete.

I bet you didn't know that "+= 1" and "var = var + 1" were both twice as fast as "++", so read on.

(Note: lines preceded by a semi-colon in the assembly are comments.)

____________________________________________


Here is a great example to start us off.

Float vectors: Arrays vs Multiple Variables.
I've seen so many people doing things like this:
pawn Code:
new Float:t[3];
Instead of doing this:
pawn Code:
new Float:tX, Float:tY, Float:tZ
What's wrong with that? Well look here:
pawn Code:
tX = tY = tZ = 1;
This compiler turns that into the following assembly output:
Code:
    const.pri 1
    stor.s.pri fffffff4
    stor.s.pri fffffff8
    stor.s.pri fffffffc
While for this:
pawn Code:
t[0] = t[1] = t[2] = 1;
The compiler does this:
Code:
    addr.pri ffffffe8
    push.pri
    addr.pri ffffffe8
    add.c 4
    push.pri
    addr.pri ffffffe8
    add.c 8
    move.alt
    const.pri 1
    stor.i
    pop.alt
    stor.i
    pop.alt
    stor.i
I'm not going to walk you through the assembly, but I think it's obvious which one is better. It doesn't matter if you chain them or not, the arrays will always be slower. Just to prove my point here is the timing of each of these:
Quote:

Timing "accessing and setting singles"...
Mean = 136.00ns
Mode = 135.00ns
Median = 136.00ns
Range = 4.00ns
Timing "accessing and setting array"...
Mean = 233.00ns
Mode = 232.00ns
Median = 233.00ns
Range = 3.00ns

____________________________________________


Now for the rest of them.

  • Variable definitions: Chained Definitions vs Defining Each
    What people are doing:
    pawn Code:
    new tmp1;
        new tmp2;
        new tmp3;
        new tmp4;
        new tmp5;
    What people should be doing more often:
    pawn Code:
    new tmp1, tmp2, tmp3, tmp4, tmp5;
    Assembly for what people are doing:
    pawn Code:
    break
        push.c 0
        break
        push.c 0
        break
        push.c 0
        break
        push.c 0
        break
        push.c 0
    Assembly for what people should be doing more often:
    pawn Code:
    push.c 0
        push.c 0
        push.c 0
        push.c 0
        push.c 0
    So basically, the break's (';') are almost doubling the time it takes! Don't believe me? Well, look at the time comparison:
    Quote:

    Timing "new chain"...
    Mean = 114.00ns
    Mode = 114.00ns
    Median = 114.00ns
    Range = 2.00ns
    Timing "new each"...
    Mean = 221.00ns
    Mode = 222.00ns
    Median = 222.00ns
    Range = 3.00ns

    It's a shame that PAWN does this, but it is a fact we must face.
  • Setting Variables: Chain Assignments vs Assigning Each
    What people are doing:
    pawn Code:
    tmp1 = 17;
        tmp2 = 17;
        tmp3 = 17;
        tmp4 = 17;
        tmp5 = 17;
    What people should be doing more often:
    pawn Code:
    tmp1 = tmp2 = tmp3 = tmp4 = tmp5 = 17;
    Assembly for what people are doing:
    pawn Code:
    break
        const.pri 11
        stor.s.pri fffffffc
        break
        const.pri 11
        stor.s.pri fffffff8
        break
        const.pri 11
        stor.s.pri fffffff4
        break
        const.pri 11
        stor.s.pri fffffff0
        break
        const.pri 11
        stor.s.pri ffffffec
    Assembly for what people should be doing more often:
    pawn Code:
    const.pri 11
        stor.s.pri ffffffec
        stor.s.pri fffffff0
        stor.s.pri fffffff4
        stor.s.pri fffffff8
        stor.s.pri fffffffc
    So, 6 lines versus 15, which one is fastest?
    Quote:

    Timing "set chain"...
    Mean = 121.00ns
    Mode = 118.00ns
    Median = 119.00ns
    Range = 18.00ns
    Timing "set each"...
    Mean = 219.00ns
    Mode = 217.00ns
    Median = 217.00ns
    Range = 9.00ns

    6 lines, obviously. Start chaining your variable asignments if you are setting them to the same value!
  • If Lines: Chained vs Many
    What people are doing:
    pawn Code:
    if(tmp1 == 1)
            if(tmp2 == 1)
                if(tmp3 == 1)
                    if(tmp4 == 1)
                        if(tmp5 == 1)
                            {}
    What people should be doing more often:
    pawn Code:
    if(tmp1 == 1 && tmp2 == 1 && tmp3 == 1 && tmp4 == 1 && tmp5 == 1) {}
    Assembly for what people are doing:
    pawn Code:
    break
        load.s.pri fffffffc
        eq.c.pri 1
        jzer 3
        break
        load.s.pri fffffff8
        eq.c.pri 1
        jzer 4
        break
        load.s.pri fffffff4
        eq.c.pri 1
        jzer 5
        break
        load.s.pri fffffff0
        eq.c.pri 1
        jzer 6
        break
        load.s.pri ffffffec
        eq.c.pri 1
        jzer 7
    Assembly for what people should be doing more often:
    pawn Code:
    break
        load.s.pri fffffffc
        eq.c.pri 1
        jzer 1
        load.s.pri fffffff8
        eq.c.pri 1
        jzer 1
        load.s.pri fffffff4
        eq.c.pri 1
        jzer 1
        load.s.pri fffffff0
        eq.c.pri 1
        jzer 1
        load.s.pri ffffffec
        eq.c.pri 1
        jzer 1
    So by sight, it's the breaks (each if section this time) again!
    Quote:

    Timing "if chain"...
    Mean = 164.00ns
    Mode = 163.00ns
    Median = 164.00ns
    Range = 3.00ns
    Timing "if lot"...
    Mean = 230.00ns
    Mode = 228.00ns
    Median = 228.00ns
    Range = 16.00ns

    They are slowing us down more than we think!
  • Value Testing: Switch vs "If =="
    What people are doing:
    pawn Code:
    if(tmp == 2 ) {}
            else if(tmp == 5 ) {}
            else if(tmp == 8 ) {}
            else if(tmp == 11) {}
            else if(tmp == 14) {}
            else if(tmp == 17) {}
            else if(tmp == 20) {}
            else if(tmp == 23) {}
            else if(tmp == 26) {}
            else if(tmp == 29) {}
            else {}
    What people should be doing more often:
    pawn Code:
    switch(tmp) {
                case 2 : {}
                case 5 : {}
                case 8 : {}
                case 11: {}
                case 14: {}
                case 17: {}
                case 20: {}
                case 23: {}
                case 26: {}
                case 29: {}
                default: {}
            }
    Assembly for what people are doing:
    pawn Code:
    break
            load.pri 0
            eq.c.pri 2
            jzer 0
            jump 1
        l.0
            break
            load.pri 0
            eq.c.pri 5
            jzer 2
            jump 3
        l.2
            break
            load.pri 0
            eq.c.pri 8
            jzer 4
            jump 5
        l.4
            break
            load.pri 0
            eq.c.pri b
            jzer 6
            jump 7
        l.6
            break
            load.pri 0
            eq.c.pri e
            jzer 8
            jump 9
        l.8
            break
            load.pri 0
            eq.c.pri 11
            jzer a
            jump b
        l.a
            break
            load.pri 0
            eq.c.pri 14
            jzer c
            jump d
        l.c
            break
            load.pri 0
            eq.c.pri 17
            jzer e
            jump f
        l.e
            break
            load.pri 0
            eq.c.pri 1a
            jzer 10
            jump 11
        l.10
            break
            load.pri 0
            eq.c.pri 1d
            jzer 12
            jump 13
        l.12
        l.13
        l.11
        l.f
        l.d
        l.b
        l.9
        l.7
        l.5
        l.3
        l.1
    Assembly for what people should be doing more often:
    pawn Code:
    break
            load.pri 0
            switch 0
        l.2
            jump 1
        l.3
            jump 1
        l.4
            jump 1
        l.5
            jump 1
        l.6
            jump 1
        l.7
            jump 1
        l.8
            jump 1
        l.9
            jump 1
        l.a
            jump 1
        l.b
            jump 1
        l.c
            jump 1
        l.0
            casetbl
            case a c
            case 2 2
            case 5 3
            case 8 4
            case b 5
            case e 6
            case 11 7
            case 14 8
            case 17 9
            case 1a a
            case 1d b
        l.1
    There is actually a lot more to this, this is just a basic example. Switches with ranges and things are way more optimized than any logic if test. And the timings...
    Quote:

    Timing "if-elseif-else"...
    Mean = 408.00ns
    Mode = 407.00ns
    Median = 407.00ns
    Range = 19.00ns
    Timing "switch"...
    Mean = 100.00ns
    Mode = 97.00ns
    Median = 99.00ns
    Range = 12.00ns

    Yeah, 4 times the speed difference. And again, as I said, this is just a basic example. A more complex example would have an even greater difference.

To be continued...
(so please don't bother replying yet)
Reply
#2

This is nice mate. People can learn bits from this.
Reply
#3

Quote:

tmp1 = tmp2 = tmp3 = tmp4 = tmp5 = 17;

I didn't even know you could have multiple assignments within a single statement in Pawn. Mind blown.
Reply
#4

Nice to see that stuff actually tested.

Heres another one (that most people probably already are aware of):

pawn Code:
new Float:value = 5.0;

// Slow
value = value / 2;
// converts to
// value = value / float(2);

// Faster
value = value / 2.0;
Also applies to any other operations connecting a float and an integer.

Ive once measured the time for that, but it was years ago and I have no idea if the post even still exists in the forum.
Reply
#5

PHP Code:
enum TPlayerData
{
    
UserID,
    
Name[25],
    
Password[130],
    
IP[16],
    
GPCI[130],
    
PasswordAttempts,
    
bool:LoggedIn,
    
bool:InClassSelection,
    
// Player data
    
Money,
    
Score,
    
AdminLevel,
    
Speed
}
new 
APlayerData[MAX_PLAYERS][TPlayerData]; 
Consider this enum structure.

Most people do this when clearing all data in a player's account when a player disconnects or connects (to be sure the new player won't get data from the previous player):
PHP Code:
APlayerData[playerid][UserID] = 0;
APlayerData[playerid][Name][0] = 0;
APlayerData[playerid][Password][0] = 0;
APlayerData[playerid][IP][0] = 0;
APlayerData[playerid][GPCI][0] = 0;
APlayerData[playerid][PasswordAttempts] = 0;
APlayerData[playerid][LoggedIn] = false;
APlayerData[playerid][InClassSelection] = false;
APlayerData[playerid][Money] = 0;
APlayerData[playerid][Score] = 0;
APlayerData[playerid][AdminLevel] = 0;
APlayerData[playerid][Speed] = 0
While it's alot easier to do this:
PHP Code:
new temp[TPlayerData];
APlayerData[playerid] = temp
No need for hundreds of lines of code to reset your entire structure one variable at a time and no risk about forgetting to reset one of them.
Just reset them all at once.

I have no idea if it's faster and if it is, how much faster.
I never tested the speed of it, but it surely is less code and less risk to clear all your vars.
Reply
#6

Quote:
Originally Posted by Mauzen
View Post
Nice to see that stuff actually tested.

Heres another one (that most people probably already are aware of):

pawn Code:
new Float:value = 5.0;

// Slow
value = value / 2;
// converts to
// value = value / float(2);

// Faster
value = value / 2.0;
Also applies to any other operations connecting a float and an integer.

Ive once measured the time for that, but it was years ago and I have no idea if the post even still exists in the forum.
I already addressed that actually, I just haven't put it in the post yet. I have a lot more tests I didn't toss in the post yet.
Reply
#7

Okay so apparently we can bother posting now =P
Here's a question and at the same time could be a part for this thread.
Is it faster using Macros instead of functions if possible? (as far as my knowledge goes it is in most cases, but would really like to know)
PHP Code:
    GetHousePos(houseid,&Float:x,&Float:y,&Float:z)
    {
        
x=HouseI[houseid][X];
        
y=HouseI[houseid][Y];
        
z=HouseI[houseid][Z];
        return 
1;
    } 
could this be turned to (well it compiles that far I know):
PHP Code:
#define GetHousePos(%0,%1,%2,%3) (%1=HouseI[%0][X],%2=HouseI[%0][Y],%3=HouseI[%0][Z]) 
and which one is faster?
Reply
#8

Boohoo, 100 nanoseconds difference. Readability should always take precedence over such extremely negligible improvements.
Reply
#9

I think it is up to each if applies or not, just as it is not bad to know it. I created the code, I read the code, I understand my code. If I did to share it, maybe yes I care a little more readability before optimization exaggerated.

This is just to know what is faster.
Reply
#10

Код:
if (a == 1 || a == 2 || a == 3 || a == 4 || a == 5)
Isn't it faster to do this:
Код:
switch (a)
{
	case 1, 2, 3, 4, 5:
	{
		// Do something
	}
}
Reply
#11

Quote:
Originally Posted by Vince
Посмотреть сообщение
Boohoo, 100 nanoseconds difference. Readability should always take precedence over such extremely negligible improvements.
Quote:
Originally Posted by Jay_
Посмотреть сообщение
Completely agree. This whole optimization thing is major overkill. I think it originated from Y_Less in that huge code optimazitions thread that was later proven redundant when it was benchmarked.
Yes, no shit the speeds here are very negligible. As I mentioned before. The point is that there is a difference. These tiny speed differences can very well matter in certain situations, though.

Quote:
Originally Posted by Darkwood17
Посмотреть сообщение
Код:
if (a == 1 || a == 2 || a == 3 || a == 4 || a == 5)
Isn't it faster to do this:
Код:
switch (a)
{
	case 1, 2, 3, 4, 5:
	{
		// Do something
	}
}
As I said, this thread is to be continued. I've only put up some of my tests so far. I have lots more that I have already tested but haven't put in the thread yet. That includes if vs switch (one of the more extreme speed differences, totally different operations).

@PrO.GameR, function calls themselves are extremely slow compared to statements. That's obvious.
Reply
#12

Quote:
Originally Posted by Darkwood17
Посмотреть сообщение
Код:
if (a == 1 || a == 2 || a == 3 || a == 4 || a == 5)
Isn't it faster to do this:
Код:
switch (a)
{
	case 1, 2, 3, 4, 5:
	{
		// Do something
	}
}
Which can also written as:
pawn Код:
case 1 .. 5:
---

Some are minor but it's always good to know even if the difference is not even noticeable.
Anyway, one thing I mostly see in threads about scripting help is the declaration of variables inside loops.
Reply
#13

Quote:
Originally Posted by Vince
Посмотреть сообщение
Boohoo, 100 nanoseconds difference. Readability should always take precedence over such extremely negligible improvements.
Quote:
Originally Posted by Jay_
Посмотреть сообщение
Completely agree. This whole optimization thing is major overkill. I think it originated from Y_Less in that huge code optimazitions thread that was later proven redundant when it was benchmarked.
Coincidently, all the optimizations mentioned in this topic atm improve readability.

Which one is ugly?
Код:
new Float:pos[3];
GetPlayerPos(0, pos[0], pos[1], pos[2]);
OR

Код:
new Float:X, Float:Y, Float:Z;
GetPlayerPos(0, X, Y, Z);
This is the most annoying way to declare variables. Of course if one tells the compiler to stop compiling with debug information, the optimization would make no difference (assuming so since the sole purpose of break is to help the internal debugger)
Код:
    new tmp1;
    new tmp2;
    new tmp3;
    new tmp4;
    new tmp5;
Код:
new tmp1, tmp2, tmp3, tmp4, tmp5;
Which is better?
Код:
    tmp1 = 17;
    tmp2 = 17;
    tmp3 = 17;
    tmp4 = 17;
    tmp5 = 17;

    tmp1 = tmp2 = tmp3 = tmp4 = tmp5 = 17;
and what is wrong with this one?
Код:
    if(tmp1 == 1)
        if(tmp2 == 1)
            if(tmp3 == 1)
                if(tmp4 == 1)
                    if(tmp5 == 1) 
                        {}

    if(tmp1 == 1 && tmp2 == 1 && tmp3 == 1 && tmp4 == 1 && tmp5 == 1) {}
These optimizations though tiny actually make the code more readable. There is no reason why someone shouldn't do these optimizations.

Maybe someday in future, I will make a neat topic which will list optimizations that really make a "big" improvement which are actually based on "smart ideas" rather than coding techniques.
Reply
#14

Quote:
Originally Posted by Vince
Посмотреть сообщение
Boohoo, 100 nanoseconds difference. Readability should always take precedence over such extremely negligible improvements.
Whats 100 nanoseconds? Actually a lot, considering that it means "twice as fast" in these cases. Ive done a lot of optimizations like this a while ago for a highly complex math job that couldnt be done in a plugin, and without saving some micros here and some hundred nanos there the function would have been way too slow to be used.
Sure, cases that really need that stuff are rare, but I dont see the problem in using these optimizations in places where it doesnt hurt the readability.
Reply
#15

Quote:
Originally Posted by Jay_
Посмотреть сообщение
Completely agree. This whole optimization thing is major overkill. I think it originated from Y_Less in that huge code optimazitions thread that was later proven redundant when it was benchmarked.
Well he says himself that he completely agrees with that. I suppose people didn't realize the optimizations aren't needed in most cases, and should only be used if there are hot spots in your code (e.g. a loop that's ran thousands of times).
Reply
#16

Quote:
Originally Posted by Mauzen
Посмотреть сообщение
Whats 100 nanoseconds? Actually a lot, considering that it means "twice as fast" in these cases.
It's the difference between light travelling 30 meters and 60 meters.. You won`t notice it.
Reply
#17

Quote:
Originally Posted by Macluawn
Посмотреть сообщение
It's the difference between light travelling 30 meters and 60 meters.. You won`t notice it.
Except in certain cases you WILL notice it. Same applies to both light and these opt's.
Reply
#18

This thread can be useful to me as i code like mess
Reply
#19

Quote:
Originally Posted by Sreyas
Посмотреть сообщение
This thread can be useful to me as i code like mess
This thread is nothing about your mess. It's about tiny optimizations that many people will find extremely useful in many circumstances.

For your mess, you just need to learn how to write code. You clearly got the brunt of it down, just keep paying attention to how it's supposed to be done. Learn some other languages too, that will actually help a lot. Seeing how things are done in multiple languages will give you a great baseline style.
Reply
#20

Quote:
Originally Posted by Crayder
Посмотреть сообщение
This thread is nothing about your mess. It's about tiny optimizations that many people will find extremely useful in many circumstances.

For your mess, you just need to learn how to write code. You clearly got the brunt of it down, just keep paying attention to how it's supposed to be done. Learn some other languages too, that will actually help a lot. Seeing how things are done in multiple languages will give you a great baseline style.
-_- that was not i meant for your info i know how to script in c++(u dont need to teach me that),pawn(ofcourse ),SQL ,sort of python and little bit Java .I just said my codes are not pretty printed (yeah i meant that most of the things in this thread are not optimzations they are just pretty printing)
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)