Anti Advertising without regex
#1

Hello everyone, today I want to start a community effort to build an anti advertising function to detect adverts in strings without the use of regex.

I hope people will contribute.

I've started off using RyDer's stringContainsIP function as a base and modified it but I would like help from you guys.

You can contribute here: https://github.com/Kar2k/samp-anti-advertising

Current features:

Actual IP detection (use of numbers ranging from 0-255) with port.
Whitespace skipping.
Detection works with other symbols such as (127_0_0_1:7777).

What are all the things to consider when creating this?

Future plans:

- Allow IP detection without port (only works if a port is used currently). DONE.
- Instead of considering every character in the string.. only look for the numbers and check if they are valid IP numbers (0-255) and use that to consider detection. DONE.
- Ignoring letters between numbers and dots / colons .. E.G:
127aa.0.0zz.1.ff7777 DONE.
- Fixed separation option between ip numbers detection .. E.G: 127 $$$$ 0 $$$$ 0 $$$$ 1. As most advertisers that come in your server usual has a fixed set of characters in between the numbers in this case its " $$$$ ". Adding an option to detect that will greatly decrease false positives. DONE.
- Website detection possibly (maybe not.. too much false positives).
Reply
#2

well from my experience some people use commas in ips like
127,0,0,1 port 7777
Reply
#3

I've updated the detection to skip all non-IP numbers (non +/- 0-255).
Added more valid IP number detection (can't use numbers like 1-1.1-1.1-1.1-1).

Code:
[31/03/2016 10:02:50] 000.000.000.000:7777 - 1.
[31/03/2016 10:02:50] 0.0.0.0:7777 - 1.
[31/03/2016 10:02:50] 127.0.0.1:7777 - 1.
[31/03/2016 10:02:50] 127  .  0.  0.  1:  7777 - 1.
[31/03/2016 10:02:50] 255.255.255.255:7777 - 1.
[31/03/2016 10:02:50] PLS COME JOIN SERVER 37____187____22____119 - 1.
[31/03/2016 10:02:50] PLS COME JOIN SERVER 37 $$$$ 187 $$$$ 22 $$$$ 119 - 1.
[31/03/2016 10:02:50] 0000.000.000.0000:7777 - 0.
[31/03/2016 10:02:50] 255.256.255.255:7777 - 0.
[31/03/2016 10:02:50] -1.-1.-1.-1:7777 - 0.
[31/03/2016 10:02:50] 1-1.1-1.1-1.1-1:7777 - 0.
1 means its a valid IP.

FULL update at https://github.com/Kar2k/samp-anti-a...8d53a5b2dc5286
Reply
#4

Nice one
Reply
#5

pawn Code:
// ** INCLUDES

#include <a_samp>
#include <sscanf>

// ** DEFINES

// *** FUNCTIONS

#define strcpy(%0,%1,%2) strcat((%0[0] = '\0', %0), %1, %2)
#define isnull(%1) ((!(%1[0])) || (((%1[0]) == '\1') && (!(%1[1]))))

// ** MAIN

main()
{
    print("Loaded \"anti_advert.amx\".");

    new str[128];

    // Valid
    str = "000.000.000.000:7777";
    printf("%s - %d.", str, IsAdvertisement(str));
    str = "0.0.0.0:7777";
    printf("%s - %d.", str, IsAdvertisement(str));
    str = "255.255.255.255:7777";
    printf("%s - %d.", str, IsAdvertisement(str));
    str = "PLS COME JOIN SERVER 37____187____22____119";
    printf("%s - %d.", str, IsAdvertisement(str));
    str = "PLS COME JOIN SERVER 37 $$$$ 187 $$$$ 22 $$$$ 119";
    printf("%s - %d.", str, IsAdvertisement(str));

    // Invalid
    str = "0000.000.000.0000:7777";
    printf("%s - %d.", str, IsAdvertisement(str));
    str = "255.256.255.255:7777";
    printf("%s - %d.", str, IsAdvertisement(str));
    str = "-1.-1.-1.-1:7777";
    printf("%s - %d.", str, IsAdvertisement(str));
    str = "-32.-187.-22.-119 - NOTE: REMOVE THE -s AND JOIN NOW!";
    printf("%s - %d.", str, IsAdvertisement(str));
}

// ** CALLBACKS

public OnGameModeInit()
{  
    return 1;
}

public OnGameModeExit()
{
    return 1;
}

public OnPlayerText(playerid, text[])
{
    if(IsAdvertisement(text))
    {
        SendClientMessage(playerid, 0xFFFFFFFF, "{FF0000}[ERROR]: {FFFFFF}Your message contains an IP Address (strict detection).");
        return 0;
    }
    return 1;
}

// ** FUNCTIONS

forward bool:IsAdvertisement(text[]);
public bool:IsAdvertisement(text[])
{
    new message[128], extract[2], element[4][4], count_1, count_2, temp, bool:number_next = false, bool:next_number = false, bool:advert = false;
    strcpy(message, text, sizeof(message));

    for(new i = 0, j = strlen(message); i < j; i ++)
    {
        switch(message[i])
        {
            case '0'..'9':
            {
                if(next_number)
                {
                    continue;
                }

                number_next = false;

                strmid(extract, message[i], 0, 1);
                strcat(element[count_1], extract);
               
                count_2 ++;

                if(count_2 == 3 || message[i + 1] == EOS)
                {
                    strmid(extract, message[i + 1], 0, 1);

                    if(IsNumeric(extract))
                    {
                        element[0][0] = EOS;
                        element[1][0] = EOS;
                        element[2][0] = EOS;
                        element[3][0] = EOS;

                        count_1 = 0;
                        count_2 = 0;

                        next_number = true;
                        continue;
                    }

                    temp = strval(element[count_1]);

                    if(count_1 == 0)
                    {
                        if(temp <= 255)
                        {
                            count_1 ++;
                            count_2 = 0;
                        }
                        else
                        {
                            element[count_1][0] = EOS;

                            count_2 = 0;

                            next_number = true;
                        }
                    }
                    else
                    {
                        if(temp <= 255)
                        {
                            count_1 ++;
                            count_2 = 0;
                        }
                        else
                        {
                            element[0][0] = EOS;
                            element[1][0] = EOS;
                            element[2][0] = EOS;
                            element[3][0] = EOS;

                            count_1 = 0;
                            count_2 = 0;

                            next_number = true;
                        }
                    }
                }

                if(count_1 == 4)
                {
                    advert = true;
                    break;
                }
            }
            default:
            {
                next_number = false;

                if(number_next)
                {
                    continue;
                }

                if(!isnull(element[count_1]))
                {
                    temp = strval(element[count_1]);

                    if(count_1 == 0)
                    {
                        if(temp <= 255)
                        {
                            count_1 ++;
                            count_2 = 0;

                            number_next = true;
                        }
                        else
                        {
                            element[count_1][0] = EOS;

                            count_2 = 0;
                        }
                    }
                    else
                    {
                        if(temp <= 255)
                        {
                            count_1 ++;
                            count_2 = 0;

                            number_next = true;
                        }
                        else
                        {
                            element[0][0] = EOS;
                            element[1][0] = EOS;
                            element[2][0] = EOS;
                            element[3][0] = EOS;

                            count_1 = 0;
                            count_2 = 0;
                        }
                    }

                    if(count_1 == 4)
                    {
                        advert = true;
                        break;
                    }
                }
            }
        }
    }
    return advert;
}

stock IsNumeric(const string[])
{
    return !sscanf(string, "{d}");
}
Results:
000.000.000.000:7777 - 1.
0.0.0.0:7777 - 1.
255.255.255.255:7777 - 1.
PLS COME JOIN SERVER 37____187____22____119 - 1.
PLS COME JOIN SERVER 37 $$$$ 187 $$$$ 22 $$$$ 119 - 1.
0000.000.000.0000:7777 - 0.
255.256.255.255:7777 - 0.
-1.-1.-1.-1:7777 - 1.
-32.-187.-22.-119 - NOTE: REMOVE THE -s AND JOIN NOW! - 1.

I'm happy with my results. But you may be wondering, why are the last 2 IPs detected as valid IPs? Well, it's for more security in my opinion. If a player sends a message like "-32.-187.-22.-119 - NOTE: REMOVE THE -s AND JOIN NOW!" it will go through and logically the player would win the battle and bring all your players onto their server.

Also, the function is way faster than regex based systems. So that aspect isn't to be worried about.
Reply
#6

Cool.

I was going to add an option to allow or ignore negatives.

Also going to add fixed separation checks and an option to must have ports.

EDIT: Added it, thanks!
Reply
#7

What is your argument against regex? Some benchmarks might be useful.
Reply
#8

Quote:
Originally Posted by Macluawn
View Post
What is your argument against regex? Some benchmarks might be useful.
I'm not sure, the regex plugins crash the server when weird characters are entered (for me atleast). So even if I benchmark, I still won't be able to use it properly. That's why I'm creating this.
Reply
#9

Quote:
Originally Posted by Kar
View Post
I'm not sure, the regex plugins crash the server when weird characters are entered (for me atleast). So even if I benchmark, I still won't be able to use it properly. That's why I'm creating this.
Only the one released in 2011 does this. This one's fine: https://sampforum.blast.hk/showthread.php?tid=526725
Reply
#10

Quote:
Originally Posted by Kar
View Post
I'm not sure, the regex plugins crash the server when weird characters are entered (for me atleast). So even if I benchmark, I still won't be able to use it properly. That's why I'm creating this.
How about not going down npm's road, and actually fix the problem in the plugin instead if making something new?
Reply
#11

Enabled by default, is an option to check for a fixed separation between
the numbers.

Meaning 127 ....... 0. 1. 1. is not a valid IP but 127.0.1.1 is a valid
IP.

If you have people speaking etc in your server, you should leave this on
by default.

@Macluawn.. well that would be a great idea but it's never bad to have a working pawn solution right?

@kvann, I did test that plugin. I have nothing to back up my claims right now but there is a reason why I'm not using it right now.. I can't remember but if it was working 100% for me I would be using it. I really don't really why I couldn't use it.
Reply
#12

Removed - Double post.
Reply
#13

I've tested regex, and I have not found any suitable solutions to find matches like "Join for FREE VIP ||| 255.255.255 .255:7777 .." or "PLS COME JOIN SERVER 37 $$$$ 187 $$$$$ 22 $$$$ 119" yet.

So, I've noticed a few complaints on my include about using fixed separation... now that I've seen how it all works. I'm going to go with an option to set the max separation offset.

So example, maxSeparationOffset = 3.

Код:
PLS COME JOIN SERVER 37 $$ 187 $$22 $ 19
would be detected but

Код:
PLS COME JOIN SERVER 37 $$$$$$ 187 $$ 22 $ 19
would not.
Reply
#14

((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)([\s\.\$]+)){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\s*:\s*[0-9]+)?

[\s\.\$] is list of separators. Will see . / $$$ / more
Reply
#15

Nice, couldn't find expressions like that.
Reply
#16

Never really liked regex, thanks for this Kar.
Reply


Forum Jump:


Users browsing this thread: 2 Guest(s)