[Include] FingerPrint - new sscanf2 custom specifiers
#1

Ok, here they are, some custom specifiers for the sscanf 2.5 plugin!

[Why]another way to parse playerids?
i often see players with their names starting with non-alphabetical/numerical characters, like ".:Babul:." or "IlIlIcantItypeIth1s"...
searching "bab" using the "u" specifier will fail at returning the id for player ".:Babul:.", since ".:Bab" doesnt equal "Bab", so comparing fails at the very first character already.
by using a little routine which compares each players name to the input using a custom specifier, this behavior (like strcmp for the "u" specifier causing fails) can be changed.

a sscanf custom specifier takes a string, and returns a number - optimal for scripting a "u" replacement: the "k<pla>" specifier is born!

[Requirements] to use the new specifiers
-sscanf2.5 update 7 or later
-zcmd

[How] to install
-copy the file "fingerprint.inc" into "pawno/include/"
-add this line to your scripts' top section:
pawn Code:
#include <fingerprint>
since the functions are available now, there are some variables needed in order for the specifier callbacks to work. i suggest you to open the filterscript "FingerPrint.pwn" and have a look at the top, where some variables are declared..

heres a quick explanation on how to replace the old "u" specifier in a command:
pawn Code:
sscanf(params,"u",player)
becomes
pawn Code:
sscanf(params,"k<who>",player)
this method solves some problems like swapped or missing characters causing a fail when searching for a name, but it also acts weird if not fed with the correct data (a long input-string is vital for searching) - if 3 players with similar fingerprints (derived from "Babul", "Balu", "Blubba") are being compared, then the callback will return the first player matching the criteria - like searching for player "BL". btw: this behavior depends on the scripting, so feel free to change it.
a long string means a "bigger" fingerprint (more bits set for more characters), due to (mostly) different characters contained in names (hence the cross sum of all bits).
if players with short names are online, then a search for "." or "x" could even let the <pla> specifier return a totally wrong playerid: imagine player "abc" with a fingerprint of only 3 bits set. a fingerprint for "x" will cause 4 missing chars (or differences in the binarys compared) compared with "abc".

possible ways to avoid this weird behavior:
> lower the "FingerPrintTolerance" value to like 1 upto 3 or 4, to allow just a few "simple" typos. stupid idea imo
> allow playernames with 8 chars+ only (for search-commands AND registering)
> add a second, additive way to calculate the fingerprints, like delta coding

since the vehicle names' array got added (and slightly modified) into the include, spawning any vehicle can be done the same way as playernames, using the k<veh> specifier. try to spawn a vehicle using your nick! for me, its
Code:
/veh babul
a buffalo. anagrams are considered aswell:
Code:
/veh maick
spawns a maverick, or:
Code:
/veh 09rfc
spawns a FCR-900. pretty obvious, isnt it?

[Specifiers] in short:
Code:
specifier	takes				returns

k<pla>		playername/id			playerid	(fingerprint)
k<veh>		vehiclename/id			vehicleid	(fingerprint)
k<wep>		weaponname			weaponid	(fingerprint)
k<clp>		k<pla>				playerid, closest to k<pla>
k<clv>		k<pla>				vehicleid, closest to k<pla>
each specifier commented with "fingerprint" calclulates a value (the binary fingerprint) representing the appearance of characters in the string (name) in 32 bits.

2 specifiers which dont really match here, are the <clp> and <clv>, which represents the closest player/vehicle id.
the /wep command using the k<wep> specifier, explains itself:
Code:
/wep rocker
gives you a rockeT...

[Download] <<< on the left

did i miss something? does this script solve at least one of your "player not found" or "invalid player id" problems? got some constructive critism? wanna make me cry? then let me know ^^
Reply
#2

argh, you found the weakest spot of the algorithm in 1 hour? GRRR D:<
hehe now seriously, 32 bits storing 26 chars, ignoring uppercase, is ok i guess, that leaves only a few bits to store chars 0-9 plus .,:;-_!§$ etc - one more reason to make a sort of 64 bit representation, maybe using 2 cells? somewhere i saw a 64-bit "fake" implementation, i guess its time to seek it up sometimes, but 2 cells will do aswell..

concerning your hooking system: i have read the 2 topics (1 first) about the y_hooks, at the end i had to edit 2 lines only for the public callbacks:
Code:
hook OnPlayerConnect(playerid){
	new name[MAX_PLAYER_NAME];
	GetPlayerName(playerid,name,sizeof(name));
	PlayerName[playerid]=name;
	PlayerFP[playerid]=FingerPrint(name);
	PlayerFPC[playerid]=WordBinaryCrossSum[PlayerFP[playerid]>>16]+WordBinaryCrossSum[PlayerFP[playerid]&0xffff];
	return 1;
}

hook OnPlayerDisconnect(playerid){
	PlayerFP[playerid]=0;
	PlayerFPC[playerid]=0;
	return 1;
}
... and THIS is enough to implement it? what "at the correct time" means about the hooked callbacks being called? are regular publics like OnPlayerConnect() called in a different way/time? iam not sure what to ask about, coz i dont really understand how it makes a difference ><
so far, it compiled fine with the hooking method, reading the docs again and again until i GET its concept, will help me to understand it indeed, but also needs time, due i dont feel motivated to learn about hooking without knowing its benefits. time will tell, maybe searching a tutorial will do..
ok, lets concentrate on the bad case, the short player names for now - you spotted that pretty fast, i didnt expect that THIS soon ^^
as you mentioned, comparing those names:
Code:
"456"
"123"
"789"
all result in the SAME fingerprint, so the first player found will be "456", even if you searched "123" or only "xxx".

problem: too short names arent recognized properly, comparisons fail epicly.

maybe-solution: add delta coding, like "12345" produces a "11111", where "123589" produces "111231", its the differences only.
using this method, a "123" = "111", where "456" = "411", and "789" = "711".
thats a good way as long the first character is typed properly, but a typo there could cause more trouble.
for repeating letters, like
Code:
****** = {89,95,76,101,115,115} = deltacoded {89,6,-19,25,14,0}
eeeeee = {101,101,...} = {101,0,0,0,0}
..and searching
eeees = {101,101,101,101,115} = deltacoded {101,0,0,0,14}
would compare
Code:
101,0,0,0,14 (eeees)
with both
Code:
89,6,-19,25,14,0 (******)
101,0,0,0,0 (eeeee)
and should decide for "eeeee" be closer to "eeees", at least it looks closer, hencing the 3 zeros in the middle, plus the matching 101. 14 to 0 is a difference of 1 only, that looks good already.

for short playernames, there will be no scriptable solution except forcing players not to use names <4 chars. its rare, maybe the idea above helps avoiding false returned player ids, i bet its worth a try.

maybe the upcoming feedbacks from testers help to find a good solution, the rank#1 glitch got revealed by ****** already, but keep searching :P
Reply
#3

oh, i didnt clear that up, in fact, its hidden in the .inc file:
Quote:

new FingerPrintTolerance=26;//the whole alphabet as tolerance is a bad idea, it will cause weird behavior at short names (when "abc" matches "xyz" with a tolerance of 6, hence the 6 differences)

...i will edit the first post to ensure everybody understands that this script can cause more trouble than benefits, if the default tolerance is not set to a different (lower) value than 26...
the repeating chars issue is the nature of the script, unfortunately, since each character is represented in 1 bit only, no place for multiples - the deltacoding variant could help to avoid this i hope

from what i understand from the hooking explanation, its like an abstracted interface to access callbacks? sounds interesting indeed, its implemented already, but iam not actually taking advantage in the script (yet)...

>to be edited maybe<
Reply
#4

very nice
Reply
#5

Awesome
Reply
#6

nice work
Reply
#7

Nice release.
Reply
#8

Awesome
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)