[Tutorial] Using BlueG's MySQL plugin R7 (with cache)
#1

Hey!

Since BlueG's MySQL plugin was updated to version R7 in the beginning of February (2012), there have been a lot of people who haven't been able to understand how to fully thread their queries or how to use the new cache functions. I have received 2 requests for a tutorial on this subject, so I decided to come up with one to (hopefully) help everyone understand how to thread queries and use the cache functionality.

Plugin version R7
* Dropped support for unthreaded queries.
* Added cache functionality with functions cache_get_row, cache_get_field, cache_get_data, cache_get_field_content.
* Switching over to using mysql_function_query, which unlike mysql_query_callback from R6-2, allows scripters to pass custom variables as parameters to the callback.
* Bugfixes, improvements, etc. that this tutorial does not cover.

Plugin version R8 and newer
BlueG has decided to use a public repository for the development of this plugin. This means that others can commit code as well, and some already have. Here are some of the changes:
* Added cache_get_row_int and cache_get_row_float (supported briefly in this tutorial, but I'll continue it some time).
* More security added to internal string handling to prevent buffer overflows.
* Fixed crash issue on gamemode restart (gmx).
* A few more internal issues cleared up.

Benefits of using the cache
* It is faster! Using cache functions instead of mysql_fetch_row is notably faster. For JernejL, loading 1300 vehicles from the database went from 6 seconds to 60 milliseconds (source) and from my personal experience, I can tell that loading 500 houses from a database is now at least 10-20 times faster (source). Here are some more results.
* It does not require the coder to store/free the query result! There's no need to call mysql_store_result() to store the result or mysql_free_result() to free the result after done processing it. Specifically the result freeing has proven itself to be a source of problems, specially for newer scripters.
* It is easier to read/write/modify! (depending on the programmer's own preference)
To continue with, parsing the query result has become simpler compared to several older methods like sscanf.
pawn Code:
sscanf(data, "p<|>s[20]s[20]s[20]", string1, string2, string3);
// Becomes
cache_get_row(0, 0, string1);
cache_get_row(0, 1, string2);
cache_get_row(0, 2, string3);
// or
cache_get_field_content(0, "fieldname1", string1);
cache_get_field_content(0, "fieldname2", string2);
cache_get_field_content(0, "fieldname3", string3);
Functions
Before continuing with tips and examples for converting from older plugin versions to the newer format, lets go over the new functions that version R7 of the plugin introduced.

mysql_function_query(connectionHandle, query[], bool:cache, callback[], format[], {Float,_}:... )
* Replaces mysql_query and mysql_query_callback from older versions.
* Allows scripter to pass extra parameters to the callback with the format[] and following parameters (behavior similar to CallRemoteFunction)
pawn Code:
mysql_function_query(dbHandle, "SELECT ... FROM ...", true, "OnQueryFinished", "siii", "Andre", 1, 2, 3);
forward OnQueryFinished(name[], number_1, number_2, number_3);
public OnQueryFinished(name[], number_1, number_2, number_3)
{
    printf("Data: %s %i %i %i", name, number_1, number_2, number_3);
    // will print: Andre 1 2 3
}
As you can see in the mysql_function_query usage, "siii" stands for "string integer integer integer", which is respective to the data of "Andre 1 2 3", which is originally passed as separate parameters. This very effectively removes the need to use extra variables, PVars or GVars for this purpose. All parameters can be passed to the query function itself (keep it below 20 parameters though).

* Allows scripter to decide whether to use caching or not. The third parameter of the function decides whether the query should take advantage of the new cache functions. If this is set to true, the plugin will populate a vector with the raw data received as the result of the query. Whenever you run a SELECT query, this should be set to true (read above for the benefits of using cache functions).
pawn Code:
// SELECT queries (that return a result set)
mysql_function_query(dbHandle, "SELECT * FROM players", true, "OnPlayerListLoad", "");
// We set caching to true and set the callback to "OnPlayerListLoad" - this callback will be called without parameters (the empty quotes specify that) when the query finishes.

// Other (UPDATE, etc) queries
mysql_function_query(dbHandle, "UPDATE players SET kills = 10 WHERE id = 1", false, "", "");
// We leave the callback and parameters parameters empty and run a simple UPDATE query.
cache_get_data(&num_rows, &num_fields, connectionHandle = 1)
This function is the first that should be called in most cases after running a query with caching set to true. This, of course, unless you're 100% sure that you already know the amount of rows/fields that your query returns.
A very simple example of this:
pawn Code:
mysql_function_query(dbHandle, "SELECT * FROM players WHERE name = 'Andre' LIMIT 0,1", true, "OnPlayerDataLoad", "s", "Andre");
forward OnPlayerDataLoad(name[]);
public OnPlayerDataLoad(name[])
{
    new rows, fields;
    cache_get_data(rows, fields);
    if(!rows)
    {
        print(!"Andre is not a registered account on this server!");
    }
    else
    {
        printf("Andre is registered (1 row with %d fields)", fields);
    }
}
As you can see, it returns the number of rows and fields that the query returned.

cache_get_row(row, idx, dest[], connectionHandle = 1)
This is the most basic function for retrieving all sorts of data. We will be using this the most. The syntax is really simple and as I mentioned before, probably easier to understand in a large block of code (specially when the query returns a lot of fields) than sscanf parsing perhaps. What this function does is take the row index and the field index and fetches its data to the destination string (dest[]) that the scripter specified.
This example is a continuation of the example for cache_get_data. We assume that our table has the following syntax:
1. ID (unique ID, AUTO_INCREMENT) - field index 0
2. name - field index 1
3. level - field index 2
pawn Code:
printf("Andre is registerd (1 row with %d fields)", fields);
new temp[12];
cache_get_row(0, 0, temp);
pritnf("Andre's ID is %d", strval(temp));
cache_get_row(0, 2, temp);
printf("Andre's level is %d", strval(temp));
Since cache_get_row takes a string as the third parameter (destination), we will have to use a temporary placeholder for all sorts of numeric values that we want to store as integers later. An example of this would be:
pawn Code:
cache_get_row(0, 1, temp);
PlayerInfo[playerid][pLevel] = strval(temp);
It should be understood for scripters that indexes in programming start from 0. This means that the first database field would actually be considered as field with index 0. The second field is a field with index 1 and so on...

cache_get_row_int(row, idx, connectionHandle = 1)
Not supported in R7. See this repository.
New updates by cool guys like udan have brought the function I introduced here back in 2012 to the main plugin code. This is extremely useful as it requires the scripter to write less code and is faster as well.
pawn Code:
// Old code
new temp[12];
cache_get_row(0, 1, temp);
PlayerInfo[playerid][pMoney] = strval(temp);
pawn Code:
// New code
PlayerInfo[playerid][pMoney] = cache_get_row_int(0, 1);
Large tests showed that the new function executes 1.5 times faster than using the combination of cache_get_row() and strval().

Float:cache_get_row_float(row, idx, connectionHandle = 1)
Not supported in R7. See this repository.
See this post for the code I wrote back in 2012 and some speed tests.

This function should make you happy if you work with floats a lot. And handling of float values is somewhat slow in PAWN, but a lot faster in C++, so if some of it can be done internally, it is a win-win situation. Similarly to the cache_get_row_int implementation, this function helps you write shorter code that executes faster.
pawn Code:
// Old code
new temp[12];
cache_get_row(0, 10, temp);
PlayerInfo[playerid][pSpeedMultiplier] = floatstr(temp);
pawn Code:
// New code
PlayerInfo[playerid][pSpeedMultiplier] = cache_get_row_float(0, 10);
The new code is ~2 times faster than using cache_get_row() with floatstr(). See the link above for speed tests.

cache_get_field(field_index, dest[], connectionHandle = 1)
The plugin also stores the names of the fields that a query retrieves. I cannot think of an example from my own code where this could become useful, but when you run a query like SELECT * FROM ... with an asterisk that selects all fields, there can be some cases where you need to retrieve the field's name. This function will help you in this case:
pawn Code:
public OnQueryExecute() // whatever "SELECT * FROM ..." query
{
    new fieldname[32];
    cache_get_field(0, fieldname);
    printf("The name of field 0 is '%s'", fieldname);
    return 1;
}
A little bit more complicated example of combining cache_get_data and cache_get_field:
pawn Code:
public OnQueryExecute()
{
    new fieldname[32], fields, rows;
    // retrieve the amount of rows and fields (we'll take advantage of the field count)
    cache_get_data(rows, fields);
    // loop through all fields
    for(new i = 0; i != fields; i++)
    {
        cache_get_field(i, fieldname);
        printf("The name of field %d is '%s'", i, fieldname);
    }
    return 1;
}
cache_get_field_content(row, const field_name[], dest[], connectionHandle = 1)
This is a function that I don't suggest people to use unless you're handling a case where the field index is unknown. But in most cases, the scripters already know the field index! An example of this would be:
pawn Code:
mysql_function_query(dbHandle, "SELECT name,level,kills,deaths FROM players WHERE id = 1", true, "OnPlayerDataLoad", "");
// indexes: name = 0, level = 1, kills = 1, deaths = 2
In this example, you already know the indexes, so there's nothing wrong with using cache_get_row!
But I'm lazy and I want to do this? Sure, why not, the speed difference isn't notable at all. I don't know if I'm a freak if it comes to cases like this, but when I know that the plugin has no need to loop through the names of all fields, it will be at least a little bit faster.

An example of this function, anyways: (based on the example I gave for cache_get_row)
pawn Code:
printf("Andre is registerd (1 row with %d fields)", fields);
new temp[12];
cache_get_field_content(0, "level", temp);
printf("Andre's level is %d", strval(temp));
Converting
Since unthreaded queries are totally unsupported in the new version of the plugin, we should first cover how to convert unthreaded queries into threaded ones. And this time using the new querying function and cache compatibility. This is a most basic example of an old unthreaded query:
pawn Code:
format(query, sizeof(query), "SELECT name,level,kills,deaths FROM players WHERE name = %s LIMIT 0,1", PlayerName);
mysql_query(query);
mysql_store_result();
if(mysql_num_rows())
{
    mysql_fetch_row(data);
    sscanf(data, "p<|>s[24]iii", name, level[playerid], kills[playerid], deaths[playerid]);
    SendClientMessage(playerid, 0xFFFFFFFF, "I have received your data!");
}
mysql_free_result();
Queries like the one above were not recommended before (mixing threaded and unthreaded queries also sometimes had undefined results) and they are totally unsupported now. The new behavior is that when a query finishes, the callback specified in the mysql_function_query line is called. This also means that if we are retrieving data about a player, we will need to pass (send) the player ID as a parameter to the callback. When the query finishes, we use the cache functions described above to store the information we're querying for. This is easy, just watch this example with some handy comments included.
pawn Code:
format(query, sizeof(query), "SELECT name,level,kills,deaths FROM players WHERE name = '%s' LIMIT 0,1", PlayerName);
mysql_function_query(dbHandle, query, true, "OnPlayerDataLoad", "i", playerid);

// somewhere else in your script. In the main scope...
forward OnPlayerDataLoad(playerid);
public OnPlayerDataLoad(playerid)
{
    // Get the amount of rows (0 or 1 due to the LIMIT clause) and fields returned by the query.
    new rows, fields;
    cache_get_data(rows, fields);
    if(rows)
    {
        cache_get_row(0, 0, name); // Store the data from the name field into the name string.
        new temp[12]; // Create a temporary string to store the numeric values we will not retrieve.
        cache_get_row(0, 1, temp); // Store the string representation of the player's level in the temporary string.
        level[playerid] = strval(temp); // Store the numeric representation of the player's level in the right array.
        // The same comments go for the next 2 fields (kills and deaths).
        cache_get_row(0, 2, temp), kills[playerid] = strval(temp);
        cache_get_row(0, 3, temp), deaths[playerid] = strval(temp);
    }
    return 1;
    // Returning 1 to whatever callback that uses the cache functionality will clear out the cache in the plugin.
    // Note that no usage of mysql_free_result is required.
}
A huge benefit from threading is that your server no longer hangs during the execution of the query that is ran through mysql_function_query. This means that while the query is running (and larger queries can take up to even a few seconds to execute), the server will be able to do other things in the background (synchronize player data, call other callbacks, functions, etc). Due to a feature of R7, you no longer need to worry about freeing data and memory leaks: whenever you return 1 to the callback that is ran by the plugin, the vectors containing the cached field data will be cleared automatically.


Useful: protecting your server against race condition attack (thanks to JernejL)
Useful: Multiple approaches to selecting data from MySQL and then parsing it (with and without field names)
Useful: A basic gamemode using MySQL R7 plugin and caching

Thank you for reading. This tutorial was mostly written very late in the evening and every suggestion is welcome. If you have any requests, questions or would like to know more about a certain aspect, don't hesitate to post.

16/03/2013: I added some information about the newer plugin versions and will continue updating this tutorial.
Reply
#2

Very useful !!
Reply
#3

Oh so this is what cache is xD
Pretty cool Tutorial Thanks!
Reply
#4

What about sscanf? Are cache functions faster than sscanf?
Reply
#5

[Vector]:

I'm going to assume faster. As you use sscanf in conjunction with mysql_fetch_row. Whereas using these cache functions, you don't need to use either afaik.

Quote:

Benefits of using the cache
Using cache functions instead of mysql_fetch_row is notably faster. For JernejL, loading 1300 vehicles from the database went from 6 seconds to 60 milliseconds (source) and from my personal experience, I can tell that loading 500 houses from a database is now at least 10-20 times faster (source). [...]

AndreT:

Thanks for the topic, this has helped me understand it better, not having Internet for two months up until 14 days ago made me a bit rusty
Reply
#6

Quote:
Originally Posted by [Vector]
View Post
What about sscanf? Are cache functions faster than sscanf?
Exactly as Burridge said. A call to sscanf to parse the line is faster than using cache_get_row the amount of times necessary, but the sscanf-solution also requires mysql_fetch_row, which takes an extra array (string) to fetch the data into first and then parse from it. With caching, the plugin stores the data internally in a vector and for some reason, such code is faster.

I have only ran tests with the house loading code, and here are some results (500 rows)...
With mysql_fetch_row, no caching
1. mysql_fetch_row calls took 3065 milliseconds
2. sscanf (parsing) took 27ms in total (500 calls to function)
3. the rest of the house loading code took 129ms (irrelevant)
Total: 3221ms

With caching
1. cache functions (loading and parsing) took 166ms
2. the rest of the house loading code took 108ms (irrelevant)
Total: 274ms

From this, we can tell that using the caching functions is about 12 times faster in such example (with loops and huge rows of data being returned).

Burridge, Niko_boy and Ricop522, thanks for the positive feedback. If there's anything you feel that should be added to the tutorial, please let me know!
Reply
#7

This would be pointless if you have under 20-30 columns, thus I'm not converting YET.

Nice job!
Reply
#8

Thanks for your answer, I gotta change all my GM because it's made almost entirely in MySQL with the old functions of R6 but with threaded queries. By the way, I'm gonna test it by myself and post my speed results.
Reply
#9

Nothing is forcing people to update to the new "method", as threaded queries from R6/R6-2 work perfectly. But there are a few cases when scripters should:
1. When they want to organize their code structure better (example: move minigame query results to the minigame block of the code instead of having it in OnQueryFinish),
2. When they need to pass extra parameters to the query finishing callback and the playerid and extraid parameter of OnQueryFinish won't do (or even worse if you use PVars, GVars or even arrays to store this information),
3. When dealing with rather large sets of data and/or more than 1 row of data (in such cases, the caching is notably faster as mentioned in the previous posts).

Any more questions/ideas/suggestions/requests, let me know!
Reply
#10

If I don't want a callback to be called, for example an INSERT query for registration, what do I put as the callback and stuff in mysql_function_query?

I found this in a_mysql.inc

pawn Code:
#define mysql_query(%1,%2,%3,%4) \
    mysql_function_query(%4, %1, false, "OnQueryFinish", "siii", %1, %2, %3, %4)
But what is the syntax..?
Reply
#11

Never understood the R7 version, is annoying for the update.
Reply
#12

R7 is actually a good update and it is nice to see people write new modes using it or convert old modes to use it.
Quote:
Originally Posted by MP2
If I don't want a callback to be called, for example an INSERT query for registration, what do I put as the callback and stuff in mysql_function_query?
Simply leave the callback and specifiers string empty, pass them as "", since mysql_function_query does not have these as optional parameters. There's an example for this in the first post:
Quote:
Originally Posted by AndreT
pawn Code:
// Other (UPDATE, etc) queries
mysql_function_query(1, "UPDATE players SET kills = 10 WHERE id = 1", false, "", "");
// We leave the callback and parameters parameters empty and run a simple UPDATE query.
INSERT query behaves the same way.

And also, a small question/suggestion - why don't you use AUTO INCREMENT for your users? This would give each of your users an unique database ID that you could use in various cases, but it has the best use when you have other tables, for example `houses` where the owner of your house can simply be set as the unique value of player.

Quote:

I found this in a_mysql.inc

pawn Code:
#define mysql_query(%1,%2,%3,%4) \
    mysql_function_query(%4, %1, false, "OnQueryFinish", "siii", %1, %2, %3, %4)
But what is the syntax..?

This is purely to maintain backwards compatibility for scripters that still use the OnQueryFinish callback. It has a predefined set of variables that are sent to OnQueryFinish and this already gives us 2 reasons why not to use it:
1. It is one callback only. This does not provide possibilities for one to divide their code between numerous files or various sections of file without creating a link in OnQueryFinish.
2. It cannot pass extra variables to the callback. You only have the player ID and an "extra ID" in this callback as far as I know.

As for the syntax, however, look up mysql_query behavior in R6-2 - it is similar and produces similar results. However I seriously tell you not to use it since you are writing a new mode as I saw from another post of yours. The new capabilities provide more flexibility.

Quote:
Originally Posted by kikito
Never understood the R7 version, is annoying for the update.
Read the tutorial and ask for help
Reply
#13

Ah I see, thanks for that.

So I am right to just add this?

pawn Code:
#define mysql_query(%0) mysql_function_query(1, %0, false, "", "")
and that will thread?

EDIT: It fails with SELECT queries, mysql_store_result says there is no data to store.
Reply
#14

For queries which do not return anything, yes. However I would suggest changing the name a little bit so it is less generic and its functionality is explained a little more: mysql_query_nodata for example.
Reply
#15

So if I do this

pawn Code:
// SELECT queries (that return a result set)
mysql_function_query(1, "SELECT * FROM players WHERE Name = 'MP2'", true, "OnPlayerLogin", "");
How do I then put the player's 'Score' in to a variable? I know how to do it with RC6, I'm just confused with all this callback stuff. Do I need to use mysql_store_result or what?!
Reply
#16

No need for mysql_store_result. Since you set the third parameter (cache) to true, we will use only the cache_* functions inside OnPlayerLogin.
Yet there's another thing you need to do: pass the playerid to OnPlayerLogin.
pawn Code:
mysql_function_query(1, "SELECT * FROM players WHERE Name = 'MP2'", true, "OnPlayerLogin", "i", playerid);
Then you need to know the field index of field 'Score'. Then use cache_get_row.
Reply
#17

Do you have to use indexes? Can't you use the field name? I may add new fields in the future and don't feel like editing my hundreds of indexes.
Reply
#18

You're right by your assumption, field names can be used and this might be a simpler approach, however many ways are possible.

One: knowing what you query for and thereby knowing field indexes.
Adding a new field that you query for does not require you to modify hundreds of indexes in your code, what you can do is use a query which specifies all the field names that you are loading. For example:
pawn Code:
mysql_function_query(dbHandle, "SELECT level,kills,deaths,online_time,vehicle FROM players WHERE name = 'MP2' LIMIT 0,1", true, "OnPlayerDataLoad", "i", playerid);
// Field indexes: level = 0, kills = 1, deaths = 2, online_time = 3, vehicle = 4
Adding a new field to query for is simple, the new index will be 5 in a case where you append the new field name...
pawn Code:
mysql_function_query(dbHandle, "SELECT level,kills,deaths,online_time,vehicle,health FROM players WHERE name = 'MP2' LIMIT 0,1", true, "OnPlayerDataLoad", "i", playerid);
Field `health` has an index of 5, this means that in your callback, doing:
pawn Code:
cache_get_row(0, 5, temp), health = strval(temp); // yeah, I did not make it a float
This code above assumes you have only one result and the query also uses the limit clause to make sure of this. So you're getting the 6th field of 1st row that the query returned and then storing it in a variable called health.

So what's the advantage to this behavior? You can easily do on-the-run development and modifications to your database!
Lets say you have such query:
SELECT * FROM players WHERE name = 'MP2' LIMIT 0,1
What does this mean? All fields in the table are queried for and will be returned in whatever callback you choose to use. If you are developing a new feature for your server (and this is surely most important if you're making a new gamemode from scratch and actually have it running publicly already), you might want to add a new field. But you can be sure that as soon as you add the new field somewhere in the table (not in the end of it), this can cause trouble.
This issue used to be worse in R6-2 where I used sscanf for parsing my database results. An extra unwanted field added to the parsed string instantly caused a lot of trouble.

This is not as bad in R7 with cache functions, but you should also take note that you do not always need all the fields. For example, you're making the query I mentioned above (SELECT * FROM players WHERE name = 'MP2' LIMIT 0,1) - you do not need to get all the data. You already know the player's name for example, you also might not need their password in case you're making an automatic login or if you compare the passwords in the query itself. So the less fields you have in your result, the faster it actually is. And I'm not saying that this is something that will be noticeable, but it is just about the coding practice and I doubt that specifying the required fields in the query itself is slower than eliminating the unwanted fields later on in your loading callback.

Two: Using field names
There's nothing wrong with this, yes. Use cache_get_field_content which is also documented and given an example in the first post of this topic.
An example for this would be:
pawn Code:
// Query somewhere
mysql_function_query(dbHandle, "SELECT kills,deaths,level FROM players WHERE name = 'MP2' LIMIT 0,1", true, "OnPlayerDataLoad", "i", playerid);

// The callback
forward OnPlayerDataLoad(playerid);
public OnPlayerDataLoad(playerid)
{
    new rows, fields;
    cache_get_data(rows, fields);
    if(!rows)
    {
        // We did not get any rows - account does not exist?
        return 1;
    }
    // Start storing the data
    new temp[12];
    cache_get_field_content(0, "kills", temp), gPlayerData[playerid][pKills] = strval(temp);
    cache_get_field_content(0, "deaths", temp), gPlayerData[playerid][pDeaths] = strval(temp);
    cache_get_field_content(0, "level", temp), gPlayerData[playerid][pLevel] = strval(temp);
    return 1;
}
This code is untested yet should give you the basic idea of this. Also note that you actually don't need to specify the fields you're querying, so your query can very well look like:
SELECT * FROM players WHERE name = 'MP2' LIMIT 0,1

Also, since you're starting out with your new gamemode, perhaps I can help you with your tables as well, do you mind posting your table syntax?

Have fun coding!
Reply
#19

Thank you so much for this, I can't begin to explain how grateful I am. Everything seems to be going along nicely now. I haven't really set up the table properly yet as it's early days.

I now understand the purpose of threaded queries, and shouldn't have said the things I said in the mySQL plugin topic!
Reply
#20

Not bad I guess, thanks for the heads up. Nice to see it forces me using threaded queries, knowing I had to switch over to them a long time ago.
Reply


Forum Jump:


Users browsing this thread: 18 Guest(s)