Optimize some MySQL query
#1

I just finished my biggest single query yet, though I suck at mysql joins and never really worked with big queries yet. So why not post it here (ok, every guy with some experience will rage how inefficient this is, but i dont care cause I already know that)

Код:
SELECT `g1`.`galaxy`, `g1`.`system`, `g1`.`planet`, `g1`.`name`, `g1`.`owner`, `g1`.`ally`, `g1`.`points`, `g1`.`timestamp`, `i1`.`inactive`
FROM galaxydata g1
	JOIN
		(    
		SELECT `p1`.`name`, ($timestamp - `p2`.`timestamp`) AS `inactive`
		FROM
        	        (
                        SELECT `t2`.`name`, `t2`.`gesamtp`, `t2`.`timestamp`
                        FROM playerdata t2
                        WHERE `t2`.`timestamp` = $timestamp
			) p1,
                        playerdata p2
		WHERE `p1`.`name` = `p2`.`name`
			AND `p1`.`gesamtp` = `p2`.`gesamtp`
		GROUP BY `p1`.`name`
		) i1
	ON ( `i1`.`name` = `g1`.`owner` );
Note that $timestamp is a previously acquired variable and you dont need to mind it, for testing just replace it with 1366803483277
What does this actually do? I got two tables, playerdata and galaxydata. Galaxydata contains "planets" with coordinates like <galaxy><system><planet> and (beneath others) the name of the owner.
Playerdata contains a set of player-statistics snapshots for each owner, taken in certain intervals (that timestamp tells the creation time for each snapshot). The join now joins every planet with the "activity" of its owner (means the interval between the $timestamp snapshot, and the oldest snapshot since when the points didnt change). Note that planets can also have no owner, I didnt filter these from the join yet (maybe that can speed it up)

Galaxydata got ~2k lines, later gonna have 50k, playerdata got 20k, might also go up to 50k, and current time per query is roughly 0.15s which is still okay for my needs, but will later grow to a damn long time.

So I dont really mind if someone wants to test his mysql skills, or if Ill optimize it myself later, just wanted to try posting it here. Post your versions and Ill test them and see if they are working, and working faster
Reply
#2

Do you have the SQL code for your tables and some test data to try it ? I work a lot with SQL but I can't give you a better query without having the MySQL tables in front of me.
Reply
#3

Oh right, forgot to add that. Heres a dump: http://homemade-blog.net/testdump.zip

Edit: Noones getting it? Okay then, 5$ via paypal to the user who gives the fastest version and explanations what he did, so I can learn how to do it myself the next time.
Reply
#4

Bumping this.
You can use this as test environment: http://homemade-blog.net/phpMyAdmin
User: gwtest PW: test123 Database: gw
Or use the upper dump to test it on your own DB.

Mentioning again (maybe that makes some people interested) the guy with the fastest, explained version will get 5$ on paypal
Reply
#5

I got a little faster version (0.1660s to 0.1176s). I'm sure it can be optimized further but I don't fully know your tables and I'm not completely sure what you're trying to do -- sorry.

Код:
SELECT `g1`.`galaxy`, `g1`.`system`, `g1`.`planet`, `g1`.`name`, `g1`.`owner`, `g1`.`ally`, `g1`.`points`, `g1`.`timestamp`, `i1`.`inactive`
FROM galaxydata g1
JOIN
	(    
	SELECT `p1`.`name`, (1366803483277 - `p1`.`timestamp`) AS `inactive`
	FROM playerdata p1
	JOIN (
		SELECT name, timestamp, gesamtp
		FROM playerdata
	) p2
		ON p1.name = p2.name
		AND `p1`.`gesamtp` = `p2`.`gesamtp`
		AND p2.`timestamp` = 1366803483277
	GROUP BY `p1`.`name`
	) i1
ON (`i1`.`name` = `g1`.`owner`);
What I did was I filtered p2 in the join, while you did it afterwards in the where. This means you'll fetch a smaller table to begin with.
Reply
#6

I think you should first redesign your database because it doesn't look very optimized. I can help you with that but I need to know what you want to store in it exactly.
Reply
#7

Quote:
Originally Posted by Sinner
Посмотреть сообщение
I got a little faster version (0.1660s to 0.1176s). I'm sure it can be optimized further but I don't fully know your tables and I'm not completely sure what you're trying to do -- sorry.

What I did was I filtered p2 in the join, while you did it afterwards in the where. This means you'll fetch a smaller table to begin with.
Thanks for that, quite simple but effective idea. Didnt think of that before, and it boosts it by 30%, thats quite nice.

Quote:
Originally Posted by R@f
Посмотреть сообщение
I think you should first redesign your database because it doesn't look very optimized. I can help you with that but I need to know what you want to store in it exactly.
Thats the problem, Im not all sure myself what I might add to the database later. Its some experiment for creating and evaluating statistics for a browsergame. Yet I just grab some general data, though the most important stuff. Theres a whole bunch of more data that might be useful to log, but I dont know yet what to add and what to leave out.
However, it also doesnt need to run ultra fast. The DB is used on demand by a handful of people, there arent more than 10-20 queries of that kind per hour, so it doesnt really depend on speed, its more like a cosmetic change.
Reply
#8

Quote:
Originally Posted by Mauzen
Посмотреть сообщение
Thats the problem, Im not all sure myself what I might add to the database later. Its some experiment for creating and evaluating statistics for a browsergame. Yet I just grab some general data, though the most important stuff. Theres a whole bunch of more data that might be useful to log, but I dont know yet what to add and what to leave out.
However, it also doesnt need to run ultra fast. The DB is used on demand by a handful of people, there arent more than 10-20 queries of that kind per hour, so it doesnt really depend on speed, its more like a cosmetic change.
I think r@f is more talking about the design of your database rather then what fields you store. For example, it would be a good idea to make an extra ID a primary key (e.g. userid -- AUTO_INCREMENT) rather then linking the 2 tables by names (name and owner). This will infact speed up your queries tremendously since these unique ids are often indexed by the database system (automatically) -- indexed fields help the database system perform searches faster. Also, if someone changes their name you only need to change 1 table.

The key to fast and efficient queries (which is your question) is in the first place a properly designed database.
Reply
#9

Quote:
Originally Posted by Sinner
Посмотреть сообщение
I think r@f is more talking about the design of your database rather then what fields you store. For example, it would be a good idea to make an extra ID a primary key (e.g. userid -- AUTO_INCREMENT) rather then linking the 2 tables by names (name and owner). This will infact speed up your queries tremendously since these unique ids are often indexed by the database system (automatically) -- indexed fields help the database system perform searches faster. Also, if someone changes their name you only need to change 1 table.

The key to fast and efficient queries (which is your question) is in the first place a properly designed database.
Oh I see. Thats even what I originally planned, but I thought the additional joins would make it rather slow than speed it up. Also using IDs instead of names makes filling the database slightly more complicated, as Im getting the data directly from the browser, so id need an additional query for every insterted row to get the id for the name.
Ill give it a try!
Reply
#10

I know you're just doing it for cosmetic changes but isn't this a bit of premature optimisation? I'm guilty of this myself I find it hard to finish core features before optimising features that work, but not effecient enough in my mind. (being worried about nanoseconds and stuff).
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)