[Tutorial] Advanced Iterators (foreach/y_iterate).
#6

Contents

First Post

Information on the internal implementation of various parts of "y_iterate".

Introduction - A brief word on what this tutorial is about and a recap.
Multi-Dimensional Iterators - A review of multi-dimensional iterators required for later.
Macros - How the "Iterator:" macro works.
Linked Lists - An explanation of the data type used by iterators.
Start Point - How "foreach" loops over arbitrary arrays.
Improvements - Why "foreach" is better than "for" in most cases.
Generic Loop Functions - Using iterators without "foreach".
Reverse Iteration - Going backwards through iterators.

Second Post

Function list.

Functions - A list of all the functions provided by "y_iterate".

Third Post

An introduction to "special iterators" - those that are defined by functions and
not by using "Iterator:".

Special Iterators - An introduction to writing special iterator functions.
Example 1 - Writing an iterator for every positive even integer.
Explanation - A more detailed look at the code for Example 1.
Example 2 - Writing an iterator for every positive even integer.
Example 3 - Writing an iterator for RCON admins, which can be done multiple ways.

Fourth Post

More advanced special iterators.

Iterators In Iterators - Using iterators to create new iterators.
Multi-Dimensional Special Iterators - Special iterators that look like multi-dimensional iterators.
More Parameters - Passing more than one parameter to a special iterator.
Function-Like Iterators - Special iterators that look like regular functions.

Fifth Post

A detailed look at the macros that make up "y_iterate" and make all these
different iterator types work together.

"foreach" Types - A review of all the "foreach" syntax forms.
Redefining "new" - "y_iterate"'s dirty little secret.
Y_FOREACH_SECOND - Detecting one syntax form.
Y_FOREACH_THIRD - Detecting a second syntax form.
Why Redefine "new"? - Why the "#define new" hack is required.
The Remainder - All the other syntax form detections.

"foreach" Types

We will now look at one important chunk of the macros in "y_iterate", used to
determine exactly which version of "foreach" has been entered. To understand
the macros however, we must first understand what we are trying to parse. The
following code demonstrates all the possible variations of "foreach" that can
appear in [pawn]

pawn Code:
// Original version (implicit declaration):
foreach (Player, playerid)

// New version:
foreach (playerid : Player)

// New version with declaration:
foreach (new playerid : Player)

// New version with tag declaration:
foreach (new Group:group : PlayerGroup)
We need to write macros to correctly identify and process each of these versions
of the keyword. There is also an older version of the command that looked like:

pawn Code:
// Original version (no declaration):
foreachex (Player, playerid)
This was different from "foreach" in that it DIDN'T declare "playerid" as a new
variable, which the standard "foreach" keyword did. This has been deprecated in
the later versions of "y_iterate" because the "new" keyword is now explicit.
Fortunately, this version is much easier to express in the new system.

There are two things that we need to detect - number of colons, and number of
"new" keywords. Zero colons means the old syntax is in use, anything else is
the new syntax. You could in theory have:

pawn Code:
// Original version with tag (implicit declaration):
foreach (PlayerGroup, Group:group)
However, this was never supported and no tagged iterators were ever written for
this older version. We can therefore ignore this possibility.

At this point I'd love to detail the logical progression that led to my
development of these macros, in the hope that people could learn something from
the thought process, but quite simply I can't! I have no idea how I came up
with this system, it was just a lot of thinking resulting in a brainwave that I
can't explain - which is a real shame IMHO. As a result, I can only document
how the macros work.

Redefining "new"

The main "foreach" macro actually revolves around this macro:

pawn Code:
#define new%0|||%9|||%1:%2||| %9|||%0|||%1|||%2|||
That actually redefines the "new" keyword to something different (I'm not too
happy about having to do this). Because it redefines such a pivotal parf of the
PAWN language, I have made sure that it does not match common code. As you can
see, the pattern for matching is this:

pawn Code:
new  |||  |||  :  |||
I've used spaces here instead of macro parameters ("%0" etc) to make the pattern
clearer and represent optional code typed by the user. If you have any "new"
variable declarations with this exact sequence of operators after it then I
apologise - this will probably break you code. But I think that's unlikely!

The replacement for the macro is:

pawn Code:
%9|||%0|||%1|||%2|||
We detect the "new" keyword, but we don't have it present in the replacement, so
this code looks for "new" and removes it. It also shuffles the remaining
parameters and removes the detected colon, but the result is still not valid
code.

To further understand the macro, lets look at it being used:

pawn Code:
#define foreach%1(%0) for (new Y_FOREACH_SECOND|||Y_FOREACH_THIRD|||%0||| )
Here "%1" simply detects extra spaces and discards them. We then add the word
"new", which as we've seen is instantly detected and removed. Why not simply
write:

pawn Code:
#define foreach%1(%0) for (Y_FOREACH_THIRD|||Y_FOREACH_SECOND|||%0||| )
You might think that would be the result of the macros after calling the "new"
macro, but you would be wrong - sometimes...

The key fact to note here is that the "new" macro looks for three sets of "|||"
AND a colon. We didn't write the colon in the replacement text for "foreach",
so it must be part of the parameter "%0". If we type:

pawn Code:
foreach (new playerid : Player)
We have "%1" as a single space (discarded) and "%0" as "new playerid : Player".
Manually replacing macros gives:

pawn Code:
// Detect and replace "foreach"....
for (new Y_FOREACH_SECOND|||Y_FOREACH_THIRD|||new playerid : Player||| )

// Detect and replace "new"...
for (Y_FOREACH_THIRD||| Y_FOREACH_SECOND|||new playerid ||| Player||| )
So what happens if we DON'T type a colon? The only time this happens is when we
use the old-style "foreach" syntax:

pawn Code:
foreach (Player, playerid)

// Detect and reaplce "foreach":
for (new Y_FOREACH_SECOND|||Y_FOREACH_THIRD|||Player, playerid||| )

// DO NOT replace "new" because there is no colon to match.
Compare the outputs side-by-side:

pawn Code:
foreach (new playerid : Player)
// Gives:
for (Y_FOREACH_THIRD||| Y_FOREACH_SECOND|||new playerid ||| Player||| )

foreach (Player, playerid)
// Gives:
for (new Y_FOREACH_SECOND|||Y_FOREACH_THIRD|||Player, playerid||| )
Each "foreach" line declares a new variable (this fact is hidden in the second
one, which the main motivation for developing the new syntax). In the resulting
generated code each "for" loop has a "new" keyword somewhere in it, and each one
starts with a DIFFERENT macro. The first one starts with "Y_FOREACH_THIRD", the
second one starts with "Y_FOREACH_SECOND" (don't worry that the first one isn't
called "Y_FOREACH_FIRST", that's just because of the order I am describing the
macros in). This demonstrates why we remove the "new" keyword, and why we
shuffle the first two parameters around.

Y_FOREACH_SECOND

This macro deals with the old version of the syntax, and is very simple to
parse. From the earlier parts of this tutorial, we know that the end product of
a "foreach" loop is:

pawn Code:
// This:
foreach (Iter, var)
// Or this:
foreach (new var : Iter)
// Becomes:
for (new var = sizeof (Iter@YSII_Ag) - 1; (var = Iter@YSII_Ag[var]) != sizeof (Iter@YSII_Ag) - 1; )
So we can very quickly construct the "Y_FOREACH_SECOND" macro and step through
its expansion:

pawn Code:
// Macro:
#define Y_FOREACH_SECOND|||Y_FOREACH_THIRD|||%2,%1||| %1 = sizeof (%2@YSII_Ag) - 1; _:(%1 = %2@YSII_Ag[%1]) != sizeof (%2@YSII_Ag) - 1;

// Input:
foreach (Player, playerid)

// Steps:
// Detect and reaplce "foreach"...
for (new Y_FOREACH_SECOND|||Y_FOREACH_THIRD|||Player, playerid||| )

// DO NOT replace "new" because there is no colon to match.

// Detect and replace "Y_FOREACH_SECOND":
for (new  playerid = sizeof (Player@YSII_Ag) - 1; _:( playerid = Player@YSII_Ag[ playerid]) != sizeof (Player@YSII_Ag) - 1; )

// Done!
Y_FOREACH_THIRD

If this macro is the first one listead, instead of "Y_FOREACH_SECOND", then the
user used the new syntax instead of the old syntax, so let us parse that
instead (here "%9" stores the other macro name and discards it):

pawn Code:
// Macro:
#define Y_FOREACH_THIRD|||%9|||%1|||%2||| %1 = sizeof (%2@YSII_Ag) - 1; _:(%1 = %2@YSII_Ag[%1]) != sizeof (%2@YSII_Ag) - 1;

// Input:
foreach (new playerid : Player)

// Steps:
// Detect and reaplce "foreach"...
for (new Y_FOREACH_SECOND|||Y_FOREACH_THIRD|||new playerid : Player||| )

// Detect and replace "new"...
for (Y_FOREACH_THIRD||| Y_FOREACH_SECOND|||new playerid ||| Player||| )

// Detect and replace "Y_FOREACH_THIRD"...
for (new playerid  = sizeof ( Player@YSII_Ag) - 1; _:(new playerid  =  Player@YSII_Ag[new playerid ]) != sizeof ( Player@YSII_Ag) - 1; )

// DO NOT replace "new" because there are no "|||"s to match.

// Done!
BAH! So close! The "new" keyword is still attached to the "playerid" symbol,
so every time we use one we use the other. In this case, we need to again
detect and remove the "new" keyword - an operation that we already wrote a macro
for:

pawn Code:
#define new%0|||%9|||%1:%2||| %9|||%0|||%1|||%2|||
This is where the "foreach" macros start getting really clever by repeatedly
calling this "new" macro with different macro names as parameters. The first
time we passed the two macros called "Y_FOREACH_SECOND" and "Y_FOREACH_THIRD",
this time we will pass different ones. Let's have a second attempt at that
"Y_FOREACH_THIRD" macro:

pawn Code:
// We previously removed the colon, but now we must put it back to be detected.
#define Y_FOREACH_THIRD|||%9|||%1|||%2||| %1|||Y_FOREACH_FOURTH|||%1:%2|||

// If there is a "new" in "%1", then "Y_FOREACH_FOURTH" will be run.
#define Y_FOREACH_FOURTH|||%0|||%1|||%2||| new %0 = sizeof (%2@YSII_Ag) - 1; _:(%0 = %2@YSII_Ag[%0]) != sizeof (%2@YSII_Ag) - 1;

// Input:
foreach (new playerid : Player)

// Steps:
// Detect and reaplce "foreach"...
for (new Y_FOREACH_SECOND|||Y_FOREACH_THIRD|||new playerid : Player||| )

// Detect and replace "new"...
for (Y_FOREACH_THIRD||| Y_FOREACH_SECOND|||new playerid ||| Player||| )

// Detect and replace "Y_FOREACH_THIRD"...
for (new playerid |||Y_FOREACH_FOURTH|||new playerid : Player||| )

// Detect and replace "new" again...
for (Y_FOREACH_FOURTH||| playerid |||new playerid ||| Player||| )

// Detect and replace "Y_FOREACH_FOURTH"...
for (new  playerid  = sizeof ( Player@YSII_Ag) - 1; _:(  playerid  =  Player@YSII_Ag[  playerid ]) != sizeof ( Player@YSII_Ag) - 1; )

// DO NOT replace "new" because there are no "|||"s to match.

// Done!
By calling the "new" macro twice we have managed to strip off just the word
"new" from "playerid" so that we know it was present but can add it in only
where we want it.

Why Redefine "new"?

There are a couple of ways to detect specific pieces of text - a define is one,
putting the text directly in the pattern is another. To detect the keyword
"new" we could do:

pawn Code:
#define new%0|||
Or:

pawn Code:
#define OTHER%0new%1|||
The first will detect anything starting with "new", the second will detect
anything that starts with "OTHER" (just as an example) and that includes the
text "new" somewhere after it. The difference is in exactly how much is
matched. A define CALLED "new", as the first one is, will search for only that
exact word - the word "newt" will NOT be matched. The second one will match
exactly the word "OTHER", but then just searches for the three letters "n", "e",
and "w" in order after it, so WILL detect the following:

pawn Code:
MY OTHER newt|||
In that case, "%0" will be a single space and "%1" will be the letter "t". This
is a subtle point when dealing with the pre-processor. This is important
because were we to use the second method, any variable name with "new" within it
would cause a compilation error:

pawn Code:
foreach (newt : Animal)
There is a second reason as well - the result if the match is NOT made. If we
do:

pawn Code:
#define S: MACRO_1 MACRO_2
#define MACRO_1%0MACRO_2%0new%1||| new %1; // Found "new".
#define MACRO_2%1||| %1; // Didn't find "new".
(Side note, the use of "%0" twice in that example is valid - the second one will
overwrite the contents of the first one, but both are discarded anyway).

Then doing:

pawn Code:
S: new var|||
Will give:

pawn Code:
new var;
But doing:

pawn Code:
S: printf("%d", var)|||
Will give:

pawn Code:
MACRO_1   printf("%d", var);
Because we couldn't find "new", the "MACRO_1" macro is never matched and the
text remains, leaving us with the word "MACRO_1" in the source-code where it
shouldn't be, and thus giving a compilation error.

If we instead do something like:

pawn Code:
#define S: new MACRO_3|||
#define new%0|||%1||| new other,%1;
#define MACRO_3|||%1; %1
Then doing:

pawn Code:
S: var|||
Will give:

pawn Code:
new other, var;
And doing:

pawn Code:
S: var;
Will give:

pawn Code:
new var;
In the second case, we have an excess "new" left over from the first macro that
wasn't matched, but we have designed the macros such that this remainder is
still valid in this context, and so that we won't get any compile-time errors.
This is basically how all the macros in "y_iterate" are designed - if the "new"
keyword is not matched, it must be because we have a pattern that requires the
"new" keyword to be present and therefore leaving it in the source code output
is valid.

The Remainder

The code presented so far is a little inaccurate, but is enough to describe the
basic principles on which the whole code block is based. They show how we use
the "new" keyword to detect certain patters and strip out certain words, and how
the macros are shuffled around so that different code is run when "new" is there
and when it isn't. Given this grounding, the remaining macros will not be
detailed to the same extent the first few are, but just to get you started, here
is the REAL version of Y_FOREACH_THIRD (note that "%1" and "%2" are the other
way around here to what they were before):

pawn Code:
#define Y_FOREACH_THIRD|||%0|||%1|||%2||| %1=Y_FOREACH_FIFTH|||Y_FOREACH_FOURTH|||%1:%2|||
"Y_FOREACH_FOURTH" is still there, but "Y_FOREACH_FIFTH" has been added,
separated from whatever is in "%1" by an equals sign (if "%1" is a variable, a
lack of delimiter here could cause the to names to merge and no longer be
valid), to run more code when there is no "new" keyword:

pawn Code:
#define Y_FOREACH_FIFTH|||Y_FOREACH_FOURTH|||%1:%2||| sizeof (%2@YSII_Ag) - 1; _:(%1 = %2@YSII_Ag[%1]) != sizeof (%2@YSII_Ag) - 1;
// Handles:
foreach (playerid : Player)
Note that again, the real macro is slightly different, but the extra indirection
handles iterator arrays and special iterators, it can basically be read as the
code above:

pawn Code:
#define Y_FOREACH_FIFTH|||Y_FOREACH_FOURTH|||%1:%2||| _Y_ITER_FOREACH_SIZE(%2);_:(%1=_Y_ITER_ARRAY$%2$YSII_Ag[%1])!=_Y_ITER_MAYBE_ARRAY(%2);
"_Y_ITER_MAYBE_ARRAY" gets the size of a slot in an iterator, for both normal
iterators and iterator arrays. "_Y_ITER_FOREACH_SIZE" does the same thing, but
can also detect special iterators defined with "()" syntax. "_Y_ITER_ARRAY"
gets the next slot in the current iterator for both normal iterators and arrays.

The real definition of "Y_FOREACH_FOURTH" is not actually a final output - it is
another redirection designed to try and detect declarations with tag overrides.
It adds in the macros "Y_FOREACH_SIXTH" and "Y_FOREACH_SEVENTH":

pawn Code:
// Tries to detect a colon via "new" despite us not adding it back (as in 3rd).
#define Y_FOREACH_SEVENTH|||%9Y_FOREACH_SIXTH;%0|||%1|||%2||| new %0:%1 = %0:(sizeof (%2@YSII_Ag) - 1); _:(%1 = %0:_Y_ITER_ARRAY$%2$YSII_Ag[%1]) != sizeof (%2@YSII_Ag) - 1;
// Handles:
foreach (new Group:g : CreatedGroup)

#define Y_FOREACH_SIXTH;%1|||Y_FOREACH_SEVENTH|||%2||| %1 = sizeof (%2@YSII_Ag) - 1; _:(%1 = _Y_ITER_ARRAY$%2$YSII_Ag[%1]) != sizeof (%2@YSII_Ag) - 1;
// Handles:
foreach (new playerid : Player)
So to recap:

"Y_FOREACH_SECOND" - Generates code for the original syntax.
"Y_FOREACH_FIFTH" - Generates code for the new syntax with no "new".
"Y_FOREACH_SEVENTH" - Generates code for the new syntax with tags.
"Y_FOREACH_SIXTH" - Generates the remainder of the new syntax ("new", no tags).

The others just call "new" again to try resolve which version to use.

Hopefully from this description you should now be able to read and comprehend
this code, taken straight from "y_iterate":

pawn Code:
#define foreach%1(%0) for(new Y_FOREACH_SECOND|||Y_FOREACH_THIRD|||%0|||)

// This allows us to use "new" multiple times - stripping off ONLY whole words.
#define new%0|||%9|||%1:%2||| %9|||%0|||%1|||%2|||

// This one is called if the new syntax is required, but the state of "new" is
// as-yet unknown.  This attempts to call "%1" as a macro, if it starts with
// "new" as a whole word then it will (and will also helpfully strip off the
// "new" keyword for us).
#define Y_FOREACH_THIRD|||%0|||%1|||%2||| %1=Y_FOREACH_FIFTH|||Y_FOREACH_FOURTH|||%1:%2|||

// This is called if the "new" macro is called for a second time.
#define Y_FOREACH_FOURTH|||%0=Y_FOREACH_FIFTH|||%1|||%2||| new Y_FOREACH_SIXTH;%0|||Y_FOREACH_SEVENTH|||%2|||

// This is called when there are tags on the "new" declaration.
#define Y_FOREACH_SEVENTH|||%9Y_FOREACH_SIXTH;%0|||%1|||%2||| new %0:%1=%0:_Y_ITER_FOREACH_SIZE(%2);_:(%1=%0:_Y_ITER_ARRAY$%2$YSII_Ag[%1])!=_Y_ITER_MAYBE_ARRAY(%2);

// This is called when there aren't.
#define Y_FOREACH_SIXTH;%0|||Y_FOREACH_SEVENTH|||%2||| %0=_Y_ITER_FOREACH_SIZE(%2);_:(%0=_Y_ITER_ARRAY$%2$YSII_Ag[%0])!=_Y_ITER_MAYBE_ARRAY(%2);

// This is called if "%1" didn't have "new" at the start.
#define Y_FOREACH_FIFTH|||Y_FOREACH_FOURTH|||%1:%2||| _Y_ITER_FOREACH_SIZE(%2);_:(%1=_Y_ITER_ARRAY$%2$YSII_Ag[%1])!=_Y_ITER_MAYBE_ARRAY(%2);

// This is the old version, but DON'T add "new" because that already exists from
// the failed "new" macro call above.
#define Y_FOREACH_SECOND|||Y_FOREACH_THIRD|||%1,%2||| %2=_Y_ITER_FOREACH_SIZE(%1);_:(%2=_Y_ITER_ARRAY$%1$YSII_Ag[%2])!=_Y_ITER_MAYBE_ARRAY(%1);
Reply


Messages In This Thread
Advanced Iterators (foreach/y_iterate). - by Ada32 - 14.04.2015, 19:43
Advanced Iterators (foreach/y_iterate) - Part 2 - by Ada32 - 14.04.2015, 19:45
Advanced Iterators (foreach/y_iterate) - Part 3 - by Ada32 - 14.04.2015, 19:47
Re: Advanced Iterators (foreach/y_iterate). - by Pottus - 14.04.2015, 19:49
Advanced Iterators (foreach/y_iterate) - Part 4 - by Ada32 - 14.04.2015, 19:50
Advanced Iterators (foreach/y_iterate) - Part 5 - by Ada32 - 14.04.2015, 19:52
Re: Advanced Iterators (foreach/y_iterate). - by Crayder - 02.05.2015, 07:16
Re: Advanced Iterators (foreach/y_iterate). - by Konstantinos - 02.05.2015, 10:50
Re: Advanced Iterators (foreach/y_iterate). - by Gammix - 10.04.2018, 00:08

Forum Jump:


Users browsing this thread: 1 Guest(s)