THE PUZZLET PAGE


How_Does_It_Work


FUNCTION:  Anagrams().  An anagram is a word whose letters can be manipulated to spell another word. For example, POST can be used to spell STOP (as well as OPTS, SPOT, TOPS and POTS!), STRAP can be used to spell PARTS, etc.  A numerical anagram does the same thing with digits.  For example, 1254 can be used to make 5124, 21339 can be used to make 31293, etc.

If you pass two integers to Anagrams(), it will analyse both. If one can be used to make the other, it returns TRUE (-1), otherwise FALSE (0).

This code will only find anagrams up to the maximum possible value of its arguments. So, when declaring Anagrams(), its arguments must be declared as type integer (int). See the table below for the limits.

Note that this funtion deals only with numerical anagrams.  If other characters are present in the argument, such as hexadecimal numbers like A25F9, etc, use function IsAnagram(). To see that function, click here now.

Type

Exponent of 2
Value

int 230 1073741824
uint 231 2147483648
int64 261 2305843009213693952
uint64 262 4611686018427387904


Declaring the function must be done like this:

declare Anagrams(p: int) (or whatever type of integer you choose).

Here's the code:


sub Anagrams (a1, a2)
' If argument a1 is an anagram of
' argument a2 returns TRUE (-1),
' otherwise returns FALSE (0).
' This algorithm is around 75%
' faster than the string method.
def length: int
' are both arguments the same length?
length = 1 + int(log10(a1))
if length <> 1 + int(log10(a2))
    return 0: 'no - not anagrams
endif
def a[length]: int
def flag, i, pntr: int
' first part - load a1 into array a[]
pntr = 0
do
    a[pntr] = a1 % 10
    pntr = pntr + 1
    a1 = a1/10
until a1 = 0
' second part - compare a2 to a[]
' by nullifying like terms
do
    i = a2 % 10
    flag = 0
    pntr = 0
    do
        if i = a[pntr]: 'character match
            a[pntr] = -1
            flag = -1
        else: ' no match - try next character
            pntr = pntr + 1
        endif
    until flag | pntr > length - 1
    if not flag: 'not anagrams
        return flag
    endif
    a2 = a2/10
until a2 = 0
return flag


Overview:  the function begins by weeding out the "hopeless" cases - those for which the two arguments don't even have the same length.  If the two lengths are identical, the code goes on to compare both arguments digit by digit. It does this in two parts.

Firstly, it loads the first argument's digits into individual array cells. Then it looks through the second argument's digits. If a digit in the second argument matches one stored in the array, the one in the array is nullified and the process moves onto the next digit. If no match is found, the arguments aren't anagrams of each other, and the program quits.

The nullified characters stop duplicate characters being "matched" twice or more when only one match exists.

If, after all digits have been checked, all characters in the first argument have been matched once and once only to a corresponding digit in the second argument, an anagram exists, and this is signalled back to the calling routine.

Now to look at the code in detail.

We begin by checking that both arguments are the same length, and to do this we need a method of finding the length. A very simple technique is to take logs and look at the exponent.  It works out that the exponent of an integer is always one less than the number of digits the integer contains - that is, its length.  So all that's required is to add one to the exponent, and we have the length.  You can see how it works from the table:

Typical Arguments
Log10
Exponent + 1
Actual Length
3
0.4771
1
1
29
1.4624
2
2
164
2.2148
3
3
3882
3.5891
4
4
78130
4.8928
5
5

The lines of code that use this are:

length = 1 + int(log10(a1))
if length <> 1 + int(log10(a2))
    return 0
endif


The length of the first argument, a1, is derived using the base 10 logs technique just discussed and stored in variable length.

This is then compared to the length of the second argument, a2, derived in exactly the same way.  If they're not of equal lengths, they can't be anagrams, and the function quits by sending FALSE (0) to the calling routine.

If the lengths are equal, we can go on to start comparing individual digits of the arguments.  The algorithm used here requires that the first argument is broken down into individual digits, each one being stored in a different cell of array a[].  The next half dozen lines of code take care this.  I've numbered them to make explanations simpler.


1
2
3
4
5
6
7
8

def a[length]: int
def flag, i, pntr: int
pntr = 0
do
    a[pntr] = a1 % 10
    pntr = pntr + 1
    a1 = a1/10
until a1 = 0

Let's look at the above code in detail now.

1

def a[length]: int.  The last section of code was used to found out how long argument a1 is, the result being stored in variable length.  Line 1 uses that value to set up an array, a[], whose length is exactly the same as a1.

2

def flag, i, pntr: int.  We can define some more variables at this point.  Variables flag and i aren't needed until the next section of code, but, for convenience, they're defined here.  Variable pntr is going to be used as a pointer to the individual cells of a[] in the next few lines of code.

3

pntr = 0.  Variable pntr is initialised ready for use.

4

do.  The DO loop used to load a[] is opened.

5


a[pntr] = a1 % 10We can paraphrase this line by saying: "Take the remainder of dividing a1 by 10 and save it into the cell in array a[] pointed to by variable pntr."  For example, if a1 equals 12345 dividing by 10 leaves a remainder of 5. If pntr at this time has value 3, a[3] will be loaded with 5.

6

pntr = pntr + 1Increments variable pntr ready for the next execution of the DO loop.

7

a1 = a1/10.  This line effectively chops the LSD (That is, least significant, or right-most digit) off a1.  How can the expression a1 = a1/10 do that? For instance, if a1 equal 12345, surely a1/10 should be 1234.5?

No! It works like this. a1 was declared as an integer in the declarations, as discussed in the introductory remarks. An integer has no decimal part. So 1234.5 becomes 1234. The overall effect has been to chop off that LSD, as required.

8

until a1 = 0This is decision time.  Should the DO loop be executed again, or quit so that the code can move on the next part?

Variable a1 just had its LSD chopped off, and this happens every time the DO loop is executed, so it's getting shorter all the time. Eventually, it will be reduced to zero length. At that point, all a1's digits have been stored in a[]'s cells, so that task is complete.

We can therefore paraphrase line 6 as "Execute the DO loop again unless variable a1 has been reduced to zero length."


Now we're ready for the more complex task of comparing a2's individual digits with those from a1, which have now been stored in individual cells of array a[].

The code handling this task is again numbered to make understanding easier.

9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

do
    i = a2 % 10
    flag = 0
    pntr = 0
    do
        if i = a[pntr]
            a[pntr] = -1
            flag = -1
        else
            pntr = pntr + 1
        endif
    until flag | pntr > length - 1
    if not flag
        return flag
    endif
    a2 = a2/10
until a2 = 0
return flag

Now for a line-by-line exegesis of the code.

9

do.  Opens DO loop containing this section of code.

10

i = a2 % 10.  Variable i is used to store the LSD of argument a2.  Integer arithmetic is used to separate it out.  The expression a2 % 10 means "the remainder after dividing a2 by 10."  So we can paraphrase the whole expression as "Take the remainder of dividing variable a2 by 10 and store the result in variable i."

11

flag = 0.  Initialises flag to zero. This means we're assuming the two arguments aren't anagrams of each other until it's been proved they are, since flag is used for this purpose.

12

pntr = 0.  Variable pntr exists in this code to point at individual cells of array a[], and it's initialised here to point at the very first cell, a[0].

13

do.  Opens a nested DO loop within the main one established at line 9 above.  The purpose of this inner loop is to process a[]'s contents against the specific digit stored in variable i at this time.

14

if i = a[pntr].  A paraphrase of this line is: "If the value stored in variable i at this time is equal to the value stored in the cell of array a[] pointed to by the value in variable pntr . . .".

Remember, the value in i is the LSD of argument a1. So we're working through the cells of array a[] one at a time searching for a match. Since the contents of those cells were taken from argument a2, we're effectively looking for a match between a1's LSD and any digit of a2.

15

a[pntr] = -1.  If such a match is found, the value in a[pntr] is nullified.  I chose to use the value -1 for nullifying, since, a[] would normally only ever hold 0 or positive integers.

16

flag = -1.  If we found a match, we don't need to search for further matches.  In fact, it would create an error to do so, since the same digit may exist twice in argument a1 but not in a2.  So flag is set to TRUE (-1), indicating a match has been found and further tests on this particular digit are not required.

17

else.  Prepares to take alternative action in the event of no match being found in this particular cell, that is, the one pointed to by current value of pntr.

18

pntr = pntr + 1.  Variable pntr is now incremented so that the next cell of array a[] can be tested for a match.

19

endif.  Housekeeping - closes local IF-ENDIF clause.

20

until flag | pntr > length - 1.  Decision time.  Should the inner DO loop be executed again, or quit so that the code can move on to the next instructions?  This line makes that decision.

It can be paraphrased as "Execute the inner DO loop again unless either flag is set to TRUE or the newly-incremented variable pntr's value is equal to variable length."

If flag is set to TRUE, it means a match has been found, and, as explained above, it's important not to check any more digits on this pass once that has happened.

If pntr equals length, it means every digit has been checked already, so there's nothing more to do here. Note that in these circumstance, flag will still have its original setting of FALSE (0).

21



if not flag.  At this point, flag could have either possible value, so the code has to check which it has and decide what to do.  This line effectively says "If flag has value FALSE . . ."

22

return flag.  If the code reaches this line, flag has value FALSE.  That means that a digit in argument a1 has no match in a2.  Therefore, regardless of whether other digits match or not, there's no point in continuing.  The function quits and returns flag's status to let the calling routine know the result.

23

endif.  Housekeeping - closes the IF-ENDIF clause.

24

a2 = a2/10.  In this part of the code, we are loading variable i with argument a2's LSD and comparing it to every digit of argument a1 (as stored in array a[]) one at a time.  It's time now to move onto another of a2's digits. If we chop off its LSD and then take the new LSD, we will be gradually working through its digits one at a time from the right-hand end.

This line prepares for that action by chopping off the right-most digit.  It uses integer arithmetic to do it, in exactly the same way as was used in line 7, explained in detail above.

25

until a2 = 0.  The process continues until all a2's digits have been investigated.  Each time a digit is checked for matches, a2's LSD is chopped off and it gets a bit shorter. Eventually, its length will be reduced to zero.  When this point is reached, all digits have been checked, and the searches can stop. 

So this line can be paraphrased as "Execute the outer DO loop until a2's length has been reduced to zero."

26

 return flag.  The code has completed all possible checks at this time, or it has quit early because a digit in argument a1 wasn't matched in argument a2.

Either way, flag contains the result of asking the question "Is argument a1 an anagram of argument a2?"  It's time to quit and pass the result back to the calling routine.


Earlier versions of this function used strings, but I have found this method, using integer arithmetic and arrays, is almost twice as fast.


MAIN MENU
HOW DOES IT WORK?

Site design/maintenance: Dave Ellis E-mail me!
Last Updated: February 3rd, 2010.