THE PUZZLET PAGE


How_Does_It_Work


FUNCTION:  IsAnagram().  An anagram is a word whose letters can be manipulated to spell another word. For example, POST can be used to spell STOP (as well as OPTS, SPOT, TOPS and POTS!), STRAP can be used to spell PARTS, etc.

If you pass two character strings to Anagrams(), it will analyse both. If one can be used to make the other, it returns TRUE (-1), otherwise FALSE (0).

Note: this function works on all character types.  If you wish to test a purely numerical character, use function Anagrams().  It can only handle integers, but it's much faster than IsAnagram(). To view the Anagrams() function, click here now.

Declaring the function must be done like this:

declare IsAnagram(str1$: string, str2$: string)

Here's the code:


sub IsAnagram(str1$, str2$)
' if str1$ is an anagram of str2$,
' returns -1, else returns 0.
def i$, j$, test$: string
def flag, i, j, L1, L2: int
test$ = str2$: 'copy 2nd argument
L1 = len(str1$): 'holds length of 1st argument
L2 = len(test$): 'holds length of 2nd argument
if L1 <> L2: 'are both arguments the same length?
    flag = 0: 'no
else: 'yes
    i = 1
    do
         'iterate over all characters in 1st argument
         i$ = mid$(str1$, i, 1): 'extract i'th character
        j = 1
        flag = -1
        'iterate over characters of 2nd argument until
        'match found or all characters examined
        do
            j$ = mid$(test$, j, 1): 'extract j'th character
            if i$ = j$: 'i'th and j'th characters matched
                replace$(test$, j, 1, "X"): 'record the match
                flag = 0
        endif
            j = j + 1: 'point to next character
        until flag = 0 | j > L2
        i = i + 1
    until flag | i > L1
    flag = (test$ = string$(L1, "X"))
endif
return flag
 

Overview:  the function begins by weeding out the "hopeless" cases - those for which the two arguments don't even have the same length.  If the two lengths are identical, the code goes on to compare both arguments character by character.  It does this in two parts.

In the first part, it works through the characters of argument #1.  For each character, it searches for a match in argument #2.  If one is found, it replaces the character in argument #2 with an "X" to ensure the character isn't used for matching purposes a second time.  Eventually, all character in argument #1 will have been examined in this way.

In the second part, argument #2 is examined.  If every single character has been replaced by "X", a complete match must exist, and TRUE (-1) is returned to the calling routine, otherwise FALSE (0) is returned. The table below shows typical examples.  They're all of differing lengths, and only one is a perfect match, indicated when the output string contains only "X" characters and nothing else.

Arg #1
Arg #2
new Arg #2
Comments
"10263"
"36219"
"XXXX9"
1 character mismatched
"871139"
"463189"
"46XXXX"
2 characters mismatched
"4567"
"5476"
"XXXX"
perfect match

Now to look at the code in detail.  I've numbered the lines to make the explanation which follows easier to understand.  Here are the numbered lines, with comment-only lines removed for clarity of numbering.


1

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

' main routine of function
test$ = str2$: 'copy 2nd argument

L1 = len(str1$): 'holds length of 1st argument
L2 = len(test$): 'holds length of 2nd argument
if L1 <> L2: 'are both arguments the same length?
    flag = 0: 'no
else: 'yes
    i = 1
    do
         i$ = mid$(str1$, i, 1): 'extract i'th character
        j = 1
        flag = -1
        do
            j$ = mid$(test$, j, 1): 'extract j'th character
            if i$ = j$: 'i'th and j'th characters matched
                replace$(test$, j, 1, "X"): 'record the match
                flag = 0
        endif
            j = j + 1: 'point to next character
        until flag = 0 | j > L2
        i = i + 1
    until flag | i > L1
    flag = (test$ = string$(L1, "X"))
endif
return flag
 

Let's look at the above code in detail now. 

1

test$ = str2$.  During execution of the code, any character in argument #2 which matches a character in argument #1 will be replaced by an "X" for reasons explained above and again below.  To prevent corruption of argument #2 by these actions, it's copied into variable test$, and the characters of test$ are the ones that get changed into "X", leaving argument #2 unchanged.

2

L1 = len(str1$). Before proceeding further, the code makes a "sanity" check.  If the two arguments are of differing lengths, they can't possibly be anagrams of each other, so there's no point in continuing.  To do this, the lengths of both arguments must be obtained.  Line 2 begins this process by applying the inbuilt function len() to argument 1 (which is contained in variable str1$). len() returns the number of characters in str1$ and saves the count in variable L1.

3

L2 = len(test$).  Line 3 applies the process outlined above for argument #1, saving the length of argument #2 in variable L2.

4

if L1 <> L2.  This line is asking the question "Is the value stored in variable L1 not equal to that stored in variable L2?"  This could be further paraphrased as "Is the length of argument #1 not equal to that of argument #2?"

5

flag = 0.  Line 5 is only executed if the answer to the question in line 4 is "Yes," that is, the two arguments are of different lengths.  In such a case, variable flag is set to FALSE (0).  Because the answer was "Yes," the code jumps immediately to the endif part of the clause, found in line 21.  It skips over all other character-match tests.

6

else.  Indicates there is an alternative action if the answer to the question in line 5 was "No."  It means that both arguments have the same length, so, as an alternative to quitting, the code can carry on to a character-by-character comparison of the two arguments, beginning in line 7.

7

i = 1.  The code is now going to iterate over every character in argument #1 in a DO looop, using variable i as iterator.  It knows how many times to go around the loop by referring to variable L1, which holds the character-count of argument #1.  You'll see this check taking place at the end of the loop in line 21.

8

do.  Opens the main DO loop.

9

i$ = mid$(str1$, i, 1).  For each iteration of the main DO loop, a different character from argument #1 is to be examined.  For convenience, the i'th character is taken each time, since variable i assumes values from 1 to however many characters argument #1 holds.  IBasic's built-in function mid$() is used to do this job, storing the extracted character in variable i$.

10

j = 1.  The code is going to set up another, nested DO loop within the main DO loop set up in line 8.  That means that the first loop stops where it is until the second loop has been completed.

11

flag = -1.  Variable flag will be used in the inner DO-LOOP to indicate success or failure.  It's set to TRUE  (-1) initially by line 11, assuming no character will be found in argument #2 to match the i'th character of argument #1.

12

do.  Opens the inner, nested DO-LOOP.

13

j$ = mid$(test$, j, 1).  Extracts the j'th character of argument 2 and saves it in variable j$, using the identical process to that already described for line 9.  Now we have the i'th character of argument #1 saved in i$ and the j'the character of argument #2 saved in j$.

14

if i$ = j$.  Line 14 goes ahead and tests the two saved characters to see if there is a match.  So it's asking the question "Is the character stored in variable i$ an exact match with the character stored in variable j$?"

15

replace$(test$, j, 1, "X").  This line will only be executed if the answer to the question in line 14 is "Yes."  In such a case, a match has been found.  The code wants to change the matched character in argument #2 with an "X" to prevent it being used again for a match, which could cause an error.  It performs this task by using the built-in function replace$().  For example, if test$ contains "51293, " and j contains 4, execution of line 15 will change test$ to "512X3"

16

flag = 0.  A match was found in line 14, so there's no reason to check any other characters of argument #2.  In fact, it would cause an error to do so if there are repeated characters in the argument.  So line 16 sets flag to 0 to indicate that the search in argument #2 for a particular character from argument #1 can be terminated now.

17

endif.  Housekeeping - closes IF-ENDIF clause opened in line 14.

18

j = j + 1.  Variable j is the iterator used for the DO loop opened in line 12.  It's now incremented ready for the next iteration if required.

19

until flag = 0 | j > L2.  The code now reaches the end of the DO loop and has to make a decision: execute it again or quit?  If the flag is reset to 0, we know a match was found and the code can move on past the DO loop.

Also, it's possible that all the characters of argument #2 have now been examined, and this should also cause the code to move on. A simple test for this condition is to check the value of the newly-incremented value in variable j.  If it now exceeds the value stored in variable L2, the loop has already checked all characters in argument #2.

So we can paraphrase line 19 like this:  "If variable
flag holds value 0 (meaning that a match was found) or the value stored in variable j is greater than that stored in variable L2 (meaning that all characters of argument #2 have now been examined for a match with the i'th character of argument #1 without finding a match), move on to the next line of code, otherwise execute the DO loop again."

20

i = i + 1.  Increments variable i ready for the next iteration of the outer loop if necessary.

21


until flag | i > L1.  Another decision point.  Execute the outer DO loop again, or quit?  There are once again two factors affecting the decision.  Firstly, if variable flag is still set to -1, it means no match was found for the current character of argument #1 anywhere in argument #2.  Therefore, there's no point in examining any other characters in argument #1, and the program can quit.

Secondly, it may be that all the characters of argument #1 have now been examined, and no further checks are therefore required.  This will be indicated when the newly-incremented value in variable i has exceed that stored in variable L1

Thus we can paraphrase line 21 as:  "If variable flag is set to -1 (meaning no match was found for the current character of argument #1 in argument #2) or the value stored in variable i is greater than that stored in variable L1 (meaning that all the characters in argument #1 have now been examined for checks in argument #2) quit the DO loop, otherwise execute it again."

23

flag = (test$ = string$(L1, "X")).  All checks are complete.  If every character in test$ has been changed to "X," the two arguments are anagrams of each other.  An easy test for this can be made using the IBasic function. It's used to generate a string equal in length to argument #1, but containing only the "X" character.

This string is then compared to test$ in the outer bracketed part of the line. If the two strings are identical, the expression returns a TRUE (-1), otherwise FALSE (0).  Either way the result is saved in variable flag, over-writing whatever was stored in there.

23




endif.  Housekeeping - completes IF-ENDIF clause opened in line 4.  Remember, the code would have jumped to this point if the two arguments were of differing lengths, with a 0 in flag.  Note that, since the intervening code was jumped over in these circumstances,  line 22 will not be executed, so flag's contents won't be changed.

24

return flag.  The code has completed execution, and all that remains is to return whatever value is now stored in variable flag to the calling routine, and line 24 does just that.


MAIN MENU
HOW DOES IT WORK?

Site design/maintenance: Dave Ellis E-mail me!
Last Updated:February 4th, 2010.