THE PUZZLET PAGE


How_Does_It_Work


FUNCTION:  AllDigitsDifferent().  Pass an integer to this function and it will inspect each digit in the integer. If they're all different (such as 912, or 293168, etc) the function returns TRUE (-1), otherwise it returns FALSE (0). So 9823127 would return FALSE because the 2 is repeated.

It's up to you what data type you make the argument, depending on the type of integer you want to process.


Type

Exponent of 2
Value

int 230 1073741824
uint 231 2147483648
int64 261 2305843009213693952
uint64 262 4611686018427387904

Declaring the function must be done like this:

declare AllDigitsDifferent(p: int) (or whatever integer type you want).

Here's the code:


sub AllDigitsDifferent(d)
' If all the digits of argument d are
' unique returns TRUE (-1), otherwise
' returns FALSE (0).
' This algorithm is approximately 25%
' faster than one using strings.
def a[11]: int: 'holds individual digits
def flag, count, i1, i2: int

' first part - find length of d and
' store individual digits of d
count = 0
do
    count = count + 1: 'track length of d
    a[count] = d % 10: 'store each digit
    d = d/10: 'chop off the LSD
until d = 0: 'all chopped off!

' second part - check none of d's
' digits are duplicated
flag = -1
i1 = 1
do
    i2 = i1 + 1
    do
        if a[i1] = a[i2]: 'digits are the same
            flag = 0
        endif
        i2 = i2 + 1
    until i2 > count
    i1 = i1 + 1
until i1 = count
return flag

Local array a[11] is created locally to hold the individual digits of argument d. If you work with longer-type integers, you'll need to make this array larger so that it can hold all the digits. Make the array one longer than the maximum number of digits in the data type used.

Variable
flag is a flag used to store the code's result.  It can hold -1 or 0, corresponding to TRUE or FALSE. When a non-identical digit is found before they've all been checked, it prevents further unnecessary checks, improving the efficiency of the code. Ultimately, it is this value which must be returned to the calling routine. 


Variable count is used to hold a count of the number of digits in the function's argument, d.

Variables i1 and i2 are used as iterators in two nested DO loops in the second part of the program, whose use we will look at shortly.

In the description of the code which follows, I have applied line numbers to make it easier to follow. Here they are:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

count = 0
do
    count = count + 1
    a[count] = d % 10
    d = d/10
until d = 0
flag = -1
i1 = 1
do
    i2 = i1 + 1
    do
        if a[i1] = a[i2]
            flag = 0
        endif
        i2 = i2 + 1
    until not flag | i2 > count
    i1 = i1 + 1
until not flag | i1 = count
return flag

Now for a line-by-line exegesis of the code.

1

count = 0.  Initialises variable count to zero in readiness for the DO loop which is about to open and use it.

2

do.  Opens the first DO loop. The code will continue to execute within this loop until either it it finds an exit condition is met, which we'll discuss when we get to line 6.

3

count = count + 1.  Increments variable count so that a check can be kept on the number of times the DO loop is executed. This will be the same as the number of digits in d, all of which have to be processed by the DO loop.

4

a[count] = d % 10.  This expression can be paraphrased as "Divide d by 10 and save the remainder in the cell of array a[] which variable count is currently pointing to." For example, assume d = 24638 and count holds value 3. Dividing d by 10 leaves a remainder of 8, which is saved in a[3].

Be aware that we haven't actually altered d's value at this time, because we didn't save the result anywhere. d still has the value before the trial division took place.

This type of operation is known as integer arithmetic. The same is true of line 5.

5

d = d/10.  To paraphrase this line as well, we can say "Divide d by 10, throw away any remainder, and save the result back into d."  To illustrate, using the previous value for d of 24638. Dividing by 10 gives 2463 remainder 8. Throw the 8 away and store the rest in d, which is now 2463.  The net effect of this action is to discard d's least significant digit. And this time, of course, d's value is permanently changed.

The reason this works is that d was set as an integer in the declaration of the function. This means that it can't have a decimal part.  Going back to the value of 24638 for a moment. 24638/10 would normally be 2463.8, but since we're assigning the result back to n (an integer), the decimal part is discarded and we end up with 2463.

6

until d = 0.  A decision must be made at the end of the DO loop: execute again, or move onto the next instruction? This line makes that decision.  A paraphrase of the expression could be "Execute the DO loop again unless variable d is equal to zero."

We're chopping off d's LSD every time this DO loop is executed, so it's continually getting shorter. Eventually, it will be down to length zero, having value zero. When line 6 is executed it recognises the zero value and moves onto the rest of code starting with line 7.

7

flag = -1.  Variable flag is initialised to TRUE (-1). This means the code is going to assume all of d's digits are different unless it discovers otherwise.

8

i1 = 1.  Iterator i1 is initialised to zero ready to start the second DO loop.

9

do.  Opens the second DO loop.

10

i2 = i1 + 1.  Initialises variable i2 to a value one greater than that of variable i1.  Later on, i1 will have higher values than its initial value, but i2 will always be initialised in the same way.  This means that i2 will never have a value equal to or lower than i1. The reason for this will become apparent shortly.

11

do.  Opens a third DO loop. This one, however, is nested within the second DO loop, so it's kind of like loops 2(a) (outer loop) and 2(b) (inner loop). While loop 2(b) is operating, loop 2(a) is suspended in whatever state is was when 2(b) started. When 2(b) completes, 2(a) picks up where it left off.

12

if a[i1] = a[i2].  The two pointers formed by variables i1 and have different values, so point to two different cells within array a[]. The contents of these cells are two different digits from argument d, should have two different values if all the digits of d are different. That's exactly what this line is checking.

The arrangement of i1 and i2 is such that between them they will eventually point to every possible digit-pair combination within d.  Assume d contains 5 digits. Look at the table below to see how the two DO loops work together to find every possible digit pair from the 5.

Value of i1
Value of i2
Selection
1
2
a[1] and a[2]
1
3
a[1] and a[3]
1
4
a[1] and a[4]
1
5
a[1] and a[5]
2
3
a[2] and a[3]
2
4
a[2] and a[4]
2
5
a[2] and a[5]
3
4
a[3] and a[4]
3
5
a[3] and a[5]
4
5
a[4] and a[5]

The outer DO loop holds i1 at value 1 while the inner DO loop cycles i2 from value 2 through 5. When it's done, control is returned to the outer DO loop which increments i1 to value 2. Then the inner DO loop is given control again, cycling i2 from value 3 through 5, and so on.

Notice how i2 always has a greater value than i1, and that the two values are never the same. This way, the two nested  DO loops between them cycle just once through every possible digit pair. If there are two digits the same, this process will find them without fail.


13

flag = 0. If the code reaches this line, it means that two identical digits have been found. There's obviously no point in continuing with further digit-pair checks.  Variable flag is set to FALSE (0) so that the nested DO loops can be quit and the result signalled back to the calling routine. It's more efficient to quit this way than to continue.

14

endif.  Housekeeping: closes IF-ENDIF clause.

15

i2 = i2 + 1.   i2 is incremented ready for the inner DO loop to select the second half of another digit pair.

16

until not flag | i2 > countThe code now reaches the end of the inner DO loop. It needs to decide whether to execute the loop again, or just move on to the next instruction.

This line is the decision-maker.  It can be paraphrased as "execute the DO loop again unless either flag is FALSE (0) or the value of i2 exceeds that stored in variable count."

The "until not flag" may appear confusing, but it's actually simple. The normal phrase is "until flag" and means "execute the DO loop until flag is set to TRUE (-1)." So "until not flag" means the opposite: "execute the DO loop until flag is set to FALSE (0)."

Clearly, if flag is FALSE (0), a pair of identical digits has been found, so there's no need to look at any more of d's digits. The function can return the FALSE (0) immediately.

And the alternative for exiting the DO loop is if d exceeds count. Remember, count holds the number of digits in argument d, established at the beginning of the function within the first DO loop.

d is incremented by 1 each time the DO loop executes without finding an identical digit pair. So, if all the digit pairs identified so far are different, it will eventually be incremented past the value stored in count, causing the inner DO loop to quit.


17

i1 = i1 + 1.  With the code back in the outer DO loop, variable i1 is incremented ready to start forming the next group of digit pairs.

18

until not flag | i2 = countThe code now reaches the end of the outer DO loop. It needs to decide whether to execute the loop again, or just move on to the next instruction.

This code works almost identically to that in line 16, described in detail above. There are two differences. Firstly, it's controlling the outer DO loop, not the inner.  Secondly, it looks for i2 to equal count, not to exceed it as in the inner DO loop.

Because incrementing occurs after processing in both loops, the variables concerned, i1 and i2, will always end up one greater than the final processed values.  If you look back at the table in line 12 above, you can see that the process values never exceed (but always equal) the correct range.

19

return flag.   One of those two alternatives will cause the program to quit the outer DO loop and this last line of code to be executed. The calling routine will thus receive either a FALSE (0) or TRUE (-1), corresponding to the presence or absence of identical digits within d.


Earlier versions of this function used strings, but I have found this method, using integer arithmetic, is about three times faster.


MAIN MENU
HOW DOES IT WORK?

Site design/maintenance: Dave Ellis E-mail me!
Last Updated: February 3rd, 2010.