Euclidean Algorithm To Compute GCD

By Sudheer S

July 28, 2024

This is a long-form post about the Euclidean algorithm to compute the greatest common divisors of two integers. The article starts from the fundamentals and explains why it works better than the naive algorithm. The author also explains the computational complexity and the mathematics of the algorithm.

Prereseqisites

Fundamental arithmetic and an understanding of the greatest common divisor(GCD).
Introductory concepts in algebra and logarithms, including mathematical induction.
Overview of modular arithmetic.
Familiarity with the Fibonacci series. An excellent video tutorial of the Fibonacci series on Youtube.
An introduction to the big O notation.
Basics Python programming skills.
Curiosity.
Patience to read long articles. It is okay to read it in more than one sitting. Taking notes will be benefitial for the learner. Re-reading the article a few times is encouraged.

Introduction

What Is GCD?

GCD, the greatest common divisor is the largest number that divides the given two integers.

Here is an example:

Let’s find the GCD of 252 and 105.

divisors of 252: 1, 2, 3, 4, 6, 7, 9, 12, 14, 18, 21, 28, 36, 42, 63, 84, 126

divisors of 105: 1, 3, 5, 7, 15, 21, 35

The greatest among the two sets of divisors are 21. Therefore, GCD(252, 105) = 21.

Naive Algorithm To Compute The GCD

Let’s write a Python program to compute the GCD. We’ll call it the naive algorithm to compute the GCD.

def gcd_naive(a, b):
    gcd = 1
    for i in range(1, min(a, b) + 1):
        if a % i == 0 and b % i == 0:
            gcd = i
    return gcd
    
# Example usage:
num1 = int(input("Enter the first number: "))
num2 = int(input("Enter the second number: "))
    
print(f"The GCD of {num1} and {num2} is {gcd_naive(num1, num2)}")

Sample execution of the program:

Enter the first number: 252
Enter the second number: 105
The GCD of 252 and 105 is 21

Walkthrough Of The Naive Algorithm Implementation

We set a and b as our integers. We want to find the GCD of a and b. We loop from 1 to the minimum of the two integers a and b. If a is less than b then we loop from 1 to a. In each iteration of the loop, we test if i divides both a and b. If it does, then set it as the gcd. Increment by one and keep iterating. At the end of the loop, we have the GCD. If no number divides both a and b, we have 1 as the GCD. 1 divides all integers and hence 1 divides a and b.

The Computational Complexity Of The Naive Algorithm To Compute The GCD

The computational complexity of the naive algorithm for finding the Greatest Common Divisor (GCD) of two integers a and b is O(min⁡(a,b)).

Explanation:

The naive algorithm iterates through all numbers from 1 to the minimum of a and b.
For each iteration, it checks if both a and b are divisible by the current number.

Since it performs a constant-time operation (checking divisibility) for each of the min⁡(a,b) iterations, the overall time complexity is O(min⁡(a,b)).

If you require a to be always greather than b, then you can simplify the notation to: O(b).

The best way to thoroughly understand the program is to evaluate the steps using a pen and paper.

Measuring The Exeuction Times Of The Naive Algorithm

To get some idea of how long it takes to compute the GCD of two integers, let’s modify our program.

import time

def gcd_naive(a, b):
    gcd = 1
    for i in range(1, min(a, b) + 1):
        if a % i == 0 and b % i == 0:
            gcd = i
    return gcd

# Pairs of integers (small to large)
pairs = [
    (12, 15),
    (100, 250),
    (1234, 5678),
    (789456, 123456),
    (987654321, 123456789),
    (234567890123, 987654321098),
    (12345678901234567890, 9876543210987654321)
]

# Compute GCD and measure time
for num1, num2 in pairs:
    start_time = time.time()
    gcd_result = gcd_naive(num1, num2)
    end_time = time.time()
    elapsed_time = end_time - start_time
    print(f"The GCD of {num1} and {num2} is {gcd_result}")
    print(f"Time taken: {elapsed_time:.10f} seconds\n")

Here’s the output on my computer:

The GCD of 12 and 15 is 3
Time taken: 0.0000202656 seconds

The GCD of 100 and 250 is 50
Time taken: 0.0000064373 seconds

The GCD of 1234 and 5678 is 2
Time taken: 0.0000605583 seconds

The GCD of 789456 and 123456 is 48
Time taken: 0.0066680908 seconds

The GCD of 987654321 and 123456789 is 9
Time taken: 6.4232766628 seconds

The GCD of 234567890123 and 987654321098 is 1
Time taken: 76928.0231845379 seconds

I couldn’t wait till the program finished computing the GCD(234567890123, 987654321098). It was taking too long, days not seconds to execute. Remember, O(b), that is, i is the amount of work done to compute the GCD with this naive algorithm.

The Euclidean Algorithm To Compute The GCD

Start with two integers, a and b, where a≥ b.
Divide a by b, and find the remainder, r.
That is, a=b*q+r, where q is the quotient and r is the remainder.
Replace a with b, and b with r.
Repeat the process until the remainder r is 0.
The GCD is the last non-zero remainder.

a=252, b=105
252 ÷ 105 = 2 with a remainder of 42 (since 252 = 105*2 + 42)
Replace a with 105 and b with 42.
105 ÷ 42 = 2 with a remainder of 21 (since 105 = 42*2 + 21)
Replace a with 42 and b with 21.
42 ÷ 21 = 2 with a remainder of 0 (since 42 = 21*2 + 0)

The remainder is now 0, so the GCD is the last non-zero remainder, which is 21.

Why the Euclidean Algorithm Works

The Euclidean algorithm works based on a key property of divisors and remainders:

Key Property

If a=b*q+r, then any common divisor of a and b is also a common divisor of b and r. Conversely, any common divisor of b and r is also a common divisor of a and b.

Proof of Correctness Of The Key Property

Let d be the GCD of a and b.
By definition, d divides both a and b.
From the equation a=b*q+r, we can see that r=a−b*q.
Since d divides both a and b, it must also divide any linear combination of a and b. Therefore, d divides r.
Hence, the problem of finding the GCD of a and b reduces to finding the GCD of b and r.
This process continues until the remainder r becomes 0. At this point, the last non-zero remainder is the GCD of the original two numbers.

Pondering Over The Key Property

Let a = 100 and b = 12.

r = a - b*q

Let’s vary q from 1 to 8.

r	a	b	q
88	100	12	1
76	100	12	2
64	100	12	3
52	100	12	4
40	100	12	5
28	100	12	6
16	100	12	7
4	100	12	8

For all these linear combinations of a and b, d divides r. Since d divides a and b.

If you multiply a by any integer, then d still divides the result of the multiplication.
The same goes for b. If you multiply b by any integer, then d still divides the result of the multiplication.
d divides a + b
d divides a - b

As an exercise, play with different values of q. Take it to the next level. Find two integers a and b and their greatest common divisor d. Use the equation r = a - b*q, vary q and see if d indeed divides

r
a+b
a-b
any linear combination of a and b.

Rope Length Analogy For The Euclidean Algorithm To Compute The GCD

Imagine you have two ropes: Rope 1 with a length of a = 48 and Rope 2 with a length of b = 18. Your goal is to find the longest possible piece of rope, d, that can be used to measure both ropes exactly without leaving any leftover rope. This piece of rope d is the greatest common divisor (GCD) of the lengths a and b.

Step 1: Compare and Mark

Initial Setup: Rope 1 (48) is longer than Rope 2 (18). First Iteration: Take Rope 2 (18) and place it alongside Rope 1 (48). Starting from one end, mark Rope 1 at the length of Rope 2 (18). Shifting: Now, slide Rope 2 forward so that its tail is at the mark you just made. Mark Rope 1 again at the new position, which will be at 36 (18 + 18). Measure Remaining: The remaining unmarked portion of Rope 1 is 48 - 36 = 12. This leftover piece of length 12 is now our new comparison length.

Step 2: Repeat the Process

Second Iteration: Now, we focus on Rope 2 (length 18) and the new piece of rope (length 12). Place the new rope (12) alongside Rope 2 (18), starting at one end. Mark Rope 2 where the new rope ends. Shifting: Slide the new rope (12) forward, mark again. You’ll have 18 - 12 = 6 as the new remaining length.

Step 3: Continue Until No Remainder

Third Iteration: Now, compare the new piece of rope (length 12) with the remaining length (6). Place the shorter rope (6) alongside the longer one (12). You’ll see that the shorter rope (6) fits exactly twice into the longer one (12) with no leftover rope. The remainder is now 0.

When you reach a remainder of 0, the last piece of rope length that fit into both ropes is the greatest common divisor. In this case, the GCD of 48 and 18 is 6. This means the longest piece of rope that can measure both 48 and 18 without leaving any leftover is 6.

This analogy helps visualize Euclid’s algorithm by breaking it down into a practical and physical process of measuring and marking, making the concept of GCD more intuitive.

Python Program To Implement The Euclidean Algorithm

def euclid_gcd(a, b):
    while b != 0:
        a, b = b, a % b
    return a

# Example usage
num1 = 48
num2 = 18

gcd = euclid_gcd(num1, num2)
print(f"The GCD of {num1} and {num2} is {gcd}")

Let’s modify the program to run some tests on a few pairs of integer inputs. I will use the same pairs of integers we used in the naive version of this program.

import time

def euclid_gcd(a, b):
    while b != 0:
        a, b = b, a % b
    return a

# Pairs of integers (small to large)
pairs = [
    (12, 15),
    (100, 250),
    (1234, 5678),
    (789456, 123456),
    (987654321, 123456789),
    (234567890123, 987654321098),
    (12345678901234567890, 9876543210987654321)
]

# Compute GCD and measure time
for num1, num2 in pairs:
    start_time = time.time()
    gcd_result = euclid_gcd(num1, num2)
    end_time = time.time()
    elapsed_time = end_time - start_time
    print(f"The GCD of {num1} and {num2} is {gcd_result}")
    print(f"Time taken: {elapsed_time:.10f} seconds\n")

Here’s the output on my computer.

The GCD of 12 and 15 is 3
Time taken: 0.0000014305 seconds

The GCD of 100 and 250 is 50
Time taken: 0.0000009537 seconds

The GCD of 1234 and 5678 is 2
Time taken: 0.0000009537 seconds

The GCD of 789456 and 123456 is 48
Time taken: 0.0000009537 seconds

The GCD of 987654321 and 123456789 is 9
Time taken: 0.0000014305 seconds

The GCD of 234567890123 and 987654321098 is 1
Time taken: 0.0000030994 seconds

The GCD of 12345678901234567890 and 9876543210987654321 is 90000000009
Time taken: 0.0000011921 seconds

GCD for all pairs was computed in a fraction of a second. Compare the results to the naive version.

Efficiency Of The Euclidean Algorithm

The Euclidean algorithm is efficient because with each iteration, the size of the numbers involved decreases significantly. Specifically, the remainder r is always less than b, ensuring that the number of iterations is logarithmic. This makes it very fast even for large integers.

In summary, the Euclidean algorithm works by repeatedly applying the division and exploiting the properties of divisors and remainders. Its efficiency and simplicity makes it a fundamental algorithm in number theory and computer science.

Time Complexity Of The Euclidean Algorithm

The time complexity of the Euclidean algorithm is O(log ⁡min⁡(a,b)).

Detailed Analysis Of The Time Complexity Of The Euclidean Algorithm

To understand why the time complexity is logarithmic, let’s take a look at the steps of the algorithm and analyze the size reduction of the numbers involved.

Division Step: In each step of the Euclidean algorithm, we compute the remainder of the expression:

a divided by b (i.e., r = a mod b)

and then replace a with b and b with r.

Reduction in Size: The key observation is that the size of the numbers decrease significantly with each iteration. Specifically, if r = a mod b then r < b. Hence, the pair (a,b) is replaced by (b,r) with r < b.
Worst Case Analysis: The worst-case scenario occurs when the reduction in size is as slow as possible. This happens when the numbers involved are successive Fibonacci numbers. In this case, the algorithm takes the maximum number of steps. If we start with two successive Fibonacci numbers F_n and F_n−1, then the remainder when F_n is divided by F_n−1 is F_n−2, and so on. Each step reduces the Fibonacci index by 1, leading to the sequence: (F_n,F_n−1), (F_n−1,F_n−2), (F_n−2,F_n−3),… Since Fibonacci numbers grow exponentially,

F_n ≈ ϕⁿ/sqrt(5).

where ϕ ≈ 1.618 (the golden ratio).

Number of Steps: The number of steps k required to reduce the problem to a pair involving 1 is proportional to the number of Fibonacci numbers up to the original input size. Thus,

k ≈ log_⁡ϕmax⁡(a,b) ≈ log_⁡ϕmin⁡(a,b)

given that the logarithm base ϕ is a constant factor.

Time Complexity: Converting to base 2 logarithms. Since

log_⁡ϕ(x)=log⁡₂(x)/log_⁡2(ϕ)

Since log_⁡2(ϕ)≈0.694 and 1/0.694 is 1.44092219

k ≈ 1.44log₂(b)

Since we use Big-O notation to denote the asymptotic upper bound, the constant factor 1.44 is omitted, leading to:

k ∈ O(log_⁡2b)

Thus, the time complexity of Euclid’s algorithm, in the worst case, is O(log_2⁡(min⁡(a,b)), where min⁡(a,b) is the smaller of the two input numbers. This is because the number of steps k is proportional to the logarithm of the smaller number in the pair, given the logarithmic growth rate of the Fibonacci sequence.

A note about the change of base logarithm property: the change of base logarithm property allows you to convert a logarithm from one base to another. The formula for changing the base of a logarithm is:

log_⁡b(x)=log_⁡k(x) / log_k(b)

Proof And Detailed Explanation Of The Time Complexity Of The Euclidean Algorithm

If you are new to modulo arithmetic, I recommend you take the Khan Academy’s short course on this subject. If you have a firm ground in algebra, you should be able complete the course in a few hours. Don’t worry, if it takes more. It’s worth it, if you want to delve deeper into computer science.

Modular Arithmetic - a short course by Khan Academy

Modulo Operation And The Distributive Property

The modulo operation has the property:

(a + b) mod m = ((a mod m) + (b mod m)) mod m

This means that you can distribute the modulo operation over addition.

Applying The Distributive Property Of The Modular Operation To Fibonacci Numbers

In the case of the Fibonacci numbers F_n+1 and F_n:

Fibonacci Relationship:

F_n+1 = F_n + F_n−1

Modulo Fn:

F_n+1 mod F_n = (F_n+F_n−1) mod Fn

Using the distributive property of modulo over addition:

(F_n+F_n−1) mod F_n= (F_n mod F_n + F_n−1 mod Fn) mod Fn

Since Fn mod Fn = 0:

(F_n mod F_n + F_n−1 mod F_n) mod F_n = (0 + F_n−1 mod F_n) mod F_n = F_n−1

Thus:

F_n+1 mod F_n = F_n−1

Notice that F_n-1 < F_n. When you divide a smaller number by a larger number, you get the smaller number as the remainder with quotient as 0. Therefore, F_n-1 mod F_n = 0.

Why Fibonacci Numbers Are The Worst Case

Here’s an intuition to understand why Fibonacci numbers represent the worst case:

Reduction Rate: In each step, the algorithm reduces the problem size by subtracting the smaller number from the larger one and taking the modulus. For Fibonacci numbers, the remainder is maximized because

F_k+1 mod F_k = F_k−1 is nearly as large as

F_k itself,

leading to the slowest possible reduction.

Comparison with Other Pairs: For other pairs of numbers, the remainders in each step tend to be smaller relative to the divisor, leading to faster convergence. For instance, if a and b are two large numbers with a>b, the typical case is that a mod b is significantly smaller than b. Therefore, the number of steps required is fewer than in the case of Fibonacci numbers.

Proof Of The Worst-Case Scenario

Review Of The Fibonacci Numbers:

Fibonacci numbers are a sequence of numbers where each number is the sum of the two preceding ones, starting from 0 and 1. The sequence is as follows: 0, 1, 1, 2, 3, 5, 8, 13, 21, and so on. Formally, Fn is defined as:

F₀=0

F₁=1

F_n=F_n−1+F_n−2 for n≥2

The worst-case scenario for the Euclidean algorithm occurs when the algorithm requires the maximum number of steps to compute the GCD of two numbers a and b where a>b.

Inductive Proof

We aim to show that if the Euclidean algorithm requires N steps for a pair of numbers a>b, then a≥F_N+2 and b≥F_N+1.

Base Case: N=1

For N=1, the algorithm requires only one step, meaning b divides a with no remainder. The smallest natural numbers for which this is true are b=1 and a=2, corresponding to Fibonacci numbers F₂ and F₃ respectively.

Induction Hypothesis

Assume that the result holds for all values of N up to M−1. That is, if the algorithm requires M−1 steps, then for the numbers

a^’ and b^’

used in those steps,

a^’≥F_(M−1)+2=F_M+1 and

b^’≥F_(M−1)+1=F_M.

Induction Step

Consider the case where the Euclidean algorithm requires M steps for the pair a>b. The first step of the algorithm can be written as: a=q₀b+r₀ where 0≤r₀<b

The algorithm then proceeds with b and r₀, which requires M−1 steps. By the induction hypothesis:

b≥F_M+1

r0≥_FM

Now, since a=q₀b+r₀ and q₀≥1 (since a>b), a≥b+r₀ a≥F_M+1+F_M

By the properties of Fibonacci numbers: F_M+1+F_M=F_M+2

Thus: a≥F_M+2

This completes the induction step, showing that if the Euclidean algorithm requires M steps, then a≥F_M+2 and b≥F_M+1.

By induction, it follows that the smallest values of a and b that require N steps in the Euclidean algorithm are F_N+2 and F_N+1 respectively. This demonstrates the worst-case scenario for the Euclidean algorithm, where the number of steps required is maximized.

Conclusion

The Euclidean algorithm efficiently computes the GCD with a time complexity of O(log ⁡min⁡(a,b)). This logarithmic complexity arises from the significant reduction in the size of the numbers involved with each iteration, especially noticeable in the worst-case scenario of successive Fibonacci numbers.

Practical Uses

This article is about the algorithms to compute the GCD. We write these programs to study the algorithms.

For practical purposes, you don’t want to write such programs yourselves. If your application requires computing divisors or GCD, you just use the library functions for convenience. These library functions are mature and well optimized for performance. On the other hand, the programs to implement the algorithms in this tutorial are meant to study the time complexities. The algorithms and mathematical concepts discussed in the article could be used as a stepping stone to delve deeper into number theory, cryptography and computer science in general.

To find the divisors of an integer, use the sympy package.

from sympy import divisors

divisors(252)
# [1, 2, 3, 4, 6, 7, 9, 12, 14, 18, 21, 28, 36, 42, 63, 84, 126, 252]

The standar library function to compute the GCD:

from math import gcd
gcd(252, 105)

Output:

Out[3]: 21