Euclidean Algorithm To Compute GCD
By Sudheer S
This is a long-form post about the Euclidean algorithm to compute the greatest common divisors of two integers. The article starts from the fundamentals and explains why it works better than the naive algorithm. The author also explains the computational complexity and the mathematics of the algorithm.
Prereseqisites
- Fundamental arithmetic and an understanding of the greatest common divisor(GCD).
- Introductory concepts in algebra and logarithms, including mathematical induction.
- Overview of modular arithmetic.
- Familiarity with the Fibonacci series. An excellent video tutorial of the Fibonacci series on Youtube.
- An introduction to the big O notation.
- Basics Python programming skills.
- Curiosity.
- Patience to read long articles. It is okay to read it in more than one sitting. Taking notes will be benefitial for the learner. Re-reading the article a few times is encouraged.
Introduction
What Is GCD?
GCD, the greatest common divisor is the largest number that divides the given two integers.
Here is an example:
Let’s find the GCD of 252 and 105.
divisors of 252: 1, 2, 3, 4, 6, 7, 9, 12, 14, 18, 21, 28, 36, 42, 63, 84, 126
divisors of 105: 1, 3, 5, 7, 15, 21, 35
The greatest among the two sets of divisors are 21. Therefore, GCD(252, 105) = 21.
Naive Algorithm To Compute The GCD
Let’s write a Python program to compute the GCD. We’ll call it the naive algorithm to compute the GCD.
def gcd_naive(a, b):
gcd = 1
for i in range(1, min(a, b) + 1):
if a % i == 0 and b % i == 0:
gcd = i
return gcd
# Example usage:
num1 = int(input("Enter the first number: "))
num2 = int(input("Enter the second number: "))
print(f"The GCD of {num1} and {num2} is {gcd_naive(num1, num2)}")
Sample execution of the program:
Enter the first number: 252
Enter the second number: 105
The GCD of 252 and 105 is 21
Walkthrough Of The Naive Algorithm Implementation
We set a
and b
as our integers. We want to find the GCD of a
and b
.
We loop from 1 to the minimum of the two integers a
and b
. If a
is less than b
then we loop from 1 to a
.
In each iteration of the loop, we test if i
divides both a
and b
. If it does, then set it as the gcd
. Increment by one and keep iterating.
At the end of the loop, we have the GCD. If no number divides both a
and b
, we have 1 as the GCD. 1 divides all integers and hence 1 divides a
and b
.
The Computational Complexity Of The Naive Algorithm To Compute The GCD
The computational complexity of the naive algorithm for finding the Greatest Common Divisor (GCD) of two integers a
and b
is O(min(a,b)).
Explanation:
- The naive algorithm iterates through all numbers from 1 to the minimum of
a
andb
. - For each iteration, it checks if both
a
andb
are divisible by the current number.
Since it performs a constant-time operation (checking divisibility) for each of the min(a,b) iterations, the overall time complexity is O(min(a,b)).
If you require a
to be always greather than b
, then you can simplify the notation to: O(b)
.
The best way to thoroughly understand the program is to evaluate the steps using a pen and paper.
Measuring The Exeuction Times Of The Naive Algorithm
To get some idea of how long it takes to compute the GCD of two integers, let’s modify our program.
import time
def gcd_naive(a, b):
gcd = 1
for i in range(1, min(a, b) + 1):
if a % i == 0 and b % i == 0:
gcd = i
return gcd
# Pairs of integers (small to large)
pairs = [
(12, 15),
(100, 250),
(1234, 5678),
(789456, 123456),
(987654321, 123456789),
(234567890123, 987654321098),
(12345678901234567890, 9876543210987654321)
]
# Compute GCD and measure time
for num1, num2 in pairs:
start_time = time.time()
gcd_result = gcd_naive(num1, num2)
end_time = time.time()
elapsed_time = end_time - start_time
print(f"The GCD of {num1} and {num2} is {gcd_result}")
print(f"Time taken: {elapsed_time:.10f} seconds\n")
Here’s the output on my computer:
The GCD of 12 and 15 is 3
Time taken: 0.0000202656 seconds
The GCD of 100 and 250 is 50
Time taken: 0.0000064373 seconds
The GCD of 1234 and 5678 is 2
Time taken: 0.0000605583 seconds
The GCD of 789456 and 123456 is 48
Time taken: 0.0066680908 seconds
The GCD of 987654321 and 123456789 is 9
Time taken: 6.4232766628 seconds
The GCD of 234567890123 and 987654321098 is 1
Time taken: 76928.0231845379 seconds
I couldn’t wait till the program finished computing the GCD(234567890123, 987654321098). It was taking too long, days not seconds to execute. Remember, O(b), that is, i
is the amount of work done to compute the GCD with this naive algorithm.
The Euclidean Algorithm To Compute The GCD
- Start with two integers,
a
andb
, wherea
≥b
. - Divide
a
byb
, and find the remainder,r
. - That is,
a=b*q+r
, whereq
is the quotient andr
is the remainder. - Replace
a
withb
, andb
withr
. - Repeat the process until the remainder
r
is 0. - The GCD is the last non-zero remainder.
a=252, b=105
252 ÷ 105 = 2 with a remainder of 42 (since 252 = 105*2 + 42)
Replace a with 105 and b with 42.
105 ÷ 42 = 2 with a remainder of 21 (since 105 = 42*2 + 21)
Replace a with 42 and b with 21.
42 ÷ 21 = 2 with a remainder of 0 (since 42 = 21*2 + 0)
The remainder is now 0, so the GCD is the last non-zero remainder, which is 21.
Why the Euclidean Algorithm Works
The Euclidean algorithm works based on a key property of divisors and remainders:
Key Property
If a=b*q+r
, then any common divisor of a
and b
is also a common divisor of b
and r
.
Conversely, any common divisor of b
and r
is also a common divisor of a
and b
.
Proof of Correctness Of The Key Property
- Let
d
be the GCD ofa
andb
. - By definition,
d
divides botha
andb
. - From the equation
a=b*q+r
, we can see thatr=a−b*q
. - Since
d
divides botha
andb
, it must also divide any linear combination ofa
andb
. Therefore,d
dividesr
. - Hence, the problem of finding the GCD of
a
andb
reduces to finding the GCD ofb
andr
. - This process continues until the remainder
r
becomes 0. At this point, the last non-zero remainder is the GCD of the original two numbers.
Pondering Over The Key Property
Let a
= 100 and b
= 12.
r = a - b*q
Let’s vary q from 1 to 8.
r | a | b | q |
---|---|---|---|
88 | 100 | 12 | 1 |
76 | 100 | 12 | 2 |
64 | 100 | 12 | 3 |
52 | 100 | 12 | 4 |
40 | 100 | 12 | 5 |
28 | 100 | 12 | 6 |
16 | 100 | 12 | 7 |
4 | 100 | 12 | 8 |
For all these linear combinations of a
and b
, d
divides r
.
Since d
divides a
and b
.
- If you multiply
a
by any integer, thend
still divides the result of the multiplication. - The same goes for
b
. If you multiplyb
by any integer, thend
still divides the result of the multiplication. d
dividesa
+b
d
dividesa
-b
As an exercise, play with different values of q
. Take it to the next level. Find two integers a
and b
and their greatest common divisor d
. Use the equation r = a - b*q
, vary q
and see if d
indeed divides
r
a+b
a-b
- any linear combination of
a
andb
.
Rope Length Analogy For The Euclidean Algorithm To Compute The GCD
Imagine you have two ropes: Rope 1 with a length of a = 48 and Rope 2 with a length of b = 18. Your goal is to find the longest possible piece of rope, d
, that can be used to measure both ropes exactly without leaving any leftover rope. This piece of rope d
is the greatest common divisor (GCD) of the lengths a
and b
.
Step 1: Compare and Mark
Initial Setup: Rope 1 (48) is longer than Rope 2 (18). First Iteration: Take Rope 2 (18) and place it alongside Rope 1 (48). Starting from one end, mark Rope 1 at the length of Rope 2 (18). Shifting: Now, slide Rope 2 forward so that its tail is at the mark you just made. Mark Rope 1 again at the new position, which will be at 36 (18 + 18). Measure Remaining: The remaining unmarked portion of Rope 1 is 48 - 36 = 12. This leftover piece of length 12 is now our new comparison length.
Step 2: Repeat the Process
Second Iteration: Now, we focus on Rope 2 (length 18) and the new piece of rope (length 12). Place the new rope (12) alongside Rope 2 (18), starting at one end. Mark Rope 2 where the new rope ends. Shifting: Slide the new rope (12) forward, mark again. You’ll have 18 - 12 = 6 as the new remaining length.
Step 3: Continue Until No Remainder
Third Iteration: Now, compare the new piece of rope (length 12) with the remaining length (6). Place the shorter rope (6) alongside the longer one (12). You’ll see that the shorter rope (6) fits exactly twice into the longer one (12) with no leftover rope. The remainder is now 0.
When you reach a remainder of 0, the last piece of rope length that fit into both ropes is the greatest common divisor. In this case, the GCD of 48 and 18 is 6. This means the longest piece of rope that can measure both 48 and 18 without leaving any leftover is 6.
This analogy helps visualize Euclid’s algorithm by breaking it down into a practical and physical process of measuring and marking, making the concept of GCD more intuitive.
Python Program To Implement The Euclidean Algorithm
def euclid_gcd(a, b):
while b != 0:
a, b = b, a % b
return a
# Example usage
num1 = 48
num2 = 18
gcd = euclid_gcd(num1, num2)
print(f"The GCD of {num1} and {num2} is {gcd}")
Let’s modify the program to run some tests on a few pairs of integer inputs. I will use the same pairs of integers we used in the naive version of this program.
import time
def euclid_gcd(a, b):
while b != 0:
a, b = b, a % b
return a
# Pairs of integers (small to large)
pairs = [
(12, 15),
(100, 250),
(1234, 5678),
(789456, 123456),
(987654321, 123456789),
(234567890123, 987654321098),
(12345678901234567890, 9876543210987654321)
]
# Compute GCD and measure time
for num1, num2 in pairs:
start_time = time.time()
gcd_result = euclid_gcd(num1, num2)
end_time = time.time()
elapsed_time = end_time - start_time
print(f"The GCD of {num1} and {num2} is {gcd_result}")
print(f"Time taken: {elapsed_time:.10f} seconds\n")
Here’s the output on my computer.
The GCD of 12 and 15 is 3
Time taken: 0.0000014305 seconds
The GCD of 100 and 250 is 50
Time taken: 0.0000009537 seconds
The GCD of 1234 and 5678 is 2
Time taken: 0.0000009537 seconds
The GCD of 789456 and 123456 is 48
Time taken: 0.0000009537 seconds
The GCD of 987654321 and 123456789 is 9
Time taken: 0.0000014305 seconds
The GCD of 234567890123 and 987654321098 is 1
Time taken: 0.0000030994 seconds
The GCD of 12345678901234567890 and 9876543210987654321 is 90000000009
Time taken: 0.0000011921 seconds
GCD for all pairs was computed in a fraction of a second. Compare the results to the naive version.
Efficiency Of The Euclidean Algorithm
The Euclidean algorithm is efficient because with each iteration, the size of the numbers involved decreases significantly. Specifically, the remainder r
is always less than b
, ensuring that the number of iterations is logarithmic. This makes it very fast even for large integers.
In summary, the Euclidean algorithm works by repeatedly applying the division and exploiting the properties of divisors and remainders. Its efficiency and simplicity makes it a fundamental algorithm in number theory and computer science.
Time Complexity Of The Euclidean Algorithm
The time complexity of the Euclidean algorithm is
O(log min(a,b))
.
Detailed Analysis Of The Time Complexity Of The Euclidean Algorithm
To understand why the time complexity is logarithmic, let’s take a look at the steps of the algorithm and analyze the size reduction of the numbers involved.
- Division Step: In each step of the Euclidean algorithm, we compute the remainder of the expression:
a
divided by b
(i.e., r = a mod b
)
and then replace a
with b
and b
with r
.
- Reduction in Size: The key observation is that the size of the numbers decrease significantly with each iteration. Specifically, if
r = a mod b
thenr < b
. Hence, the pair(a,b)
is replaced by(b,r)
withr < b
. - Worst Case Analysis: The worst-case scenario occurs when the reduction in size is as slow as possible. This happens when the numbers involved are successive Fibonacci numbers. In this case, the algorithm takes the maximum number of steps. If we start with two successive Fibonacci numbers Fn and Fn−1, then the remainder when Fn is divided by Fn−1 is Fn−2, and so on. Each step reduces the Fibonacci index by 1, leading to the sequence: (Fn,Fn−1), (Fn−1,Fn−2), (Fn−2,Fn−3),… Since Fibonacci numbers grow exponentially,
Fn ≈ ϕn/sqrt(5).
where ϕ ≈ 1.618
(the golden ratio).
- Number of Steps: The number of steps
k
required to reduce the problem to a pair involving 1 is proportional to the number of Fibonacci numbers up to the original input size. Thus,
k ≈ logϕmax(a,b) ≈ logϕmin(a,b)
given that the logarithm base ϕ
is a constant factor.
Time Complexity: Converting to base 2 logarithms. Since
logϕ(x)=log2(x)/log2(ϕ)
Since log2(ϕ)≈0.694 and 1/0.694 is 1.44092219
k ≈ 1.44log2(b)
Since we use Big-O notation to denote the asymptotic upper bound, the constant factor 1.44 is omitted, leading to:
k ∈ O(log2b)
Thus, the time complexity of Euclid’s algorithm, in the worst case, is O(log2(min(a,b)), where min(a,b)
is the smaller of the two input numbers. This is because the number of steps k
is proportional to the logarithm of the smaller number in the pair, given the logarithmic growth rate of the Fibonacci sequence.
A note about the change of base logarithm property: the change of base logarithm property allows you to convert a logarithm from one base to another. The formula for changing the base of a logarithm is:
logb(x)=logk(x) / logk(b)
Proof And Detailed Explanation Of The Time Complexity Of The Euclidean Algorithm
If you are new to modulo arithmetic, I recommend you take the Khan Academy’s short course on this subject. If you have a firm ground in algebra, you should be able complete the course in a few hours. Don’t worry, if it takes more. It’s worth it, if you want to delve deeper into computer science.
Modular Arithmetic - a short course by Khan Academy
Modulo Operation And The Distributive Property
The modulo operation has the property:
(a + b) mod m = ((a mod m) + (b mod m)) mod m
This means that you can distribute the modulo operation over addition.
Applying The Distributive Property Of The Modular Operation To Fibonacci Numbers
In the case of the Fibonacci numbers Fn+1 and Fn:
Fibonacci Relationship:
Fn+1 = Fn + Fn−1
Modulo Fn:
Fn+1 mod Fn = (Fn+Fn−1) mod Fn
Using the distributive property of modulo over addition:
(Fn+Fn−1) mod Fn= (Fn mod Fn + Fn−1 mod Fn) mod Fn
Since Fn mod Fn = 0:
(Fn mod Fn + Fn−1 mod Fn) mod Fn = (0 + Fn−1 mod Fn) mod Fn = Fn−1
Thus:
Fn+1 mod Fn = Fn−1
Notice that Fn-1 < Fn. When you divide a smaller number by a larger number, you get the smaller number as the remainder with quotient as 0. Therefore, Fn-1 mod Fn = 0.
Why Fibonacci Numbers Are The Worst Case
Here’s an intuition to understand why Fibonacci numbers represent the worst case:
- Reduction Rate: In each step, the algorithm reduces the problem size by subtracting the smaller number from the larger one and taking the modulus. For Fibonacci numbers, the remainder is maximized because
Fk+1 mod Fk = Fk−1 is nearly as large as
Fk itself,
leading to the slowest possible reduction.
- Comparison with Other Pairs: For other pairs of numbers, the remainders in each step tend to be smaller relative to the divisor, leading to faster convergence. For instance, if
a
andb
are two large numbers witha>b
, the typical case is thata mod b
is significantly smaller thanb
. Therefore, the number of steps required is fewer than in the case of Fibonacci numbers.
Proof Of The Worst-Case Scenario
Review Of The Fibonacci Numbers:
Fibonacci numbers are a sequence of numbers where each number is the sum of the two preceding ones, starting from 0 and 1. The sequence is as follows: 0, 1, 1, 2, 3, 5, 8, 13, 21, and so on. Formally, Fn is defined as:
F0=0
F1=1
Fn=Fn−1+Fn−2 for n≥2
The worst-case scenario for the Euclidean algorithm occurs when the algorithm requires the maximum number of steps to compute the GCD of two numbers a
and b
where a>b
.
Inductive Proof
We aim to show that if the Euclidean algorithm requires N steps for a pair of numbers a>b
, then a≥FN+2 and b≥FN+1.
Base Case: N=1
For N=1, the algorithm requires only one step, meaning b
divides a
with no remainder. The smallest natural numbers for which this is true are b=1 and a=2, corresponding to Fibonacci numbers F2 and F3 respectively.
Induction Hypothesis
Assume that the result holds for all values of N up to M−1. That is, if the algorithm requires M−1 steps, then for the numbers
a’ and b’
used in those steps,
a’≥F(M−1)+2=FM+1 and
b’≥F(M−1)+1=FM.
Induction Step
Consider the case where the Euclidean algorithm requires M steps for the pair a>b. The first step of the algorithm can be written as: a=q0b+r0 where 0≤r0<b
The algorithm then proceeds with b
and r0, which requires M−1 steps. By the induction hypothesis:
b≥FM+1
r0≥FM
Now, since a=q0b+r0 and q0≥1 (since a>b), a≥b+r0 a≥FM+1+FM
By the properties of Fibonacci numbers: FM+1+FM=FM+2
Thus: a≥FM+2
This completes the induction step, showing that if the Euclidean algorithm requires M steps, then a≥FM+2 and b≥FM+1.
By induction, it follows that the smallest values of a
and b
that require N steps in the Euclidean algorithm are FN+2 and FN+1 respectively. This demonstrates the worst-case scenario for the Euclidean algorithm, where the number of steps required is maximized.
Conclusion
The Euclidean algorithm efficiently computes the GCD with a time complexity of O(log min(a,b))
. This logarithmic complexity arises from the significant reduction in the size of the numbers involved with each iteration, especially noticeable in the worst-case scenario of successive Fibonacci numbers.
Practical Uses
This article is about the algorithms to compute the GCD. We write these programs to study the algorithms.
For practical purposes, you don’t want to write such programs yourselves. If your application requires computing divisors or GCD, you just use the library functions for convenience. These library functions are mature and well optimized for performance. On the other hand, the programs to implement the algorithms in this tutorial are meant to study the time complexities. The algorithms and mathematical concepts discussed in the article could be used as a stepping stone to delve deeper into number theory, cryptography and computer science in general.
To find the divisors of an integer, use the sympy
package.
from sympy import divisors
divisors(252)
# [1, 2, 3, 4, 6, 7, 9, 12, 14, 18, 21, 28, 36, 42, 63, 84, 126, 252]
The standar library function to compute the GCD:
from math import gcd
gcd(252, 105)
Output:
Out[3]: 21
References
- Wikipedia article on GCD
- Wikipedia article on modular arithmetic
- Wikipedia article on the big O notation
- Wikipedia article on Logarithms
- video tutorial of the Fibonacci series on Youtube.
- Modular Arithmetic - a short course by Khan Academy
- Wikipedia article on mathematical induction
- A short introduction to mathematical induction - video on Youtube by Infinity Learn NEET