Lap, can you quickly show me a long social chain?
You are challenging your laptop. You devised a plan to compute a long social chain. But your first attemps failed, timely speaking. Can you think of an alternative? What about tweaking your computations to get them fast?
- The objective of the challenge:
- The tools you can use and why?
- How to deal with the challenge?
- A first step: getting some perfect numbers.
- A second step: computing friendly numbers.
- The final step: tweaking your algorithm to make it faster.
- Conclusion and take-away:
You want to challenge your laptop.
Working with large and big integers is central to cryptography and cybersecurity applications.
You want to get a sense of feeling about working with big numbers.
You know that a very specific task in the domain, which still challenges teams of computer scientists, is to find the prime decomposition of a very big integer in a reasonable time.
This task is soundly simple in those words.
The formal question is however to find the list of prime numbers, and their multiplicity, with which such a decomposition is found.
But, you know finding big primes is practically computationally very expensive. That's interesting!
There aren't indeed very fast algorithms that can factor a, let'say 'cryptographic number', which is a very big integer used in cybersecurity.
The overall objective of your new challenge is unique. It deserves you to bring a feel of the need of tweaking even the simplest algorithms to speed them up and solve a task.
More specifically, you want to code an efficient algorithm of prime decomposition that will push your laptop in the run.
Intrigued about that problem, you discovered some special fun integers in your discrete math textbook:
You want to exhibit the longest possible social chain with your laptop and with a minimal set of basic python tools and rules.
Let's get started!
The tools you can use and why?
You are only allowed to use the Python core library to work out arithmetic and lists operations in your code, including the modulo operation.
Especially, you constrain yourself not using any specific extra module to code your solution with, except the Datetime and Matplolib for timing considerations.
Computations are long and you guess you will need to track time and to plot some curves while investigating the complexity of the problem.
You indeed want to visualize the complexity of the solutions you find. You want to plot the time required to compute a set of inputs.
You are finally additionnaly warmly welcome to use mathematical tricks to tweak your algorithms with and to speed them up. This is an open book challenge.
You are given the next three basic definitions to start with:
- A perfect number $n$ is positive integer whose proper divisors (all divisors except $n$ itself) add up to $n$.
- Friendly numbers is a couple $(m, n)$ of positive integers where the proper divisors of $m$ add up to $n$, and similarly, the proper divisors of $n$ add up to $m$.
- A social chain is a cyclic finite sequence of $q$ positive integers $(n_1, n_2, \dots, n_q$) in which one element' proper divisors add up to his successor.
This is however declarative knowledge. Although you know what these definitions mean, they don't teach you how to compute those math facts.
You therefore devise a plan for the computation.
You will start with the most basic approach: the naive approach. Listing each and every divisors of an integer.
Computing some numbers in such a way will provide a baseline for the complexity of the problem.
You next devise to code an algorithm of prime decomposition. The Sieve of Erastoten is one example. You can choose one or more examples.
Your bet is a faster running time.
After thourough thinking, you have a three steps plan which the details are given in the next sections.
To tackle the first task, you have to write three utilities that do the following jobs :
- A function that takes a positive integer $n$ and returns a list of all its positive divisors.
- A function that takes a positive integer $n$ and returns True if $n$ is a perfect number.
- A function that takes a positive integer $N$ and returns a list of all perfect numbers $n_1, n_2, \dots n_q$ less than $N$.
To that point, you will be getting a list of perfect numbers. You could therefore explore how long it takes to get a list for large $N$.
Note that in order to compute the divisors of a positive integer $n$ using a naive exhaustion approach, you will simply test each and every integers less than or equal to $n$ for divisibility.
You are furthermore suggested to keep track of the computation time to get a perfect numbers list and to study the temporal complexity this naive approach has.
You can there produce some nice visualisations to show your results.
When having an algorithm that finishes in a reasonable time and that is correct, whatever slow it is, you are ready to verify some particular perfect numbers properties.
For instance, what about the sum $S = \sum_{i=1}^k \frac{1}{d_i}$, where $d_i$ describes the set of all divisors of a positive integer ?
You can conjecture a result and try to prove it.
Finally in this first step, you have now tools to play with perfect numbers. Can you get a list the first ten perfect numbers in less than 30 minutes ? What is the biggest perfect number you can compute?
Your second task is to compute a list of friendly numbers.
You need therefore to modify your solution such that your algorithm checks whether a couple of positive integers are friendly numbers.
Recall. These numbers are defined such that the sum of the divisors of the first number is equal to the second number and vice-versa.
As in the first task, you will still use a naive solution and you will still be confronted to slow computations.
How long does it take compute the first 20 friendly numbers?
Can you produce a time complexity plot? Can you guess the Big Oh complexity of the naive solution?
Well, until now, if you wanted to compute some special numbers with a fast algorithm, you lose the challenge.
Your laptop has not enough computational power to exhibe a large perfect number or a large friendly numbers.
However, this work gives you a baseline solution that you can try to enhance.
This is your last and most important task.
Remembering the prime decomposition of a positive integer, you fortunately learn about a result that you guess will help.
The sum of every divisors of $n$, including $n$ is given by the following formula:
$$s(n)= \prod_{i=1}^k \frac{p_i^{\alpha_i+1}-1}{p_i-1}$$
where $p_i$ et $\alpha_i$ are the components of the prime decomposition of $n$. This is easily checked.
Hence, instead of searching for each and every divisors of $n$ as in the naive approach, you now want to compute for the prime decomposition of that number.
That's your initial objective and you're to succeed.
As already said, there are several algorithms that tackle this problem. Choose one or two of them to implement your new solution.
Then study the time complexity of your solution. What is the longest social chain you can exhibit in 30 minutes?
What is the longest social chain you can compute patiently with your laptop?
You hopefully worked out successfully the three steps of this confrontation with your laptop.
The first two tasks were not fast enough to tackle the problem of exhibing special integers large enough you can work or play with in a reasonable time...
Then you tweaked your solution using prime decompositions. This speeds up your algorithm and still you realize you are limited to relatively small numbers.
Prime decomposition is declarative knowledge, as compared to procedural knowledge. There are multitude algorithms to achieve this task.
You have now a sense of what is a hard problem.
Designing fast algorithms for prime decomposition and a various classes of problems, even resulting from soundly simple descriptions, can be very challenging and is still in active research.
As a successul Data Scientist, you have to design, optimize and work with as fast as possibe learning algorithms.