Abdul Bari Algorithms - Time Complexity
An introduction to analyzing algorithms, comparing functions, and Big O notation, such as Big O, Theta, and Omega.
Introduction to Algorithms
Algorithms vs Programs
Algorithm | Program |
---|---|
Focused on Design | Implementation in programming language on hardware |
Domain Knowledge | Software Engineering |
Any Language | Specific Language |
Analyze | Testing |
Priori Analysis and Posteriori Testing
posteriori-vs-a-priori-analysis-of-algorithms notes
Priori Analysis
Hardware independent
Theoretical
Language independent
Time and space function
Posteriori
Watch time and bytes
Algorithm Characteristics
Input
Algorithms can take 0 or more inputs
Output
Algorithms must generate some result or output
If it doesn’t give you any output, an algorithm is not useful
Even if a function returns void, it should return a result in some other method, like modifying a variable somewhere
Definiteness
Everything should be unambiguous and clear.
If you can’t describe the problem to a human, you don’t know it well enough to write an algorithm.
You can’t pass an imaginary number like without specifying how to deal with it
Finiteness
Algorithms must terminate at some point
A web server, which keeps running until you stop it, is a program, not an algorithm. Programs may use algorithms while running.
Effectiveness
Don’t have any unnecessary procedures in your algorithm.
For example, in chemistry, you wouldn’t boil a chemical and not use it in the experiment. That would be unnecessary.
How to Write and Analyze Algorithms
Swapping two numbers
This is the pseudo code for swapping two values.
We won’t always use full language syntax in algorithms
##+BEGIN_NOTE This particular function only works for languages that support “pass by reference” like C/C++. Read more here ##+END_NOTE
function swap(a, b){
tmp = a;
a = b;
b = tmp;
}
Criteria for Analyzing Algorithms
Time and space are the most important criteria when analyzing algorithms
Time
How long will the algorithm take to run?
Space
How much memory does the algorithm need to run?
There are other important characteristics we may care about, but are usually less important.
Network Traffic
How much data needs to be sent over the network when the algorithm runs?
Power
How much power does the algorithm need to run?
This is important when designing for mobile devices
CPU Registers
Sometimes for low level software, you may need to know hardware details like how many CPU registers your algorithms needs.
Time Analysis
Every “simple” statement in an algorithm takes one “unit” of time
If a procedure has 3 simple statements, it takes 3 units of time.
You can write this as
This is a constant value. It doesn’t matter what you give it, it always takes 3 units of time.
For simplicity, we usually say things like y = 3*a + 6*b
is just 1 unit of time. It’s usually not necessary to go into such great detail
Space Analysis
What is the space complexity of this algorithm? How much memory does it use?
function swap(a,b){
tmp = a;
a = b;
b = tmp;
}
It uses 3 variables always, regardless of the input, so we say which is also constant.
Each statement is one “unit” of time and each variable is one “unit” of space
Frequency Count Method
Frequency Count Method
The time taken by an algorithm can be determined by assigning one “unit” of time for each “statement”
If a statement is repeating, the frequency of execution will determine the time taken by the algorithm to run
Sum of elements in array
function(nums){
sum = 0
for (i = 0; i < nums.length; i++){
sum = sum + nums[i];
}
return sum;
}
Time Complexity
If I give it an array of length n, the sum function runs n times, so the algorithm takes time to run. We call this “order of n”
Space Complexity
we have variables sum
, i
and nums
nums
has space n “units” of space and i
and sum
and 1 “unit” of space.
Since nums
has a space complexity of n
so we say the space complexity is “order of n”
Matrix Addition
function addMatrix(a,b){
for(i= 0; i < a.length; i++){
for(j = 0; j < a[0].length; j++){
c[i][j] = a[i][j] + b[i][j];
}
}
}
Time Complexity
Two for nested for loops, each taking n time.
n procedures executing n times
Order of n^2 or
Space Complexity
variables a,b,c
matrices.
variables i,j
scalar variables.
Time Complexity
How do we analyze time complexity for given code?
What things affect time complexity?
We need to figure out how many times stmt
executes
Normal for loops
stmt
will be executed n times, so it’s
for(i=0; i<n;i++){
stmt()
}
Decrementing for loop
Even though i
is decrementing, it will be executed n times, so it’s also
for(i=n; i>0;i--){
stmt()
}
Increment by two
for(i=0; i<n;i+=2){
stmt()
}
In this code we increment by two, so it will be executed n/2 times. It is because n * anything is
Nested for loops
for(i=0; i<n;i++){
for(j=0; j<n;j++){
stmt()
}
}
Each loop executes n
times so stmt
is executed n
* n
times or
Dependent For Loops
What happens if the inner loop is dependent on the outer loop?
for(i=0; i<n;i++){
for(j=0; j<i;j++){
stmt()
}
}
Let’s make a table to keep track of the values at each interation
i | j | stmt executions |
---|---|---|
0 | 0 | 0 |
1 | 0 , 1 | 1 |
2 | 0, 1, 2 | 2 |
3 | 0, 1, 2, 3 | 3 |
… | … | … |
n | 0,1,2,…,n | n |
How many times is stmt
executed?
The outer loop goes from 0 to n
When i
is 1, stmt
is executed 1 times
When i
is 2, stmt
is executed 2 times
Finding out how many times stmt
is executed is the same as the integer-sum-formula
This is equivalent to 1 + 2 + 3 + 4 … + n
We can use
This can be expanded out to
This is simplified to to because we only care about the biggest exponent.
Outer loop does not execute n times
p=0
for(i=1; p<=n;i++){
p=p+i
stmt()
}
How many times is stmt
executed?
Let’s make a table that shows the values at each iteration
i | p |
---|---|
1 | 0+1=1 |
2 | 1+2=3 |
3 | 1+2+3 |
4 | 1+2+3+4 |
k | 1+2+3+…+k |
We can use the integer-sum-formula again to find p
for a given i
When will this loop stop? when p > n
Let’s replace p
with our formula and simplify
This can be expanded out and simplified to (we know we only care about the exponent for big o time complexity analysis)
Therefore, the time complexity is
Multiply i value
for(i=0; i<n;i=i*2){
stmt()
}
How many times will this execute?
k | i |
---|---|
1 | 1 |
2 | 2 |
3 | 2^2 |
4 | 2^3 |
2^k
assumei>=n
which breaks the loop
The time complexity is
Divide i value
for(i=0; i<n;i=i/2){
stmt()
}
i < 1
The time complexity is
While loops and If
How can we analyze functions with while loops and if statements?
We can make a table of values to understand the exection
while (m != n){
if (m > n){
m=m-n
} else {
n = n-m
}
}
m=16 | n=2 |
---|---|
14 | 2 |
12 | 2 |
10 | 2 |
8 | 2 |
6 | 2 |
4 | 2 |
2 | 2 |
We can see that with an input of 16, it will run 7 times, so 16/2 = 8, which can give us the upper limit
The time complexity is n/2 which simplifies to
Classes of Functions
Types of time functions
These are listed in increasing size
Constant time
Always runs in a fixed amount of time, doesn’t depend on input
or are both constant, which is described as
Logarithmic
Linear
and both simplify to
Quadratic
Cubic
Exponential
Sample values for different classes
0 | 1 | 1 | 2 |
1 | 2 | 4 | 4 |
2 | 4 | 16 | 16 |
3 | 8 | 64 | 256 |
We can see that the growth is much faster for the exponential equations
When n gets large, will always be less than
By Cmglee
Asymptotic Notation
Big O -
Upper Bound
means there are positive constants c and k, such that for all .
If you were to graph big of bounding function, your function’s value would always be less than the big O upper bound
If you have a function
What’s an example of a function that will always be greater?
10 is the constant c, from the Big o definition
Try to use the closest function for the upper bound, even though higher values like could be the upper bound, it’s less useful
Big Omega - Ω
Lower Bound
Similar to Big O notation, but your function will always be greater than the omega function
means there are positive constants c and k, such that for all .
Theta - Θ
Average Bound
means there are positive constants , and k, such that for all n ≥ k.
Consider
In this case
Since this is average notation, you can’t use , it’s out of the bounds
Properties of asymptotic notation
General Property
if is then is
e.g. is
is
Reflexive Property
If is given, then =
ex then
Transitive Property
if is and is
then is
if is upper bound for and is upper bound for , then is also an upper bound for
Symmetric Property
Only true for Θ notation
if is then is
e.g ,
Transpose Symmetric
True for and
if if is
eg ,
then is and is
if is and
is acting as both lower bound as well as upper bound
So
then
ex. If ,
Then
then
When adding, take the bigger term
Also
Comparison of functions
If we have two functions, how can we show which is the upper bound and which is the lower bound?
For example, vs
We can sample values and observe which is greater in a table
2 | ||
3 | ||
4 |
Apply on both sides
vs
vs
This simplifies to vs
2 * log(n) <= 3 * log(n)
We can see that 2log(n) is always less than 3log(n)
Logarithms guide
then
Comparison of functions
First example
apply
Second Example
using the property
using the property
so it simplifies to
so and
Third example
Apply log
The power inside the log statement comes out via the third formula
Apply log
The power inside the log statement comes out via the third formula
Simplify to one
So and
Best, Worst, and Average Case Analysis
Linear Search
Given a list [8,6,12,5,9,7,4,3,16,18]
And we want to search for 7
Linear search starts at the first element, checking each element one at a time moving from left to right
If we search for something that doesn’t exist like 20
it will search from left to right until it reaches the end of the list.
Best Case
The best case is the fastest the algorithm can possibly run
In linear search, If the element you’re searching for is present at the first index, that’s the best case.
Best case time is , it will always take one iteration no matter how long the list is, if the key value is at the first index
Worst Case
The worst case is the slowest the algorithm can run
In linear search, if the element you’re searching for is present at the last index, that’s the worst case.
Worst case is because you have to search every element in the list
Average Case
All possible case times divided by number of cases. Usually this isn’t feasible to find, so we rarely do it and focus on the worst case time instead
To find this, you find all possible cases, add up time taken in each possible case, and divide by the number of cases
What are the cases of linear search?
Key present at first index, key present at second index, etc
If at first index, then 1 comparisons
If at second index, then 2 comparisions
etc
So 1 + 2 +3 + .. +n is the total possible case time
and there are n possible cases
We can use integer-sum-formula to add up the time of all cases
We divide the total case time by the number of cases
We simply by dividing out n at the top and bottom, and are left with
Which is the average case time
Asymptotic notation
Don’t confuse best, worst, and average case with Big O, Big Omega, and Theta
Best case and worst can be expressed using any of these. Big O isn’t only used to express worst case
Best case can be expressed using any of these notations
Best case(n) = 1
Best case(n) =
Best case(n) =
Best case(n) =
Binary Search
If you’re searching for 15
, start at root 20
Is 15
bigger or smaller than 20
? smaller so move left
Then check 10
, is 15
larger than 10
? larger, so move right
Best Case
If the element you’re searching for is the root, the best case time is constant 1
Worst Case
In the worst case, the element you’re searching for is a leaf
So the worst case is the height of the binary tree, which is
So the worst case is
Unbalanced binary search tree
A binary tree could be unbalanced, this binary tree is left skewed
It has the height n
The best case is still 1
when the element is at the root
However, the worst case is the height of the tree, which is n
in this case, so the worst case is n
whereas the worst case for a balanced binary tree is