The idea behind big O notation
Big O notation is the language we use for articulating how long an algorithm takes to run. It's how we compare the efficiency of different approaches to a problem.
With big O notation we express the runtime in terms of—brace yourself—how quickly it grows relative to the input, as the input gets arbitrarily large.
Let's break that down:
- How quickly the runtime grows—Some external factors affect the time it takes for a function to run: the speed of the processor, what else the computer is running, etc. So it's hard to make strong statements about the exact runtime of an algorithm. Instead we use big O notation to express how quickly its runtime grows.
- Relative to the input—Since we're not looking at an exact number, we need something to phrase our runtime growth in terms of. We use the size of the input. So we can say things like the runtime grows "on the order of the size of the input" (O(n)) or "on the order of the square of the size of the input" (O(n2)).
- As the input gets arbitrarily large—Our algorithm may have steps that seem expensive when n is small but are eclipsed eventually by other steps as n gets huge. For big O analysis, we care most about the stuff that grows fastest as the input grows, because everything else is quickly eclipsed as n gets very large. If you know what an asymptote is, you might see why "big O analysis" is sometimes called "asymptotic analysis."
Big O notation is like math except it's an awesome, not-boring kind of math where you get to wave your hands through the details and just focus on what's basically happening.
If this seems abstract so far, that's because it is.
We express complexity using big-O notation. For a problem of size N:
- A constant-time method is "order 1": O(1)
- A linear-time method is "order N": O(N)
- A quadratic-time method is "order N squared": O(N2)
Note that the big-O expressions do not have constants or low-order terms. This is because, when N gets large enough, constants and low-order terms don't matter (a constant-time method will be faster than a linear-time method, which will be faster than a quadratic-time method). See belowfor an example.
Formal definition:A function T(N) is O(F(N)) if for some constant c and for all values of N greater than some value n0: T(N) <= c * F(N)
The idea is that T(N) is the exact complexity of a method or algorithm as a function of the problem size N, and that F(N) is an upper-bound on that complexity (i.e., the actual time/space or whatever for a problem of size N will be no worse than F(N)). In practice, we want the smallest F(N) -- the leastupper bound on the actual complexity.
For example, consider T(N) = 3 * N2 + 5. We can show that T(N) is O(N2) by choosing c = 4 and n0 = 2. This is because for all values of N greater than 2: 3 * N2 + 5 <= 4 * N2
T(N) is not O(N), because whatever constant c and value n0 you choose, I can always find a value of N greater than n0 so that 3 * N2 + 5 is greater than c * N.