need to discover a new method of solving a problem when an older solution works too slowly or uses too many resources. Knowing the basis to use to compare different solutions and deciding between them is an essential skill when dealing with algorithms.
The issue of efficiency has been part of discovering and designing new algorithms since the concept of algorithms first came into being, which is why you see so many different algorithms competing to solve the same problem. The concept of measuring the size of the functions within an algorithm and analyzing how the algorithm works isn’t new; both Ada Lovelace and Charles Babbage considered the problems of algorithm efficiency in reference to computers as early as 1843 (see https://www.computerhistory.org/babbage/adalovelace/
).
Donald Knuth (https://www-cs-faculty.stanford.edu/~knuth/
), computer scientist, mathematician, professor emeritus at Stanford University, and author of the milestone, multivolume book The Art of Computer Programming (Addison-Wesley), devoted much of his research and studies to comparing algorithms. He strove to formalize how to estimate the resource needs of algorithms in a mathematical way and to allow a correct comparison between alternative solutions. He coined the term analysis of algorithms, which is the branch of computer science devoted to understanding how algorithms work in a formal way. The analysis measures resources required in terms of the number of operations an algorithm requires to reach a solution or by its occupied space (such as the storage an algorithm requires in computer memory).
Analysis of algorithms requires some mathematical understanding and some computations, but it’s extremely beneficial in your journey to discover, appreciate, and effectively use algorithms. This topic is considerably more abstract than other topics in this book. To make the discussion less theoretical, later chapters present more practicalities of such measurement by examining algorithms together in detail. The following sections give you the basics.
Simulating using abstract machines
The more operations an algorithm requires, the more complex it is. Complexity is a measure of algorithm efficiency in terms of time usage because each operation takes some time. Given the same problem, complex algorithms are generally less favorable than simple algorithms because complex algorithms require more time. Think about those times when speed of execution makes the difference, such as in the medical or financial sector, or when flying on automatic pilot on an airplane or space rocket. Measuring algorithm complexity is a challenging task, though a necessary one if you want to employ the right solution. The first measurement technique uses abstract machines like the Random Access Machine (RAM).
Abstract machines aren’t real computers but rather theoretical ones — computers that are imagined in their functioning. It’s sort of like daydreaming for computer scientists. You use abstract machines to consider how well an algorithm would work on a computer without testing it on the real thing, yet is bound by the type of hardware you’d use. A RAM computer performs basic arithmetic operations and interacts with information in memory, and that’s all. Every time a RAM computer does anything, it takes a time step (a time unit). When you evaluate an algorithm in a RAM simulation, you count time steps using the following procedure:
1 Count each simple operation (arithmetic ones) as a time step.
2 Break complex operations into simple arithmetic operations and count time steps as defined in Step 1.
3 Count every data access from memory as one time step.
To perform this accounting, you write a pseudocode version of your algorithm (as mentioned in Chapter 1) and perform these steps using paper and pencil. In the end, it’s a simple approach based on a basic idea of how computers work, a useful approximation that you can use to compare solutions regardless of the power and speed of your hardware or the programming language you use.
Using a simulation is different from running the algorithm on a computer because you use a standard and predefined input. Real computer measurements require that you run the code and verify the time required to run it. Running code on a computer is actually a benchmark, another form of efficiency measurement, in which you also account for the application environment (such as the type of hardware used and the software implementation). A benchmark is useful but lacks generalization. Consider, for instance, how newer hardware can quickly execute an algorithm that took ages on your previous computer.
Getting even more abstract
If you thought things were abstract before, this section makes those previous sections seem concrete, but grit your teeth and move on because you really are up to the task! Measuring a series of steps devised to achieve a solution to a problem poses quite a few challenges. The previous section discusses counting time steps (number of operations), but sometimes you also need to compute space (such as the memory an algorithm consumes). You consider space when your problem is greedy for resources. Depending on the problem, you may consider an algorithm to be better when it works efficiently with regard to one of these resource consumption aspects:
Running time
Computer memory requirements
Hard-disk usage
Power consumption
Data-transmission speed in a network
Some of these aspects relate to others in an inverse manner, so if, for instance, you want speedier execution time, you can sometimes increase memory or power consumption to get it. Not only can you have different efficiency configurations when running an algorithm, you can also change the hardware characteristics and software implementation to accomplish your goals. In terms of hardware, using a supercomputer or a general-purpose computer does matter, and the software, or language used to write the algorithm, is definitely a game changer. In addition, the quantity and kind of data you feed the algorithm could result in better or worse performance measurements.
RAM simulations count time because when you can employ a solution in so many environments, and its resource usage depends on so many factors, you have to find a way to simplify comparisons so that they become standard. Otherwise, you can’t compare possible alternatives. The solution is, as so often happens with many other problems, to use a single measure and say that one size fits all. In this case, the measure is time, which you make equal to the number of operations, that is, the complexity of the algorithm.
A RAM simulation places the algorithm in a situation that’s both language and machine-agnostic (it’s independent of programming language and computer type). However, explaining how a RAM simulation works to others requires quite an effort. The analysis of algorithms proposes to use the number of operations you get from a RAM simulation and turn them into a mathematical function expressing how your algorithm behaves in terms of time, which is a quantification of the steps or operations required when the number of data inputs grows. For instance, if your algorithm sorts objects, you can express complexity using a function that reports how many operations it needs depending on the number of objects it receives.
Working with functions
A function in mathematics is simply a way to map some inputs to a response. Expressed in a different way, a function is a transformation (based on math operations) that transforms (maps) your input to an answer. For certain values of input (usually denoted by the letters x or n), you have a corresponding answer using the math that defines the function. For instance, a function like f(n) = 2n
tells you that when your input is a number n, your answer is the number n multiplied by 2.
A function describing how an algorithm relates its solution to the quantity of data it receives is something you can analyze without specific hardware or software support. It's also easy to compare