My Top 3 Takeaways from Donald Knuth’s The Art of Computer Programming Vol. 1
A couple of months ago, I came across a lecture on the Analysis of Algorithms taught by Stanford Professor Donald Knuth. The topics covered in that 1-hour lecture genuinely piqued my interest and inspired me to start learning more by reading The Art of Computer Programming series of books. I recently completed the first volume in the 5 volume series, and I am excited to share what I have learned so far. This book on Fundamental Algorithms has made me appreciate the reality that computer programming is no less than art, and that nothing is more beautiful than a well-designed algorithm. By writing this, I hope to internalize these concepts for myself and to inspire others to learn more about the art of computer programming.
Who am I?
Since this is my first post, allow me to introduce myself. My name is Qasim Khawaja, and I am a second-year computing science student at the University of Alberta, and programming is my first love and genuine passion. I’m enthusiastic, and it won’t be an exaggeration if I say that I’m crazy about everything that has to do with coding and technology in general. As far as my professional experience goes, I’m currently working at Intuit as a co-op software developer. When I’m not working, I enjoy challenging myself to become a better developer by continuously learning more about computing science.
Now let’s get started. Shall we?
The Art of Computer Programming Vol. 1 | Fundamental Algorithms
This book series is overwhelmingly impressive; it has now been over 50 years since it was initially published, and this text is still considered the “magnum opus” of computer science. Renowned scientists have called it the bible of all the fundamental algorithms. Although Knuth published the first edition in the 60s, the underlying concepts remain timeless.
I should note that the programming examples in this book all use the MIX assembly language for the hypothetical MIX computer, which I will cover briefly later on, and also include many types of algorithms along with their comprehensive analysis. I found the challenge of learning an assembly language very rewarding, and an excellent reason on its own to read through this book, the in-depth study of algorithms was fascinating as well.
The exercises in this book also take you throughout multiple levels of expertise; the problems start easy with warm-ups and progress in difficulty as they go into research problems that are still unsolved.
Fun fact: At the end of volume 1, there is a very popular blurb by Bill Gates:
“You should definitely send me a résumé if you can read the whole thing.”
If a co-sign from Bill Gates himself doesn’t justify reading this, I don’t know what else will.
Now I’ll discuss the three major takeaways that I had from reading so far:
Topic 1: Analysis of an Algorithm and Asymptotic notation
Mathematical Induction
Before getting into details, let’s first talk about mathematical induction, which is a potent technique of mathematical proof that demonstrates the validity of a statement for all the natural numbers.
Proof by mathematical induction consists of two main steps:
The (base case): prove that the statement holds for the first natural number n. This is usually, n = 0 or n = 1.
The induction step: prove that, if the statement is true for some natural number n, then the statement is true for n + 1.
A simple example to illustrate this is to imagine a line of dominos:
If the first domino falls, and
if any domino (n) in the chain that falls, causes the next domino in line (n+1) to fall,
proves that all dominos will fall.
In computing science, mathematical induction is used in an analysis of an algorithm to prove that the concerned algorithm correctly performs as expected (correction) by taking a certain amount of time (complexity).
It’s one of the most critical and vital concepts in computer programming because each program should work correctly in the least possible time.
It’s also exciting to notice that one program performs better than the other while solving the same problem. To analyze the efficiency of a program, computer scientists use different techniques. Moreover, advanced programs are developed based on different, well-proven algorithms. The process of analyzing the problem-solving capabilities of a specific algorithm is known as the analysis of the algorithm.
Analysis of an Algorithm
There are different ways to approach an analysis of an algorithm, including quantitative mathematical analysis
Some essential steps for analyzing an algorithm include:
- Complete implementation of the algorithm
- Identifying all the possible quantities to describe the execution frequency of all the necessary operations
- Creating a realistic input model to program
- Calculation of space and time complexities
An example of a straightforward algorithm that Knuth analyzes in the book is the algorithm for finding the max integer in a list of integers. Here is a flowchart for this algorithm.
To analyze, let’s break down the number of times each step is executed with an input of size n.
+======+========================+
| Step | Frequency of Execution |
+======+========================+
| 1 | 1 |
+------+------------------------+
| 2 | n |
+------+------------------------+
| 3 | n-1 |
+------+------------------------+
| 4 | A |
+------+------------------------+
| 5 | n-1 |
+------+------------------------+
The only quantity that remains to be analyzed is ‘A,’ which is determined by the input.
We must determine the maximum, minimum, average, and the standard deviation of this value.
The maximum and minimum values for A are easy to determine for this algorithm; the max value for A is n-1, in the case that the list is in descending order, where every iteration will reassign the m value, and the minimum is zero when X[n] is the maximum value.
To determine the average value, we first need to determine the characteristics of the input. In this example, we assume that all the values of X are distinct and each of the n! permutations of these values are equally likely to occur.
So in the case that n = 3. There are six possible arrangements of the values in X.
+================+========+===============+
| Situation | Step 4 | “A” value |
+================+========+===============+
| X[1]<X[2]<X[3] | N/A | 0(Minimum) |
+----------------+--------+---------------+
| X[1]<X[3]<X[2] | m<-X[2]| 1 |
+----------------+--------+---------------+
| X[2]<X[1]<X[3] | N/A | 0(Minimum) |
+----------------+--------+---------------+
| X[2]<X[3]<X[1] | m<-X[1]| 1 |
+----------------+--------+---------------+
| X[3]<X[2]<X[1] | m<-X[2]| |
| | m<-X[1]| 2(Max) |
+----------------+--------+---------------+
| X[3]<X[1]<X[2] | m<-X[3]| 1 |
+----------------+--------+---------------+**THIS TABLE MAY NOT SHOW UP CORRECTLY ON A PHONE SCREEN**
So when n=3, the average A count is ( 0+1+0+1+2+1)/6 = 5/6
Generalizing this for n, we must realize that the probability of A being k is
pₙₖ = (the number of times A=k for n! objects) divided by n!
For example:
p₃₁= (2)/3! = ⅓
p₃₀ = (3)/3! = ½
p₃₂ = (1)/3! = ⅙
The formula for the average of A is found to be
Aₙ =∑ₖkpₙₖ
To verify
A₃ =∑ₖkpₙₖ= 0*⅓+ 1*½+2*⅙ = 5/6
That’s as far as I will go into the analysis; however, Dr. Knuth’s review is a lot more comprehensive and thorough. To see the complete study, check out the book or his lecture, and I hope like me, you also appreciate how algorithms are quantifiably analyzed and studied.
Asymptotic analysis is a type of algorithm analysis that approximates the solution for large values of n for input sizes. Often it’s more helpful to know that the complexity of an algorithm is O(n²) rather than 4n²+2n.
Asymptotic notations are expressions that denote the complexity of algorithms.
The O-notation and the Big-O notation are asymptotic notations, which, as mentioned earlier, are used to find the complexity of an algorithm. These expressions indicate the relationship between the necessary operations (steps) needed to execute an algorithm and the size of the input provided to the algorithm. As the name suggests, O(n) denotes the expression, where n represents the input’s size.
Although not covered in this volume, sorting algorithms are important for understanding this notation and easily comparing their different types.
Some of the most common sorting algorithms with their complexities are as followed:
Merge Sort: The efficiency of the merge sort is way better with more massive data sets, and programmers generally use it when they have to deal with large amounts of data. The complexities of Merge Sort are listed below.
- Worst Case O(n log n)
- Best Case O(n) natural variant, O(n log n) typical
- Average Case O(n log n)
Quick Sort: It is by far the most efficient and most used sorting algorithm, and it works best with smaller data sets. The complexities of Quick Sort are as followed:
- Worst Case O(n²)
- Best Case O(n log n)
- Average Case O(n log n)
Bubble Sort: It is one of the less used sorting algorithms, and its complexities are as followed:
- Worst Case O(n²)
- Best Case O(n)
- Average Case O(n²)
Insertion Sort: It is yet another stable and faster sorting algorithm for small data sets, and its complexities are listed below:
- Worst Case О(n)
- Best Case O(n)
- Average Case О(n²)
Topic 2: Information Structures (Linear Lists and Trees)
Information Structures are my second key takeaway from The Art of Computer Programming Vol 1. because of their usefulness and countless applications in computer sciences. The two main information structures discussed in the book are linear lists and trees.
Linear Lists
Linear lists are a data structure(information structure) that stores a sequence of nodes. The elements of a linear list are ordered in a linear sequential way.
There are three important types of linear lists:
A stack is one type of linear list, and it works on LIFO (Last in First out) functionality. This means that it has a common inserting (push) and exiting (pop) point. If we push the data into the stack one by one, the first item that we pushed will be the last one to pop. The item that accessible from the stack at any point is the youngest item inserted also referred to as the top of the stack. And the last item is the oldest item inserted into the stack, this is referred to as the bottom. In the book, a railway analogy is used to visualize a stack, but I prefer to stick with the visual of this toy.
Stacks are fairly common in computer programming in conjunction with recursive algorithms, an example being the Javascript Call Stack.
This is another type of linear list. As the name would imply, Queues work on FIFO (First in First out) approach and items can only be inserted (Enqueued) from one end(rear) and deleted (Dequeued) from the other (front).
Queues are also fairly common in computer programming, an example would be the Javascript Callback Queue.
This type of linear list is a bit more general because it behaves like a “double-ended” queue. It has two open ends, where items can be inserted and deleted. The two ends are referred to as the left and right of a deque. Deques are also distinguished by being defined as output restricted or input restricted, this assertion limits the insertion and deletion to occur at one end.
Sequential Allocation
This is the most obvious way to store a linear list. It is a type of allocation where we store the nodes in sequence in memory, starting at some base address.
Linked Allocation
In this type of allocation, the linear list does not have to be stored sequentially in memory, but instead, it is possible to store the nodes in the list in arbitrary locations in memory and retain access to the list by managing at least two pieces of information in one node (data and reference to the next node). This allocation takes up more memory but evidently it allows for easier insertion and deletion into the list.
Here is an application of Linked Allocation:
Trees
Trees are nonlinear ways to save data in a hierarchical data structure, and they are especially useful for complex problems.
The most common properties of trees are:
- Nodes
- Pointer (one or more)
Binary Tree
Binary Trees are the most common hierarchical data structure, and in the binary tree, each node has two children (left and right). Each node of the binary tree contains the following component.
- Pointer to the right child
- Pointer to left child
- Data
Traversing a Binary Tree
Traversing the tree means that you have visited all the nodes systematically in a tree at least once. There are three basic types of tree traversal.
- Preorder
- Inorder
- Postorder
Some types of tree structures are as followed:
- General Tree
- Binary Tree (covered)
- AVL Tree
- Red-Black Tree
- N-ary Tree
Topic 3: The Basics of How a Computer Works
(This Section is Under Construction. You may skip this)
To put it simply, a computer receives data via the input device with the instructions to act, and it displays the results on the output device.
Register
These are the storage locations in an internal memory of the computers which are dedicated to speed up the operations of the processors.
There are always a limited number of registers stores the data elements, and they do not need to access the memory for processing.
Word
The basic unit of data in MIX is a byte. A word is a piece of data with a fixed number of bytes, which is managed by the processor hardware instruction set as a unit. The most important properties of the word for particular computer architecture or processor are as followed:
- Size
- Width
- Length
MIX Computer
It is a hypothetical computer, envisioned by Donald Knuth in The Art of Computer Programming Vol 1, its model number is 1009 (sum of the model numbers of 16 other similar computers.) It is neither a fully decimal nor a binary machine, this peculiar property allows algorithms written on the MIX computer to be simulated in any type of machine with slight modification.
Different Parts of the MIX Computer
- 9 main registers: rA, rX, rI1, rI2, rI3, rI4, rI5, rI6, rJ
- rA accumulator register
- rX extension register
- rI1 — rI6 are index registers
- rJ holds the jump address
- Memory and input/output devices, including Tape units, Disk or drum units, Card reader, Typewriter terminal.
- Instructions which occupy the word
How Computer Executes a Program
We all use computers, but have you ever wondered how the programs in it are executed? Here’s how.
- There is an instruction sequence stored in the memory with all the functionalities to perform.
- From that instruction sequence that memory address is copied which points to the first instruction
- That address is sent to the memory in the program counter by the CPU through an address bus.
- The memory sends a copy of the bits’ state on the data bus at the location.
- The CPU takes a copy of that information into the instruction register
- The pointer, which represents the instruction, is incremented to the next address for the next memory instruction.
- The instruction is executed in the instruction register by the CPU
Bottom Line
The theories that back up Computing Science are numerous. This book does an impressive job of introducing the reader to essential concepts to build a foundational understanding of this field.
If you are a programming enthusiast, then I hope this discussion attracted you. If I missed something important or if you have any questions, please feel free to reach out to me in the comments below.
For more discussion, you can also join my other social media pages through the following links.
Linkedin: https://www.linkedin.com/in/qkhawaja
Twitter: https://twitter.com/qasim_ii