Week 2 - More Advanced Data Structures: Linked Lists and Stacks

This week we will look at two more specialised data structures, the linked list and the stack.

The linked list

We will start with a look at linked lists. Linked lists, which are different to the plain lists we discussed last time, are unlike arrays or lists in that they are not stored continuously in memory. Instead, data is stored as a series of linked nodes. Each node contains one item of data, and links to the memory locations of the previous and the next item of data in the linked list.

Linked list diagram

Each node has a link to the previous and the following node. When we add a new item of data, we make the previous node link to the new node, and we link the new node back to the previous node to form a two-way link.

The first node in the list links to nothing in the reverse direction (indicated in Python by the special value None) and similarly, the final node in the list links to nothing in the forward direction.

What are the consequences of this?

Remember how we could use simple arithmetic, using the array index, to calculate the location in memory of a given element in an array or list. Can we do this here? We cannot. This is because, in a linked list, items are not stored continuously in memory. Instead, each node contains references to the memory locations of the previous and the following node.
On the other hand, as long as we have a reference to both the start and the end of the linked list, it's efficient to add a new member to the end of the linked list. We can just create a new node and link it, both ways, to the end node. Contrast this to arrays, in which we had to create a new array with additional space and copy the elements over. We will explore this in more detail in the exercises this week.
Insertion into the middle of the list has mixed efficiency. On the one hand we have to find the index we want to insert the element at (which as we saw above is inefficient), on the other hand the actual insertion process is easier as we can just break the existing links between the node BEFORE the element we want to insert and the node AFTER this element, and then link in the new element. Again we will look at this in the exercises.

The stack

A stack data structure involves adding items from bottom to top, rather like a stack of plates. When we remove items from the stack, we remove from the top, again just like a stack of plates. The stack is known as a "last in first out" or "LIFO" data structure. It is called this, because the last things we add to the stack, are the first things we remove. Here is an example of a simple stack of numbers.

Simple stack

A stack can be used for any operation in which we need to navigate back to a previous state. Examples could include:

Browser navigation. When we visit a website, we often need to navigate back to a previous site. When we click the 'Back' button, we want to return to the site immediately preceding the one we are currently viewing. So when you click 'Back', the current site might be removed from the stack so that you return to the previous site.
Directory/folder structure. When navigating the folder system of your computer, you typically start at a 'root' folder (for example C:\ on Windows, or your home directory on Linux) and then navigate to subfolders, for example C:\Pictures. You then might navigate to a sub-sub-folder, such as C:\Pictures\Holiday and then C:\Pictures\Holiday\2018 and so on. In a subfolder you can navigate upwards to the previous folder, so that if you are in C:\Pictures\Holiday and you navigate upwards, you arrive at C:\Pictures and then C:\ if you navigate upwards once more. So the process of navigating upwards removes the current folder from the stack and returns to the previous folder.
"Undo" commands in desktop applications. Each action you take in a desktop application might be stored on a stack, so that if you select "Undo", the topmost operation would be reversed, and then removed from the stack.

(In actual fact, each of these is now implemented in a slightly more complex way, in the sense that you can, in modern browsers, move both back and forwards along your history, but we are assuming a more simplified implementation in which you can only move back for the purposes of illustrating a stack).

Another use of stacks, which you will appreciate more when you have done more programming, is:

Storing function calls in a program.

The two key operations of a stack, adding and removing items, have special terms.

Push. To push an item onto a stack means to add it to the top. It is possible the stack may only have a certain capacity, i.e. it can only hold a certain number of items (perhaps due to memory constraints) in which case an error occurs if the stack is full.
Pop. To pop an item off the stack means to remove it from the top. The item is removed, and we also obtain it as a result of the pop operation. If the stack is empty, an error is generated.

An additional operation is:

Peek, To peek a stack is to obtain the value of the top-most item of the stack without removing it.

Stacks are typically implemented using an internal array or list. When we push an item to a stack, we add an item to the next available space in this internal array. When we pop an item from a stack, we remove it from the last occupied space in the internal array, and "blank out" that position in the internal array.

Pseudocode

When coding a data structure or algorithm, it is useful to be able to work out the logic of the code before you actually write any real code. Why is this useful?

It gets you to carefully think through the logic of the data structure or algorithm before you actually write code, including any errors that might occur;
once the logic has been worked out (arguably the hard part, harder than actually writing the code) the data structure or algorithm can then be implemented in any programming language, or even implemented in multiple languages.

We can use pseudocode to represent such logic. Pseudocode is a way of representing the logic of some code in concise English (or other human language) statements which clearly and unambiguously represent the logic of the code.

Starting with a basic example, "Hello World" might look like this in pseudocode:

Print "hello world"

Moving on to a more interesting example, here is some pseudocode to represent some simple logic to determine if you are old enough to vote:

Read age from the user into a variable "age"
If age is 18 or greater
    Print "You are old enough to vote"
Otherwise
    Print "Sadly you are not old enough to vote, but you will be in "
    Print 18 minus age
    Print "years"

And here is some pseudocode to represent a "while" loop to count up to a number that the user enters:

Read a number from the user into a variable "number"
Set a variable "counter" to 0
While counter is less than number
    Print counter
    Increase counter by 1

As can be seen, these examples of pseudocode represent the logic before we actually write code. They are trivial examples, but hopefully are enough to illustrate the point, and once completed, the pseudocode can then be implemented in real code of any given programming language. Python actually resembles pseudocode more than some other languages, so hopefully converting pseudocode to Python should be quite an easy job!

Higher-level pseudocode

It should also be said that there are differing levels of detail that you can write pseudocode with. The examples above include references to variables, which make them particularly useful for converting into real code. For simple algorithms, you can go straight to this stage. In more complex algorithms you might want to, before you go into this level of detail, write some higher-level pseudocode which describes the general logic without referencing variables. This helps you think about the overall high level logic of what must be done before you start thinking about what variables are needed. Higher-level pseudocode for the age tester might look like this:

Read age from the user 
If user's age is 18 or greater
    Tell them that they are old enough to vote
Otherwise
    Tell them that they are not old enough to vote
    Tell them how long they must wait to vote, based on their age

Exercise 1 - Exploring Linked Lists and Stacks on Paper

Question 1 : Linked List (paper)

Think about what you would have to do to search for a particular item in a linked list using its index, starting at the beginning.

Draw out a linked list containing the 5 items of data:
```
- Linux 
- Windows 
- Mac OS X
- Android
- iOS
```
Imagine we wish to retrieve the item with index 3 (Android). How could we do this? Draw out how you think it could be done on paper, and ask yourself: how efficient is this, particularly compared to doing the same thing with an array or list.
Imagine that we wish to add a new item (Solaris) to the end of the linked list. Draw a diagram showing the process of creating this new item and adding it to the linked list.

Question 2: Stacks

We are now going to perform another paper-based exercise with stacks, to help you understand them and their operations.

Imagine you have an empty stack. Draw the stack after each operation below, and explain what, if anything is returned from each operation and any errors that might occur.

push (a), push (b), pop (), push (c), peek (), pop (), pop (), pop (), push (d), push (e), push (f), pop (), push (g), push (h), peek (), push (i), pop (), pop (), pop (), peek ().

Exercise 2: Designing a Stack and Linked List with Pseudocode

This exercise will allow you to think about the logic of stacks and linked lists using pseudocode, before you actually write real code. In two weeks' time, once you have learnt more Python programming, you will actually implement stacks and linked lists using real Python code.

Stack

First try writing, in a text editor, using pseudocode, the logic for implementing a stack.

The stack, in this case, will be implemented using an internal array of a fixed size (so the stack has a fixed capacity). This will not be the case for all stacks, but you can assume it will be in this case. The stack can be visualised as below, note the affect of the push and pop operations on the internal array.
The internal array will initially be empty: each position in the array will hold nothing (in Python this is represented by the value "None").
The push operation should add data to the next available space in the array.
The pop operation should remove data from the top of the stack, which should be the member of the array with the highest index containing data, and return it to the user. It should do this by setting the value to "None" and returning the data being popped.

To answer the question, consider and carry out the following:

As well as the array, what other variable will the stack need to hold, to implement the stack as described above?
Using pseudocode, describe the full logic of the push operation, including how it changes the variable mentioned in the last question.
Using pseudocode, describe the full logic of the pop operation, including how it changes that same variable.
What error checks must the push and pop operations include? Modify your pseudocode to include these error checks.

If you're struggling with the pseudocode, try writing higher-level pseudocode, without the use of variables, first. See the discussion on pseudocode above.

Linked List

Try writing, in a text editor, using pseudocode, the logic for implementing the "add" and "get" operations of a linked list.

The linked list will require two entities to implement it: a node, and the linked list itself.
The "add" operation should add a new item of data to the linked list.
The "get" operation should retrieve an item from the linked list, using its index (where 0 represents the first item in the linked list, 1 represents the second, and so on). It should return "None" if the index does not exist (e.g. index 2 is used when there are only two items, at indices 0 and 1, in the linked list).

To answer the question, consider and carry out the following:

Start with the "add" operation. To efficiently implement this (i.e. you want to avoid having to search all the way through the linked list to find its end) what variable does the linked list need to contain?
What other variable must the linked list contain to be able to implement the "get" operation?
Given these two variables, write the pseudocode for the "add" operation. Do you need to do anything special if the list is empty, as it will be initially?
Now write the pseudocode for the "get" operation. Is this efficient, compared to an array or list?
Can you think of any steps you can take to make the "get" operation a little more efficient?

Advanced: write pseudocode for a third operation, "insertAt". This should insert a node into the linked list containing some data, after a given index. So if index 2 is specified, the node would be added after index 2.

If you're struggling with the pseudocode, try writing higher-level pseudocode, without the use of variables, first, as described above.