Collection types

Python has different data types which represent collections of items.

All these collection types have some properties in common:

  • they contain items which can be counted
  • we can loop over the items in the collection
  • we can test an item for membership in the collection

At the same time, collection types can be different from each other in some aspects:

are the items in sequence, or are they without a particular order?
can the items in the collection be modified individually?
can an item appear multiple times in the collection?

Here is a comparison between the main collection types in Python:

collection type order mutability uniqueness
list ordered mutable non-unique items
tuple ordered immutable non-unique items
set unordered mutable unique items
dictionary unordered* mutable key/value pairs
string ordered immutable unicode code points
named tuple ordered immutable key/value pairs
frozenset unordered immutable unique items

* In Python 3, dictionaries return items in the order in which they were inserted. But dictionaries are not sequences: their items cannot be accessed by index, and the order of the items cannot be changed.

Collection properties

Number of items in a collection

Collections are containers of things. All collections have a length – the number of items contained in them.

The built-in function len() can be used to ask for the amount of items in any collection type:

>>> len('hello world') # string
>>> len([120, 400, 56, 320]) # list

Testing item membership

We can ask a collection if it contains a given item using the in keyword. The result to this expression is returned as a bool:

>>> 10.0 in [0, 10, 20, 30]
>>> 10 in [0, 10, 20, 30]

The same also works with strings and any other collection type:

>>> 'x' in 'abracadabra'
>>> 'a' in 'abracadabra'

Types of loops

There are two kinds of loops in Python:

for loops
iterate over a collection, or repeat an action a number of times
while loops
repeat an action as long as a condition is met

For loops

Use for loops when you need to access all items in a collection one by one:

>>> L = ['z', (0, 10, 20), 31.4, None]
>>> for item in L:
...     print(item)
(0, 10, 20)
>>> for char in 'abc':
>>>     char

Dynamic variable assignment

Loops assign a variable name dynamically in each iteration (repetition) of the loop. Look at how the variables item and char are declared in the examples above, in the loop statements: at each iteration their value is reset to the current item.

Looping over number ranges

To repeat an action a number of times, we can use the range function to create a sequence of numbers dynamically, and iterate over it:

>>> for i in range(4):
...     i, 'spam!'
(0, 'spam!')
(1, 'spam!')
(2, 'spam!')
(3, 'spam!')

The range function can take one, two or three arguments.

If only one argument is given, the function returns a list starting at zero and ending just before the given number:

>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

If two numbers are given, the first number indicates the start index (included), and the second the end index:

>>> list(range(4, 10))
[4, 5, 6, 7, 8, 9]

If a third number is given, it indicates the step (increment) between the numbers in the sequence:

>>> list(range(4, 20, 3))
[4, 7, 10, 13, 16, 19]

The body of the loop

The ‘body’ of a loop is the indented part after the loop statement. This is the part that gets repeated at each iteration of the loop.

Here’s an example script in which we print some info before, during and after the loop:

# outside of loop

for i in range(4):
    # body of the loop
    print('doing some stuff', i)

# outside of the loop
doing some stuff 0
doing some stuff 1
doing some stuff 2
doing some stuff 3

Looking at the output of the script we can see that:

  • code which is outside of the loop body is executed only once
  • code which is inside of the loop body is executed multiple times

Nested loops

Loop statements can be declared in the body of other loop statements. This is called nested loops, an iteration inside an iteration. The inner loop runs a full cycle for each iteration of the outer loop:

>>> for i in range(2): # 1st loop
...     i, 'outer'
...     for j in range(2): # 2nd loop
...             i, j, 'inner'
(0, 'outer')
(0, 0, 'inner')
(0, 1, 'inner')
(1, 'outer')
(1, 0, 'inner')
(1, 1, 'inner')

Here’s another example with a third level of nesting:

>>> for x in range(2):
...     for y in range(2):
...         for z in range(2):
...             x, y, z
(0, 0, 0)
(0, 0, 1)
(0, 1, 0)
(0, 1, 1)
(1, 0, 0)
(1, 0, 1)
(1, 1, 0)
(1, 1, 1)

Acessing items and/or indexes

When we loop over the items in a collection, each iteration can give us two values:

  • the item itself
  • the index of the item

Sometimes we need only the item, sometimes only the index. Sometimes we need both.

If we need only the item, we can simply loop over the list:

>>> myList = ['parrot', 'ant', 'fish', 'goat', 'cat', 'rabbit', 'frog']
>>> for item in myList:
...     item

If we need only the index of the items, we can loop over the amount of items in the list:

>>> for index in range(len(myList)):
...     index

If we need both, we can use the enumerate function to return index and item at each iteration in the loop:

>>> for index, item in enumerate(myList):
...     index, item
(0, 'parrot')
(1, 'ant')
(2, 'fish')
(3, 'goat')
(4, 'cat')
(5, 'rabbit')
(6, 'frog')

While loops

A while loop is not tied to a particular amount of items or iterations, but to a condition – the loop is repeated for as long as this condition is true.

Here’s a first example. The loop runs as long as n is greater than zero:

>>> n = 4
>>> while n > 0:
...     n
...     n -= 1

If the condition in a while loop declaration does not change, we get caught in an infinite loop – our program runs forever without leaving the loop, and the computer may freeze or crash. So make sure you change the condition to break out of the loop at some point. In the example above, we are decreasing the value of n at each iteration, so after a few rounds it stops being bigger than 0.

The ‘break’ statement

The break statement is used to exit a loop before it reaches the end.

Let’s say you are looping over all items in a list, looking for a certain value or condition. Once this value is found or this condition is met, we have reached our goal and can exit the loop – there’s no need to continue until the end.

Here’s an example. We have a list of names, and we want to find the first name which contains the character e:

>>> names =  ['Graham', 'Eric', 'Terry', 'John', 'Terry', 'Michael']
>>> for i, name in enumerate(names):
...     i
...     if 'e' in name:
...         name
...         break

Using for / else and while / else

Loops can have an additional else statement, which gets executed only if the loop completes normally. If the loop exits before the end (for example with a break statement), the else block is not executed.

Building on the previous example, we could have an additional else statement to be executed when no matching item is found:

>>> for i, name in enumerate(names):
...     i
...     if 'k' in name:
...         name
...         break
... else:
...     'did not find any match'
'did not find any match'

The same construction can also be used with while loops.

The ‘continue’ statement

The continue statement is used to skip or exit the current iteration and continue to the next one in the same loop.

Here’s another example: all names are printed except the ones which contain the character e:

>>> names =  ['Graham', 'Eric', 'Terry', 'John', 'Terry', 'Michael']
>>> for i, name in enumerate(names):
...     if 'e' in name:
...         continue
...     i, name
(0, 'Graham')
(1, 'Eric')
(3, 'John')
Last edited on 11/01/2019