Nested Lists and Loops

We've seen lists with int, float, bool, and str items. When a list contains items of type list, this is called a nested list. Here is an example:

In [2]:
rhymes = [['cat', 'hat', 'fat'], ['mouse', 'house', 'louse'], ['be', 'see', 'key', 'he']]

What is the type of rhymes?

In [3]:
type(rhymes)
Out[3]:
list

What is the type of the first item in rhymes?

In [4]:
type(rhymes[0])
Out[4]:
list
In [5]:
rhymes[0]
Out[5]:
['cat', 'hat', 'fat']

Since rhymes[0] produces the list ['cat', 'hat', 'fat'], indexing into that list at index 1 produces hat:

In [6]:
rhymes[0][1]
Out[6]:
'hat'

Using rhymes, give expressions that would produce each of the following values:

  • 'louse'

  • 'key'

  • ['be', 'see', 'key', 'he']

  • ['hat', 'fat']

Looping over a nested list

Consider the following loop using our rhymes list:

In [7]:
for item in rhymes:
    # do something with item
    print(item)
['cat', 'hat', 'fat']
['mouse', 'house', 'louse']
['be', 'see', 'key', 'he']

Each iteration through the loop, variable item refers to one of the inner lists.

Example: hospital visits

Suppose we have a list where each item represents the visit dates for a particular patient in the last month. We want to calculate the highest number of visits made by any patient. Let's write a function to do this.

First, here is an example call:

>>> max_visits([[2, 6], [3, 10], [15], [23], [1, 8, 15, 22, 29], [14]])
5

Next, the type contract:

(list of list of int) -> int

Next, the header (the def line), which involves naming the parameter:

In [8]:
def max_visits(visits_by_patient):
    """ (list of list of int) -> int 
    
    Return the maximum number of visits made by any patient in visits_by_patient.
    
    >>> max_visits([[2, 6], [3, 10], [15], [23], [1, 8, 15, 22, 29], [14]])
    5
    """

Now, code the body.

If we had a single list of visit dates, and wanted to know the number of visits what would we do? Call len on the list. So, we need to do this for each item (each inner list) of our visits_by_patient outer list.

In [9]:
def max_visits(visits_by_patient):
    """ (list of list of it) -> int 
    
    Return the maximum number of visits made by any patient in visits_by_patient.
    >>> max_visits([[2, 6], [3, 10], [15], [23], [1, 8, 15, 22, 29], [14]])
    5
    """
    
    max_so_far = 0
    for patient_list in visits_by_patient:
        visits = len(patient_list)
        if visits > max_so_far:
            max_so_far = visits
    return max_so_far
In [10]:
max_visits([[2, 6], [3, 10], [15], [23], [1, 8, 15, 22, 29], [14]])
Out[10]:
5

Practice Exercise: symptom count

Suppose we have a nested list where each inner list contains strings that represent symptoms exhibited by the a patient. Write a function that takes this list as a parameter and returns a new list containing integers. For each patient, the new list should contain the number of symptoms they were exhibiting.

Here is an example:

>>> symptom_count([['fatigue', 'abdominal swelling', 'bruising'], ['loss of appetite', 'fatigue']])
[3, 2]

Follow the design recipe and start by writing the docstring.

Heterogeneous Lists

In Python, the items of a list can be of different types. For example, it is possible to have a list like this:

['Milos', 'Jones', 48, 'male', 'smoker', 210]

that represents one person's personal information. (The last number is total cholesterol in mg/dL.)

Then, we could have a list of elements like this to represent a list of people.

[ ['Milos', 'Jones', 48, 'male', 'smoker', 210], ['Delia', 'Chan', 39, 'female', 'non-smoker', 170], ['Denise', 'Ross', 62, 'female', 'non-smoker' 150] ]

Practice exercise: last names of female patients

Suppose we have a nested list, where each inner list contains a patient's first name (a str), last name (a str), age in years (an int), sex (a str), and cholesterol in mg/dL (an int). Write a program that produces a list of the last names of female patients.

Nested Loops

To solve some problems, we need to loop not only over the items of the outer list, but we also over the items of each of the inner lists. This is called a nested loop.

Example: average heart rates

Suppose we have a list that represents repeated heart-rate measurements for the same patient over a number of tests. Each inner-list is a test/situation and for that test, we monitored the heart rate for a little while taking a few measurements. Now we would like to calculate the average of the measurements for each test.

hr = [[72, 75, 71, 73],   # resting
      [91, 90, 94, 93],   # walking slowly
      [130, 135, 139, 142], # running on treadmill
      [120, 118, 110, 105, 100, 98]] # after minute recovery

Suppose we can't use the built-in sum() function, since the point is to use a loop inside outer loop!

In [26]:
hr = [[72, 75, 71, 73],              # resting
      [91, 90, 94, 93],              # walking slowly
      [130, 135, 139, 142],          # running on treadmill 
      [120, 118, 110, 105, 100, 98]] # after a minute recovery


# start with an empty list that we will build to return (or print)
result = []

# loop over the outer list, each element is a test
for test in hr:
    
    # reset the sum for this test to 0
    sum = 0
    
    # loop over the inner list
    for measurement in test:
        sum = sum + measurement
        
    # finish up with this test before repeating the loop for the next one    
    average = sum / len(test)
    result.append(average)
    
print(result)
[72.75, 92.0, 136.5, 108.5]

Practice Exercise: heart rate ranges

Suppose we have a nested list that represents repeated heart rate measurements for the same patient over a number of tests. Each inner list contains heart rates measured during one test.

Find range of heart rate measurements for each inner list. Note: do not modify the lists!

  1. First approach: use the built-in functions min and max.
  2. Second approach: do not use built-in functions min and max.

For the sample list:

 hr = [[72, 75, 71, 73],   # resting
      [91, 90, 94, 93],   # walking slowly
      [130, 135, 139, 142], # running on treadmill
      [120, 118, 110, 105, 100, 98]] # after minute recovery

The result is:

[4, 4, 12, 22]