A list comprehension is a very simple yet powerful built-in generator in Python used to create lists.
For example, to generate the list [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], you can use list(range(1, 11)):
>>> list(range(1, 11))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
But what if you want to generate [1x1, 2x2, 3x3, ..., 10x10]? Method 1 is to use a loop:
>>> L = []
>>> for x in range(1, 11):
... L.append(x * x)
...
>>> L
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
However, loops are too verbose. A list comprehension can generate the above list in one line instead:
>>> [x * x for x in range(1, 11)]
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
When writing a list comprehension, place the element you want to generate, x * x, at the front, followed by a for loop, and the list will be created. It’s very useful; with a bit of practice, you’ll quickly become familiar with this syntax.
You can add an if condition after the for loop to filter elements. For example, to filter and square only even numbers:
>>> [x * x for x in range(1, 11) if x % 2 == 0]
[4, 16, 36, 64, 100]
You can also use nested loops to generate Cartesian products:
>>> [m + n for m in 'ABC' for n in 'XYZ']
['AX', 'AY', 'AZ', 'BX', 'BY', 'BZ', 'CX', 'CY', 'CZ']
Three or more nested loops are rarely used.
List comprehensions allow for very concise code. For instance, listing all files and directories in the current directory can be achieved in one line:
>>> import os # Import the os module; the concept of modules will be covered later
>>> [d for d in os.listdir('.')] # os.listdir lists files and directories
['.emacs.d', '.ssh', '.Trash', 'Adlm', 'Applications', 'Desktop', 'Documents', 'Downloads', 'Library', 'Movies', 'Music', 'Pictures', 'Public', 'VirtualBox VMs', 'Workspace', 'XCode']
A for loop can actually iterate over two or more variables simultaneously. For example, dict.items() can iterate over both keys and values:
>>> d = {'x': 'A', 'y': 'B', 'z': 'C' }
>>> for k, v in d.items():
... print(k, '=', v)
...
y = B
x = A
z = C
Therefore, list comprehensions can also use two variables to generate a list:
>>> d = {'x': 'A', 'y': 'B', 'z': 'C' }
>>> [k + '=' + v for k, v in d.items()]
['y=B', 'x=A', 'z=C']
Finally, to convert all strings in a list to lowercase:
>>> L = ['Hello', 'World', 'IBM', 'Apple']
>>> [s.lower() for s in L]
['hello', 'world', 'ibm', 'apple']
if ... else
When using list comprehensions, some beginners often confuse the usage of if...else.
For example, the following code correctly outputs even numbers:
>>> [x for x in range(1, 11) if x % 2 == 0]
[2, 4, 6, 8, 10]
However, we cannot add an else to the final if:
>>> [x for x in range(1, 11) if x % 2 == 0 else 0]
File "<stdin>", line 1
[x for x in range(1, 11) if x % 2 == 0 else 0]
^
SyntaxError: invalid syntax
This is because the if following the for is a filtering condition and cannot be followed by an else; otherwise, how would the filtering work?
Other beginners find that placing an if before the for requires an else, otherwise an error is reported:
>>> [x if x % 2 == 0 for x in range(1, 11)]
File "<stdin>", line 1
[x if x % 2 == 0 for x in range(1, 11)]
^
SyntaxError: invalid syntax
This is because the part before the for is an expression, which must compute a result based on x. Therefore, consider the expression: x if x % 2 == 0. It cannot compute a result from x because it lacks an else clause, which must be added:
>>> [x if x % 2 == 0 else -x for x in range(1, 11)]
[-1, 2, -3, 4, -5, 6, -7, 8, -9, 10]
Only then can the expression x if x % 2 == 0 else -x before the for loop compute a definite result based on x.
Thus, in a list comprehension, the if ... else before the for is an expression, while the if after the for is a filtering condition and cannot have an else.
Using list comprehensions allows you to quickly generate lists and derive one list from another with very concise code.