đŸ¶
Python

Python Split List into Equal Chunks

By Filip on 10/05/2024

Learn several efficient Python techniques to split lists into chunks of equal size, perfect for data processing and iteration.

Python Split List into Equal Chunks

Table of Contents

Introduction

In Python, there are scenarios where you might need to divide a list into smaller sub-lists, or "chunks," of equal size. This can be useful for tasks like batch processing, data manipulation, or improving code readability. Let's explore two common approaches to achieve this: using slicing and utilizing the zip() function.

Step-by-Step Guide

Let's explore how to split a Python list into chunks of equal size.

1. Using Slicing

Slicing is a fundamental Python concept. We can leverage it to split a list into chunks:

my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
chunk_size = 3
chunks = [my_list[i:i + chunk_size] for i in range(0, len(my_list), chunk_size)]
print(chunks)  # Output: [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]

Explanation:

  • We define the list (my_list) and the desired chunk size (chunk_size).
  • We use a list comprehension to iterate through the list in steps of chunk_size.
  • For each step, my_list[i:i + chunk_size] extracts a chunk of the list.

2. Using the zip() Function

The zip() function can be cleverly combined with list comprehension to achieve chunking:

my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
chunk_size = 3
chunks = list(zip(*[iter(my_list)] * chunk_size))
print(chunks)  # Output: [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

Explanation:

  • iter(my_list) creates an iterator from the list.
  • [iter(my_list)] * chunk_size replicates the iterator chunk_size times.
  • zip(* ...) effectively groups items from the iterators into tuples.
  • We convert the result to a list using list().

Important Note: Both methods might result in a final chunk smaller than chunk_size if the list length is not perfectly divisible.

Code Example

The code demonstrates two methods of splitting a list into smaller chunks of a specified size in Python. The first method uses list slicing within a list comprehension to create sub-lists of the desired chunk size. The second method utilizes the zip() function with an iterator multiplied by the chunk size to group elements into tuples, which are then converted into a list of chunks. Both methods achieve similar results, with the slicing method potentially including a final incomplete chunk and the zip() method only producing complete chunks.

# 1. Using Slicing

my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
chunk_size = 3

chunks = [my_list[i:i + chunk_size] for i in range(0, len(my_list), chunk_size)]
print("Using slicing:", chunks)  

# Output: [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]


# 2. Using the zip() Function

my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
chunk_size = 3

chunks = list(zip(*[iter(my_list)] * chunk_size))
print("Using zip():", chunks)  

# Output: [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

Additional Notes

  • Handling Uneven Splits: Both methods handle uneven splits differently. Slicing includes the remaining elements in a smaller last chunk, while zip() discards them. To capture all elements when using zip(), you can use itertools.zip_longest(*[iter(my_list)]*chunk_size, fillvalue=...) to pad the final chunk with a specified value.

  • Performance: For large lists, the zip() method is generally faster than slicing. This is because zip() iterates through the list only once, while slicing creates multiple copies of sub-lists.

  • Memory Efficiency: The zip() method is more memory-efficient, especially for large lists. It avoids creating multiple copies of sub-lists, unlike slicing.

  • Readability: Slicing might be considered more readable for beginners, as it directly relates to how we access parts of a list. However, the zip() method can be more concise and elegant once understood.

  • Alternatives:

    • numpy.array_split(): If you're working with NumPy arrays, numpy.array_split() provides a convenient way to split arrays into chunks, even if the length is not perfectly divisible.
    • Third-party libraries: Libraries like more_itertools offer functions like chunked() for splitting iterables into chunks.
  • Choosing the Right Method: The best method depends on your specific needs. Consider the trade-offs between readability, performance, memory efficiency, and whether you need to handle uneven splits in a particular way.

Summary

This article explains two methods for splitting a Python list into chunks of equal size:

Method Description Advantages Disadvantages
Slicing Uses list comprehension and slicing (my_list[i:i + chunk_size]) to extract chunks. Simple and easy to understand. May result in a smaller final chunk.
zip() Function Combines zip(), iter(), and list comprehension to group items into tuples. Elegant and concise. May be less intuitive for beginners. May result in a smaller final chunk.

Note: Both methods may produce a final chunk smaller than the desired size if the list length is not perfectly divisible by the chunk size.

Conclusion

Choosing between slicing and the zip() function for chunking depends on your specific needs and priorities in terms of readability, performance, and how you want to handle the last chunk. Understanding both approaches equips you to make informed decisions when working with lists and dividing them into manageable chunks for various programming tasks.

References

Were You Able to Follow the Instructions?

😍Love it!
😊Yes
😐Meh-gical
😞No
đŸ€źClickbait