Searching Subdirectories with Python

Dealing with files and directories is very common in programming. In particular, searching subdirectories is used for various purposes such as organizing system files and analyzing log files. In this tutorial, we will learn in detail how to effectively search subdirectories using Python.

Python provides several libraries for manipulating files and directories. In this article, we will explain how to search subdirectories in Python using the os, glob, and pathlib modules. We will compare the pros and cons of each method and present the best usage examples.

Using the os Module

The os module provides various functionalities to interact with the operating system. By using this module, you can create, delete, and modify files and directories, as well as manipulate file paths, attributes, and permissions. In particular, the os.walk() function is useful for traversing the directory tree to search for subdirectories and files.

os.walk() Function

The os.walk() function starts from the root directory and traverses all subdirectories, obtaining the path, directory names, and file names. The return value of the function is a tuple, where each element includes (directory path, list of all subfolders in the directory, list of all files in the directory).


import os

def search_subdirectories(root_dir):
    for dirpath, dirnames, filenames in os.walk(root_dir):
        print(f'Current directory path: {dirpath}')
        print(f'Subdirectories: {dirnames}')
        print(f'Files: {filenames}')
        print('---------------------')
        
search_subdirectories('/mnt/data')  # Specify the path of the root directory to search.

The above code uses os.walk() to output all subdirectories and files of the given directory. It prints the current directory, subdirectories, and files to the console for each path.

Implementing Filtering

If you only need specific file extensions or name patterns from a large directory, you can search for the desired information through filtering. The following code demonstrates filtering so that the user can search for only ‘.txt’ files.


def search_text_files(root_dir):
    for dirpath, dirnames, filenames in os.walk(root_dir):
        text_files = [f for f in filenames if f.endswith('.txt')]
        if text_files:
            print(f'Text files found in {dirpath}: {text_files}')
        
search_text_files('/mnt/data')  # Specify the directory to search.

In the above example, it searches and outputs only files with the ‘.txt’ extension. You can use different patterns or conditions for filtering as needed.

Using the glob Module

The glob module allows you to search for files using Unix style pathname patterns. It easily searches for files that match specific extensions or name patterns through simple pattern matching.

glob.glob() Function

The glob.glob() function returns all path names matching the specified pattern. In Python 3.5 and above, you can recursively search for subdirectories using the ** pattern.


import glob

def search_with_glob(pattern):
    files = glob.glob(pattern, recursive=True)
    for file in files:
        print(file)
        
search_with_glob('/mnt/data/**/*.txt')  # Search for all .txt files including subdirectories

The above code uses the ** pattern to search for all text files in the target directory and its subdirectories. The glob module provides easy-to-use yet very powerful file search functionality.

Conclusion

In this tutorial, we learned how to search subdirectories using Python. The os.walk() function allows for depth-first traversal of the directory tree, while the glob module provides simple and powerful file searching through pattern matching. Each method can be chosen and applied according to the purpose of use.

Another file-related module in Python is pathlib, which provides an object-oriented approach to make file path manipulation more intuitive. We will cover pathlib in a separate tutorial.

I hope this tutorial provides you with the basics and tips you need to perform file exploration tasks. If you have any further questions or would like to discuss specific use cases, please leave a comment in the section below.