Complete guide to Python's os.scandir function covering directory listing, file system traversal, and practical examples.
Last modified April 11, 2025
This comprehensive guide explores Python’s os.scandir function, which provides efficient directory scanning. We’ll cover basic usage, file attributes, performance benefits, and practical examples.
The os.scandir function scans a directory yielding DirEntry objects. It’s more efficient than os.listdir as it provides file attributes without additional system calls.
Key features: returns an iterator of DirEntry objects, provides file type and attribute information, and is faster than os.listdir for many use cases.
The simplest use of os.scandir lists all entries in a directory. This example shows how to iterate through directory contents.
basic_listing.py
import os
with os.scandir(’.’) as entries: for entry in entries: print(entry.name)
entries = os.scandir(’.’) for entry in entries: print(entry.name) entries.close()
This example shows two ways to scan a directory. The first uses a context manager for automatic resource cleanup. The second requires manual closing.
The context manager approach is preferred as it ensures proper resource cleanup even if exceptions occur during iteration.
DirEntry objects provide methods to check entry types. This example filters files and directories separately.
filter_entries.py
import os
with os.scandir(’.’) as entries: files = [] dirs = []
for entry in entries:
if entry.is_file():
files.append(entry.name)
elif entry.is_dir():
dirs.append(entry.name)
print(“Files:”, files) print(“Directories:”, dirs)
This code scans the current directory and categorizes entries into files and directories. The is_file() and is_dir() methods efficiently check types.
These methods use cached information from the initial scan, avoiding extra system calls that would be needed with os.path.isdir().
DirEntry objects provide access to file metadata. This example shows how to get size and modification time for files.
file_info.py
import os import datetime
with os.scandir(’.’) as entries: for entry in entries: if entry.is_file(): stat = entry.stat() mtime = datetime.datetime.fromtimestamp(stat.st_mtime) print(f"{entry.name:20} {stat.st_size:8} bytes {mtime}")
This code displays file names, sizes, and modification times. The stat() method returns a stat_result object with file attributes.
The stat() method performs a system call, but only when needed, making os.scandir still more efficient than separate os.stat calls.
Combining os.scandir with recursion allows full directory tree traversal. This example implements a simple recursive file search.
recursive_scan.py
import os
def scan_directory(path, indent=0): with os.scandir(path) as entries: for entry in entries: print(" " * indent + entry.name) if entry.is_dir(): full_path = os.path.join(path, entry.name) scan_directory(full_path, indent + 4)
scan_directory(’.')
This function recursively scans directories, printing a tree-like structure. Each level of nesting is indented for better visualization.
For large directory trees, consider using os.walk() instead, which handles some edge cases more robustly.
os.scandir can efficiently filter files by extension or other attributes. This example finds all Python files in a directory.
find_py_files.py
import os
python_files = [] with os.scandir(’.’) as entries: for entry in entries: if entry.is_file() and entry.name.endswith(’.py’): python_files.append(entry.name)
print(“Python files found:”, python_files)
with os.scandir(’.’) as entries: py_files = [e.name for e in entries if e.is_file() and e.name.endswith(’.py’)] print(“Python files (listcomp):”, py_files)
The first approach uses a traditional loop to collect Python files. The second shows a more concise list comprehension version.
Both methods efficiently check file types and extensions without unnecessary system calls.
This example demonstrates the performance difference between os.scandir and os.listdir when checking file types.
performance_comparison.py
import os import time
def using_scandir(): start = time.time() with os.scandir(’.’) as entries: files = [e.name for e in entries if e.is_file()] return time.time() - start
def using_listdir(): start = time.time() files = [f for f in os.listdir(’.’) if os.path.isfile(f)] return time.time() - start
scandir_time = using_scandir() listdir_time = using_listdir()
print(f"os.scandir: {scandir_time:.6f} seconds") print(f"os.listdir: {listdir_time:.6f} seconds") print(f"scandir is {listdir_time/scandir_time:.1f}x faster")
This code measures the time taken to list files using both methods. os.scandir is typically faster as it avoids separate stat calls for type checking.
The performance difference becomes more significant with larger directories or when checking multiple file attributes.
os.scandir provides methods to detect and handle symbolic links. This example shows how to identify them.
symlink_handling.py
import os
if not os.path.exists(’test_link’): os.symlink(file, ’test_link')
with os.scandir(’.’) as entries: for entry in entries: if entry.is_symlink(): target = os.readlink(entry.path) print(f"Symlink: {entry.name} -> {target}") elif entry.is_file(): print(f"File: {entry.name}") elif entry.is_dir(): print(f"Directory: {entry.name}")
This code first creates a symbolic link for testing, then scans the directory identifying different entry types. The is_symlink() method detects links.
Note that is_file() and is_dir() follow symlinks by default. Use follow_symlinks=False with stat() to get information about the link itself.
Resource handling: Always close scandir iterators when done
Permission errors: Handle OSError for protected directories
Symbolic links: Be aware of potential symlink attacks
Case sensitivity: Behavior varies by filesystem
Concurrent modifications: Directory may change during scan
Use context managers: Prefer ‘with’ statement for resource cleanup
Cache attributes: Store needed attributes to avoid repeated calls
Handle exceptions: Catch OSError for permission issues
Consider os.walk: For recursive scans with more features
Document assumptions: Note any expected directory structure
My name is Jan Bodnar, and I am a passionate programmer with extensive programming experience. I have been writing programming articles since 2007. To date, I have authored over 1,400 articles and 8 e-books. I possess more than ten years of experience in teaching programming.
List all Python tutorials.