Step 1: How do I input the right path?
Assuming that you wish to get a listing of a particular path accurately, we start by selecting a user directory on a Windows 10 system, which is basically a reproducible example:
path_dir: str = "C:\Users\sselt\Documents\blog_demo"
The variables assigned upon execution immediately cause an error:
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
The interpreter doesn’t understand the character sequence \U, since this initiates Unicode characters of a similar sequence. This problem arises because the Windows system uses the backslash “\” as a path separator and Linux uses the slash “/”. Unfortunately, since the Windows separator is also the initiator for diverse special characters or escape in Unicode, it obviously confuses everything. Just like we don’t expect any coherence soon in the use of decimal separators in various countries, our only choice is to go for one of three solutions.
Solution 1 – The Hideous Variant
Simply avoid the Windows separator and instead write the path using Linux separators only:
path_dir: str = "C:/Users/sselt/Documents/blog_demo"
The interpreter then recognizes the correct path, believing it were a Linux system to start with.
Solution 2 – The Even More Hideous Variant
Use escape sequences.
path_dir: str = "C:\\Users\sselt\Documents\\blog_demo"
What bothers me besides the illegibility of this is that one does not use escape sequences at every character-separator combination, only before the “U” and “b”.
Solution 3 – The Elegant One
Use raw strings with “r” as a prefix to indicate that special characters should not be evaluated.
path_dir: str = r"C:\Users\sselt\Documents\blog_demo"
Step 2: Scanning the files
Back to our task of wanting to list all elements in a folder. We already know the path.
The simple command os.listdir lists all strings, i.e., only the path filenames. Here and in all other examples, I use type hinting for additional code documentation. This syntax became available from Python 3.5 onwards.
import os
from typing import List
path_dir: str = r"C:\Users\sselt\Documents\blog_demo"
content_dir: List[str] = os.listdir(path_dir)
The file is okay, but I’m more interested in file statistics, for which we have os.stat.
Step 3: Catenating paths
To transfer the file path, we must first combine the filename and path. I have often seen the following constructs in the wild, and even used them when starting out. For example:
path_file: str = path_dir + "/" + filename
path_file: str = path_dir + "\\" + filename
path_file: str = "{}/{}".format(path_dir, filename)
path_file: str = f"{path_dir}/{filename}"
A and B are hideous, because they catenate strings with a “+” sign – which is unnecessary in Python.
B is especially hideous, because one needs a double separator in Windows, or it will be evaluated as an escape sequence for the closing quotation mark.
C and D are somewhat better, since they use string formatting, but they still do not resolve the system-dependence problem. If I apply the result under Windows, I get a functional, but inconsistent path with a mixture of separators.
filename = "some_file"
print("{}/{}".format(path_dir, filename))
...: 'C:\\Users\\sselt\\Documents\\blog_demo/some_file'