Last updated: Apr 11, 2024
Reading time·5 min
shutil.copyfileobj()
fileinput
glob()
method to merge text files in PythonTo merge text files in Python:
with open()
statement to open the output file for writing.for
loop to iterate over the file paths.The example project has the following folder structure.
my-project/ └── main.py └── file-1.txt └── file-2.txt
Here are the contents of file-1.txt
.
bobby hadz . com
And here are the contents of file-2.txt
.
one two three four
This is the code for the main.py
file.
file_paths = ['file-1.txt', 'file-2.txt'] with open('output-file.txt', 'w', encoding='utf-8') as output_file: for file_path in file_paths: with open(file_path, 'r', encoding='utf-8') as input_file: for line in input_file: output_file.write(line)
If there is no end-of-line character at the end of each file, you'd have to manually add one.
file_paths = ['file-1.txt', 'file-2.txt'] with open('output-file.txt', 'w', encoding='utf-8') as output_file: for file_path in file_paths: with open(file_path, 'r', encoding='utf-8') as input_file: for line in input_file: output_file.write(line) # ✅ Manually add a newline character at the end of each file output_file.write('\n')
The last line adds a newline (\n
) character at the end of each file.
If I run the script with python main.py
, the following output-file.txt
file
is produced.
bobby hadz . com one two three four
We used the with open() statement to open the output file for writing.
The next step is to iterate over the list of file paths.
On each iteration, we open the file at the current file path for reading.
file_paths = ['file-1.txt', 'file-2.txt'] with open('output-file.txt', 'w', encoding='utf-8') as output_file: for file_path in file_paths: with open(file_path, 'r', encoding='utf-8') as input_file: for line in input_file: output_file.write(line)
The next step is to use a for loop to iterate over the lines of each input file.
On each iteration, we write the current line to the output file.
This approach works well for large files because we iterate over the lines of each file instead of reading the entire file in memory.
If the files you are merging are small in size, you don't have to iterate over the lines of each file.
You can directly read the contents of each file in memory.
file_paths = ['file-1.txt', 'file-2.txt'] with open('output-file.txt', 'w', encoding='utf-8') as output_file: for file_path in file_paths: with open(file_path, 'r', encoding='utf-8') as input_file: output_file.write(input_file.read())
The code sample is very similar to the one from the previous subheading.
However, instead of using a nested for
loop to iterate over the lines of each
input file, we use the file.read()
method to read the entire file in memory.
Once we've read the file, we write its contents to the output file and repeat the process for each of the file paths.
shutil.copyfileobj()
You can also use the shutil.copyfileobj()
method to merge text files in
Python.
The example project has the following folder structure.
my-project/ └── main.py └── file-1.txt └── file-2.txt
Here are the contents of file-1.txt
.
bobby hadz . com
And here are the contents of file-2.txt
.
one two three four
This is the code for the main.py
file.
import shutil file_paths = ['file-1.txt', 'file-2.txt'] with open('output-file.txt', 'wb') as output_file: for file_path in file_paths: with open(file_path, 'rb') as input_file: shutil.copyfileobj(input_file, output_file)
If there is no end-of-line character after the last line of each file, you will have to manually add one.
import shutil file_paths = ['file-1.txt', 'file-2.txt'] with open('output-file.txt', 'wb') as output_file: for file_path in file_paths: with open(file_path, 'rb') as input_file: shutil.copyfileobj(input_file, output_file) # ✅ add end-of-line character manually output_file.write(b'\n')
The last line adds a newline character after copying the contents of each file.
We opened the output-file
in wb
(write binary) mode and used a for
loop to
iterate over the file paths.
On each iteration, we open the current file path in rb
(read binary) mode and
copy its contents into the output file.
The shutil.copyfileobj() method copies the contents of the source file-like object to the output file-like object.
The method optionally takes a third argument - an integer length to specify the buffer size.
By default, the data is read in chunks to avoid uncontrolled memory consumption.
fileinput
You can also use the fileinput.input() method to merge text files in Python.
import fileinput file_paths = ['file-1.txt', 'file-2.txt'] with open('output-file.txt', 'w', encoding='utf-8') as output_file, fileinput.input(file_paths) as input_file: for line in input_file: output_file.write(line)
The code sample uses the with open()
statement to open the output file for
writing and the
fileinput.input()
method to create an instance of the FileInput
class.
The instance can be used as a context manager in the with
statement.
The FileInput
object also automatically closes the input files even if an
exception occurs.
Running the python main.py
command produces a file called output-file.txt
with the following contents.
bobby hadz . com one two three four
glob()
method to merge text files in PythonIf you need to merge all .txt
files that match a given pattern, use the
glob()
method.
The example project has the following folder structure.
my-project/ └── main.py └── text-files/ └── file-1.txt └── file-2.txt
Here are the contents of text-files/file-1.txt
.
bobby hadz . com
And here are the contents of text-files/file-2.txt
.
one two three four
This is the code for the main.py
file.
import glob file_paths = glob.glob('text-files/*.txt') # 👇️ ['text-files/file-1.txt', 'text-files/file-2.txt'] print(file_paths) with open('output-file.txt', 'w', encoding='utf-8') as output_file: for file_path in file_paths: with open(file_path, 'r', encoding='utf-8') as input_file: output_file.write(input_file.read() + '\n')
The glob.glob() method returns a list of path names that match the given pattern.
Once we have a list of the file paths, we use the with open()
statement to
open the output file.
The next steps are to:
Notice that we also append a newline (\n
) character at the end of each input
file.
This is not necessary if your input fields end with a newline character.
You can learn more about the related topics by checking out the following tutorials: