CSV (Comma-Separated Values) files are one of the most common formats for storing and exchanging data. Python provides an easy way to work with CSV files using its built-in csv
module. Whether you're dealing with large datasets or simple tables, understanding how to read from and write to CSV files is a vital skill for any data processing task.
In this blog post, we will cover:
csv
module to read and write CSV files
A CSV file is a plain-text file that contains data separated by commas (or other delimiters, such as tabs or semicolons). It is often used to store tabular data, such as a spreadsheet or database export. Each line in the file corresponds to a row in the table, and each value in the row is separated by a comma.
For example, a simple CSV file might look like this:
Name, Age, Occupation
John, 28, Software Engineer
Jane, 34, Data Scientist
In this case, the first row represents the header, and the following rows represent the data entries.
csv
Module to Read and Write CSV FilesPython’s built-in csv
module allows us to read from and write to CSV files with ease. Let’s explore some common operations.
To read a CSV file, use the csv.reader()
function. This function returns an iterator that can be used to iterate through each row in the file.
import csv
# Open the CSV file in read mode
with open('example.csv', 'r') as file:
# Create a CSV reader object
reader = csv.reader(file)
# Iterate over the rows in the CSV file
for row in reader:
print(row)
['Name', 'Age', 'Occupation']
['John', '28', 'Software Engineer']
['Jane', '34', 'Data Scientist']
To write to a CSV file, you use the csv.writer()
function, which writes a sequence of rows to the file.
import csv
# Open the CSV file in write mode
with open('output.csv', 'w', newline='') as file:
# Create a CSV writer object
writer = csv.writer(file)
# Write rows to the CSV file
writer.writerow(['Name', 'Age', 'Occupation'])
writer.writerow(['John', 28, 'Software Engineer'])
writer.writerow(['Jane', 34, 'Data Scientist'])
The newline=''
argument is used to prevent extra blank lines from being inserted between rows when writing on Windows systems.
CSV files often contain a header row with column names. The csv.DictReader()
and csv.DictWriter()
functions allow for more flexible handling of CSV files by working with dictionaries instead of lists.
The csv.DictReader()
function reads a CSV file and maps each row to a dictionary where the keys are the column names (from the header).
import csv
# Open the CSV file in read mode
with open('example.csv', 'r') as file:
# Create a DictReader object
reader = csv.DictReader(file)
# Iterate over the rows
for row in reader:
print(row)
{'Name': 'John', 'Age': '28', 'Occupation': 'Software Engineer'}
{'Name': 'Jane', 'Age': '34', 'Occupation': 'Data Scientist'}
The csv.DictWriter()
function allows you to write dictionaries to a CSV file, where the keys of the dictionary correspond to the column names.
import csv
# Data to write
data = [
{'Name': 'John', 'Age': 28, 'Occupation': 'Software Engineer'},
{'Name': 'Jane', 'Age': 34, 'Occupation': 'Data Scientist'}
]
# Open the CSV file in write mode
with open('output.csv', 'w', newline='') as file:
# Create a DictWriter object
fieldnames = ['Name', 'Age', 'Occupation']
writer = csv.DictWriter(file, fieldnames=fieldnames)
# Write the header
writer.writeheader()
# Write the rows
writer.writerows(data)
By default, the CSV module uses a comma ,
as the delimiter, but you can specify a custom delimiter if your CSV file uses a different character.
import csv
# Read CSV with semicolon delimiter
with open('example.csv', 'r') as file:
reader = csv.reader(file, delimiter=';')
for row in reader:
print(row)
# Write CSV with semicolon delimiter
with open('output.csv', 'w', newline='') as file:
writer = csv.writer(file, delimiter=';')
writer.writerow(['Name', 'Age', 'Occupation'])
writer.writerow(['John', 28, 'Software Engineer'])
writer.writerow(['Jane', 34, 'Data Scientist'])
CSV files may have different encodings, such as UTF-8 or ISO-8859-1. You can specify the encoding when opening the file to handle different character sets.
import csv
# Open the CSV file with a specific encoding
with open('example.csv', 'r', encoding='utf-8') as file:
reader = csv.reader(file)
for row in reader:
print(row)
# Writing to a file with a specific encoding
with open('output.csv', 'w', newline='', encoding='utf-8') as file:
writer = csv.writer(file)
writer.writerow(['Name', 'Age', 'Occupation'])
writer.writerow(['John', 28, 'Software Engineer'])
writer.writerow(['Jane', 34, 'Data Scientist'])
with
Statements: Always use the with open()
syntax to ensure that the file is properly closed after reading or writing.None
or custom placeholders to handle missing values.'a'
mode.DictReader
and DictWriter
: For more flexibility and readability, prefer using csv.DictReader
and csv.DictWriter
when dealing with CSV files that have headers.Suppose you have a list of dictionaries and you want to write it to a CSV file. You can use the csv.DictWriter
for this.
import csv
data = [
{'Name': 'Alice', 'Age': 24, 'City': 'New York'},
{'Name': 'Bob', 'Age': 30, 'City': 'Los Angeles'}
]
with open('people.csv', 'w', newline='') as file:
fieldnames = ['Name', 'Age', 'City']
writer = csv.DictWriter(file, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(data)
You might need to process a CSV file and calculate the sum of certain columns. Here’s how you can do that:
import csv
with open('sales.csv', 'r') as file:
reader = csv.DictReader(file)
total_sales = 0
for row in reader:
total_sales += float(row['Amount']) # Assuming 'Amount' is a column in the CSV
print(f"Total Sales: ${total_sales}")