Python CSV: Read and Write CSV Files


CSV (Comma-Separated Values) files are one of the most common formats for storing and exchanging data. Python provides an easy way to work with CSV files using its built-in csv module. Whether you're dealing with large datasets or simple tables, understanding how to read from and write to CSV files is a vital skill for any data processing task.

In this blog post, we will cover:

  • What is a CSV file?
  • Using Python’s csv module to read and write CSV files
    • Reading CSV files
    • Writing to CSV files
    • Handling CSV files with headers
  • Advanced CSV operations
    • Reading and writing with custom delimiters
    • Handling different encodings
  • Best Practices for working with CSV files
  • Example use cases

What is a CSV File?

A CSV file is a plain-text file that contains data separated by commas (or other delimiters, such as tabs or semicolons). It is often used to store tabular data, such as a spreadsheet or database export. Each line in the file corresponds to a row in the table, and each value in the row is separated by a comma.

For example, a simple CSV file might look like this:

Name, Age, Occupation
John, 28, Software Engineer
Jane, 34, Data Scientist

In this case, the first row represents the header, and the following rows represent the data entries.


Using Python’s csv Module to Read and Write CSV Files

Python’s built-in csv module allows us to read from and write to CSV files with ease. Let’s explore some common operations.

Reading CSV Files

To read a CSV file, use the csv.reader() function. This function returns an iterator that can be used to iterate through each row in the file.

Example: Basic CSV Reading

import csv

# Open the CSV file in read mode
with open('example.csv', 'r') as file:
    # Create a CSV reader object
    reader = csv.reader(file)
    
    # Iterate over the rows in the CSV file
    for row in reader:
        print(row)

Output:

['Name', 'Age', 'Occupation']
['John', '28', 'Software Engineer']
['Jane', '34', 'Data Scientist']

Writing to CSV Files

To write to a CSV file, you use the csv.writer() function, which writes a sequence of rows to the file.

Example: Basic CSV Writing

import csv

# Open the CSV file in write mode
with open('output.csv', 'w', newline='') as file:
    # Create a CSV writer object
    writer = csv.writer(file)
    
    # Write rows to the CSV file
    writer.writerow(['Name', 'Age', 'Occupation'])
    writer.writerow(['John', 28, 'Software Engineer'])
    writer.writerow(['Jane', 34, 'Data Scientist'])

The newline='' argument is used to prevent extra blank lines from being inserted between rows when writing on Windows systems.


Handling CSV Files with Headers

CSV files often contain a header row with column names. The csv.DictReader() and csv.DictWriter() functions allow for more flexible handling of CSV files by working with dictionaries instead of lists.

Reading CSV Files with Headers

The csv.DictReader() function reads a CSV file and maps each row to a dictionary where the keys are the column names (from the header).

Example: Reading with Headers

import csv

# Open the CSV file in read mode
with open('example.csv', 'r') as file:
    # Create a DictReader object
    reader = csv.DictReader(file)
    
    # Iterate over the rows
    for row in reader:
        print(row)

Output:

{'Name': 'John', 'Age': '28', 'Occupation': 'Software Engineer'}
{'Name': 'Jane', 'Age': '34', 'Occupation': 'Data Scientist'}

Writing CSV Files with Headers

The csv.DictWriter() function allows you to write dictionaries to a CSV file, where the keys of the dictionary correspond to the column names.

Example: Writing with Headers

import csv

# Data to write
data = [
    {'Name': 'John', 'Age': 28, 'Occupation': 'Software Engineer'},
    {'Name': 'Jane', 'Age': 34, 'Occupation': 'Data Scientist'}
]

# Open the CSV file in write mode
with open('output.csv', 'w', newline='') as file:
    # Create a DictWriter object
    fieldnames = ['Name', 'Age', 'Occupation']
    writer = csv.DictWriter(file, fieldnames=fieldnames)
    
    # Write the header
    writer.writeheader()
    
    # Write the rows
    writer.writerows(data)

Advanced CSV Operations

Reading and Writing with Custom Delimiters

By default, the CSV module uses a comma , as the delimiter, but you can specify a custom delimiter if your CSV file uses a different character.

Example: Using a Semicolon as Delimiter

import csv

# Read CSV with semicolon delimiter
with open('example.csv', 'r') as file:
    reader = csv.reader(file, delimiter=';')
    for row in reader:
        print(row)

# Write CSV with semicolon delimiter
with open('output.csv', 'w', newline='') as file:
    writer = csv.writer(file, delimiter=';')
    writer.writerow(['Name', 'Age', 'Occupation'])
    writer.writerow(['John', 28, 'Software Engineer'])
    writer.writerow(['Jane', 34, 'Data Scientist'])

Handling Different Encodings

CSV files may have different encodings, such as UTF-8 or ISO-8859-1. You can specify the encoding when opening the file to handle different character sets.

Example: Specifying Encoding

import csv

# Open the CSV file with a specific encoding
with open('example.csv', 'r', encoding='utf-8') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

# Writing to a file with a specific encoding
with open('output.csv', 'w', newline='', encoding='utf-8') as file:
    writer = csv.writer(file)
    writer.writerow(['Name', 'Age', 'Occupation'])
    writer.writerow(['John', 28, 'Software Engineer'])
    writer.writerow(['Jane', 34, 'Data Scientist'])

Best Practices for Working with CSV Files

  1. Use with Statements: Always use the with open() syntax to ensure that the file is properly closed after reading or writing.
  2. Handle Missing Values: CSV files may have missing or incomplete data. You can use None or custom placeholders to handle missing values.
  3. Avoid Overwriting Existing Data: When writing to a file, be careful not to overwrite existing data unless intended. You can append data by using 'a' mode.
  4. Validate Data: Always validate the data before writing it to a CSV file, especially when reading data from external sources.
  5. Use DictReader and DictWriter: For more flexibility and readability, prefer using csv.DictReader and csv.DictWriter when dealing with CSV files that have headers.

Example Use Cases

Example 1: Converting Data from a List of Dictionaries to CSV

Suppose you have a list of dictionaries and you want to write it to a CSV file. You can use the csv.DictWriter for this.

import csv

data = [
    {'Name': 'Alice', 'Age': 24, 'City': 'New York'},
    {'Name': 'Bob', 'Age': 30, 'City': 'Los Angeles'}
]

with open('people.csv', 'w', newline='') as file:
    fieldnames = ['Name', 'Age', 'City']
    writer = csv.DictWriter(file, fieldnames=fieldnames)
    
    writer.writeheader()
    writer.writerows(data)

Example 2: Reading and Summarizing CSV Data

You might need to process a CSV file and calculate the sum of certain columns. Here’s how you can do that:

import csv

with open('sales.csv', 'r') as file:
    reader = csv.DictReader(file)
    total_sales = 0
    
    for row in reader:
        total_sales += float(row['Amount'])  # Assuming 'Amount' is a column in the CSV

print(f"Total Sales: ${total_sales}")