3 Ways to Process CSV Files in Python
This article is about 3 ways you can process a CSV file using Python.
Source: flaticon
For those who are starting with Python, the programming language, kick-starting your Data Science career, or just need a quick recap - this article is about 3 ways you can process a CSV file using Python.Â
Let’s quickly start off with what a CSV file is.
What is a CSV?
CSV stands for Comma Separated Values and is a plain text file containing data. It is known as one of the simplest data storage formats and is highly used by Data Scientists and other engineers.Â
This is an example structure:
I got this dataset from Kaggle, you can find it here: Electric car prices
So now let's move on to how to process CSV files in Python.
Processing a CSV file
For the sake of this article, I will be using the Electric car prices dataset as an example.Â
Using pandas
Pandas is an open-source Python package that is built on top of Numpy. The steps are:
Import the library:
import pandas as pd
Use read_csv() to read your file
read_csv() does what it says, it reads your csv file into DataFrame, like this:
df = pd.read_csv("electric_cars.csv")
df.head(5)
Example:
Using csv.reader
Python has a built-in module called csv which can be used to read files. Here are some quick and easy steps:
Import the library:
import csv
Open your CSV file:
with open('electric_cars.csv', 'r') as infile: r = csv.reader(infile) for one_line in r: print(one_line)
Example:
Split method
We can easily load CSV files through the .split method. The .split method on strings returns a list of strings.Â
for one_line in open('electric_cars.csv'): print(one_line.split(','))
However, looking at the example image below and rather having commas as a delimiter, you want to have a tab as a delimiter, you can do:
with open('format1.csv', 'w') as outfile: for one_line in open('electric_cars.csv'): outfile.write(('\t'.join(one_line.strip().split(',')) + '\n'))
Example:
Wrapping it up
There are a variety of ways that you can process CSV files into Python. Some of you may have not heard of these processes, and some of you may have. It is always good to know more than one way of tackling an issue in Data Science and you should always be open to learning these different ways!
Nisha Arya is a Data Scientist and Freelance Technical Writer. She is particularly interested in providing Data Science career advice or tutorials and theory based knowledge around Data Science. She also wishes to explore the different ways Artificial Intelligence is/can benefit the longevity of human life. A keen learner, seeking to broaden her tech knowledge and writing skills, whilst helping guide others.