Writing data from a Pandas Dataframe to a MySQL table

Last updated on 21st January 2020

Pandas is currently one of the most popular Python library used for data analysis. A DataFrame in Pandas is a data structure for storing data in tabular form, i.e., in rows and columns. This article describes how to write the data in a Pandas DataFrame to a MySQL table.

Consider a DataFrame with three records like below.

You can create a database table in MySQL and insert this data using the to_sql() function in Pandas. The to_sql() function requires two mandatory parameters - table name and an SQLAlchemy engine object.

SQLAlchemy is a Python toolkit and Object Relational Mapper (ORM) that allows Python to work with SQL Databases. An engine is the base of any SQLAlchemy application that talks to the database. The engine object is created by calling the create_engine() function with database dialect and connection parameters.

The following Python program creates a new table named users in a MySQL database and populates the table with the three records from our example dataframe.

import pandas as pd
from sqlalchemy import create_engine

# Credentials to database connection
hostname="localhost"
dbname="mydb_name"
uname="my_user_name"
pwd="my_password"

# Create dataframe
df = pd.DataFrame(data=[[111,'Thomas','35','United Kingdom'],
		[222,'Ben',42,'Australia'],
		[333,'Harry',28,'India']],
		columns=['id','name','age','country'])

# Create SQLAlchemy engine to connect to MySQL Database
engine = create_engine("mysql+pymysql://{user}:{pw}@{host}/{db}"
				.format(host=hostname, db=dbname, user=uname, pw=pwd))

# Convert dataframe to sql table                                   
df.to_sql('users', engine, index=False)

After executing the above Python script, login to your MySQL database and query the newly created users table.

SELECT * from users;

id	name	age	country
111	Thomas	35	United Kingdom
222	Ben	42	Australia
333	Harry	28	India

Now let's look at some of the optional parameters that can be passed to the to_sql() function.

index : True or False. Default is True.
If the index paramter in the to_sql function is set to True (index=True) or if the index parameter is omitted then the DataFrame's index is added as the first column in the MySQL database table. If you don't want the index then set index=False
index_label : string. Default is None.
When index=True, the DataFrame's index is included in the table under column name index. You can have a different column name for the index column by setting a string value to the index_label parameter. For example
```
df.to_sql('users', engine, index=True, index_label="RowNum")
```
if_exists : fail, replace, append. Default is fail
This parameter decides what to do if the table already exist in the database. By dafault an error is raised if the table exists. If you set this parameter to replace then the existing table is first dropped before inserting the values. If set to append then the values in the dataframe are appended to the existing table.
dtype : dictionary
The dtype parameter allows you to specify the data type of the table columns as a dictionary object. The keys in the dictionary are column names and values can be one of the SQLAlchemy data types. For example
```
from sqlalchemy.types import String, Integer
df.to_sql('users', engine, dtype={"age": Integer(), "name": String(30)})
```
The above to_sql() function call creates the users table with the age column set as integer type and name as varchar
Columns in users table
```
index bigint(20) 
id bigint(20) 
name varchar(30) 
age int(11) 
country text
```

Adding a PRIMARY KEY

The to_sql() function does not offer any options for creating a Primary Key column. However you can achieve this by calling the .execute method of SQLAlchemy engine.

The following statement sets the id column in users table as the primary key.

engine.execute('ALTER TABLE users ADD PRIMARY KEY (`id`);')

Writing data from a Pandas Dataframe to a MySQL table

Adding a PRIMARY KEY

Related Articles

Post a comment

Comments

Related Articles

Reading data from MySQL to Pandas Dataframe

Different ways to create a Pandas DataFrame

Selecting, Slicing and Filtering data in a Pandas DataFrame