Python – Pandas – Using Dictionary to remap values in Dataframe

While working with data in Pandas, we perform a vast array of operations on the data to get the data in the desired form, before, for example, creating diagrams or passing to the visualization phase. One of these operations could be that we want to remap the values of a specific column in the DataFrame. This can be done in several ways.

The following example will show that, given a Dataframe containing data about an event, we can remap the values of a specific column to a new value, using a dictionary.

The first step, for this example, is to create a sample dataframe with some dummy data:

import pandas as pd

# Creating the DataFrame.
df = pd.DataFrame({‘Date’:[’10/2/2011′, ’11/2/2011′, ’12/2/2011′, ’13/2/2011′],
‘Event’:[‘Music’, ‘Poetry’, ‘Theatre’, ‘Comedy’],
‘Cost’:[10000, 5000, 15000, 2000]})
# Printing the dataframe.
print(df)

Python - Pandas - Using Dictionary to remap values in Dataframe

Now we will remap the values of the Event column by their respective codes:

# Create a dictionary using which we will use to remap the values into the dataframe.
dict = {‘Music’ : ‘M’, ‘Poetry’ : ‘P’, ‘Theatre’ : ‘T’, ‘Comedy’ : ‘C’}
# Printing the dictionary.
print(dict)
# Remap the values of the dataframe.
df.replace({“Event”: dict})

Python - Pandas - Using Dictionary to remap values in Dataframe

map() Method

We can use map() function to achieve this task:

import pandas as pd

# Creating the DataFrame.
df = pd.DataFrame({‘Date’:[’10/2/2011′, ’11/2/2011′, ’12/2/2011′, ’13/2/2011′],
‘Event’:[‘Music’, ‘Poetry’, ‘Theatre’, ‘Comedy’],
‘Cost’:[10000, 5000, 15000, 2000]})
# Printing the dataframe .
print(df)

Now we will remap the values of the ‘Event’ column by their respective codes.

# Create a dictionary using which we
# will remap the values
dict = {‘Music’ : ‘M’, ‘Poetry’ : ‘P’, ‘Theatre’ : ‘T’, ‘Comedy’ : ‘C’}

# Print the dictionary
print(dict)

# Remap the values of the dataframe
df[‘Event’]= df[‘Event’].map(dict)

# Print the DataFrame after modification
print(df)

Function Approach

Another approach is to using a function in Python, in this case equivalent to the replace() approach, to be able to reuse it anytime we need it during our analysis:

def remap(data,dict_labels):
for field,values in dict_labels.items():
print(“I am remapping %s”%field) data.replace({field:values},inplace=True)
print(“DONE”)
return data

Update Approach

Taking, as an example, di as dictionary and df as a dataframe, if the keys of di are meant to refer to index values, then you could use the update method:

df[‘col1’].update(pd.Series(di))

For example:

import pandas as pd
import numpy as np

df = pd.DataFrame({‘col1’:[‘w’, 10, 20],’col2′: [‘a’, 30, np.nan]},index=[1,2,0])

  col1 col2
1  w    a
2  10   30
0  20   NaN

di = {0: “A”, 2: “B”}

The value at the 0-index is mapped to ‘A‘, the value at the 2-index is mapped to ‘B‘:

df[‘col1’].update(pd.Series(di))
print(df)

  col1 col2
1  w   a
2  B   30
0  A   NaN

Note how the keys in di are associated with index values. The order of the index values, that is, the index locations, does not matter.

Category