Signup/Sign In
PUBLISHED ON: MARCH 15, 2021

Pandas DataFrame convert_dtypes() Method

In this tutorial, we will learn the Python pandas DataFrame.convert_dtypes() method. It converts the columns of DataFrame to the best possible dtypes using dtypes supporting pd.NA. It returns the DataFrame that is the copy of the input object with the new dtypes. Here best possible means the type most suited to hold the values.

The below shows the syntax of the DataFrame.convert_dtypes() method.

Syntax

DataFrame.convert_dtypes(infer_objects=True, convert_string=True, convert_integer=True, convert_boolean=True, convert_floating=True)

Parameters

infer_objects: It represents the bool(True or False), and the default is True. It indicates whether object dtypes should be converted to the best possible types.

convert_string: It represents the bool(True or False), and the default is True. It indicates whether object dtypes should be converted to StringDtype().

convert_integer: It represents the bool(True or False), and the default is True. It indicates whether, if possible, conversion can be done to integer extension types.

convert_boolean:It represents the bool(True or False), and the default is True. It indicates whether object dtypes should be converted to BooleanDtypes().

convert_floating:It represents the bool(True or False), and the default is True. It indicates whether if possible, conversion can be done to floating extension types. If convert_integer is also True, preference will be give to integer dtypes if the floats can be faithfully casted to integers.

Example 1: Convert the DataFrame to use best possible dtypes using DataFrame.convert_dtypes() Method

We can convert the columns of DataFrame to the best possible dtypes. See the below example.

import pandas as pd
df = pd.DataFrame({'A': ['a', 'b', 'c'], 'B': ['d', 'e', 'f']})
print("--------DataType of DataFrame---------")
print(df.dtypes)
print("--------DataType of DataFrame after converting---------")
df1=df.convert_dtypes()
print(df1.dtypes)

Once we run the program we will get the following output.


--------DataType of DataFrame---------
A object
B object
dtype: object
--------DataType of DataFrame after converting---------
A string
B string
dtype: object

Example 2: Convert the DataFrame Using DataFrame.convert_dtypes() Method

This example is similar to the previous one, we just trying with different datatypes.

import pandas as pd
df = pd.DataFrame({'A': [True,True,True], 'B': [True,2,3]})
print("--------DataType of DataFrame---------")
print(df.dtypes)
print("--------DataType of DataFrame after converting---------")
df1=df.convert_dtypes()
print(df1.dtypes)

Once we run the program we will get the following output.


--------DataType of DataFrame---------
A bool
B object
dtype: object
--------DataType of DataFrame after converting---------
A boolean
B object
dtype: object

Example 3: Convert the DataFrame Using DataFrame.convert_dtypes() Method

Let's understand the DataFrame.convert_dtypes() method with different datatypes.

import pandas as pd
import numpy as np
df = pd.DataFrame({"a": pd.Series([1, 2, 3], dtype=np.dtype("int32")),"b": pd.Series(["x", "y", "z"], dtype=np.dtype("O")),
                   "c": pd.Series([True, False, np.nan], dtype=np.dtype("O")),"d": pd.Series(["h", "i", np.nan], dtype=np.dtype("O")),
                   "e": pd.Series([10, np.nan, 20], dtype=np.dtype("float")),"f": pd.Series([np.nan, 100.5, 200], dtype=np.dtype("float")),})
print("--------DataType of DataFrame---------")
print(df.dtypes)
print("--------DataType of DataFrame after converting---------")
df1=df.convert_dtypes()
print(df1.dtypes)

Once we run the program we will get the following output.


--------DataType of DataFrame---------
a int32
b object
c object
d object
e float64
f float64
dtype: object
--------DataType of DataFrame after converting---------
a Int32
b string
c boolean
d string
e Int64
f float64
dtype: object

Conclusion

In this tutorial, we learned the Python pandas DataFrame.convert_dtypes() method. By solving examples we understood how the DataFrame.convert_dtypes() method converts the columns of DataFrame to the best possible dtypes.



About the author:
I like writing about Python, and frameworks like Pandas, Numpy, Scikit, etc. I am still learning Python. I like sharing what I learn with others through my content.