Signup/Sign In

Working with Xlsx Files in Python - openpyxl Module

Posted in Programming   LAST UPDATED: AUGUST 19, 2021

    Working with Excel files in Python is not that much hard as you might think. In this tutorial, we are going to learn how to create, read and modify .xlsx files using python.

    Introduction

    Xlsx files are the most widely used documents in the technology field. Data Scientists uses spreadsheets more than anyone else in the world and obivously they don't do it manually.

    We will need a module called openpyxl which is used to read, create and work with .xlsx files in python. There are some other modules like xlsxwriter, xlrd, xlwt, etc., but, they don't have methods for performing all the operations on excel files. To install the openpyxl module run the following command in command line:

    pip install openpyxl

    Let's see what all operations one can perform using the openpyxl module after importing the module in our code, which is simple:

    import openpyxl

    Once we have imported the module in our code, all we have to do is use various methods of the module to rad, write and create .xlsx files.


    Creating New .xlsx File

    In this program, we are going to create a new .xlsx file.

    import openpyxl
    
    ## CREATING XLSX FILE
    
    ## initializing the xlsx
    xlsx = openpyxl.Workbook()
    
    ## creating an active sheet to enter data
    sheet = xlsx.active
    
    ## entering data into the A1 and B1 cells in the sheet
    sheet['A1'] = 'Studytonight'
    sheet['B1'] = 'A Programming Site'
    
    ## saving the xlsx file using 'save' method
    xlsx.save('sample.xlsx')

    The above program creates an .xlsx file with the name sample.xlsx in the present working directory.

    Creating Xlsx File using openpyxl python module


    Writing to a Cell

    There are to ways to write to a cell. The first method is the one which we used in the program above and the second method is using the cell() method by passing the row and column numbers.

    Let's see how the second way works:

    import openpyxl
    
    ## initializing the xlsx
    xlsx = openpyxl.Workbook()
    
    ## creating an active sheet to enter data
    sheet = xlsx.active
    
    ## entering data into the cells using 1st method
    sheet['A1'] = 'Studytonight'
    sheet['B2'] = 'Cell B2'
    
    ## entering data into the cells using 2nd method
    sheet.cell(row = 1, column = 2).value = 'A Programming Site'
    sheet.cell(row = 2, column = 1).value = "B1"
    
    ## saving the xlsx file using 'save' method
    xlsx.save('write_to_cell.xlsx')

    adding cell data to Xlsx File using openpyxl python module

    In the second method above, we are getting the cell with the row and column values. After getting the cell, we are assigning a value to it using the value variable.


    Appending Data to a .xlsx file

    The append() method is used to append the data to any cell. Here is an example:

    import openpyxl
    
    ## initializing the xlsx
    xlsx = openpyxl.Workbook()
    
    ## creating an active sheet to enter data
    sheet = xlsx.active
    
    ## creating data to append
    data = [
             [1, 2, 3],
             [4, 5, 6],
             [7, 8, 9],
             [10, 11, 12]
           ]
    
    ## appending row by row to the sheet
    for row in data:
        ## append method is used to append the data to a cell
        sheet.append(row)
    
    ## saving the xlsx file using 'save' method
    xlsx.save('appending.xlsx')

    Appending data to .xlsx file using openpyxl module python

    Using the above program, we have appended 4 rows and 3 column values in our .xlsx file. You can also use tuples or any iteratable object instead of lists.


    Reading Data from a Cell

    We are now going to learn how to read data from a cell in a xlsx file. We will use the previously created .xlsx file to read data from the cell.

    import openpyxl
    
    ## opening the previously created xlsx file using 'load_workbook()' method
    xlsx = openpyxl.load_workbook('sample.xlsx')
    
    ## getting the sheet to active
    sheet = xlsx.active
    
    ## getting the reference of the cells which we want to get the data from
    name = sheet['A1']
    tag = sheet.cell(row = 1, column = 2)
    
    ## printing the values of cells
    print(name.value)
    print(tag.value)

    Output of above Program:

    Studytonight
    A Programming Site
    


    Reading Data from Multiple Cells

    Now we are going to use the appending.xlsx file to read data. It contains numeric values from 1 to 12 saved in the cells in form of 4 rows and 3 columns.

    import openpyxl
    
    ## opening the previously created xlsx file using 'load_workbook()' method
    xlsx = openpyxl.load_workbook('appending.xlsx')
    
    ## getting the sheet to active
    sheet = xlsx.active
    
    ## getting the reference of the cells which we want to get the data from
    values = sheet['A1' : 'C4']
    
    ## printing the values of cells
    for c1, c2, c3 in values:
    print("{} {} {}".format(c1.value, c2.value, c3.value))

    Output of above Program:

    1 2 3
    4 5 6
    7 8 9
    10 11 12

    Slicing method applied on cells returns a tuple containing each row as a tuple. And we can print all the cell's data using a loop.


    Getting Dimensions of an .xlsx Sheet

    Getting the dimensions of an .xlsx sheet is also possible and super-easy using the dimensions method.

    import openpyxl
    
    ## opening the previously created xlsx file using 'load_workbook()' method
    xlsx = openpyxl.load_workbook('appending.xlsx')
    
    ## getting the sheet to active
    sheet = xlsx.active
    
    ## getting the reference of the cells which we want to get the data from
    dimensions = sheet.dimensions
    
    ## printing the dimensions of the sheet
    print(dimensions)

    Output of above Program:

    A1:C4

    The output of the dimensions method is the range of the sheet from which cell to which cell the data is present.


    Getting Data from Rows of .xlsx File

    We can also get data from all the rows of an xlsx file by using the rows method.

    import openpyxl
    
    ## opening the previously created xlsx file using 'load_workbook()' method
    xlsx = openpyxl.load_workbook('appending.xlsx')
    
    ## getting the sheet to active
    sheet = xlsx.active
    
    ## getting the reference of the cells which we want to get the data from
    rows = sheet.rows
    
    ## printing the values of cells using rows
    for row in rows:
        for cell in row:
            print(cell.value, end = ' ')
    
        print("\n")

    Output of above Program:

    1 2 3 
    4 5 6 
    7 8 9 
    10 11 12

    rows method returns a generator which contains all rows of the sheet.


    Getting Data from Columns of .xlsx File

    We can get data from all the columns of an xlsx file by using the columns method.

    import openpyxl
    
    ## opening the previously created xlsx file using 'load_workbook()' method
    xlsx = openpyxl.load_workbook('appending.xlsx')
    
    ## getting the sheet to active
    sheet = xlsx.active
    
    ## getting the reference of the cells which we want to get the data from
    columns = sheet.columns
    
    ## printing the values of cells using rows
    for column in columns:
        for cell in column:
            print(cell.value, end = ' ')
        print("\n")

    Output of above Program:

    1 4 7 10 
    2 5 8 11 
    3 6 9 12

    columns method returns a generator which contains all the columns of the sheet.


    Working with Excel Sheets

    In this section we will see how we can create more sheets, get name of sheets and even change the name of any given sheet etc.

    1. Changing the name of an Excel Sheet

    We can also change the name of a given excel sheet using the title variable. Let's see an example:

    import openpyxl
    
    ## initializing the xlsx
    xlsx = openpyxl.Workbook()
    
    ## creating an active sheet to enter data
    sheet = xlsx.active
    
    ## entering data into the A1 and B1 cells in the sheet
    sheet['A1'] = 'Studytonight'
    sheet['B1'] = 'A Programming Site'
    
    ## setting the title for the sheet
    sheet.title = "Sample"
    
    ## saving the xlsx file using 'save' method
    xlsx.save('sample.xlsx')

    change sheet name using openpyxl module in python


    2. Getting Excel Sheet name

    Getting the names of all the sheets present in xlsx file is super easy using the openpyxl module. We can use the method called get_sheet_names() to get names of all the sheets present in the excel file.

    import openpyxl
    
    ## initializing the xlsx
    xlsx = openpyxl.load_workbook('sample.xlsx')
    
    ## getting all sheet names
    names = xlsx.get_sheet_names()
    
    print(names)

    Output of above Program:

    ['Sample']
    


    3. Creating more than one Sheet in an Excel File

    When creating our first xlsx file, we only created one sheet. Let's see how to create multiple sheets and give names to them.

    import openpyxl
    
    ## initializing the xlsx
    xlsx = openpyxl.Workbook()
    
    ## creating sheets
    xlsx.create_sheet("School")
    xlsx.create_sheet("College")
    xlsx.create_sheet("University")
    
    ## saving the xlsx file using 'save' method
    xlsx.save('multiple_sheets.xlsx')

    Multiple Sheets

    As you can see in the snapshot above that a new excel file is created with 3 new sheets with different names provided by us in the program.


    4. Adding Data to Multiple Sheets

    Entering data into different sheets present in xlsx file can also be done easily. In the program below, we will get the sheets by their names individually or all at once as we have done above. Let's see how to get sheets using their name and enter data into them.

    We will use previously created xlsx file called multiple_sheets.xlsx to enter the data into sheets.

    import openpyxl
    
    ## initializing the xlsx
    xlsx = openpyxl.Workbook()
    
    ## creating sheets
    xlsx.create_sheet("School")
    xlsx.create_sheet("College")
    xlsx.create_sheet("University")
    
    ## getting sheet by it's name
    school = xlsx.get_sheet_by_name("School")
    school['A1'] = 1
    
    ## getting sheet by it's name
    college = xlsx.get_sheet_by_name("College")
    college['A1'] = 2
    
    ## getting sheet by it's name
    university = xlsx.get_sheet_by_name("University")
    university['A1'] = 3
    
    ## saving the xlsx file using 'save' method
    xlsx.save('multiple_sheets.xlsx')

    Creating multiple sheets in excel file using openpyxl python module

    We can't modify the existing xlsx file using the openpyxl module but we can read data from it.


    Conclusion

    We have seen different methods which can be used while working with xlsx files using Python. If you want to explore more methods available in the openpyxl module, then you can try them using the dir() method to get information about all methods of openpyxl module.

    You can also see other modules like xlsxwriter, xlrd, xlwt, etc., for more functionalities. If you have already used any of these modules, do share your experience with us through comments section.

    You may also like:

    About the author:
    I am a Computer Science Student. I am interested in Programming. I love to know about new technologies.
    Tags:PythonOpenpyxlXlsx
    IF YOU LIKE IT, THEN SHARE IT
     

    RELATED POSTS