Why learn to use Excel

In the establishment of the machine learning model, Excel’s powerful statistical analysis capabilities have to be used. At this time, it involves how to store the data directly processed by Python into the excel form and obtain the data of the form for analysis. In simple terms, learn After operating Excel files with Python, you can apply them to machine learning and automate the tedious tasks in daily life.

Basic use of Excel

Python package for processing Excel is openpyxl. If you downloaded the Python library installed by anaconda directly, then when you download it, openpyxl is already installed. If not, you can also install it directly in cmd using pip install openpyxl .

After the installation is complete, we should understand the components of Excel, as shown below:


You should know when you see the above picture. First, we should get the Excel file:


For an existing Excel target file, using the above method, you can directly obtain the excel file. If you want to re-create a new Excel file, you may try the following methods:


Excel has been obtained, but in fact our data is in a table one by one, as shown below:

There can be multiple tables in an Excel file. Generally speaking, the first table is operated, and there are many methods to obtain the table, such as:

    1. sheet=wb.worksheets[0]



    1. sheet=wb[‘吸入类’]



The above two methods are to get the first table. If we use wb = openpyxl.Workbook () method, what about a newly created Excel? We need to create an action table:

    1. wb.create_sheet(index=0,title=’小猪’)



    1. wb.create_sheet(index=1,title=’佩奇’)



After we get the form, we can happily manipulate our data.

For example, at this time, if you want to get the text in cell A1 in the table, you can use the following method:


The above is to get the data of a single cell. Now I have a need to traverse all the data in a table. What should I do? Here we first suppose I have the following form:


Then the code traversed is as follows:

import openpyxl
for row in sheet.iter_rows():
for cell in row:
print(cell.coordinate, cell.value)


The above is traversing line by line. First we get the Excel file, then the first table in it, and then iterate through the number of rows and columns. Openpyxl’s method for obtaining the number of rows is sheet.iter_rows (). The number of columns can be obtained. The direct access to the current row is for cell in row. The code is concise and clear, and the implementation effect is as follows:


So how do you modify the data? Let’s take a look at the code first:




In all three ways, you can modify data in a cell in Excel, and you can also get the value of this cell in the first two ways. The advantage of Python is so simple and clear.

After these operations, we have one more thing to do, and that is, if we save or create Excel after modification, it is actually very simple, and it can be done in one sentence. The code is as follows:


In this way, the operation data is saved to 1234.xlsx under the current directory. It is also possible to write the absolute path.

Excel style

In everyday life, Excel tables are not just simple operations such as writing data, modifying data, and saving data. In fact, if you make Excel more beautiful, such as modifying certain fonts, border styles, and so on.

Font style

Well, let’s talk about how to modify the font style. We have special font styles in openpyxl, which can be imported in the following ways. The code used is as follows:

from openpyxl.styles import Font, PatternFill, Border, Side, Alignment

font = Font(name='微软雅黑', size=10, bold=False, italic=False, vertAlign=None,

underline='none', strike=False, color='FF000000')

Here we first introduce the font style in Excel, and then create a style through the constructor. For example, the font created here is Microsoft Yahei, the font size is 10, no bold, no italics, no underline, no strikethrough, The color is black.

As much as we have created a style, the next step is to set the style to the cell. For example, to set the style to the cell in Table 11, the following two methods will work, the code is as follows:



Fill style

After talking about font styles, the second most commonly used is fill styles, such as filling a cell with a background color, a fill pattern, and so on. Let’s take a look at the operation of the code:

fill = PatternFill(fill_type='darkUp',start_color='FFFF00',end_color='FF0000')

This code implements the following style:

You can see that its pattern type is darkUp, starting color line color, ending color is red, red and yellow grid pattern fill, of course, there are many pattern types, the following are:

 {‘gray0625’, ‘lightHorizontal’, ‘darkVertical’, ‘darkGray’, ‘darkDown’, ‘solid’, ‘lightTrellis’, ‘darkHorizontal’, ‘gray125’, ‘lightGray’, ‘lightDown’, ‘lightUp’, ‘mediumGray’, ‘darkUp’, ‘darkGrid’, ‘darkTrellis’, ‘lightVertical’, ‘lightGrid’}

Which one is used is free to choose according to your needs. The important point is that when the pattern is solid, only the start_color color is filled by default.

Border style

The third thing to introduce is the border style. It can be said that the border style is also used a lot in actual work. It is still good to know about the actual work. The operation code is as follows:

border = Border(left=Side(border_style='dashDotDot',color='9932CC'),

The colors and styles of the top, bottom, left, and right borders are set above. Here I have set the dotted line style. The color randomly selects four darker colors. The effect achieved is shown below (upper red, lower green, left purple, right black):

Of course, there are many parameters for the border. If you want to use it more widely, you can refer to the
openpyxl development documentation

. The blogger only introduces the most commonly used border styles.

Alignment style

The last style is the alignment style. Compared to the above three styles, this alignment style is used almost every time. Now we first make a cell center alignment, the code is as follows:

alignment = Alignment(horizontal='center',vertical='center',text_rotation=0,indent=0)

Here we set the alignment style to horizontal center, vertical center, the text rotation angle is 0 (up to 180 °), and the indentation is 0, such as 90 degrees, the text becomes the style shown in the figure below:

Of course, there are many other parameters for this alignment style, but these are the most used. If you want to know more, please go to the
openpyxl development documentation

for the most detailed parameters. I won’t go into too much detail here.

Excel formula

Link formula

After the style is finished, we need to introduce the application of Excel formulas. For example, some Excel tables have links, and we need to use formulas to set the links to a clickable hyperlink mode. Similarly, when calculating certain values, we also use To Excel formula. Let’s first explain how to make the URL link clickable. The code is as follows:

sheet.cell(6,3).value='=HYPERLINK("%s","%s")' % ("https://www.google.com", "xxx")

The above code is very simple to understand, that is, I have to enter Baidu in the third row and the third column, and Baidu can click to jump to the designated web page (the first parameter is the address to be redirected).

This is actually the formula in excel. Of course, you can also set the exe file in a certain directory and click to execute the exe directly. You can also set a jump to a cell. For example, if you want to click to jump to A1, the code is as follows:

sheet.cell(6,3).value='=HYPERLINK("%s","%s")' % ("#A1", "jump to A1")

Calculation formula operation

Now I have another requirement. For example, I want to calculate the sum of the five numbers A1 to E1 in the table below. What should I do in Excel?

In fact, it is the same as above. As long as you know the calculations and formulas in Excel, you can directly apply them to the code. This way, even if you are a programmer, you can basically master the application rules of formulas, because as long as you know Excel, the code is as follows :


Merge and split Excel cells

Excel is probably already introduced, but the blogger just remembered that there should be a common operation method, which is to merge cells. Here we also look at a piece of code:


The merge is also the operation table structure. Here we merge the numbers of the table just above. You can see that we have merged all the cells from A1 to E1. At the same time, the merged cells display the contents of the first cell.

As the saying goes, splitting occurs when there is a merge, and we also need to split the cells. In this way, we will split the cells that were merged first into the original state. The code is as follows:


Very simple, just add an un in front of the code, but there is one thing to note here. Just merge the cells. The default is filled with the data in the first cell, but splitting the cells will not automatically fill the data. The original whole data is displayed in the first cell, and other split cells do not display any data.

Having summarized so much, I believe it will be helpful to friends who often use Excel. If you write it, you can remember to like it!