LAST UPDATED: JANUARY 7, 2021
Data Structure in Pandas
1. Series is a one-dimensional array-like structure with homogeneous data. For example, the following series is a collection of integers 10, 23, 56, …
10 |
23 |
56 |
17 |
52 |
61 |
73 |
90 |
26 |
72 |
Key Points
-
Homogeneous data
-
Size Immutable
-
Values of Data Mutable
2. DataFrame is a two-dimensional array with heterogeneous data. For example,
Name |
Age |
Gender |
Rating
|
sam |
45 |
male |
3.4 |
heer |
34 |
female |
4.5 |
Fahim |
34 |
male |
4.5 |
Ena |
23 |
female |
3.6 |
The table represents the data of a sales team of an organization with its overall performance rating. The data is represented in rows and columns. Each column represents an attribute and each row represents a person.
Data Type of Columns
The data types of the four columns are as follows
Column |
Datatype |
Name |
String |
Age |
Integer |
Gender |
String |
Rating |
Float |
Key Points
- Heterogeneous data
- Size Mutable
- Data Mutable
3. Panel is a three-dimensional data structure with heterogeneous data. It is hard to represent the panel in graphical representation. But a panel can be illustrated as a container of DataFrame.
Key Points
- Heterogeneous data
- Size Mutable
- Data Mutable
Operations that you can perform on pandas data structures:
-
Merge and Join data structures to form more extensive data that helps yield better results for your data analysis project.
-
Slice datasets that are big, to get access to only a certain section of the data.
-
Group data which has common labels so that you can find links been a particular group and a particular field of data.
-
Add or remove indices to easily label your existing dataset.
-
Pivot and reshape your data to get more out of your ordinary datasets.
-
Carry out boolean checks on your dataset to check whether a condition holds over your dataset.
-
Sort the elements in your dataset according to your needs.