Dataset Creation
Before we dive into the Pandas tutorial, let's create a sample dataset of employees working in different departments with salary details.
What is Pandas?
Pandas is an open-source Python library used for data manipulation and analysis. It provides powerful data structures: Series (1D) and DataFrame (2D), which make handling structured data easy and efficient.
Key Features of Pandas:
Fast and efficient DataFrame object
Tools for reading and writing data from multiple formats (CSV, Excel, SQL, JSON, etc.)
Data alignment and handling of missing data
Powerful indexing and slicing capabilities
Data aggregation, merging, and grouping
1. Series and DataFrames
Series (1D Data Structure)
A Series is a one-dimensional labeled array capable of holding data of any type.
DataFrame (2D Data Structure)
A DataFrame is a two-dimensional, size-mutable, and heterogeneous data structure.
2. Creating DataFrames
From a Dictionary
From a CSV File
3. Reading and Writing Data
Reading CSV
Writing CSV
Reading Excel
Writing Excel
4. Data Types and Missing Values
Checking Data Types
Handling Missing Data
5. Indexing Methods: loc
, iloc
loc
, iloc
Using loc
loc
Using iloc
iloc
6. Boolean Indexing
7. Selection Based on Conditions
8. Adding and Deleting Columns
Adding a Column
Deleting a Column
9. Handling Missing Data
10. Grouping and Aggregation
Grouping Data
Aggregation
11. Merging and Joining DataFrames
Merging DataFrames
Joining DataFrames
This tutorial provides an in-depth understanding of Pandas with real-world examples. Stay tuned for more advanced topics !
Last updated