Pandas
1. What is Pandas?
Pandas is a Python library built on top of NumPy. It is designed for data analysis and manipulation tasks. It provides flexible data structures to work efficiently with structured data, such as CSV files, Excel sheets, SQL tables, and more.
Installing Pandas
Importing Pandas
2. Series and DataFrames
Series (1D Data Structure)
A Series
is similar to a column in a table. It consists of data and an index.
Output:
DataFrame (2D Data Structure)
A DataFrame
is a table-like structure with rows and columns.
Output:
3. Creating DataFrames
From Dictionary
From List of Lists
From NumPy Array
4. Reading and Writing Data
Reading CSV Files
Writing to CSV
Reading Excel Files
Writing to Excel
5. Data Types and Missing Values
Checking Data Types
Handling Missing Values
6. Indexing Methods: loc, iloc
Using loc
(Label-based Indexing)
loc
(Label-based Indexing)Using iloc
(Integer-based Indexing)
iloc
(Integer-based Indexing)7. Boolean Indexing
Boolean indexing allows filtering rows based on conditions.
8. Selection Based on Conditions
Filtering Data
Using Multiple Conditions
9. Adding and Deleting Columns
Adding a New Column
Deleting a Column
10. Handling Missing Data
Filling Missing Values
Dropping Missing Values
11. Grouping and Aggregation
Grouping Data
Aggregation
12. Merging and Joining DataFrames
Merging (Similar to SQL JOIN)
Concatenation
Conclusion
This tutorial covers the essential functionalities of Pandas, including creating DataFrames, reading and writing data, indexing, filtering, grouping, and merging data. Pandas is an incredibly powerful tool for data analysis, and mastering these concepts will help you work efficiently with structured data in Python.
Last updated