Iterating over rows and columns in Python Pandas
Iterating over rows and columns in a Pandas DataFrame is a common operation in data analysis. There are several ways to iterate over rows and columns in a Pandas DataFrame. Let's explore some of the common methods.
Iterating over Rows:
- Using
iterrows(): This method returns an iterator that yields a tuple containing the index and row data for each row in the DataFrame.
import pandas as pd
data = {'Name': ['John', 'Jane', 'Bob', 'Alice'],
'Age': [30, 25, 35, 28],
'City': ['New York', 'Chicago', 'Chicago', 'Los Angeles']}
df = pd.DataFrame(data)
for index, row in df.iterrows():
print(f"Index: {index}")
print(f"Name: {row['Name']}")
print(f"Age: {row['Age']}")
print(f"City: {row['City']}")
print("---------------")
Output:
Index: 0
Name: John
Age: 30
City: New York
---------------
Index: 1
Name: Jane
Age: 25
City: Chicago
---------------
Index: 2
Name: Bob
Age: 35
City: Chicago
---------------
Index: 3
Name: Alice
Age: 28
City: Los Angeles
---------------
- Using
iterrows() with loc[]: This method is similar to the previous method, but instead of accessing the column values using row['column_name'], we can use the loc[] function.
for index, row in df.iterrows():
print(f"Index: {index}")
print(f"Name: {row.loc['Name']}")
print(f"Age: {row.loc['Age']}")
print(f"City: {row.loc['City']}")
print("---------------")
Iterating over Columns:
- Using
iteritems(): This method returns an iterator that yields a tuple containing the column name and the column data for each column in the DataFrame.
for column_name, column_data in df.iteritems():
print(f"Column Name: {column_name}")
print(f"Column Data: {column_data.tolist()}")
print("---------------")
Output:
Column Name: Name
Column Data: ['John', 'Jane', 'Bob', 'Alice']
---------------
Column Name: Age
Column Data: [30, 25, 35, 28]
---------------
Column Name: City
Column Data: ['New York', 'Chicago', 'Chicago', 'Los Angeles']
---------------
- Using a
for loop: We can use a for loop to iterate over the column names and use the loc[] function to access the column data.
for column_name in df.columns:
print(f"Column Name: {column_name}")
print(f"Column Data: {df.loc[:, column_name].tolist()}")
print("---------------")
In summary, iterating over rows and columns in a Pandas DataFrame can be achieved using various methods such as iterrows(), iteritems(), and for loop. Choosing the appropriate method depends on the specific use case and the required output format.
Happy Learning!! Happy Coding!!
Comments