Iterating over rows and columns in Python Pandas
Iterating over rows and columns in a Pandas DataFrame is a common operation in data analysis. There are several ways to iterate over rows and columns in a Pandas DataFrame. Let's explore some of the common methods.
Iterating over Rows:
- Using
iterrows()
: This method returns an iterator that yields a tuple containing the index and row data for each row in the DataFrame.
import pandas as pd
data = {'Name': ['John', 'Jane', 'Bob', 'Alice'],
'Age': [30, 25, 35, 28],
'City': ['New York', 'Chicago', 'Chicago', 'Los Angeles']}
df = pd.DataFrame(data)
for index, row in df.iterrows():
print(f"Index: {index}")
print(f"Name: {row['Name']}")
print(f"Age: {row['Age']}")
print(f"City: {row['City']}")
print("---------------")
Output:
Index: 0
Name: John
Age: 30
City: New York
---------------
Index: 1
Name: Jane
Age: 25
City: Chicago
---------------
Index: 2
Name: Bob
Age: 35
City: Chicago
---------------
Index: 3
Name: Alice
Age: 28
City: Los Angeles
---------------
- Using
iterrows()
with loc[]
: This method is similar to the previous method, but instead of accessing the column values using row['column_name']
, we can use the loc[]
function.
for index, row in df.iterrows():
print(f"Index: {index}")
print(f"Name: {row.loc['Name']}")
print(f"Age: {row.loc['Age']}")
print(f"City: {row.loc['City']}")
print("---------------")
Iterating over Columns:
- Using
iteritems()
: This method returns an iterator that yields a tuple containing the column name and the column data for each column in the DataFrame.
for column_name, column_data in df.iteritems():
print(f"Column Name: {column_name}")
print(f"Column Data: {column_data.tolist()}")
print("---------------")
Output:
Column Name: Name
Column Data: ['John', 'Jane', 'Bob', 'Alice']
---------------
Column Name: Age
Column Data: [30, 25, 35, 28]
---------------
Column Name: City
Column Data: ['New York', 'Chicago', 'Chicago', 'Los Angeles']
---------------
- Using a
for
loop: We can use a for
loop to iterate over the column names and use the loc[]
function to access the column data.
for column_name in df.columns:
print(f"Column Name: {column_name}")
print(f"Column Data: {df.loc[:, column_name].tolist()}")
print("---------------")
In summary, iterating over rows and columns in a Pandas DataFrame can be achieved using various methods such as iterrows()
, iteritems()
, and for
loop. Choosing the appropriate method depends on the specific use case and the required output format.
Happy Learning!! Happy Coding!!
Comments