获取 Pandas-Python 中该列的子串

原文:https://www . geesforgeks . org/get-the-substring-of-the-column-in-pandas-python/

现在,我们将看到如何获取熊猫数据帧中一列的所有值的子字符串。这种提取在处理数据时非常有用。例如,我们在一个列中有不同人的名字和姓氏,我们需要提取他们名字的前 3 个字母来创建他们的用户名。

示例 1: 我们可以遍历列的范围,计算列中每个值的子串。

# importing pandas as pd
import pandas as pd 

# creating a dictionary
dict = {'Name':["John Smith", "Mark Wellington", 
                "Rosie Bates", "Emily Edward"]}

# converting the dictionary to a
# dataframe
df = pd.DataFrame.from_dict(dict)

# storing first 3 letters of name
for i in range(0, len(df)):
    df.iloc[i].Name = df.iloc[i].Name[:3]

df

输出:

pandas-extract-substring-1

注意:更多信息请参考 Python 使用熊猫提取行

示例 2: 在本例中,我们将使用[str.slice()](https://www.geeksforgeeks.org/python-pandas-series-str-slice/)

# importing pandas as pd
import pandas as pd 

# creating a dictionary
dict = {'Name':["John Smith", "Mark Wellington",
                "Rosie Bates", "Emily Edward"]}

# converting the dictionary to a 
# dataframe
df = pd.DataFrame.from_dict(dict)

# storing first 3 letters of name as username
df['UserName'] = df['Name'].str.slice(0, 3)

df

输出:

pandas-extract-2

示例 3: 我们也可以通过使用方括号以不同的方式使用 str 访问器。

# importing pandas as pd
import pandas as pd 

# creating a dictionary
dict = {'Name':["John Smith", "Mark Wellington", 
                "Rosie Bates", "Emily Edward"]}

# converting the dictionary to a dataframe
df = pd.DataFrame.from_dict(dict)

# storing first 3 letters of name as username
df['UserName'] = df['Name'].str[:3]

df

输出:

pandas-extract-21

例 4: 我们也可以使用 str.extract 来完成这个任务。在本例中,我们将在“姓氏”列中存储每个人的姓氏。

# importing pandas as pd
import pandas as pd 

# creating a dictionary
dict = {'Name':["John Smith", "Mark Wellington",
                "Rosie Bates", "Emily Edward"]}

# converting the dictionary to a dataframe
df = pd.DataFrame.from_dict(dict)

# storing lastname of each person
df['LastName'] = df.Name.str.extract(r'\b(\w+){content}apos;, 
                                     expand = True)

df

输出:

pandas-extract-substring-2