Python Pandas：删除字符串中定界符后的所有内容_随笔

Python Pandas：删除字符串中定界符后的所有内容

您可以

pandas.Series.str.split

像平常一样使用

split

。只需对string进行拆分

'::'

，并索引从该

split

方法创建的列表：

>>> df = pd.Dataframe({'text': ["vendor a::ProductA", "vendor b::ProductA", "vendor a::Productb"]})>>> df      text0  vendor a::ProductA1  vendor b::ProductA2  vendor a::Productb>>> df['text_new'] = df['text'].str.split('::').str[0]>>> df      text  text_new0  vendor a::ProductA  vendor a1  vendor b::ProductA  vendor b2  vendor a::Productb  vendor a

这是一个非熊猫解决方案：

>>> df['text_new1'] = [x.split('::')[0] for x in df['text']]>>> df      text  text_new text_new10  vendor a::ProductA  vendor a  vendor a1  vendor b::ProductA  vendor b  vendor b2  vendor a::Productb  vendor a  vendor a

编辑：这是

pandas

上面发生的情况的分步说明：

# Select the pandas.Series object you want>>> df['text']0    vendor a::ProductA1    vendor b::ProductA2    vendor a::ProductbName: text, dtype: object# using pandas.Series.str allows us to implement "normal" string methods # (like split) on a Series>>> df['text'].str<pandas.core.strings.StringMethods object at 0x110af4e48># Now we can use the split method to split on our '::' string. You'll see that# a Series of lists is returned (just like what you'd see outside of pandas)>>> df['text'].str.split('::')0    [vendor a, ProductA]1    [vendor b, ProductA]2    [vendor a, Productb]Name: text, dtype: object# using the pandas.Series.str method, again, we will be able to index through# the lists returned in the previous step>>> df['text'].str.split('::').str<pandas.core.strings.StringMethods object at 0x110b254a8># now we can grab the first item in each list above for our desired output>>> df['text'].str.split('::').str[0]0    vendor a1    vendor b2    vendor aName: text, dtype: object

我建议您查看pandas.Series.str文档，或者更好的方法是在pandas中使用文本数据。

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/5498845.html

Python Pandas：删除字符串中定界符后的所有内容

发表评论

评论列表（0条）