Search multiple strings for multiple words

  • A+
Category:Languages

I have a dataframe containing a sentence per row. I need to search through these sentences for the occurence of certain words. This is how I currently do it:

import pandas as pd  p = pd.DataFrame({"sentence" : ["this is a test", "yet another test", "now two tests", "test a", "no test"]})  test_words = ["yet", "test"] p["word_test"] = "" p["word_yet"]  = ""  for i in range(len(p)):     for word in test_words:         p.loc[i]["word_"+word] = p.loc[i]["sentence"].find(word) 

This works as intended, however, is it possible to optimize this? It runs fairly slow for large dataframes

 


You can use str.find

p['word_test'] = p.sentence.str.find('test') p['word_yet'] = p.sentence.str.find('yet')      sentence         word_test  word_yet    word_yest 0   this is a test   10         -1          -1 1   yet another test 12          0          0 2   now two tests    8          -1          -1 3   test a           0          -1          -1 4   no test          3          -1          -1 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: