Skip to content Skip to sidebar Skip to footer

Searching For A Whole Word That Contains Leading Or Trailing Special Characters Like - And = Using Regex In Python

I am trying to know a position of a string (word) in a sentence. I am using the function below. This function is working perfectly for most of the words but for this string GLC-SX-

Solution 1:

You need to escape the key when looking for a literal string, and make sure to use unambiguous (?<!\w) and (?!\w) boundaries:

import re 

defget_start_end(self, sentence, key):
    r = re.compile(r'(?<!\w){}(?!\w)'.format(re.escape(key)), re.I)
    m = r.search(question)
    start = m.start()
    end = m.end()
    return start, end

The r'(?<!\w){}(?!\w)'.format(re.escape(key)) will build a regex like (?<!\w)abc\.def\=(?!\w) out of abc.def= keyword, and (?<!\w) will fail any match if there is a word char immediately to the left of the keyword and (?!\w) will fail any match if there is a word char immediately to the right of the keyword.

Solution 2:

This is not actual answer but help to solve the problem.

You can get pattern dynamically to debug.

import re 

defget_start_end(sentence, key):
        r = re.compile(r'\b(%s)\b' % key, re.I)
        print(r.pattern)

sentence = "foo-bar is not foo=bar"

get_start_end(sentence, 'o-')
get_start_end(sentence, 'o=')

\b(o-)\b
\b(o=)\b

You can then try matching the pattern manually like using https://regex101.com/ if it matches.

Post a Comment for "Searching For A Whole Word That Contains Leading Or Trailing Special Characters Like - And = Using Regex In Python"