Regex Find String After Key Inside Qoutes
Solution 1:
Code
"key"\s*:\s*"([^"]*)"
To match the possibility of escaped double quotes you can use the following regex:
"key"\s*:\s*"((?:(?<!\\)\\(?:\\{2})*"|[^"])*)"
This method ensures that an odd number of backslashes \
precedes the double quotation character "
such that \"
, \\\"
, \\\\\"
, etc. are valid, but \\"
, \\\\"
, \\\\\\"
are not valid (this would simply output a backslash character, thus the double quotation character "
preceded by an even number of backslashes would simply result in a string termination).
Matching both strings
If you're looking to match your second string as well, you can use either of the following regexes:
\bkey\b(?:"\s*:\s*|.*?)"([^"]*)"
\bkey\b(?:"\s*:\s*|.*?)"((?:(?<!\\)\\(?:\\{2})*"|[^"])*)"
Usage
import re
s = 'blahblah "key":"value","TargetCRS": "Target","TargetCRScode": "vertical Code","zzz": "aaaa" sadzxc "sss"'
r = re.compile(r'''"key"\s*:\s*"([^"]*)"''')
match = r.search(s)
if match:
print match.group(1)
Results
Input
blahblah "key":"value","TargetCRS": "Target","TargetCRScode": "vertical Code","zzz": "aaaa" sadzxc "sss"
blalasdl8ujd key [any_chars_here] "value", blabla asdw "alo":"ebobo", bla"www":"zzzz"
Output
String 1
- Match:
"key":"value"
- Capture group 1:
value
String 2 (when using one of the methods under Matching both strings)
- Match:
key [any_chars_here] "value"
- Capture group 1:
value
Explanation
"key"
Match this literally\s*
Match any number of whitespace characters:
Match the colon character literally\s*
Match any number of whitespace characters"
Match the double quotation character literally([^"]*)
Capture any character not present in the set (any character except the double quotation character"
) any number of times into capture group 1"
Match the double quotation character literally
Matching both strings
\b
Assert position as a word boundarykey
Match this literally\b
Assert position as a word boundary(?:"\s*:\s*|.*?)
Match either of the following"\s*:\s*
"
Match this literally\s*
Match any number of whitespace characters:
Match this literally\s*
Match any number of whitespace characters
.*?
Match any character any number of times, but as few as possible
"
Match this literally([^"]*)
Capture any number of any character except"
into capture group 1"
Match this literally
Solution 2:
You can use the non-greedy quantifier .*?
between the key
and the value group:
key.*?"(.*?)"
Demo here.
Update
You might wonder why it captures the colon, :
. It captures that because this is the next thing between quotes. So you can add optional quotes around key
like this:
("?)key\1.*?"(.*?)"
Another demo here.
Solution 3:
Solution 4:
There's probably a somewhat more pythonic way to do this, but:
s1 = 'blalasdl8ujd "key":"value", blblabla asdw "alo":"ebobo",blabla"www":"zzzz"'
s2 = 'blalasdl8ujd key [any_chars_here] "value", blabla asdw "alo":"ebobo", bla"www":"zzzz"'
def getValue(string, keyName = 'key'):
"""Find next quoted value after a key that may or may not be quoted"""
startKey = string.find(keyName)
# if key is quoted, adjust value search range to exclude its closing quote
endKey = string.find('"',startKey) if string[startKey-1]=='"' else startKey + len(keyName)
startValue = string.find('"',endKey+1)+1
return string[startValue:string.find('"',startValue+1)]
getValue(s1) #'value'
getValue(s2) #'value'
I was inspired by the elegance of this answer, but handling the quoted and unquoted cases makes it more than a 1-liner.
You can use a comprehension such as:
next(y[1][1:-1] for y in [[l for l in x.split(':')]
for x in s2.split(',')] if 'key' in y[0]) # returns 'value' w/o quotes
But that won't handle the case of s2
.
Post a Comment for "Regex Find String After Key Inside Qoutes"