Skip to content Skip to sidebar Skip to footer

Regex Find String After Key Inside Qoutes

Input: blalasdl8ujd 'key':'value', blblabla asdw 'alo':'ebobo',blabla'www':'zzzz' or blalasdl8ujd key [any_chars_here] 'value', blabla asdw 'alo':'ebobo', bla'www':'zzzz' I

Solution 1:

Code

See regex in use here

"key"\s*:\s*"([^"]*)"

To match the possibility of escaped double quotes you can use the following regex:

See regex in use here

"key"\s*:\s*"((?:(?<!\\)\\(?:\\{2})*"|[^"])*)"

This method ensures that an odd number of backslashes \ precedes the double quotation character " such that \", \\\", \\\\\", etc. are valid, but \\", \\\\", \\\\\\" are not valid (this would simply output a backslash character, thus the double quotation character " preceded by an even number of backslashes would simply result in a string termination).

Matching both strings

If you're looking to match your second string as well, you can use either of the following regexes:

\bkey\b(?:"\s*:\s*|.*?)"([^"]*)"
\bkey\b(?:"\s*:\s*|.*?)"((?:(?<!\\)\\(?:\\{2})*"|[^"])*)"

Usage

See code in use here

import re

s = 'blahblah "key":"value","TargetCRS": "Target","TargetCRScode": "vertical Code","zzz": "aaaa" sadzxc "sss"'
r = re.compile(r'''"key"\s*:\s*"([^"]*)"''')

match = r.search(s)
if match:
    print match.group(1)

Results

Input

blahblah "key":"value","TargetCRS": "Target","TargetCRScode": "vertical Code","zzz": "aaaa" sadzxc "sss"
blalasdl8ujd key [any_chars_here] "value", blabla asdw "alo":"ebobo", bla"www":"zzzz"

Output

String 1

  • Match: "key":"value"
  • Capture group 1: value

String 2 (when using one of the methods under Matching both strings)

  • Match: key [any_chars_here] "value"
  • Capture group 1: value

Explanation

  • "key" Match this literally
  • \s* Match any number of whitespace characters
  • : Match the colon character literally
  • \s* Match any number of whitespace characters
  • " Match the double quotation character literally
  • ([^"]*) Capture any character not present in the set (any character except the double quotation character ") any number of times into capture group 1
  • " Match the double quotation character literally

Matching both strings

  • \b Assert position as a word boundary
  • key Match this literally
  • \b Assert position as a word boundary
  • (?:"\s*:\s*|.*?) Match either of the following
    • "\s*:\s*
      • " Match this literally
      • \s* Match any number of whitespace characters
      • : Match this literally
      • \s* Match any number of whitespace characters
    • .*? Match any character any number of times, but as few as possible
  • " Match this literally
  • ([^"]*) Capture any number of any character except " into capture group 1
  • " Match this literally

Solution 2:

You can use the non-greedy quantifier .*? between the key and the value group:

key.*?"(.*?)"

Demo here.

Update

You might wonder why it captures the colon, :. It captures that because this is the next thing between quotes. So you can add optional quotes around key like this:

("?)key\1.*?"(.*?)"

Another demo here.


Solution 3:

Check this:

.*(\"key\":\"(\w*)\")

Using the group 2:

https://regex101.com/r/66ikH3/2


Solution 4:

There's probably a somewhat more pythonic way to do this, but:

s1 = 'blalasdl8ujd "key":"value", blblabla asdw "alo":"ebobo",blabla"www":"zzzz"'
s2 = 'blalasdl8ujd key [any_chars_here] "value", blabla asdw "alo":"ebobo", bla"www":"zzzz"'


def getValue(string, keyName = 'key'):
    """Find next quoted value after a key that may or may not be quoted"""
    startKey = string.find(keyName) 
    # if key is quoted, adjust value search range to exclude its closing quote
    endKey = string.find('"',startKey) if string[startKey-1]=='"' else startKey + len(keyName) 
    startValue = string.find('"',endKey+1)+1
    return string[startValue:string.find('"',startValue+1)]

getValue(s1) #'value'
getValue(s2) #'value'

I was inspired by the elegance of this answer, but handling the quoted and unquoted cases makes it more than a 1-liner.

You can use a comprehension such as:

next(y[1][1:-1] for y in [[l for l in x.split(':')] 
     for  x in s2.split(',')] if 'key' in y[0]) # returns 'value' w/o quotes

But that won't handle the case of s2.


Post a Comment for "Regex Find String After Key Inside Qoutes"