python - pyparsing to parse a string made of boolean -
i use package pyparsing parse following kind of strings.
atomname * , atomindex 1,2,3
atomname xxx,yyy or atomtype rrr,sss
thiol
not atomindex 1,2,3
not (atomindex 4,5,6) or atomname *
based on parsing, link matches specific function calls perform selection of atoms.
all selection keywords (atomname,atomindex,thiol ...) stored in list (i.e. selkwds
).
i tried failed:
keyword = oneof(selkwds,caseless=true).setparseaction(self.__parse_keyword) func_call = forward() func_call << (keyword + commaseparatedlist).setparseaction(self.__parse_expression) func_call = operatorprecedence(func_call, [(not, 1, opassoc.right, self.__not), (and, 2, opassoc.left , self.__and), (or , 2, opassoc.left , self.__or)])
where self._and, self._or, self._not, self._parse_keyword, self._parse_expression
method modify token future eval
of transformed string.
would have idea how solve ?
thanks lot
eric
see embedded comments in modified version of parser:
from pyparsing import * selkwds = "atomname atomindex atomtype thiol".split() func_name = matchfirst(map(caselesskeyword, selkwds)) not,and,or = map(caselesskeyword,"not , or".split()) keyword = func_name | not | , | or func_call = forward() integer = word(nums).setparseaction(lambda t: int(t[0])) alphaword = word(alphas,alphanums) # have specific kind of things can arg, # otherwise, argless function call might process next # keyword or boolean operator argument; # kind of lookahead commonly overlooked # assume parser try kind of right-to-left # backtracking in order implicitly find token # mistaken current repetition type; pyparsing purely # left-to-right, , lookahead if explicitly tell # assume func_call function argument, otherwise # there no point in defining forward func_arg = ~keyword + (integer | func_call | alphaword) # add groups give structure parsed data - otherwise # runs - every function call parses 2 elements: # keyword , list of arguments (which may empty list, # still list) func_call << group(func_name + group(optional(delimitedlist(func_arg) | '*'))) # don't name func_call, confusing you've # defined above func_call_expr = operatorprecedence(func_call, [(not, 1, opassoc.right), (and, 2, opassoc.left), (or , 2, opassoc.left)])
let's test out:
tests = """\ atomname * , atomindex 1,2,3 atomname xxx,yyy or atomtype rrr,sss thiol not atomindex 1,2,3 not (atomindex 4,5,6) or atomname *""".splitlines() test in tests: print test.strip() print func_call_expr.parsestring(test).aslist() print
prints:
atomname * , atomindex 1,2,3 [[['atomname', ['*']], 'and', ['atomindex', [1, 2, 3]]]] atomname xxx,yyy or atomtype rrr,sss [[['atomname', ['xxx', 'yyy']], 'or', ['atomtype', ['rrr', 'sss']]]] thiol [['thiol', []]] not atomindex 1,2,3 [['not', ['atomindex', [1, 2, 3]]]] not (atomindex 4,5,6) or atomname * [[['not', ['atomindex', [4, 5, 6]]], 'or', ['atomname', ['*']]]]
Comments
Post a Comment