Parsing by abduction
24 Feb 2018, 06:46 • logic, nlp
Parsing reveals the syntactic structure of a sentence. Abduction is inference to the best explanation. We show in this post that context-free parsing is a special case of abductive reasoning.
Context-free grammars consist of rules of the following flavour:\[ A \rightarrow \alpha\beta\dots \]
where on the left there's a non-terminal symbol A and on the right there's a sequence of symbols (terminal or non-terminal) which the symbol on the left expands to. In the course of parsing sequences of symbols are replaced by non-terminals according to rules. Grammars of natural languages are highly ambiguous, that is, most sentences have more than one corresponding parse.
In abductive reasoning, we have a set of defeasible rules such as \( p(x) \supset q(x) \) and inference happens by backchaining on these rules. We began with so-called observables (the input) and use abductive rules to "guess" which predicates might make the observables true. In the absence of a deductive proof, we have to assume some knowledge in order to "connect the dots". Formally, given a theory T and a set of observables O, a set I provides an abductive proof if\[ T \cup I \vdash O \]
In the case of parsing, the input sentence is our O. We have to represent the words (and phrases) as a chart, that is, using interword points. For example, for the sentence "Fido barks" we'd have\[ N(fido,1,2) \land V(bark,2,3) \]
We now need an abductive rule which parses an N followed by a V into an NP. The following first-order rule will do that:\[ NP(ab,x,z) \supset N(a,x,y) \land V(b,y,z) \]
Unification of terms ensures that only adjacent words are considered. Backchaining on this rule assumes NP(ab,x,z) which represents an edge in the chart spanning the right-hand side of the rule.
Note that abductive inference is non-deterministic (many abductive proofs are typically possible) so an abductive prover gives us all possible parses according to the given grammar. Note also that what we have is an effectively propositional theory because the observables are all ground and backchaining on grammar rules yields only ground literals.