LasUIE:
Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model

1Sea-NExT Joint Lab, National University of Singapore,
2Wuhan University,   3Harbin Institute of Technology (Shenzhen)

Abstract

Universally modeling all typical information extraction tasks (UIE) with one generative language model (GLM) has revealed great potential by the latest study, where various IE predictions are unified into a linearized hierarchical expression under a GLM. Syntactic structure information, a type of effective feature which has been extensively utilized in IE community, should also be beneficial to UIE. In this work, we propose a novel structure-aware GLM, fully unleashing the power of syntactic knowledge for UIE. A heterogeneous structure inductor is explored to unsupervisedly induce rich heterogeneous structural representations by posttraining an existing GLM. In particular, a structural broadcaster is devised to compact various latent trees into explicit high-order forests, helping to guide a better generation during decoding. We finally introduce a task-oriented structure fine-tuning mechanism, further adjusting the learned structures to most coincide with the end-task’s need. Over 12 IE benchmarks across 7 tasks our system shows significant improvements over the baseline UIE system. Further in-depth analyses show that our GLM learns rich task-adaptive structural bias that greatly resolves the UIE crux, the long-range dependence issue and boundary identifying.

Presentation

Method

1. Modeling Universal Information Extraction (UIE)

UIE has been proposed to unify all information extraction tasks in NLP community, which converts the structure prediction of IE tasks universally into the sequence prediction via generative LMs. All IE jobs essentially revolves around predicting two key elements: <mention spans> or/and their <semantic relations>. In this project, we thus reduce all the IE tasks into three prototypes: span extraction, pair extraction and hyper-pair extraction:



  • I) Span Extraction, e.g.,
    •      • named entity recognition (NER)
    •      • aspect-based sentiment analysis (ABSA)
    •      • aspect-term extraction (ATE)
  • II) Pair Extraction, e.g.,
    •      • relation extraction (RE)
    •      • aspect-opinion pair extraction (AOP)
    •      • aspect-based sentiment triplet extraction (ASTE)
  • III) Hyper-pair Extraction, e.g.,
    •      • event extraction (EE)
    •      • semantic role labeling (SRL)
    •      • opinion role labeling (ORL)

Under this scheme, mention spans are described with <Span> terms and the corresponding <Span Attribute> labels; semantic relations are straightforwardly denoted with <Relation> labels. And all the IE structures are cast into a sequential representation: Linearized Hierarchical Expression (LHE). For example,


  • • in span extraction:
    •      • { ( Span1 , Attr1 ) , ... , ( Spani , Attri ) , ... }
  • • in span extraction:
    •      • { ... , ( Spani , Attri [ Relk ] Spanj , Attrj ) , ... }
  • • in span extraction:
    •      • { ... , ( Spani , Attri [ Relk ] Spanj , Attrj [ Relm ] Spann , Attrn , ... ) , ... }


2. UIE with Structure-aware Generative Language Model

As cast above, UIE has two key common challenges of IEs:

  •      • Boundary Identification of each span terms (for UIE-element-II: span extraction).

  •      • Long-range Dependence between different span terms (for UIE-element-I: relation extraction);


We thus propose addressing the two challenges by modeling both the syntactic dependency structure and constituency structure, where the constituency syntax mostly benefits the first challenge; the dependency structure well aids the second challenge.


To implement the above idea, we propose learning a Latent Adaptive Structure-aware Generative Language Model for UIE, aka, LasUIE.


  • Stage-I: unsupervised generic pre-training :

    •       generally using an off-the-shelf well-trained generative LM (GLM), e.g., BART, T5.
  • Stage-II: unsupervised structure-aware post-training :

    •       a newly introduced procedure in this project, inserted between the pre-training and fine-tuning stages for structure learning.
  • Stage-III: supervised task-oriented structure fine-tuning :

    •      a newly introduced procedure in this project, along with the task-specific finetuning.


2.1. Unsupervised structure-aware post-training

A Heterogeneous structure inductor (HSI) module is used to unsupervisedly enrich the backbone GLM with sufficient structural knowledge, reinforcing the awareness of linguistic syntax.



2.2. Supervised task-oriented structure fine-tuning

Further adjusting (finetune) the syntactic attributes within the GLM with stochastic policy gradient algorithm by directly taking the feedback of end task performance, such that the learned structural features are most coincident with the end task needs.



Experiment

▶ The unified modeling of IE (i.e., UIE) is more effective than the traditional separate modeling of specific IE task.


▶ Either in separate or unified IE setup, integrating additional linguistic syntax features into GLM evidently improves all end task performances. This proves that the syntactic structures in GLM can serve as IE task-invariant features, further contributing to UIE.


▶ On the span extraction type IE (i.e., NER) the improvements from constituency syntax prevail, while the dependency type of structure features dominate the pair-wise tasks, i.e., (hyper-)pair extraction. The constituency structure more tends to offer key clues for the boundary recognition; while the dependent trees are more apt to cope with the relation detection, solving long-range dependence issue. When combining both of them together, all the end tasks receive the enhancements to the greatest extent.



▶ It is necessary for LMs to automatically learn latent structure information for better UIE. The underlying reason of our model’s improvements could be that the dynamically learned richer structural knowledge in LasUIE largely avoids the noises that are introduced in external syntax parse annotations.


▶ the structural fine-tuning indeed can effectively adjust the learned structural information towards task-specific. Our system can correctly learn the peculiar structural bias for a specific IE task.

Poster

Paper

BibTeX

@inproceedings{fei2022lasuie,
  author = {Fei, Hao and Wu, Shengqiong and Li, Jingye and Li, Bobo and Li, Fei and Qin, Libo and Zhang, Meishan and Zhang, Min and Chua, Tat-Seng},
  booktitle = {Advances in Neural Information Processing Systems},
  title = {LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model},
  pages = {15460--15475},
  year = {2022}
}