Experimenting word embeddings in assisting legal review
Ngoc Phuoc An Vo, Caroline Privault, et al.
ICAIL 2017
Building a system able to cope with various phenomena which falls under the umbrella of semantic similarity is far from trivial. It is almost always the case that the performances of a system do not vary consistently or predictably from corpora to corpora. We analyzed the source of this variance and found that it is related to the word-pair similarity distribution among the topics in the various corpora. Then we used this insight to construct a 4-module system that would take into consideration not only string and semantic word similarity, but also word alignment and sentence structure. The system consistently achieves an accuracy which is very close to the state of the art, or reaching a new state of the art. The system is based on a multi-layer architecture and is able to deal with heterogeneous corpora which may not have been generated by the same distribution.
Ngoc Phuoc An Vo, Caroline Privault, et al.
ICAIL 2017
Ngoc Phuoc An Vo, Simone Magnolini, et al.
SemEval 2015
Ngoc Phuoc An Vo, Simone Magnolini, et al.
SemEval 2015
Ngoc Phuoc An Vo, Irene Manotas, et al.
EMNLP 2022