Paralex is a system that learns to answer questions. Paralex uses a large question-paraphrase corpus as a source of supervision, and then answers questions using a massive database of facts extracted from the web.

To learn more about Paralex, read Paraphrase-Driven Learning for Open Question Answering (Fader et al., ACL 2013).


Paralex learns from a collection of 18 million question-paraphrase pairs scraped from WikiAnswers.


You can download the code and data used to train and evaluate Paralex. The download contains all data (including the WikiAnswers paraphrase corpus and the ReVerb database) and is quite large (11GB uncompressed). Please read the README file before downloading to make sure that it is what you need.


Paralex was developed by the following people at the University of Washington's Turing Center as part of the KnowItAll Project: