Paralex: Paraphrase-Driven Learning for Open Question Answering

About

Paralex is a system that learns to answer questions. Paralex uses a large question-paraphrase corpus as a source of supervision, and then answers questions using a massive database of facts extracted from the web.

To learn more about Paralex, read Paraphrase-Driven Learning for Open Question Answering (Fader et al., ACL 2013).

Data

Paralex learns from a collection of 18 million question-paraphrase pairs scraped from WikiAnswers.

Please read the README.txt file before downloading.
Download the data here: wikianswers-paraphrases-1.0.tar.gz (543M compressed, about 4G uncompressed)

Code

You can download the code and data used to train and evaluate Paralex. The download contains all data (including the WikiAnswers paraphrase corpus and the ReVerb database) and is quite large (11GB uncompressed). Please read the README file before downloading to make sure that it is what you need.

Paralex Evaluation README
paralex-evaluation.tar.bz2 (3.4GB compressed, 11GB uncompressed)

Credits

Paralex was developed by the following people at the University of Washington's Turing Center as part of the KnowItAll Project:

Paralex

Paraphrase-Driven Learning for Open Question Answering

About

Data

Code

Credits

Links