See our github page for all our projects.
REACH stands for Reading and Assembling Contextual
and Holistic Mechanisms from Text. In plain English, REACH is an
information extraction system for the biomedical domain, which aims to
read scientific literature and extract cancer signaling pathways.
REACH implements a fairly complete extraction pipeline, including:
recognition of biochemical entities (proteins, chemicals, etc.),
grounding them to known knowledge bases such as Uniprot, extraction of
BioPAX-like interactions, e.g., phosphorylation, complex assembly,
positive/negative regulations, and coreference resolution, for both
entities and interactions.
This aims to be a one-stop place for natural language (NL) processors
and information extraction (IE) tools.
This package includes Odin, our rule-based IE framework.
For generic text preprocessing, we currently provide Scala APIs for Stanford's CoreNLP and maltparser.
This package also includes a full-fledged Rhetorical Structure Theory (RST) discourse parser,
and a machine learning package containing implementations for common algorithms
for both classification and ranking.
This project investigates the data-driven connections between the language of food and multiple
ommunity characteristics, such as diabetes and overweight rate and geographic locale of authors.
We collected a corpus of over three million food-related posts from Twitter. Using these tweets,
we built a system that predicts various community characteristics, such as likelihood of a community to be more overweight
This archive contains the source code and some of the generated data from our Spinning Straw into Gold:
Using Free Text to Train Monolingual Alignment Models for Non-factoid Question Answering NAACL
This archive contains the source code and data from our Discourse Complements Lexical Semantics for
Non-factoid Answer Reranking ACL 2014 paper.