Enterprise mashup scenarios often involve feeds derived from
data created primarily for eye consumption, such as email, news,
calendars, blogs, and web feeds. These data sources can test the
capabilities of current data mashup products, as the attributes
needed to perform join, aggregation, and other operations are
often buried within unstructured feed text. Information extraction
technology is a key enabler in such scenarios, using annotators to
convert unstructured text into structured information that can
facilitate mashup operations.
Our demo presents the integration of SystemT, an information
extraction system from IBM Research, with IBM’s InfoSphere
MashupHub. We show how to build domain-specific annotators
with SystemT’s declarative rule language, AQL, and how to use
these annotators to combine structured and unstructured
information in an enterprise mashup.