Keywords

Data Extraction, Unstructured Web Documents, Database Technology

Abstract

Our demo shows how to extract and structure data found in data-rich, unstructured, multiple-record Web documents. Users may either apply pre-built extraction applications or build and apply their own. The demo is significant because it (1) attacks an important data-centric problem and (2) uses database technology to produce good results with minimal effort.

Original Publication Citation

D.M. Campbell, Y. Ding, D.W. Embley, K. Hewett, D.L. Jackman, S.S. Jeffries, Y.S. Jiang, D.Lewis, S.W. Liddle, D.W. Lonsdale, Y.-K. Ng, A.L. Peacock, D.J. Seer, R.D. Smith, S.H. Yau,M. Xu, and L. Xu (1999). A Robust Web Data-Extraction Technique With High Recall and Precision; BYU CS Data Extraction Group Technical Report (11 pages).

Document Type

Report

Publication Date

1999

Publisher

Brigham Young University

Language

English

College

Humanities

Department

Linguistics and English Language

University Standing at Time of Publication

Associate Professor

Included in

Linguistics Commons

Share

COinS