MIP-0907
Paper Description
BibTeX entry
@incollection{MIP-0907},
author={C. Schoenberg, B. Freitag},
title={{Extracting and Storing Document Metadata}},
institution={{Fakult{\"a}t f{\"u}r Informatik und Mathematik, Universit{\"a}t Passau}},
year={2009},
number={MIP-0907}
}
Abstract
This paper gives an overview of information extraction techniques, metadata storage practices, and metadata querying and transformation methods as they are employed in the context of the Verdikt research project.
As a major part of document verification, an abstract internal model of the document to be processed has to be generated. We describe the overall model-generating procedure with a focus on metadata extraction and storage. Good practice is presented and potential problems are discussed.
Paper itself