作成者 |
|
|
|
本文言語 |
|
出版者 |
|
|
発行日 |
|
収録物名 |
|
巻 |
|
出版タイプ |
|
アクセス権 |
|
関連DOI |
|
|
関連URI |
|
|
関連情報 |
|
|
概要 |
The new wrapper model for extractiong text data from HTML documents is introduced. The Kushmerick’s wrapper class (Kusshmerick 2000) may be unsuccessful in the case that sufficiently long delimiters a...re not found. The wrapper class introduced in this paper partially overcomes this difficulty by using the tree struc tures of HTML documents. The learning problem to learn such a wrapper program from given text is considered. Moreover, we try to expand our wrapper to extract a portion of HTML not only text attributes.続きを見る
|