For information extraction, information filtering and suchlike Web application, we need segment this kind of original Web page into several appropriate information blocks as the preprocessing.
对于信息抽取、信息过滤等应用,需要首先将原始页面中分割为若干合适的信息块以便于后续的处理。
Based on the analysis of information extraction process and the structure of product web page, a product information extraction model based on DOM tree is established.
在分析信息抽取过程和商品网页结构的基础上,构建了基于网页DOM树的商品供应信息抽取模型。
WEB page content structure is very helpful for applications such as information retrieval, classification, information extraction etc.
页面内容结构分析在WEB信息检索、分类和抽取等方面有重要作用。
应用推荐