The proposed method partitions record set according to decided attribute values, and then detects approximately duplicate records in each class of decided attribute value.
该方法根据关系表的决定属性值划分记录集,并在每个决定属性值类中检测相似重复记录。
Data cleaning and transformation is an important area of data warehouse, the method for detecting approximately duplicate database record is one of technology difficulties.
数据清理转换是数据仓库中的一个重要研究领域,其技术难点之一是重复记录的识别。
应用推荐