开发成本主要取决于数据清理任务的复杂性。
The development costs depend largely on the complexity of the data cleansing task.
数据清理模式的重要方面是,其注重企业级的可重用性。
An important aspect of the data cleansing pattern is its focus on reusability at an enterprise level.
应用数据清理模式时,务必了解其如何影响以下非功能需求。
When applying the data cleansing pattern, it is important to understand how it impacts the following nonfunctional requirements.
数据清理模式的产品实现对输入数据支持的格式有很大变化。
Product implementations of the data cleansing pattern vary in the range of formats they can support for input data.
因此,数据清理服务器需要能够进行扩展,以处理大量数据。
Therefore, the data cleansing server needs to be able to scale in order to process large data volumes.
潜在收益和客户体验是通过将数据清理作为服务部署来实现的。
The potential revenue benefits and customer experience are realized by deploying data cleansing as a service.
图1显示了在传统上下文中应用数据清理模式的抽象体系结构。
Figure 1 illustrates the high-level architecture of applying the data cleansing pattern in the traditional context.
简单描述了此方法后,我们将了解应用数据清理模式的上下文。
After briefly describing the value of this approach, you'll learn the context in which the data cleansing pattern should be applied.
数据清理服务接受数据质量为未确定的数据作为输入。
The data cleansing service receives data with an undetermined level of data quality as input.
讨论数据清理、数据集成和变换、数据归约的方法。
Methods of data cleaning, data integration and transformation, and data reduction are discussed.
对大量数据应用数据清理模式与对各个记录应用此模式一样常见。
It is fairly common to apply the data cleansing pattern against sets of bulk data as well as individual records.
正如您可能想到的,设计是数据清理流程中最重要和最复杂的阶段。
As you might imagine, design is the most critical and complex phase in the data cleansing process.
数据清理的soa上下文允许对各个请求字符串进行标准化和匹配。
The SOA context for data cleansing allows for standardization and matching of individual request strings.
转换需求越复杂,变化越多,运行时转换或数据清理服务器一定就越复杂。
The more complex and varied the transformation requirements, the more sophisticated the run-time transformation or the data cleansing server must be.
很多数据清理模式的实现都提供了成熟的工具来开发、测试和部署清理规则。
Many implementations of the data cleansing pattern provide sophisticated tools to develop, test, and deploy the cleansing rules.
从此上下文中来看,数据清理模式允许企业将其验证与匹配功能扩展到创建点。
Viewed in this context, the data cleansing pattern allows an enterprise to extend its capabilities for validation and matching to the point of creation.
数据清理活动(解析或值的分隔、标准化、匹配和生存)被指定为清理规则。
The data cleansing activities (parsing or separation of values, standardization, matching and survivorship) are specified as cleansing rules.
务必注意,数据清理模式经常与其他模式一起应用;图3中绿色框就是这样的例子。
It is important to note that the data cleansing pattern is often applied together with other patterns; the green boxes in Figure 3 are such an example.
数据清理模式的传统上下文是数据库层,经常将数据清理模式应用到此层。
The traditional context of the data cleansing pattern is the database layer, which is where it is most often applied.
它们还可提供各种信息处理功能,如通过分析和评分算法、数据清理规则等进行处理。
They also surface information processing capabilities such as the results of analytical and scoring algorithms, data cleansing rules, etc.
数据的预处理主要是进行数据清理、数据集成、数据转换、数据归约等操作。
The data preprocessing can define as the operations as followings: data cleaning, data integration, data conversion, data reduction.
无论计算机提供的数据清理算法构造如何巧妙,也只能解决数据问题当中非常小的一部分。
A computer algorithm for data cleansing, no matter how cleverly constructed, can only address a very small subset of data problems.
数据清理模式指定有关如何在输入时或稍后提高持久性数据的数据质量的建议性实践。
The data cleansing pattern specifies a recommended practice for how to improve the data quality of persistent data either at entry or later.
在保存信息前应用数据清理,可在输入点(如数据输入门户)将业务定义的验证机制包含进来。
Applying data cleansing before the information is persisted allows for the incorporation of business-defined validation mechanisms at the point of entry, such as in data entry portals.
数据整合模式通常与数据清理模式结合在一起,这样就可以在整合的过程中处理数据质量的问题。
The data consolidation pattern is often combined with the data cleansing pattern so that data-quality issues can be addressed during consolidation.
对于指定清理规则的开发人员或设计人员,有必要对要应用数据清理模式的数据源有足够的了解。
For the developer or designer to specify cleansing rules, a sufficient understanding of the data sources against which the data cleansing pattern shall be applied is necessary.
它们被放在任务实例的客户属性中,在进行补充数据清理时用于引用补充数据行(任务完成时)。
They are placed in the task instance's custom properties to be used to reference the supplemental data row when a cleanup of the supplemental data is done (when the task is completed).
文中提出了数据清理的一些方法,给出了通过灵敏度分析来进行因子筛选的一种算法。
Then, some methodologies on data clearing are introduced, and we propose an algorithm on variable selection through sensitivity analysis.
数据清理转换是数据仓库中的一个重要研究领域,其技术难点之一是重复记录的识别。
Data cleaning and transformation is an important area of data warehouse, the method for detecting approximately duplicate database record is one of technology difficulties.
数据清理模式的转换能力是专用的,重点是通过数据的标准化和匹配提高数据质量和完整性。
The transformation capabilities of the data cleansing pattern are specialized and focus upon improving data quality and integrity by standardizing and matching data.
应用推荐