34 D. Pyle Fig. 4. 1 Continuous Redistribution Continuous redistribution optimally flattens the distribution of a variable and produces, as nearly as the values in the variable permit, a completely flat, or rectangular distribution. The effect of continuous redistribution is shown in Fig. 5. As can be seen in Fig. 5, perfect rectangularity is not always possible and depends on the availability of sufficient discrete values in suitable quantities. Continuous redistribution does achieve the optimal distribution flattening possible and results in increasing the linearity of relationship in a data set as much as is possible without distorting the multivariate distribution.

Dobras, J. (2003): Adaptive Process and Quality Monitoring using a new LabView Toolbox. Eunite 2003, 53. , Juuso, E. (2003): Case Based Prediction of Paper Web Break Sensitivity. Eunite 2003. 54. Zmeska, Z. (2002): Analytical Credit Risk Estimation Under Soft Conditions. Eunite 2002. 3 Data Preparation and Preprocessing D. com The problem for all machine learning methods is to find the most probable model for a data set representing some specific domain in the real world. Although there are several possible theoretical approaches to the problem of appropriately preparing data, the only current answer is to treat the problem as an empirical problem to solve, and to manipulate the data so as to make the discovery of the most probable model more likely in the face of the noise, distortion, bias, and error inherent in any real world data set.

9. Arbitrary equal range binning in the CA data set Relaxation: iterate until no change in bin boundary positioning select a contiguous pair of bins at random position the bin boundary optimally Discussion of calculating the information gain at each potential splitting point would consume more space than is available here. It is discussed in detail in Chap. 11 of [3] and again in [4]. Also, in order to allow experimentation, a free demonstration binning tool is available that allows experimentation with equal range, equal frequency, and LIL binning4 .

