The role of centralized de averaging
for your question, subtract the mean from each measurement.
and then use the data after centralization to do regression, instead of centralization and aggregation
bus line: Metro Line 2 → Metro Line 1, the whole journey is about 10.8km
1. Walk about 670m from Changsha meixihu international culture and Art Center to meixihu east station
2. Take Metro Line 2, pass 7 stops, reach Wuyi Square Station
3. Take Metro Line 1, pass 1 stop, reach peiyuanqiao station
4, walk about 1.2km, Reach POFU International Plaza
bus line: Metro Line 2 → 358, the whole journey is about 11.4km
1. Walk about 670m from Changsha meixihu international culture and Art Center to meixihu east station
2. Take Metro Line 2, pass 7 stops, and reach Wuyi Square Station
3, walk about 360m, Arrive at Huatu Ecation (taipingjiekou) station
4, take bus 358, pass 4 stops, arrive at provincial women and children station
5, walk about 200 meters to POFU International Plaza
The purpose of centralization is to unify the units, that is, to unify the dimensions, because the units of different variables are different, which will cause the errors of various statistics
first calculate the average value of variables
in this way, the work of centralizing variables is completed
for example, if there are data sets 1, 2, 3, 6 and 3, and the mean value is 3, then the data set after centralization is 1-3,2-3,3-3,6-3,3-3, that is: - 2, - 1,0,3,0. The purpose of data centralization is to eliminate the influence of dimension on data structure, because the unit of different variables is not the same, which will cause various statistical errors.
Data standardization means subtracting the mean value from the value and then dividing it by the standard deviation
data centralization refers to subtracting the mean value from the variable
the significance of data centralization and standardization in regression analysis is to eliminate the errors caused by different dimensions, self variation or large difference in values
The purpose of data centralization and standardization is to eliminate the differences between features, which can make different features have the same scale and make the influence of different features on parameters consistent. In short, when the scale (unit) of the features on different dimensions of the original data is inconsistent, the data needs to be preprocessed by centralization and standardization steps
extended data:
because the original data often have different units of independent variables, it will bring some difficulties to the analysis, and because of the large amount of data, the calculation result may not be ideal e to rounding error. Data centralization and standardization are helpful to eliminate the influence caused by different dimensions and orders of magnitude, and avoid unnecessary errors
in regression analysis, it is usually necessary to centralize and standardize the original data. Through centralization and standardization, the data with mean value of 0 and standard deviation of 1 are obtained
