Coding a dataset means transforming every value by adding (or subtracting) the same constant. If the original data are \(x_1, x_2, \dots, x_n\), choose a convenient constant \(a\) and define
\[ y_i = x_i - a \quad\text{(or } y_i = x_i + c \text{ with } c=-a\text{).} \]
We often pick \(a\) close to the centre of the data (e.g. a round number near the mean) to make arithmetic simpler.
Code with \(y_i=x_i-100\):
\(x:\; 98,\,101,\,103,\,97,\,101\) \(y:\; -2,\, 1,\, 3,\,-3,\, 1\)
\(\sum y = (-2)+1+3+(-3)+1 = 0 \Rightarrow \bar y = 0/5 = 0.\) Decode: \(\bar x = \bar y + 100 = \boxed{100}.\)
Note: Standard deviation is unchanged by the shift.
\(y = x - 15:\; -3, 0, 2, 4, 7\).
Median(\(x\)) = 17 β Median(\(y\)) = \(17-15=2\). Quartiles shift by 15; IQR is unchanged.
Original \(Q_1=15\), \(Q_3=22\) so IQR(\(x\)) = \(7\). Coded \(Q_1=0\), \(Q_3=7\) so IQR(\(y\)) = \(7\) (same).
Suppose midpoints \(m\): 40, 50, 60 with frequencies \(f\): 8, 14, 8. Choose origin \(a=50\), code \(y=m-a\): \(-10, 0, +10\).
\(\sum f = 30\), \(\sum fy = 8(-10) + 14(0) + 8(10) = 0\Rightarrow \bar y = 0.\) Decode: \(\bar x = \bar y + a = 50.\)
For \(x: 7, 9, 10\), take \(a=9\) so \(y=-2,0,1\). Compute \(s_y\) quickly from small numbers; then \(s_x = s_y\).
More generally you can use linear coding: \[ y_i = \frac{x_i - a}{b}\qquad (b\neq 0). \]
Code \(y=\dfrac{x-200}{5}\) to make tiny integers. Compute \(\bar y\), \(s_y\), then decode: \(\bar x = 200 + 5\bar y,\; s_x = 5 s_y.\)