Jump to content

Repeated median regression

From Wikipedia, the free encyclopedia

This is the current revision of this page, as edited by Peter.schild (talk | contribs) at 15:23, 2 April 2024 (Method: Details of direct vs hierarchical estimates of A_hat, both being mentioned in Siegels original paper.). The present address (URL) is a permanent link to this version.

(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

In robust statistics, repeated median regression, also known as the repeated median estimator, is a robust linear regression algorithm. The estimator has a breakdown point of 50%.[1] Although it is equivariant under scaling, or under linear transformations of either its explanatory variable or its response variable, it is not under affine transformations that combine both variables.[1] It can be calculated in time by brute force, in time using more sophisticated techniques,[2] or in randomized expected time.[3] It may also be calculated using an on-line algorithm with update time.[4]

Method

[edit]

The repeated median method estimates the slope of the regression line for a set of points as

where is defined as .[5]

The estimated Y-axis intercept is defined as

where is defined as .[5]

A simpler and faster alternative to estimate the intercept is to use the value just estimated, thus:[5]

Note: The direct and hierarchical methods of estimating give slightly different values, with the hierarchical method normally being the best estimate. This latter hierarchical approach is idential to the method of estimating in Theil–Sen estimator regression.

See also

[edit]

References

[edit]
  1. ^ a b Peter J. Rousseeuw, Nathan S. Netanyahu, and David M. Mount, "New Statistical and Computational Results on the Repeated Median Regression Estimator", in New Directions in Statistical Data Analysis and Robustness, edited by Stephan Morgenthaler, Elvezio Ronchetti, and Werner A. Stahel, Birkhauser Verlag, Basel, 1993, pp. 177-194.
  2. ^ Stein, Andrew; Werman, Michael (1992). "Finding the repeated median regression line". Proceedings of the Third Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '92). Philadelphia, PA, USA: Society for Industrial and Applied Mathematics. pp. 409–413. ISBN 0-89791-466-X.
  3. ^ Matoušek, J.; Mount, D. M.; Netanyahu, N. S. (1998), "Efficient randomized algorithms for the repeated median line estimator", Algorithmica, 20 (2): 136–150, doi:10.1007/PL00009190, MR 1484533, S2CID 17362967
  4. ^ Bernholt, Thorsten; Fried, Roland (2003). "Computing the update of the repeated median regression line in linear time". Information Processing Letters. 88 (3): 111–117. doi:10.1016/s0020-0190(03)00350-8. hdl:2003/5224.
  5. ^ a b c Siegel, Andrew (September 1980). "Technical Report No. 172, Series 2 By Department of Statistics Princeton University: Robust Regression Using Repeated Medians" (PDF). Archived (PDF) from the original on July 28, 2018. Retrieved 20 February 2018.