You are on page 1of 5

Geometric median

Not to be confused with Median (geometry). 2 Properties

The geometric median of a discrete set of sample points For the 1-dimensional case, the geometric median
in a Euclidean space is the point minimizing the sum coincides with the median. This is because the
of distances to the sample points. This generalizes the univariate median also minimizes the sum of dis-
median, which has the property of minimizing the sum tances from the points.[11]
of distances for one-dimensional data, and provides a
central tendency in higher dimensions. It is also known The geometric median is unique whenever the
as the 1-median,[1] spatial median,[2] Euclidean min- points are not collinear.[12]
isum point,[2] or Torricelli point.[3]
The geometric median is equivariant for Euclidean
The geometric median is an important estimator of similarity transformations, including translation and
location in statistics,[4] where it is also known as the L1 rotation.[11][5] This means that one would get the
estimator.[5] It is also a standard problem in facility lo- same result either by transforming the geometric
cation, where it models the problem of locating a facility median, or by applying the same transformation
to minimize the cost of transportation.[6] to the sample data and nding the geometric me-
The special case of the problem for three points in the dian of the transformed data. This property follows
plane (that is, m = 3 and n = 2 in the denition below) from the fact that the geometric median is dened
is sometimes also known as Fermats problem; it arises in only from pairwise distances, and doesn't depend on
the construction of minimal Steiner trees, and was origi- the system of orthogonal Cartesian coordinates by
nally posed as a problem by Pierre de Fermat and solved which the sample data is represented. In contrast,
by Evangelista Torricelli.[7] Its solution is now known as the component-wise median for a multivariate data
the Fermat point of the triangle formed by the three sam- set is not in general rotation invariant, nor is it inde-
ple points.[8] The geometric median may in turn be gener- pendent of the choice of coordinates.[5]
alized to the problem of minimizing the sum of weighted
distances, known as the Weber problem after Alfred We- The geometric median has a breakdown point of
ber's discussion of the problem in his 1909 book on facil- 0.5.[5] That is, up to half of the sample data may be
ity location.[2] Some sources instead call Webers problem arbitrarily corrupted, and the median of the samples
the FermatWeber problem,[9] but others use this name will still provide a robust estimator for the location
for the unweighted geometric median problem.[10] of the uncorrupted data.

Wesolowsky (1993) provides a survey of the geometric


median problem. See Fekete, Mitchell & Beurer (2005)
for generalizations of the problem to non-discrete point 3 Special cases
sets.
For 3 (non-collinear) points, if any angle of the tri-
angle formed by those points is 120 or more, then
1 Denition the geometric median is the point making that an-
gle. If all the angles are less than 120, the geomet-
ric median is the point inside the triangle which sub-
Formally, for a given set of m points x1 , x2 , . . . , xm with
tends an angle of 120 to each three pairs of triangle
each xi Rn , the geometric median is dened as
vertices.[11] This is also known as the Fermat point
of the triangle formed by the three vertices. (If the
three points are collinear then the geometric median

m
= arg min xi y2 is the point between the two other points, as is the
yRn i=1 case with a one-dimensional median.)

Note that argmin means the value of the argument y For 4 coplanar points, if one of the four points is
which minimizes the sum. In this case, it is the point inside the triangle formed by the other three points,
y from where the sum of all Euclidean distances to the xi then the geometric median is that point. Otherwise,
's is minimum. the four points form a convex quadrilateral and the

1
2 6 GENERALIZATIONS

geometric median is the crossing point of the diag- 5 Characterization of the geomet-
onals of the quadrilateral. The geometric median of
four coplanar points is the same as the unique Radon
ric median
point of the four points.[13]
If y is distinct from all the given points, xj, then y is the
geometric median if and only if it satises:

m
xj y
4 Computation 0=
x
.
j=1 j y

Despite the geometric medians being an easy-to- This is equivalent to:


understand concept, computing it poses a challenge. The
centroid or center of mass, dened similarly to the ge- /

m
xj m
1
ometric median as minimizing the sum of the squares y= ,
of the distances to each point, can be found by a simple j=1
xj y j=1
xj y
formula its coordinates are the averages of the coor-
dinates of the points but no such formula is known which is closely related to Weiszfelds algorithm.
for the geometric median, and it has been shown that no In general, y is the geometric median if and only if there
explicit formula, nor an exact algorithm involving only are vectors uj such that:
arithmetic operations and kth roots can exist in general.
Therefore only numerical or symbolic approximations to
the solution of this problem are possible under this model
m
0= uj
of computation.[14]
j=1
However, it is straightforward to calculate an approxima- where for xj y,
tion to the geometric median using an iterative procedure
in which each step produces a more accurate approxima-
tion. Procedures of this type can be derived from the xj y
uj =
fact that the sum of distances to the sample points is a xj y
convex function, since the distance to each sample point is and for xj = y,
convex and the sum of convex functions remains convex.
Therefore, procedures that decrease the sum of distances
at each step cannot get trapped in a local optimum. uj 1.
One common approach of this type, called Weiszfelds An equivalent formulation of this condition is
algorithm after the work of Endre Weiszfeld,[15] is a
form of iteratively re-weighted least squares. This algo-
xj y
rithm denes a set of weights that are inversely propor- |{ j | 1 j m, xj = y }| .
tional to the distances from the current estimate to the 1jm,xj =y xj y
samples, and creates a new estimate that is the weighted
average of the samples according to these weights. That
is, 6 Generalizations
The geometric median can be generalized from Euclidean
/ spaces to general Riemannian manifolds (and even metric
m
xj m
1 spaces) using the same idea which is used to dene the
yi+1 = .
x j y i x j y i Frchet mean on a Riemannian manifold.[16] Let M be a
j=1 j=1
Riemannian manifold with corresponding distance func-
tion d(, ) , let w1 , . . . , wn be n weights summing to 1,
This method converges for almost all initial positions, but and let x1 , . . . , xn be n observations from M . Then
may fail to converge when one of its estimates falls on one we dene the weighted geometric median m (or weighted
of the given points. It can be modied to handle these Frchet median) of the data points as
[12]
cases so that it converges for all initial points.
Bose, Maheshwari & Morin (2003) describe more so- n

phisticated geometric optimization procedures for nd- m = arg min wi d(x, xi )


xM i=1
ing approximately optimal solutions to this problem. As
Nie, Parrilo & Sturmfels (2008) show, the problem can If all the weights are equal, we say simply that m is the
also be represented as a semidenite program. geometric median.
3

7 Notes Chandrasekaran, R.; Tamir, A. (1989). Open


questions concerning Weiszfelds algorithm
[1] The more general k-median problem asks for the location for the Fermat-Weber location problem.
of k cluster centers minimizing the sum of distances from Mathematical Programming. Series A 44: 293295.
each sample point to its nearest center. doi:10.1007/BF01587094.

[2] Drezner et al. (2002) Cieslik, Dietmar (2006). Shortest Connectivity: An


Introduction with Applications in Phylogeny. Com-
[3] Cieslik (2006). binatorial Optimization 17. Springer. p. 3. ISBN
9780387235394.
[4] Lawera & Thompson (1993).
Cockayne, E. J.; Melzak, Z. A. (1969). Eu-
[5] Lopuha & Rousseeuw (1991) clidean constructability in graph minimization prob-
[6] Eiselt & Marianov (2011). lems. Mathematics Magazine 42 (4): 206208.
doi:10.2307/2688541. JSTOR 2688541.
[7] Krarup & Vajda (1997).
Drezner, Zvi; Klamroth, Kathrin; Schbel, Anita;
[8] Spain (1996). Wesolowsky, George O. (2002). Facility Location:
Applications and Theory. Springer, Berlin. pp. 1
[9] Brimberg (1995). 36. MR 1933966. |chapter= ignored (help)
[10] Bose, Maheshwari & Morin (2003). Eiselt, H. A.; Marianov, Vladimir (2011).
Foundations of Location Analysis. 155se-
[11] Haldane (1948) ries=International Series in Operations Research
[12] Vardi & Zhang (2000) & Management Science. Springer. p. 6. ISBN
9781441975720.
[13] Cieslik (2006), p. 6; Plastria (2006). The convex case
Fekete, Sndor P.; Mitchell, Joseph S. B.;
was originally proven by Giovanni Fagnano.
Beurer, Karin (2005). On the continuous
[14] Bajaj (1986); Bajaj (1988). Earlier, Cockayne & Melzak Fermat-Weber problem. Operations Re-
(1969) proved that the Steiner point for 5 points in the search 53 (1): 6176. arXiv:cs.CG/0310027.
plane cannot be constructed with ruler and compass doi:10.1287/opre.1040.0137.
[15] Weiszfeld (1937); Kuhn (1973); Chandrasekaran & Fletcher, P. Thomas; Venkatasubramanian, Suresh;
Tamir (1989). Joshi, Sarang (2009). The geometric median on
Riemannian manifolds with application to robust
[16] Fletcher, Venkatasubramanian & Joshi (2009). atlas estimation. Neuroimage 45 (1 Suppl): s143
s152. doi:10.1016/j.neuroimage.2008.10.052.
PMC 2735114. PMID 19056498.
8 References Haldane, J. B. S. (1948). Note on the median of
a multivariate distribution. Biometrika 35 (34):
Bajaj, C. (1986). Proving geometric algorithms 414417. doi:10.1093/biomet/35.3-4.414.
nonsolvability: An application of factoring polyno-
mials. Journal of Symbolic Computation 2: 99 Krarup, Jakob; Vajda, Steven (1997). On
102. doi:10.1016/S0747-7171(86)80015-3. Torricellis geometrical solution to a problem
of Fermat. IMA Journal of Mathematics Ap-
Bajaj, C. (1988). The algebraic degree of plied in Business and Industry 8 (3): 215224.
geometric optimization problems. Discrete doi:10.1093/imaman/8.3.215. MR 1473041.
and Computational Geometry 3: 177191.
Kuhn, Harold W. (1973). A note on Fermats prob-
doi:10.1007/BF02187906.
lem. Mathematical Programming 4 (1): 98107.
Bose, Prosenjit; Maheshwari, Anil; Morin, Pat doi:10.1007/BF01584648.
(2003). Fast approximations for sums of dis- Lawera, Martin; Thompson, James R. (1993).
tances, clustering and the FermatWeber prob- Proceedings of the 38th Conference on the Design
lem. Computational Geometry: Theory and Ap- of Experiments. U.S. Army Research Oce Re-
plications 24 (3): 135146. doi:10.1016/S0925- port 932. pp. 99126. |chapter= ignored (help)
7721(02)00102-5.
Lopuha, Hendrick P.; Rousseeuw, Peter J. (1991).
Brimberg, J. (1995). The FermatWeber location Breakdown points of ane equivariant estima-
problem revisited. Mathematical Programming 71 tors of multivariate location and covariance ma-
(1, Ser. A): 7176. doi:10.1007/BF01592245. MR trices. Annals of Statistics 19 (1): 229248.
1362958. doi:10.1214/aos/1176347978. JSTOR 2241852.
4 8 REFERENCES

Nie, Jiawang; Parrilo, Pablo A.; Sturmfels, Bernd


(2008). Semidenite representation of the k-
ellipse. In Dickenstein, A.; Schreyer, F.-O.;
Sommese, A.J. Algorithms in Algebraic Geome-
try. IMA Volumes in Mathematics and its Ap-
plications 146. Springer-Verlag. pp. 117132.
arXiv:math/0702005.

Ostresh, L. (1978). Convergence of a Class


of Iterative Methods for Solving Weber Location
Problem. Operations Research 26 (4): 597609.
doi:10.1287/opre.26.4.597.

Plastria, Frank (2006). Four-point Fermat


location problems revisited. New proofs and
extensions of old results. IMA Journal of
Management Mathematics 17 (4): 387396.
doi:10.1093/imaman/dpl007. Zbl 1126.90046..

Spain, P. G. (1996). The Fermat Point of a Tri-


angle. Mathematics Magazine 69 (2): 131133.
JSTOR 2690672?origin=pubexport. MR 1573157.
Vardi, Yehuda; Zhang, Cun-Hui (2000). The
multivariate L1 -median and associated data depth.
Proceedings of the National Academy of Sciences of
the United States of America 97 (4): 14231426
(electronic). doi:10.1073/pnas.97.4.1423. MR
1740461.
Weber, Alfred (1909). ber den Standort der In-
dustrien, Erster Teil: Reine Theorie des Standortes.
Tbingen: Mohr.

Wesolowsky, G. (1993). The Weber problem: His-


tory and perspective. Location Science 1: 523.

Weiszfeld, E. (1937). Sur le point pour lequel la


somme des distances de n points donnes est mini-
mum. Tohoku Mathematical Journal 43: 355386.
5

9 Text and image sources, contributors, and licenses


9.1 Text
Geometric median Source: http://en.wikipedia.org/wiki/Geometric%20median?oldid=609251103 Contributors: Michael Hardy, Alten-
mann, Urhixidur, Rich Farmbrough, Art LaPella, O18, 3mta3, Rjwilmsi, LuisPedroCoelho, RussBot, Cheese Sandwich, Hirak 99, Smack-
Bot, Kjetil1001, Mhym, Lambiam, JRSpriggs, Cydebot, Headbomb, David Eppstein, Flowanda, R'n'B, Noyder, Melcombe, FractalFusion,
Armin Rigo, Sun Creator, Qwfp, Addbot, DOI bot, AnomieBOT, Citation bot 1, Gaba p, Kiefer.Wolfowitz, Miracle Pen, Duoduoduo,
RjwilmsiBot, LelandFB, ClueBot NG and Anonymous: 17

9.2 Images

9.3 Content license


Creative Commons Attribution-Share Alike 3.0

You might also like