Statistics - Maple Programming Help

Online Help

All Products    Maple    MapleSim


Home : Support : Online Help : Statistics and Data Analysis : Statistics Package : Quantities : Statistics/RousseeuwCrouxQn

Statistics

  

RousseeuwCrouxQn

  

compute Rousseeuw and Croux' Qn

 

Calling Sequence

Parameters

Description

Computation

Data Set Options

Random Variable Options

Examples

References

Compatibility

Calling Sequence

RousseeuwCrouxQn(A, ds_options)

RousseeuwCrouxQn(X, rv_options)

Parameters

A

-

data set or Matrix data set

X

-

algebraic; random variable or distribution

ds_options

-

(optional) equation(s) of the form option=value where option is one of correction, ignore, or weights; specify options for computing Rousseeuw and Croux' Qn statistic of a data set

rv_options

-

(optional) equation of the form numeric=value; specifies options for computing Rousseeuw and Croux' Qn statistic of a random variable

Description

• 

The RousseeuwCrouxQn function computes a robust measure of the dispersion of the specified data set or random variable, as introduced by Rousseeuw and Croux in [2].

• 

This statistic, referred to as Qn in the remainder of this help page, is defined for a sorted data set A1A2An as:

Qn=OrderStatisticseqseqAiAj,i=j+1..n,j=1..n1,k

  

where k is n2+12.

• 

Qn is a robust statistic: it has a high breakdown point (the proportion of arbitrarily large observations it can handle before giving an arbitrarily large result). The breakdown point of Qn is the maximum possible value, 12.

• 

Qn is a measure of dispersion, also called a measure of scale: if QnX=a, then for all real constants α and β, we have QnαX+β=αa.

• 

The first parameter can be a data set, a distribution (see Statistics[Distribution]), a random variable, or an algebraic expression involving random variables (see Statistics[RandomVariable]). For a data set A, RousseeuwCrouxQn computes Qn as defined above. For a distribution or random variable X, RousseeuwCrouxQn computes the asymptotic equivalent - the value that Qn converges to for ever larger samples of X.

Computation

• 

By default, all computations involving random variables are performed symbolically (see option numeric below).

• 

All computations involving data are performed in floating-point; therefore, all data provided must have type/realcons and all returned solutions are floating-point, even if the problem is specified with exact values.

• 

For more information about computation in the Statistics package, see the Statistics[Computation] help page.

Data Set Options

• 

The ds_options argument can contain one or more of the options shown below. More information for some options is available in the Statistics[DescriptiveStatistics] help page.

• 

ignore=truefalse -- This option controls how missing data is handled by the RousseeuwCrouxQn command. Missing items are represented by undefined or Float(undefined). So, if ignore=false and A contains missing data, the RousseeuwCrouxQn command may return undefined. If ignore=true all missing items in A will be ignored. The default value is false.

• 

weights=Vector -- Data weights. The number of elements in the weights array must be equal to the number of elements in the original data sample. By default all elements in A are assigned weight 1.

• 

correction=samplesize or correction=none -- In [2], Rousseeuw and Croux define a correction factor cn for finite sample size as:

dn={0.399n=20.994n=30.512n=40.844n=50.611n=60.857n=70.669n=80.872n=9nn+1.4n>9andnoddnn+3.8n>9andneven

  

If the option correction = samplesize is given, then this correction factor is applied before the result is returned. The default is correction = none, that is, no correction factor is applied.

Random Variable Options

  

The rv_options argument can contain one or more of the options shown below. More information for some options is available in the Statistics[RandomVariables] help page.

• 

numeric=truefalse -- By default, Qn is computed using exact arithmetic. To compute Qn numerically, specify the numeric or numeric = true option.

Examples

withStatistics:

Compute Qn for a data sample.

s1,5,2,2,7,4,1,6,9

s152274169

(1)

RousseeuwCrouxQns

2.

(2)

Employ Rousseeuw and Croux's finite sample size correction.

RousseeuwCrouxQns,'correction=samplesize'

1.74400000000000

(3)

Let's replace four of the values with very large values.

tcopys:

t1..410100:

t

1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000074169

(4)

RousseeuwCrouxQnt

3.

(5)

The value of Qn stays bounded, because it has a high breakdown point.

Compute Qn for a normal distribution.

RousseeuwCrouxQn'Normal'3,5,'numeric'

2.25312055012086

(6)

The symbolic result is an expression involving the inverse (see RootOf) of the error function (see erf). It evaluates to the same floating-point number.

RousseeuwCrouxQn'Normal'3,5

10RootOf4erf_Z1

(7)

evalf

2.253120550

(8)

Generate a random sample of size 1000000 from the same distribution and compute the sample's Qn.

ASample'Normal'3,5,1000000:

RousseeuwCrouxQnA

2.25256098629173

(9)

Consider the following Matrix data set.

MMatrix3,1130,114694,4,1527,127368,3,907,88464,2,878,96484,4,995,128007

M31130114694415271273683907884642878964844995128007

(10)

We compute Qn for each of the columns.

RousseeuwCrouxQnM

1.117.12674.

(11)

References

  

[1] Stuart, Alan, and Ord, Keith. Kendall's Advanced Theory of Statistics. 6th ed. London: Edward Arnold, 1998. Vol. 1: Distribution Theory.

  

[2] Rousseeuw, Peter J., and Croux, Christophe. Alternatives to the Median Absolute Deviation. Journal of the American Statistical Association 88(424), 1993, pp.1273-1283.

Compatibility

• 

The Statistics[RousseeuwCrouxQn] command was introduced in Maple 18.

• 

For more information on Maple 18 changes, see Updates in Maple 18.

See Also

Statistics

Statistics[Computation]

Statistics[DescriptiveStatistics]

Statistics[Distributions]

Statistics[Median]

Statistics[MedianDeviation]

Statistics[RandomVariables]