ScreePlot - Maple Help

Statistics

 ScreePlot
 plot of the proportion of the total variation attributable to each component in a principal component analysis

 Calling Sequence ScreePlot( PCArecord ) ScreePlot( PCArecord, options, plotoptions ) ScreePlot( dataset ) ScreePlot( dataset, options, plotoptions )

Parameters

 PCArecord - the resulting record from a principal component analysis dataset - data set or DataFrame ; Matrix or DataFrame of values with 2 or more columns options - (optional) equation(s) of the form option=value, where option is one of color, showcumulative, columngraphoptions, or cumulativesumoptions plotoptions - options to be passed to the plots:-dualaxisplot command, or the Statistics:-ColumnGraph command

Options

 • showcumulative : truefalse; controls the display of a cumulative sum chart for the total proportion of variance. The default is true.
 • color : list, name, string; controls the colors for the column graph and the cumulative sum chart respectively. By default, the color list is [ "Niagara Azure", "Niagara Burgundy" ].
 • columngraphoptions : list; options to be passed to the Statistics:-ColumnGraph command
 • cumulativesumoptions : list; options to be passed to the Statistics:-CumulativeSumChart command

Description

 • The ScreePlot command is used to generate a plot of the proportion of the total variation attributable to each component in a principal component analysis. This is useful in determining the number of components required to summarize the data.
 • In the case that a dataset or DataFrame is given as the main argument for the ScreePlot command, a principal component analysis is performed and the resulting plot of components is shown.
 • When the showcumulative option is set to false, any plotoptions added to the ScreePlot command are passed to Statistics:-ColumnGraph. When showcumulative is set to true (the default), any plotoptions are passed to plots:-dualaxisplot.

Examples

 > $\mathrm{with}\left(\mathrm{Statistics}\right):$
 > $\mathrm{data}≔⟨⟨2.5|2.4|10.5⟩,⟨0.5|0.7|0.785⟩,⟨2.2|2.9|1.286⟩,⟨1.9|2.2|2.35⟩,⟨3.1|3.0|2.202⟩,⟨2.3|2.7|1.351⟩,⟨2.0|1.6|2.021⟩,⟨1.0|1.1|1.247⟩,⟨1.5|1.6|2.503⟩,⟨1.1|0.9|1.214⟩⟩$
 $\left[\begin{array}{ccc}2.5& 2.4& 10.5\\ 0.5& 0.7& 0.785\\ 2.2& 2.9& 1.286\\ 1.9& 2.2& 2.35\\ 3.1& 3.0& 2.202\\ 2.3& 2.7& 1.351\\ 2.0& 1.6& 2.021\\ 1.0& 1.1& 1.247\\ 1.5& 1.6& 2.503\\ 1.1& 0.9& 1.214\end{array}\right]$ (1)

The ScreePlot command returns a plot of the proportion of the total variance which is attributable to each of the components in a principal component analysis.

 > $\mathrm{PCAnalysis}≔\mathrm{PCA}\left(\mathrm{data},\mathrm{summarize}=\mathrm{true}\right):$
 summary:
 Values   proportion of variance  St. Deviation 8.3208     0.8782                 2.8846 1.1119     0.1173                 1.0545 0.0425     0.0045                 0.2061

The Scree Plot plots the values taken from PCAnalysis:-values and PCAnalysis:-varianceproportion as seen above.

 > $\mathrm{ScreePlot}\left(\mathrm{PCAnalysis}\right)$

From the above Scree plot, the amount of variation explained by each component drops significantly after the first component. This may suggest that using one component may be enough to summarize the data.

 > $\mathrm{data2}≔⟨⟨2.5|2.4|10.5|0.1|0.5⟩,⟨0.5|0.7|0.785|4.3|2.0⟩,⟨2.2|2.9|1.286|5.4|7.0⟩,⟨1.9|2.2|2.35|6.7|3.1⟩,⟨3.1|3.0|2.202|8.1|12⟩,⟨0.1|0.4|0.5|0.6|0.9⟩⟩$
 $\left[\begin{array}{ccccc}2.5& 2.4& 10.5& 0.1& 0.5\\ 0.5& 0.7& 0.785& 4.3& 2.0\\ 2.2& 2.9& 1.286& 5.4& 7.0\\ 1.9& 2.2& 2.35& 6.7& 3.1\\ 3.1& 3.0& 2.202& 8.1& 12\\ 0.1& 0.4& 0.5& 0.6& 0.9\end{array}\right]$ (2)

In the case that a 2 column or larger dataset is given as the argument to ScreePlot, the principal component analysis is computed and the plot is returned.

 > $\mathrm{ScreePlot}\left(\mathrm{data2},\mathrm{color}=\left["LightSteelBlue","OrangeRed"\right]\right)$

The above plot shows that the first 3 components account for 99.5% of the variance.

It may not always be possible to summarize a dataset using only a few components.

 > $\mathrm{data3}≔\mathrm{Sample}\left(\mathrm{Uniform}\left(1,10\right),\left[10,15\right]\right):$
 > $\mathrm{interface}\left(\mathrm{displayprecision}=3\right):$
 > $\mathrm{ScreePlot}\left(\mathrm{data3},\mathrm{showcumulative}=\mathrm{false},\mathrm{columngraphoptions}=\left[\mathrm{datasetlabels}=\mathrm{absolute}\right]\right)$
 > 

Compatibility

 • The Statistics[ScreePlot] command was introduced in Maple 2016.