stats[statplots, scatterplot(data, arg=value, ...)]
statplots[scatterplot(data, arg=value, ...)]
scatterplot(data, arg=value, ...)
format the scatter plot according to style f
Important: The stats package has been deprecated. Use the superseding package Statistics instead.
The function scatterplot of the subpackage stats[statplots] gives a scatter plot of the data.
There are several formats available for scatterplots:
In addition to the above, there are some formats that are valid only for one-dimensional scatter plots:
The default, when no format parameter is specified, is to plot the points and classes in the given statistical list(s). In one-dimension, (scatterplot(data1)), the points are plotted at their x-value, with an assumed y-value of 1. In two or three dimensions, (scatterplot(data1, data2, data3)), the lists are combined to form points or regions. So data1 becomes the x-values, data2 becomes the y-values and so on. The points are paired according to the order of the list, and the weights of each data item. Each statistical list must have the same total weight. Classes are plotted as lines, rectangles, or boxes depending on their pairing.
The format=agglomeratedn,l option groups closely spaced points into boxes, representing clusters of points. When there are n points within a cube with side-length l, those points will be replaced by a box with side-length l2. When there are n2 points within the cube, a bigger box, with side-length l will be plotted in place of the points. Class data are replaced by their classmarks.
The format=excisedp option deletes the fraction, p, of the least densely packed points from the plot. Conversely, when p is negative, the fraction of the most densely packed points are excised. Class data are replaced by their classmarks.
The format=quantile option plots quantile values of the data. In one dimension, the quantile values are plotted as the x-component versus the data value as the y-component. In more than one dimension the statistical lists are sorted according to their quantile, then paired together and plotted. So the r-th-quantile of data1 is plotted against the r-th-quantile of data2. Note that the number of observations in each data list can be different.
The format=sunflowerl option replaces points by "sunflowers". Each sunflower has one radial arm for every point of weight one (that is, a point with weight three will cause three radial arms in the sunflower). The plot area is divided up into cubes of length l, and one sunflower is plotted inside each cube representing the number of points within the cube. Class data are replaced by their classmarks. The number of arms in a sunflower corresponds to the total weight of the points within the cube, so fractionally weighted points may cause one arm of the sunflower to be shorter than the rest.
The format=projected option for one-dimensional plots is the default. The points in data1 are plotted at their x-value along the line y=1. This gives an idea of the concentration of the data, but does not reveal the presence of repeated data. Classes are plotted as lines. Missing data are ignored.
The format=jittered option for one-dimensional plots causes the points corresponding to a particular x-value to be scattered along the vertical line at that x-value. This gives a visual idea of the density of the points.
The format=stacked option for one-dimensional plots produces a histogram-style plot. The points in data1 are plotted at their x-value, and stacked on top of each other starting along the line y=1. The stack of points is taller in proportion to the weight of the points. Class data appear as lines, but are not stacked with the points.
The format=symmetry option for one-dimensional plots produces a symmetry plot of the data. In this type of plot, the first half of the sorted data minus the median value is plotted versus the median minus the second half of the sorted data. Therefore, if the data is symmetric (with respect to the median), then the plot will produce points on the straight line y=x. Departure from this line indicates deviation from symmetry.
This one-dimensional plots are suitable for insertion along the edge of a two-dimensional plot. The utilities statplots(deprecated)[xshift], statplots(deprecated)[xyexchange], and plots[display] are useful for this.
One-dimensional quantile plots are closely related to percentage ogives. See cumulative frequency for more information.
Multi-dimensional quantile plots, or quantile-quantile plots are useful in comparing multiple data sets. Consider data sets of the maximum daily temperatures in two cities. A scatter plot of one set against the other facilitates comparison of temperatures at the two cities, at each given day. The quantile-quantile plot provides answers to questions like: are the lowest third daily temperatures at this city over a greater span of temperatures than those in the lowest third in the other city.
When there are "too many" points in a scatter plot, it is sometimes difficult to see important trends. Agglomerated, excised, and sunflower formats help to group the data so that patterns are more obviously visible.
The command with(stats[statplots]) allows the use of the abbreviated form of this command.
data ≔ Weight⁡1,5,2,Weight⁡3,7,Weight⁡4..5,3,missing,6,9,10,11,14,15,20:
one can contrast the three styles of 1-D scatter plots by:
yshift(10, scatterplot(data, format=stacked)),
yshift(20, scatterplot(data, format=projected)),
plot(proc() 10 end proc, 0..20),
plot(proc() 20 end proc, 0..20)
}, view = [0..20, 0..30]
the other 1-D scatterplot is more of a summary:
the following is a 2-D scatter plot with 1-D summaries along the sides
data1 ≔ 2.93,2.58,2.85,4.26,2.94,4.33,1.71,4.42,3.59,4.35,2.07,1.16,2.36,1.16,4.72:
data2 ≔ 2.46,4.34,0.182,3.22,5.37,10.5,3.11,−1.99,−0.865,2.56,10.6,10.9,6.56,7.22,4.84:
The command to create the plot from the Plotting Guide using the data above is
Download Help Document