Statistics[TallyInto] - compute data frequencies
|
Calling Sequence
|
|
TallyInto(X, R, options)
|
|
Parameters
|
|
X
|
-
|
Vector; data sample
|
R
|
-
|
range or list(range) or Vector; grouping pattern
|
options
|
-
|
(optional) equation(s) of the form option=value where option is one of weights, ignore, bins or output; specify options for the TallyInto function.
|
|
|
|
|
Description
|
|
•
|
The TallyInto function groups together elements from X which belong to the same data range and computes their frequencies. (See also Statistics[Tally]).
|
•
|
The first parameter X is the data set.
|
•
|
The second argument R is used to specify how the data should be grouped. The value of R can be a range or a list of ranges, or it can be a Vector (or other 1-dimensional rtable) of numbers.
|
|
If ranges are specified, then for each range, TallyInto will compute the number of data items in the corresponding interval. The intervals are not assumed to be disjoint, so any data item may belong to more than one interval (or none of them). Each interval can be divided into a number of equal-sized subintervals using the bins option (see below). Alternatively, you can pass default as the value for R, which tells TallyInto to use the interval between the smallest and the largest data items.
|
|
If R is a Vector of numbers x0 < x1 < ... < xn, then instead Maple uses the intervals [x0, x1), [x1, x2), ..., [x(n-1), xn]. The bins option is ignored in this case.
|
|
If a given range is subdivided into subranges, then each subrange except for the rightmost one corresponds to an interval that is closed on the left and open on the right; the rightmost interval is closed on both sides. That is, a value that is equal to a boundary value will be part of the subrange on its right (except if it is at the right of the whole range). Similarly, if R is a Vector, boundary points will be put into the subinterval on the right (except the rightmost point will be put into the last subinterval). Note, however, that both the boundary points and the data points will be converted to floating-point, and this could conceivably cause the floating-point versions of data points to lie on the other side of boundary points than the exact versions do.
|
|
|
Options
|
|
|
The options argument can contain the following option:
|
•
|
weights=Array, or list -- Vector of weights. If weights are specified, the TallyInto function will compute cumulative weights of data items in each interval. Note that the weights provided must have type[realcons] and the returned frequencies are floating-point, even if the problem is specified with exact values. Both the data array and the weights array must have the same number of elements.
|
•
|
ignore=truefalse -- This option is used to specify how to handle non-numeric data. If ignore is set to true all non-numeric items in X will be ignored.
|
•
|
bins=posint -- If this option is set, every data range in R will be subdivided into the given number of equal subintervals. The default value of bins is 10 if only one range is given and 1 if multiple ranges are given. If R is specified as a Vector, then the bins option is ignored.
|
•
|
output=list, or table -- By default (output=list) a list of equation of the form range=frequency is returned. If the value of this option is set to output=table this list is converted to an object of type table.
|
|
|
Compatibility
|
|
•
|
The R parameter was updated in Maple 16.
|
|
|
Notes
|
|
•
|
The underlying computation is done in floating-point; therefore, all data provided must have type[realcons] and all returned frequencies are floating-point, even if the problem is specified with exact values. For more information about numeric computation in the Statistics package, see the Statistics[Computation] help page.
|
•
|
Note that TallyInto will return an error if it is required to split an infinite interval (i.e. if the value of bins is different from 1 and at least one boundary of an interval is infinite).
|
|
|
Examples
|
|
>
|
|
>
|
|
| (1) |
>
|
|
| (2) |
>
|
|
| (3) |
>
|
|
| (4) |
>
|
|
| (5) |
>
|
|
| (6) |
>
|
|
| (7) |
>
|
|
| (8) |
An example with explicitly specified bounds.
>
|
|
>
|
|
| (9) |
>
|
|
| (10) |
Compare this histogram:
>
|
|
|
|