Linear Fit - Maple Help

Statistics

 LinearFit
 fit a linear model function to data

 Calling Sequence LinearFit(flst, X, Y, v, options) LinearFit(flst, XY, v, options) LinearFit(falg, X, Y, v, options) LinearFit(falg, XY, v, options) LinearFit(fop, X, Y, options) LinearFit(fop, XY, options)

Parameters

 flst - list(algebraic) or Vector(algebraic); component functions in algebraic form X - Vector or Matrix; values of independent variable(s) Y - Vector; values of dependent variable XY - Matrix; values of independent and dependent variables v - name or list(names); name(s) of independent variables in the component functions falg - algebraic expression, linear in all its variables except the ones in v; model fop - list(procedure) or Vector(procedure); component functions in operator form options - (optional) equation(s) of the form option=value where option is one of output, summarize, svdtolerance or weights; specify options for the LinearFit command

Description

 • The LinearFit command fits a model function that is linear in the model parameters to data by minimizing the least-squares error.  It performs both simple and multiple linear regression.  This command accepts the model function in algebraic form in two variants, and in operator form, and data for independent and dependent variables can be specified together or separately.  For more information about the input forms, see the Input Forms help page.
 • Consider the model $y=f\left({x}_{1},{x}_{2},\mathrm{...},{x}_{n}\right)$ where y is the dependent variable and f is the model function of n independent variables ${x}_{1},{x}_{2},\mathrm{...},{x}_{n}$.  This function is a linear combination ${a}_{1}{f}_{1}+{a}_{2}{f}_{2}+{a}_{m}{f}_{m}+\mathrm{...}$ of component functions ${f}_{j}\left({x}_{1},{x}_{2},\mathrm{...},{x}_{n}\right)$, for j from 1 to n.  Given k data points, where each data point is an (n+1)-tuple of numerical values for ${x}_{1},{x}_{2},\mathrm{...},{x}_{n},y$, the LinearFit command finds values of model parameters ${a}_{1},{a}_{2},\mathrm{...},{a}_{m}$ such that the sum of the k residuals squared is minimized.  The ith residual is the value of $y-f\left({x}_{1},{x}_{2},\mathrm{...},{x}_{n}\right)$ evaluated at the ith data point.
 • In the first two calling sequences, the first parameter flst is a list or Vector of component functions in algebraic form.  Each component is an algebraic expression in the independent variables ${x}_{1},{x}_{2},\mathrm{...},{x}_{n}$.
 • In the second pair of calling sequences, the first parameter is an algebraic expression for $f\left({x}_{1},{x}_{2},\mathrm{...},{x}_{n}\right)$, including the parameters ${a}_{1},{a}_{2},\mathrm{...},{a}_{m}$.
 • In the last two calling sequences, the first parameter fop is a list or Vector of component functions in operator form. The jth component is a procedure having n input parameters representing the independent variables ${x}_{1},{x}_{2},\mathrm{...},{x}_{n}$ and returning the single value ${f}_{j}\left({x}_{1},{x}_{2},\mathrm{...},{x}_{n}\right)$.
 • The parameter X is a Matrix containing the values of the independent variables.  Row i in the Matrix contains the n values for the ith data point while column j contains all values of the single variable ${x}_{j}$.  If there is only one independent variable, X can be either a Vector or a k-by-1 Matrix.  The parameter Y is a Vector containing the k values of the dependent variable y. The parameter XY is a Matrix consisting of the n columns of X and, as last column, Y. For X, Y, and XY, one can also use lists or Arrays; for details, see the Input Forms help page.
 • The parameter v is a list of the independent variable names used in falg.  If there is only one independent variable, then v can be a single name.  The order of the names in the list must match exactly the order in which the independent variable values are placed in the columns of X.
 • By default, either the model function with the final parameter values or a Vector containing the parameter values is returned, depending on the input form.  Additional results or a solution module that allows you to query for various settings and results can be obtained with the output option.  For more information, see the Statistics/Regression/Solution help page.
 • Weights for the data points can be supplied through the weights option.

Options

 The options argument can contain one or more of the options shown below.  These options are described in more detail on the Statistics/Regression/Options help page.
 • output = name or string -- Specify the form of the solution.  The output option can take as a value the name solutionmodule, or one of the following names (or a list of these names): AtkinsonTstatistic, confidenceintervals, CookDstatistic, degreesoffreedom, externallystandardizedresiduals, internallystandardizedresiduals, leastsquaresfunction, leverages, parametervalues, parametervector, residuals, residualmeansquare, residualstandarddeviation, residualsumofsquares, rsquared, rsquaredadjusted, standarderrors, tprobability, tvalue, variancecovariancematrix. For more information, see the Statistics/Regression/Solution help page.
 • summarize = identical( true, false, embed ) -- Display a summary of the regression model
 • svdtolerance = realcons(nonnegative) -- Set the tolerance that determines whether a singular-value decomposition is performed.
 • weights = Vector -- Provide weights for the data points.

Notes

 • The underlying computation is done in floating-point; therefore, all data points must have type realcons and all returned solutions are floating-point, even if the problem is specified with exact values.  For more information about numeric computation in the Statistics package, see the Statistics/Computation help page.
 • The LinearFit command uses various methods implemented in a built-in library provided by the Numerical Algorithms Group (NAG).  Normally, a method using QR decomposition is applied.  If it is determined that the system does not have full rank, then a singular-value decomposition (SVD) is performed. The svdtolerance option allows you to specify when an SVD should be performed.  See the Statistics/Regression/Options help page for additional details.
 • To obtain more details as the least-squares problem is being solved, set infolevel[Statistics] to 2 or higher.

Examples

 > $\mathrm{with}\left(\mathrm{Statistics}\right):$

A simple example using the first form for the first argument, flst:

 > $X≔\mathrm{Vector}\left(\left[1,2,3,4,5,6\right],\mathrm{datatype}=\mathrm{float}\right):$
 > $Y≔\mathrm{Vector}\left(\left[2,3,4,3.5,5.8,7\right],\mathrm{datatype}=\mathrm{float}\right):$
 > $\mathrm{LinearFit}\left(\left[1,t,{t}^{2}\right],X,Y,t\right)$
 ${1.96000000000000}{+}{0.164999999999999}{}{t}{+}{0.110714285714286}{}{{t}}^{{2}}$ (1)

The summarize option returns a summary for the regression:

 > $\mathrm{ls}≔\mathrm{LinearFit}\left(\left[1,t,{t}^{2}\right],X,Y,t,\mathrm{summarize}=\mathrm{true}\right):$
 Summary: ---------------- Model: 1.9600000+.16500000*t+.11071429*t^2 ---------------- Coefficients:               Estimate  Std. Error  t-value  P(>|t|) Parameter 1    1.9600    1.1720      1.6724   0.1930 Parameter 2    0.1650    0.7667      0.2152   0.8434 Parameter 3    0.1107    0.1072      1.0325   0.3778 ---------------- R-squared: 0.9252, Adjusted R-squared: 0.8753
 > $\mathrm{ls}$
 ${1.96000000000000}{+}{0.164999999999999}{}{t}{+}{0.110714285714286}{}{{t}}^{{2}}$ (2)

Here is the same example using the second form for the first argument, falg:

 > $\mathrm{LinearFit}\left(a+bt+c{t}^{2},X,Y,t\right)$
 ${1.96000000000000}{+}{0.164999999999999}{}{t}{+}{0.110714285714286}{}{{t}}^{{2}}$ (3)

The summary can also be returned as an embedded table:

 > $\mathrm{LinearFit}\left(\left[1,t,{t}^{2}\right],X,Y,t,\mathrm{summarize}=\mathrm{embed}\right)$
 ${1.96000000000000}{+}{0.164999999999999}{}{t}{+}{0.110714285714286}{}{{t}}^{{2}}$ (4)

Model:

${1.9600000}{+}{0.16500000}{}{t}{+}{0.11071429}{}{{t}}^{{2}}$

 Coefficients Estimate Standard Error t-value P(>|t|) Parameter 1 ${1.96000}$ ${1.17199}$ ${1.67237}$ ${0.193045}$ Parameter 2 ${0.165000}$ ${0.766748}$ ${0.215194}$ ${0.843415}$ Parameter 3 ${0.110714}$ ${0.107226}$ ${1.03253}$ ${0.377769}$

R-squared:

${0.925169}$

${0.875282}$

Residuals

 Residual Sum of Squares Residual Mean Square Residual Standard Error Degrees of Freedom ${1.28771}$ ${0.429238}$ ${0.655163}$ ${3}$

Five Point Summary

 Minimum First Quartile Median Third Quartile Maximum ${-0.891429}$ ${-0.290357}$ ${0.155714}$ ${0.290595}$ ${0.548571}$

And finally using the third form, fop:

 > $\mathrm{constant_function}≔t↦1$
 ${\mathrm{constant_function}}{≔}{t}{↦}{1}$ (5)
 > $\mathrm{linear_function}≔t↦t$
 ${\mathrm{linear_function}}{≔}{t}{↦}{t}$ (6)
 > $\mathrm{quadratic_function}≔t↦{t}^{2}$
 ${\mathrm{quadratic_function}}{≔}{t}{↦}{{t}}^{{2}}$ (7)
 > $\mathrm{LinearFit}\left(\left[\mathrm{constant_function},\mathrm{linear_function},\mathrm{quadratic_function}\right],X,Y\right)$
 $\left[\begin{array}{c}1.960000000000003\\ 0.16499999999999876\\ 0.11071428571428583\end{array}\right]$ (8)

Use the output=solutionmodule option to see the full results.

 > $m≔\mathrm{LinearFit}\left(\left[1,t,{t}^{2}\right],X,Y,t,\mathrm{output}=\mathrm{solutionmodule}\right)$
 ${m}{≔}{\mathbf{module}}\left({}\right)\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{...}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}{\mathbf{end module}}$ (9)
 > $m:-\mathrm{Results}\left(\right)$
 $\left[{"residualmeansquare"}{=}{0.4292380952380952}{,}{"residualsumofsquares"}{=}{1.2877142857142856}{,}{"residualstandarddeviation"}{=}{0.6551626479265246}{,}{"degreesoffreedom"}{=}{3}{,}{"parametervalues"}{=}\left(\left[\begin{array}{c}1.960000000000003\\ 0.16499999999999876\\ 0.11071428571428583\end{array}\right]\right){,}{"parametervector"}{=}\left(\left[\begin{array}{c}1.960000000000003\\ 0.16499999999999876\\ 0.11071428571428583\end{array}\right]\right){,}{"leastsquaresfunction"}{=}{1.960000000000003}{+}{0.16499999999999876}{}{t}{+}{0.11071428571428583}{}{{t}}^{{2}}{,}{"standarderrors"}{=}\left(\left[\begin{array}{ccc}1.1719905736659766& 0.7667482580068001& 0.10722615809396434\end{array}\right]\right){,}{"confidenceintervals"}{=}\left(\left[\begin{array}{c}-1.7698002575074514..5.689800257507457\\ -2.275137245482846..2.6051372454828434\\ -0.23052749647805626..0.4519560679066279\end{array}\right]\right){,}{"rsquared"}{=}{0.9251691456243515}{,}{"rsquaredadjusted"}{=}{0.8752819093739191}{,}{"residuals"}{=}\left(\left[\begin{array}{cccccc}-0.23571428571428601& 0.2671428571428569& 0.5485714285714287& -0.8914285714285712& 0.24714285714285728& 0.06428571428571402\end{array}\right]\right){,}{"leverages"}{=}\left(\left[\begin{array}{cccccc}0.8214285714285723& 0.30714285714285733& 0.37142857142857133& 0.3714285714285712& 0.30714285714285705& 0.8214285714285718\end{array}\right]\right){,}{"variancecovariancematrix"}{=}\left(\left[\begin{array}{ccc}1.3735619047619048& -0.8370142857142856& 0.10730952380952377\\ -0.8370142857142856& 0.5879028911564625& -0.08048214285714284\\ 0.10730952380952377& -0.08048214285714284& 0.011497448979591833\end{array}\right]\right){,}{"internallystandardizedresiduals"}{=}\left(\left[\begin{array}{c}-0.8513943978432964\\ 0.4898606875576459\\ 1.056104119396909\\ -1.7161691940199764\\ 0.4531866253875553\\ 0.2321984721390793\end{array}\right]\right){,}{"externallystandardizedresiduals"}{=}\left(\left[\begin{array}{c}-0.7982573178275226\\ 0.4169943516423314\\ 1.0879453019028917\\ -10.37123092690656\\ 0.38338097229804785\\ 0.19131622481166083\end{array}\right]\right){,}{"CookDstatistic"}{=}\left(\left[\begin{array}{c}1.111471045041062\\ 0.03545852305230702\\ 0.2196913158044327\\ 0.5801223807960794\\ 0.030347969242257366\\ 0.08267140004437516\end{array}\right]\right){,}{"AtkinsonTstatistic"}{=}\left(\left[\begin{array}{c}-1.7120712103005218\\ 0.27763776073374663\\ 0.836310206125244\\ -7.972428631368746\\ 0.25525773727519013\\ 0.41032758892189525\end{array}\right]\right){,}{"tvalue"}{=}\left[{1.6723683995760645}{,}{0.21519448955635634}{,}{1.032530566070126}\right]{,}{"tprobability"}{=}\left[{0.19304505794390803}{,}{0.8434150347843583}{,}{0.3777685126364536}\right]\right]$ (10)

Consider now an experiment where quantities $x$, $y$, and $z$ are quantities influencing a quantity $w$ according to an approximate relationship

$w=ax+\frac{b{x}^{2}}{y}+cyz$

with unknown parameters $a$, $b$, and $c$. Six data points are given by the following matrix, with respective columns for $x$, $y$, $z$, and $w$.

 > $\mathrm{ExperimentalData}≔⟨⟨1,1,1,2,2,2⟩|⟨1,2,3,1,2,3⟩|⟨1,2,3,4,5,6⟩|⟨0.531,0.341,0.163,0.641,0.713,-0.040⟩⟩$
 $\left[\begin{array}{cccc}1& 1& 1& 0.531\\ 1& 2& 2& 0.341\\ 1& 3& 3& 0.163\\ 2& 1& 4& 0.641\\ 2& 2& 5& 0.713\\ 2& 3& 6& -0.040\end{array}\right]$ (11)

We can find the fitted model function as follows:

 > $\mathrm{LinearFit}\left(\left[x,\frac{{x}^{2}}{y},yz\right],\mathrm{ExperimentalData},\left[x,y,z\right]\right)$
 ${0.823072918385878}{}{x}{-}\frac{{0.167910114211606}{}{{x}}^{{2}}}{{y}}{-}{0.0758022678386438}{}{y}{}{z}$ (12)

Alternatively, if we have the input and output data separately, we can use the following calling sequence.

 > $\mathrm{Input}≔\mathrm{ExperimentalData}\left[..,..3\right]$
 $\left[\begin{array}{rrr}1& 1& 1\\ 1& 2& 2\\ 1& 3& 3\\ 2& 1& 4\\ 2& 2& 5\\ 2& 3& 6\end{array}\right]$ (13)
 > $\mathrm{Output}≔\mathrm{ExperimentalData}\left[..,4\right]$
 $\left[\begin{array}{c}0.531\\ 0.341\\ 0.163\\ 0.641\\ 0.713\\ -0.040\end{array}\right]$ (14)
 > $\mathrm{LinearFit}\left(\left[x,\frac{{x}^{2}}{y},yz\right],\mathrm{Input},\mathrm{Output},\left[x,y,z\right]\right)$
 ${0.823072918385878}{}{x}{-}\frac{{0.167910114211606}{}{{x}}^{{2}}}{{y}}{-}{0.0758022678386438}{}{y}{}{z}$ (15)

We might want to know the residuals and the parameter values instead of just the model function.

 > $\mathrm{LinearFit}\left(\left[x,\frac{{x}^{2}}{y},yz\right],\mathrm{ExperimentalData},\left[x,y,z\right],\mathrm{output}=\left[\mathrm{parametervector},\mathrm{residuals}\right]\right)$
 $\left[\left[\begin{array}{c}0.8230729183858783\\ -0.16791011421160582\\ -0.07580226783864379\end{array}\right]{,}\left[\begin{array}{cccccc}-0.04836053633562854& -0.09490878992549993& 0.07811753022685414& -0.03029630857075828& 0.16069707003789296& -0.09782486344999755\end{array}\right]\right]$ (16)
 > 

Compatibility

 • The XY parameter was introduced in Maple 15.