Linear Fit - Maple Help

All Products Maple MapleSim

Home : Support : Online Help : Statistics and Data Analysis : Statistics Package : Regression : Linear Fit

Statistics

LinearFit

fit a linear model function to data

	Calling Sequence
	LinearFit(flst, X, Y, v, options) LinearFit(flst, XY, v, options) LinearFit(falg, X, Y, v, options) LinearFit(falg, XY, v, options) LinearFit(fop, X, Y, options) LinearFit(fop, XY, options)

Parameters

flst	-	list(algebraic) or Vector(algebraic); component functions in algebraic form
X	-	Vector or Matrix; values of independent variable(s)
Y	-	Vector; values of dependent variable
XY	-	Matrix; values of independent and dependent variables
v	-	name or list(names); name(s) of independent variables in the component functions
falg	-	algebraic expression, linear in all its variables except the ones in v; model
fop	-	list(procedure) or Vector(procedure); component functions in operator form
options	-	(optional) equation(s) of the form option=value where option is one of output, summarize, svdtolerance or weights; specify options for the LinearFit command

Description

•

The LinearFit command fits a model function that is linear in the model parameters to data by minimizing the least-squares error. It performs both simple and multiple linear regression. This command accepts the model function in algebraic form in two variants, and in operator form, and data for independent and dependent variables can be specified together or separately. For more information about the input forms, see the Input Forms help page.

•

Consider the model $y = f (x_{1}, x_{2}, ..., x_{n})$ where y is the dependent variable and f is the model function of n independent variables $x_{1}, x_{2}, ..., x_{n}$ . This function is a linear combination $a_{1} f_{1} + a_{2} f_{2} + a_{m} f_{m} + ...$ of component functions $f_{j} (x_{1}, x_{2}, ..., x_{n})$ , for j from 1 to n. Given k data points, where each data point is an (n+1)-tuple of numerical values for $x_{1}, x_{2}, ..., x_{n}, y$ , the LinearFit command finds values of model parameters $a_{1}, a_{2}, ..., a_{m}$ such that the sum of the k residuals squared is minimized. The ith residual is the value of $y - f (x_{1}, x_{2}, ..., x_{n})$ evaluated at the ith data point.

•	In the first two calling sequences, the first parameter flst is a list or Vector of component functions in algebraic form. Each component is an algebraic expression in the independent variables $x_{1}, x_{2}, ..., x_{n}$ .

•	In the second pair of calling sequences, the first parameter is an algebraic expression for $f (x_{1}, x_{2}, ..., x_{n})$ , including the parameters $a_{1}, a_{2}, ..., a_{m}$ .

•

In the last two calling sequences, the first parameter fop is a list or Vector of component functions in operator form. The jth component is a procedure having n input parameters representing the independent variables $x_{1}, x_{2}, ..., x_{n}$ and returning the single value $f_{j} (x_{1}, x_{2}, ..., x_{n})$ .

•

The parameter X is a Matrix containing the values of the independent variables. Row i in the Matrix contains the n values for the ith data point while column j contains all values of the single variable $x_{j}$ . If there is only one independent variable, X can be either a Vector or a k-by-1 Matrix. The parameter Y is a Vector containing the k values of the dependent variable y. The parameter XY is a Matrix consisting of the n columns of X and, as last column, Y. For X, Y, and XY, one can also use lists or Arrays; for details, see the Input Forms help page.

•	The parameter v is a list of the independent variable names used in falg. If there is only one independent variable, then v can be a single name. The order of the names in the list must match exactly the order in which the independent variable values are placed in the columns of X.

•

By default, either the model function with the final parameter values or a Vector containing the parameter values is returned, depending on the input form. Additional results or a solution module that allows you to query for various settings and results can be obtained with the output option. For more information, see the Statistics/Regression/Solution help page.

•	Weights for the data points can be supplied through the weights option.

Options

The options argument can contain one or more of the options shown below. These options are described in more detail on the Statistics/Regression/Options help page.

•

output = name or string -- Specify the form of the solution. The output option can take as a value the name solutionmodule, or one of the following names (or a list of these names): AtkinsonTstatistic, confidenceintervals, CookDstatistic, degreesoffreedom, externallystandardizedresiduals, internallystandardizedresiduals, leastsquaresfunction, leverages, parametervalues, parametervector, residuals, residualmeansquare, residualstandarddeviation, residualsumofsquares, rsquared, rsquaredadjusted, standarderrors, tprobability, tvalue, variancecovariancematrix. For more information, see the Statistics/Regression/Solution help page.

•	summarize = identical( true, false, embed ) -- Display a summary of the regression model

•	svdtolerance = realcons(nonnegative) -- Set the tolerance that determines whether a singular-value decomposition is performed.

•	weights = Vector -- Provide weights for the data points.

Notes

•

The underlying computation is done in floating-point; therefore, all data points must have type realcons and all returned solutions are floating-point, even if the problem is specified with exact values. For more information about numeric computation in the Statistics package, see the Statistics/Computation help page.

•

The LinearFit command uses various methods implemented in a built-in library provided by the Numerical Algorithms Group (NAG). Normally, a method using QR decomposition is applied. If it is determined that the system does not have full rank, then a singular-value decomposition (SVD) is performed. The svdtolerance option allows you to specify when an SVD should be performed. See the Statistics/Regression/Options help page for additional details.

•	To obtain more details as the least-squares problem is being solved, set infolevel[Statistics] to 2 or higher.

Examples

>	$with (Statistics) &colon;$

A simple example using the first form for the first argument, flst:

>	$X ≔ Vector ([1, 2, 3, 4, 5, 6], datatype = float) &colon;$

>	$Y ≔ Vector ([2, 3, 4, 3.5, 5.8, 7], datatype = float) &colon;$

>	$LinearFit ([1, t, t^{2}], X, Y, t)$

$1.96000000000000 + 0.164999999999999 t + 0.110714285714286 t^{2}$

(1)

The summarize option returns a summary for the regression:

>	$ls ≔ LinearFit ([1, t, t^{2}], X, Y, t, summarize = true) &colon;$

Summary:
----------------
Model: 1.9600000+.16500000*t+.11071429*t^2
----------------
Coefficients:
              Estimate Std. Error t-value P(>|t|)
Parameter 1    1.9600    1.1720      1.6724   0.1930
Parameter 2    0.1650    0.7667      0.2152   0.8434
Parameter 3    0.1107    0.1072      1.0325   0.3778
----------------
R-squared: 0.9252, Adjusted R-squared: 0.8753

$ls$

$1.96000000000000 + 0.164999999999999 t + 0.110714285714286 t^{2}$

(2)

Here is the same example using the second form for the first argument, falg:

>	$LinearFit (a + b t + c t^{2}, X, Y, t)$

$1.96000000000000 + 0.164999999999999 t + 0.110714285714286 t^{2}$

(3)

The summary can also be returned as an embedded table:

>	$LinearFit ([1, t, t^{2}], X, Y, t, summarize = embed)$

$1.96000000000000 + 0.164999999999999 t + 0.110714285714286 t^{2}$

(4)

Model:

$1.9600000 + 0.16500000 t + 0.11071429 t^{2}$

Coefficients	Estimate	Standard Error	t-value	P(>\|t\|)
Parameter 1	$1.96000$	$1.17199$	$1.67237$	$0.193045$
Parameter 2	$0.165000$	$0.766748$	$0.215194$	$0.843415$
Parameter 3	$0.110714$	$0.107226$	$1.03253$	$0.377769$

R-squared:

$0.925169$

Adjusted R-squared:

$0.875282$

Residuals

Residual Sum of Squares	Residual Mean Square	Residual Standard Error	Degrees of Freedom
$1.28771$	$0.429238$	$0.655163$	$3$

Five Point Summary

Minimum	First Quartile	Median	Third Quartile	Maximum
$−0.891429$	$−0.290357$	$0.155714$	$0.290595$	$0.548571$

And finally using the third form, fop:

>	$constant_function ≔ t \mapsto 1$

$constant_function ≔ t \mapsto 1$

(5)

>	$linear_function ≔ t \mapsto t$

$linear_function ≔ t \mapsto t$

(6)

>	$quadratic_function ≔ t \mapsto t^{2}$

$quadratic_function ≔ t \mapsto t^{2}$

(7)

>	$LinearFit ([constant_function, linear_function, quadratic_function], X, Y)$

$[\begin{array}{c} 1.96000000000000 \\ 0.164999999999999 \\ 0.110714285714286 \end{array}]$

(8)

Use the output=solutionmodule option to see the full results.

>	$m ≔ LinearFit ([1, t, t^{2}], X, Y, t, output = solutionmodule)$

$m ≔ module () export Results, Settings &semi; end module$

(9)

>	$m :- Results ()$

$[residualmeansquare = 0.429238095238095, residualsumofsquares = 1.28771428571429, residualstandarddeviation = 0.655162647926525, degreesoffreedom = 3, parametervalues = [\begin{array}{c} 1.96000000000000 \\ 0.164999999999999 \\ 0.110714285714286 \end{array}], parametervector = [\begin{array}{c} 1.96000000000000 \\ 0.164999999999999 \\ 0.110714285714286 \end{array}], leastsquaresfunction = 1.96000000000000 + 0.164999999999999 t + 0.110714285714286 t^{2}, standarderrors = [\begin{array}{c} 1.17199057366598 & 0.766748258006800 & 0.107226158093964 \end{array}], confidenceintervals = [\begin{array}{c} −1.76980025750745 .. 5.68980025750746 \\ −2.27513724548285 .. 2.60513724548284 \\ −0.230527496478056 .. 0.451956067906628 \end{array}], rsquared = 0.925169145624351, rsquaredadjusted = 0.875281909373919, residuals = [\begin{array}{c} −0.235714285714286 & 0.267142857142857 & 0.548571428571429 & −0.891428571428571 & 0.247142857142857 & 0.0642857142857140 \end{array}], leverages = [\begin{array}{c} 0.821428571428572 & 0.307142857142857 & 0.371428571428571 & 0.371428571428571 & 0.307142857142857 & 0.821428571428572 \end{array}], variancecovariancematrix = [\begin{array}{c} 1.37356190476190 & −0.837014285714286 & 0.107309523809524 \\ −0.837014285714286 & 0.587902891156463 & −0.0804821428571428 \\ 0.107309523809524 & −0.0804821428571428 & 0.0114974489795918 \end{array}], internallystandardizedresiduals = [\begin{array}{c} −0.851394397843296 \\ 0.489860687557646 \\ 1.05610411939691 \\ −1.71616919401998 \\ 0.453186625387555 \\ 0.232198472139079 \end{array}], externallystandardizedresiduals = [\begin{array}{c} −0.798257317827523 \\ 0.416994351642331 \\ 1.08794530190289 \\ −10.3712309269066 \\ 0.383380972298048 \\ 0.191316224811661 \end{array}], CookDstatistic = [\begin{array}{c} 1.11147104504106 \\ 0.0354585230523070 \\ 0.219691315804433 \\ 0.580122380796079 \\ 0.0303479692422574 \\ 0.0826714000443752 \end{array}], AtkinsonTstatistic = [\begin{array}{c} −1.71207121030052 \\ 0.277637760733747 \\ 0.836310206125244 \\ −7.97242863136875 \\ 0.255257737275190 \\ 0.410327588921895 \end{array}], tvalue = [1.67236839957606, 0.215194489556356, 1.03253056607013], tprobability = [0.193045057943908, 0.843415034784358, 0.377768512636454]]$

(10)

Consider now an experiment where quantities $x$ , $y$ , and $z$ are quantities influencing a quantity $w$ according to an approximate relationship

$w = a x + \frac{b x^{2}}{y} + c y z$

with unknown parameters $a$ , $b$ , and $c$ . Six data points are given by the following matrix, with respective columns for $x$ , $y$ , $z$ , and $w$ .

>	$ExperimentalData ≔ ⟨⟨1, 1, 1, 2, 2, 2⟩ \| ⟨1, 2, 3, 1, 2, 3⟩ \| ⟨1, 2, 3, 4, 5, 6⟩ \| ⟨0.531, 0.341, 0.163, 0.641, 0.713, - 0.040⟩⟩$

$ExperimentalData ≔ [\begin{array}{c} 1 & 1 & 1 & 0.531 \\ 1 & 2 & 2 & 0.341 \\ 1 & 3 & 3 & 0.163 \\ 2 & 1 & 4 & 0.641 \\ 2 & 2 & 5 & 0.713 \\ 2 & 3 & 6 & −0.040 \end{array}]$

(11)

We can find the fitted model function as follows:

>	$LinearFit ([x, \frac{x^{2}}{y}, y z], ExperimentalData, [x, y, z])$

$0.823072918385878 x - \frac{0.167910114211606 x^{2}}{y} - 0.0758022678386438 y z$

(12)

Alternatively, if we have the input and output data separately, we can use the following calling sequence.

>	$Input ≔ ExperimentalData [.., .. 3]$

$Input ≔ [\begin{array}{c} 1 & 1 & 1 \\ 1 & 2 & 2 \\ 1 & 3 & 3 \\ 2 & 1 & 4 \\ 2 & 2 & 5 \\ 2 & 3 & 6 \end{array}]$

(13)

>	$Output ≔ ExperimentalData [.., 4]$

$Output ≔ [\begin{array}{c} 0.531 \\ 0.341 \\ 0.163 \\ 0.641 \\ 0.713 \\ −0.040 \end{array}]$

(14)

>	$LinearFit ([x, \frac{x^{2}}{y}, y z], Input, Output, [x, y, z])$

$0.823072918385878 x - \frac{0.167910114211606 x^{2}}{y} - 0.0758022678386438 y z$

(15)

We might want to know the residuals and the parameter values instead of just the model function.

>	$LinearFit ([x, \frac{x^{2}}{y}, y z], ExperimentalData, [x, y, z], output = [parametervector, residuals])$

$[[\begin{array}{c} 0.823072918385878 \\ −0.167910114211606 \\ −0.0758022678386438 \end{array}], [\begin{array}{c} −0.0483605363356285 & −0.0949087899254999 & 0.0781175302268541 & −0.0302963085707583 & 0.160697070037893 & −0.0978248634499976 \end{array}]]$

(16)

Compatibility

•	The XY parameter was introduced in Maple 15.

•	For more information on Maple 15 changes, see Updates in Maple 15.

•	The falg parameter was introduced in Maple 17.

•	For more information on Maple 17 changes, see Updates in Maple 17.

•	The Statistics[LinearFit] command was updated in Maple 2016.

•	The summarize option was introduced in Maple 2016.

•	For more information on Maple 2016 changes, see Updates in Maple 2016.

Maple

Add-ons Maple

Math Success Platform

Améliorer les taux de retention

Maple Flow

MapleSim

Services-conseils

Produits d’enseignement en ligne

Enseignement

Industrie

Automobile et aéronautique

Robotique

Conception de machine & automatismes industriels

Autres

Domaines d’application

Prix et d'achats

Achats

Licences institutionnelles Etudiants

Vue d’ensemble du service EMP

Support

Formation produit

Aide en ligne Produit

Webinaires et évènements

Publications

Hubs de contenu

Exemples et applications

Communauté

A propos de Maplesoft

Ressources presse

Communauté

Contacts

Online Help

All Products Maple MapleSim

Maple

Logiciel de mathématiques puissant facile à utiliser

Add-ons Maple

Math Success Platform

Améliorer les taux de retention

Maple Flow

Engineering calculations & documentation

MapleSim

Modélisation avancée au niveau système

Services-conseils

Produits d’enseignement en ligne

Enseignement

Industrie

Automobile et aéronautique

Robotique

Conception de machine & automatismes industriels

Autres

Domaines d’application

Prix et d'achats

Achats

Licences institutionnelles Etudiants

Vue d’ensemble du service EMP

Support

Formation produit

Aide en ligne Produit

Webinaires et évènements

Publications

Hubs de contenu

Exemples et applications

Communauté

A propos de Maplesoft

Ressources presse

Communauté

Contacts

Online Help

All Products Maple MapleSim