OLSMultipleLinearRegression parabola

We know how to infer a first degree curve (i.e. a straight line) from a bunch of observations using the Ordinary Least Squares estimator provided in the Java Apache Commons Math package.

Things are getting a bit more complicated here, as we try to get as estimation a second degree curve (a parabola).

Assuming that "ols" is a previously defined OLSMultipleLinearRegression object, here is the code that sets its sample data and then extract the relative coefficent estimation:
int vars = 2; // 1
int obs = 3; // 2
double[] data = { 4, 1, 1, /**/ 8, 2, 4, /**/ 14, 3, 9, }; // 3

ols.newSampleData(data, obs, vars); // 4

double[] coe = ols.estimateRegressionParameters();
dumpEstimation(coe); // 5
1. The number of independent variables for a parabola should be 2.
2. As before, we should provide at least one observation more than the vars.
3. Here is the input data. First component is y, than we have x, and then x square. These observations are lazily taken calculating y = x^2 + x + 2, as you could have guessed, so we would expect to get back as coefficients values close to (2, 1, 1).
4. I didn't try/catch, assuming that the caller of this code would do that for me. Actually, all the exceptions thrown by this package are unchecked (derived from RuntimeException), so we are not forced to try/catch or declare them in the method signature - and we can simply accept the risk of a sudden death of our application.
5. We have seen this little testing function in the previous post, it works fine here too.

Does the flattened data array bother you? Maybe not in this so simple example, but in a more real scenario could be cumbersome to organize data accordingly to this model. It could be useful to place y's in a unidimensional array, and x's in a separate bidimensional one. There is a OLSMultipleLinearRegression.newSampleData() overload that works right in this way:
double[] ys = { 4, 8, 14 };
double[][] xs = new double[][] { {1, 1}, {2, 4}, {3, 9} };
ols.newSampleData(ys, xs);

double[] coe = ols.estimateRegressionParameters();
dumpEstimation(coe);
This piece of code should be equivalent to what we have seen above.

No comments:

Post a Comment