You are on page 1of 8

Curve Fitting with Scilab

Neither Scilab nor Scicoslab have a function for straight curve fitting, such as the polyfit function that we can
find in Matlab. However, its not that difficult to develop (or find?) a custom made function for that purpose.

In [1] we can find a suggestion for this task. This is an edited version of the transcription of the 15-line code.
function p = polyfit(x, y, n)
if length(x) ~= length(y)
error('x and y vectors must be the same size')
end
x = x( : );
y = y( : );
V = ones(length(x), n+1);
for j = n : -1 : 1
V(:, j) = x .* V (:, j+1);
end
[Q, R] = qr(V);
QTy = Q' * y;
p = R(1 : n+1, 1 : n+1) \ QTy(1 : n+1);
p = p.';
endfunction
There are three inputs: vectors x and y, and a scalar n. x and y are the vectors defining our points to be fitted, and
n is the order of the polynomial to be found.
On the other hand, we also need an equivalent of the polyval function given in Matlab. Naturally there are many
possibilities here. The following code is much simpler than the previous one and were showing our suggestion.
function y = polyval(p, x)
y = 0*x;
p = mtlb_fliplr(p);
for ix = 1 : length(p)

y = y + p(ix) * x.^(ix-1);
end
endfunction
The goal of this function is to use the polynomial found by polyfit, receive an x-value and release the
corresponding y-value produced by that polynomial. There are two inputs: we need the coefficients of the
polynomial, in p, and the x-value, in x. We just do the required math in a simple and straight way. We use the
function mtlb_fliplr (provided in the M2SCI toolbox) just to sort the order of the exponent of the variable.
Now, lets see in action both functions.
// Prepare our environment
clear; clc
// Declare our two functions
getf('polyfit.sci')
getf('polyval.sci')
// Define (arbitrarily) the number of points to take into account
np = 10;
// Define the x-vector and two functions, the second function
// is a noised version of the first one
x = linspace(0, 1, np);
y1 = x.^3 - 5*x.^2 - 3*x - 7;
y2 = y1 + .1*rand(1, np);
// Enter the x and y vectors, and the order of the polynomial
// that we want to obtain
p = polyfit(x, y2, 3)
// Define other x-values and find the original function
x = 0 : .4 : 1
y1 = x.^3 - 5*x.^2 - 3*x - 7
// Use polyval to find the equivalent values in the
// noised function
y = polyval(p, x)
// Divide the values just for comparison purposes
ratio = y1./y

Now, the (commented) results shown by Scilab are:


// This is our 3rd. order polynomial from the noised function
p = 1.1103879 - 5.0400213 - 3.052195 - 6.9291697
// Another set of points generated
x = 0.
0.4
0.8
y1 = - 7. - 8.936 - 12.088
// Find the y-values of the noised function
y = - 6.9291697 - 8.8853863 - 12.028021
// Compare the original vs the noised results
ratio = 1.0102221
1.0056963
1.0049866
We can see that the polyfit and polyval functions coded in Scilab are working

Curve Fitting for experimental data


In this experiment, we are going to explore another built-in function in Scilab intended
for curve fitting or finding parameters or coefficients. Its name is datafit. Naturally,
you can see all the possibilities and uses of the function if you type help datafit on
your command window. The online reference manual should always be your first
source of information.

We are going to use the simplest case for fitting a curve to given or found data.
Lets say that we have collected some results from an experiment. These are the
specific numbers

If we graph the table in Scilab, were going to get this plot

Now, lets say that we know in advance that those measured or somehow collected
points in our experiment are part of a nonlinear function of this type:

Our mission is to find the parameters C1, C2 and C3.

We know that the function datafit is used for fitting data to a model. For a given
function G(p, z), this function finds the best vector of parameters p for approximating
G(p, zi) = 0 for a set of measurement vectors zi. Vector p is found by minimizing G(p,
z1)' WG(p, z1) + G(p, z2)' WG(p, z2) + ... + G(p, zn)' WG(p, zn)

where

G is a function descriptor

W is a weighting matrix

datafit is an improved version of fit_dat, also available in Scilab.

The first step in our demonstration is to create a file (called OF_datafit1.sci) that
includes our parameterized function (in this case called data_fit_1) and our way to
measure the error (in this case the function is called myerror).

This is one way to do it:


// This function takes vector x and the parameters
// of the function
function y = data_fit_1(x, c)
y = exp(c(1)*x) .* cos(c(2)*x) + c(3)*sin(x);
endfunction
// This is a way to measure the error, to find the least one.
// The error function will call the parameterized function.
function e = myerror(c, z)
x = z(1); y = z(2);
e = y - data_fit_1(x, c);
endfunction

Now, we can create a main script that can use the function datafit and the input data.
One way to do it is like this:
// Clear windows, memory and screen
xdel(winsid()); clear; clc
// Load our functions into memory
getf('OF_datafit1.sci');
// Measured data in vectors x and y
x = [0 0.55 1.11 1.66 2.22 ...
2.77 3.33 3.88 4.44 5];
y = [1 0.47 3.73 2.22 2.61 ...
1.63 -2.13 0.62 -6.58 1.56];
// Plot the original data
plot(x, y, 'ro')
// Prepare vector z with given coordinates
z = [x; y];

// This is our first attempt to find the parameters


c0 = [2 2 2]';
// copt is supposed to be the best result
// err is the value of the error at the end of the process
[copt, err] = datafit(myerror, z, c0);
// Let's see how good our optimization resulted
x = linspace(0, 5, 100);
y = data_fit_1(x, copt);
plot(x, y)
This is the result:
err = 43.654725
copt =
0.1361701
2.3429071
2.5378073

It was a nice try, but the error was very high (it should be close to zero) and our result
was not good at all.
We could manually try different values in c0, the starting point. We can expect Scilab
to deliver different results if we enter different seeds as starting points...

Lets try a different approach. Were going to create a loop of 10 iterations. We can
create a random vector for c0 (the seed) each time, and we are going to take the best
result after those 10 attempts. Its another way of approaching the problem, instead of
going one vector at a time...

I can suggest something similar to this code...

// Lets try random starting seeds between -5 and 5


a = -5; b = 5;
for k = 1 : 10
c0 = a + (b - a)*rand(3, 1);
[copt(:, k), err(k)] = datafit(myerror, z, c0);
end
// The least error after 10 trials is
[m, k] = min(err)
copt = copt(:,k)
// Lets plot the best found result
x = linspace(0, 5, 100);
y = data_fit_1(x, copt);
plot(x, y)

And we get this result...


k = 3
m = 0.0030935
copt =
0.3006878
6.3193538
3.0024572

Much better..., now our function has an error of only 0.003 (found in the third
iteration) and the best found coefficients produce a function that hits the experimental
data almost perfectly.
Mission accomplished!

You might also like