1. Install WinPython: http://sourceforge.net/projects/winpython/?source=dlp
This will install python distribution package together with numpy, scipy, etc.
2. Register Python using "WinPython Control Panel.exe" under WinPython folder.
3. Download and run scikit-learn installation executable. Make sure their versions are matched.
--Simple Test
>>> from sklearn import linear_model
>>> clf=linear_model.LinearRegression()
>>> clf.fit([[0,0],[1,1],[2,2]],[0,1,2])
LinearRegression(copy_X=True, fit_intercept=True, normalize=False)
>>> clf.coef_
array([ 0.5, 0.5])
>>>
Sunday, March 30, 2014
Thursday, March 27, 2014
Ridge Regression Model with Generalized Cross Validation
An introduction to GCV can be found here:
http://sfb649.wiwi.hu-berlin.de/fedc_homepage/xplore/ebooks/html/csa/node123.html
In ridge regression model with generalized cross validation, we can determine the ridge parameter by selecting \$\lambda\$ that minimizes the following:
where matrix \$H(\lambda)\$ is given as
\$X(X^{T}X+\lambda I)X^{T}\$.
The code:
http://sfb649.wiwi.hu-berlin.de/fedc_homepage/xplore/ebooks/html/csa/node123.html
In ridge regression model with generalized cross validation, we can determine the ridge parameter by selecting \$\lambda\$ that minimizes the following:
where matrix \$H(\lambda)\$ is given as
\$X(X^{T}X+\lambda I)X^{T}\$.
The code:
void CRidgeRegressionGCV::ComputeRegression()
{
double minGCV = FLT_MAX;
for (int t = 0; t < m_alphas.size(); t++)
{
double alpha = m_alphas[t];
int m = m_X.size();
int n = m_X[0].size();
MatrixXd mat(m, n + 1);
VectorXd rhs(m);
for (int i = 0; i < m; i++)
{
for (int j = 0; j < n + 1; j++)
{
if (j < n)
{
mat(i, j) = m_X[i][j];
rhs(i) = m_Y[i];
}
else
{
mat(i, j) = 1;
}
}
}
MatrixXd tmp = mat.transpose()*mat + alpha*MatrixXd::Identity(n + 1, n + 1);
JacobiSVD<MatrixXd> svd(tmp, ComputeThinU | ComputeThinV);
MatrixXd res = svd.solve(mat.transpose()*rhs);
vector<double> coef(n);
double intercept;
coef.resize(n);
for (int i = 0; i < n; i++)
{
coef[i] = res(i, 0);
}
intercept = res(n, 0);
double GCV = 0;
for (int i = 0; i < m; i++)
{
double sum = 0;
for (int j = 0; j < n; j++)
{
sum += m_X[i][j] * coef[j];
}
sum += intercept;
double diff = m_Y[i] - sum;
double diffsquare = diff*diff;
GCV += diffsquare;
}
GCV /= m;
MatrixXd tmp2=svd.solve(mat.transpose());
MatrixXd H = mat*tmp2;
GCV /= (1-H.trace()/m)*(1-H.trace()/m);
if (GCV < minGCV)
{
minGCV = GCV;
m_coef = coef;
m_intercept = intercept;
m_alpha = alpha;
}
}
}
Ridge Regression Model
In ridge regression model, optimal w is given by:
\$(X^{T}X+\alpha I)^{-1}X^{T}Y\$
The code:
\$(X^{T}X+\alpha I)^{-1}X^{T}Y\$
The code:
void CRidgeRegression::ComputeRegression()
{
int m = m_X.size();
int n = m_X[0].size();
MatrixXd mat(m, n + 1);
VectorXd rhs(m);
for (int i = 0; i < m; i++)
{
for (int j = 0; j < n + 1; j++)
{
if (j < n)
{
mat(i, j) = m_X[i][j];
rhs(i) = m_Y[i];
}
else
{
mat(i, j) = 1;
}
}
}
MatrixXd tmp = mat.transpose()*mat + m_alpha*MatrixXd::Identity(n + 1, n + 1);
JacobiSVD<MatrixXd> svd(tmp, ComputeThinU | ComputeThinV);
MatrixXd res = svd.solve(mat.transpose()*rhs);
m_coef.resize(n);
for (int i = 0; i < n; i++)
{
m_coef[i] = res(i, 0);
}
m_intercept = res(n, 0);
}
Ways to Insert Equation and Code in Blogger
For code, the simplest way is to use the following website to convert it into html block:
http://codeformatter.blogspot.com/
For equation, you can use latex in blogger following the instructions on page:
https://www.codecogs.com/latex/integration/blogger/install.php
http://codeformatter.blogspot.com/
For equation, you can use latex in blogger following the instructions on page:
https://www.codecogs.com/latex/integration/blogger/install.php
Ordinary Least Squares Model
In ordinary least squares model, optimal w is given by:
\$w=X^{-1}Y\$
A draft function for this is written as
\$w=X^{-1}Y\$
A draft function for this is written as
void CLinearRegression::ComputeRegression()
{
int m = m_X.size();
int n = m_X[0].size();
MatrixXd mat(m,n+1);
VectorXd rhs(m);
for (int i = 0; i < m; i++)
{
for (int j = 0; j < n+1; j++)
{
if (j < n)
{
mat(i, j) = m_X[i][j];
rhs(i) = m_Y[i];
}
else
{
mat(i, j) = 1;
}
}
}
JacobiSVD<MatrixXd> svd(mat, ComputeThinU | ComputeThinV);
MatrixXd res=svd.solve(rhs);
m_coef.resize(n);
for (int i = 0; i < n; i++)
{
m_coef[i] = res(i, 0);
}
m_intercept = res(n, 0);
}
Wednesday, March 26, 2014
An Introduction to Eigen
What is Eigen?
Eigen is a high-level C++ library of template headers for linear algebra, matrix and vector operations,numerical solvers and related algorithms.
How to install Eigen?
Get the latest Eigen package from here. Unzip it and include the appropriate header files. There is no need to compile or include binaries.
How to use Eigen?
I have included my code for computing ridge regression here to show some sample usages of Eigen. Most of the time it is very straight-forward.
Eigen is a high-level C++ library of template headers for linear algebra, matrix and vector operations,numerical solvers and related algorithms.
How to install Eigen?
Get the latest Eigen package from here. Unzip it and include the appropriate header files. There is no need to compile or include binaries.
How to use Eigen?
I have included my code for computing ridge regression here to show some sample usages of Eigen. Most of the time it is very straight-forward.
#include<Eigen/SVD>
using namespace Eigen;
void CRidgeRegression::ComputeRegression()
{
int m = m_X.size();
int n = m_X[0].size();
MatrixXd mat(m, n + 1);
VectorXd rhs(m);
for (int i = 0; i < m; i++)
{
for (int j = 0; j < n + 1; j++)
{
if (j < n)
{
mat(i, j) = m_X[i][j];
rhs(i) = m_Y[i];
}
else
{
mat(i, j) = 1;
}
}
}
MatrixXd tmp = mat.transpose()*mat + m_alpha*MatrixXd::Identity(n + 1, n + 1);
JacobiSVD<MatrixXd> svd(tmp, ComputeThinU | ComputeThinV);
MatrixXd res = svd.solve(mat.transpose()*rhs);
m_coef.resize(n);
for (int i = 0; i < n; i++)
{
m_coef[i] = res(i, 0);
}
m_intercept = res(n, 0);
}
Subscribe to:
Posts (Atom)