Householder matrices are square matrices of the form
$$ P = I - \beta v v^T$$
where $\beta$ is a scalar and $v$ is a vector. It has the useful
property that for suitable chosen $v$ and $\beta$ it makes the product
$P x$ to zero out all of the coordinates but one, that is, $P x =
|x| e_1$. The following code, given $x$, finds the values of $\beta,
v$ that verify that property. The algorithm can be found in several
textbooks ^{1}
def house(x):
"""
Given a vetor x, computes vectors v with v[0] = 1
and scalar beta such that P = I - beta v v^T
is orthogonal and P x = ||x|| e_1
Parameters
----------
x : array, shape (n,) or (n, 1)
Returns
-------
beta : scalar
v : array, shape (n, 1)
"""
x = np.asarray(x)
if x.ndim == 1:
x = x[:, np.newaxis]
sigma = linalg.norm(x[1:, 0]) ** 2
v = np.vstack((1, x[1:]))
if sigma == 0:
beta = 0
else:
mu = np.sqrt(x[0, 0] ** 2 + sigma)
if x[0, 0] <= 0:
v[0, 0] = x[0, 0] - mu
else:
v[0, 0] = - sigma / (x[0, 0] + mu)
beta = 2 * (v[0, 0] ** 2) / (sigma + v[0, 0] ** 2)
v /= v[0, 0]
return beta, v
As promised, this computes the parameters of $P$ such that $P x = |x| e_1$,
exact to 15 decimals:
>>> n = 5
>>> x = np.random.randn(n)
>>> beta, v = house(x)
>>> P = np.eye(n) - beta * np.dot(v, v.T)
>>> print np.round(P.dot(x) / linalg.norm(x), decimals=15)
[ 1. -0. -0. 0. -0.]
This property is what it makes Householder matrices useful in
the context of numerical analysis. It can be used for example to
compute the QR decomposition of a given matrix. The idea is to
succesively zero out the sub-diagonal elements, thus leaving a
triangular matrix at the end. In the first iteration we compute a
Householder matrix $P_0$ such that $P_0 X$ has only zero below the
diagonal of the first column, then compute a Householder matrix $P_1$
such that $P_1 X$ zeroes out the subdiagonal elements of the second
column and so on. At the end we will have that $P_0 P_1 ... P_n X$ is
an upper triangular matrix. Since all $P_i$ are orthogonal, the
product $P_0 P_1 ... P_n$ is again an orthogonal matrix, namely the
$Q$ matrix in the QR decomposition.
If we choose X as 20-by-20 random matrix, with colors representing
different values
we can see the process of the Householder matrices being applied one
by one to obtain an upper triangular matrix
A similar application of Householder matrices is to reduce a given
symmetric matrix to tridiagonal form, which proceeds in a similar way
as in the QR algorithm, only that now we multiply by the matrix $X$ by
the left and right with the Householder matrices. Also, in this case
we seek for Householder matrices that zero out the elements of the
subdiagonal plus one, instead of just subdiagonal elements. This
algorithm is used for example as a preprocessing step for most dense
eigensolvers
Ridge coefficients for multiple values of the regularization parameter
can be elegantly computed by updating the thin SVD decomposition of
the design matrix:
import numpy as np
from scipy import linalg
def ridge(A, b, alphas):
"""
Return coefficients for regularized least squares
min ||A x - b||^2 + alpha ||x||^2
Parameters
----------
A : array, shape (n, p)
b : array, shape (n,)
alphas : array, shape (k,)
Returns
-------
coef: array, shape (p, k)
"""
U, s, Vt = linalg.svd(X, full_matrices=False)
d = s / (s[:, np.newaxis].T ** 2 + alphas[:, np.newaxis])
return np.dot(d * U.T.dot(y), Vt).T
This can be used to efficiently compute what it's regularization
path, that is, to plot the coefficients as a function of the
regularization parameter. Since the bottleneck of the algorithm is the
singular value decomposition, computing the coefficients for other
values of the regularization parameter basically comes for free.
A variant of this algorithm can then be used to compute the
optimal regularization parameter in the sense of leave-one-out
cross-validation and is implemented in scikit-learn's RidgeCV (for
which Mathieu Blondel has an excelent post by ). This optimal
parameter is denoted with a vertical dotted line in the following
picture, full code can be found here.
Update: a fast and stable norm was added to scipy.linalg in August
2011 and will be available in scipy 0.10 Last week I discussed with
Gael how we should compute the euclidean norm of a vector a using
SciPy. Two approaches suggest themselves, either calling
scipy.linalg.norm(a) or computing sqrt(a.T a), but as I learned later,
both have issues. Note: I use single-precision arithmetic for
simplicity, but similar results hold for double-precision.
Overflow and underflow
Both approaches behave terribly in presence of big or small numbers.
Take for example an array with a single entry:
In [0]: a = np.array([1e20], dtype=np.float32)
In [1]: a
Out[1]: array([1.00000002e+20], dtype=float32)
In [2]: scipy.linalg.norm(a)
Out[2]: inf
In [3]: np.sqrt(np.dot(a.T, a))
Out[3]: inf
That is, both methods return Infinity. However, the correct answer is
10^20, which would comfortably fit in a single-precision
instruction. Similar examples can be found where numbers underflow.
Stability
Again, scipy.linalg.norm has a terrible behavior in what concerns
numerical stability. In presence of different magnitudes severe
cancellation can occur. Take for example and array with one 10.000 in
the first value and 10.000 ones behind:
a = np.array([1e4] + [1]*10000, dtype=np.float32)
In this case, scipy.linalg.norm will discard all the ones, producing
In [3]: linalg.norm(a) - 1e4
Out[3]: 0.0
when the correct answer is 0.5. In this case $\sqrt{a^T a}$ has a much
nicer behavior since results of a dot-product in single precision are
accumulated using double-precision (but if double-precision is used,
results won't be accumulated using quadruple-precision):
In [4]: np.sqrt(np.dot(a.T, a)) - 1e4
Out[4]: 0.5
BLAS BLAS BLAS ...
The BLAS function nrm2 does automatic scaling of parameters rendering
it more stable and tolerant to overflow. Luckily, scipy provides a
mechanism to call some BLAS functions:
In [5]: nrm2, = scipy.linalg.get_blas_funcs(('nrm2',), (a,))
Using this function, no overflow occurs (hurray!)
In [95]: a = np.array([1e20], dtype=np.float32)
In [96]: nrm2(a)
Out[96]: 1.0000000200408773e+20
and stability is greatly improved
In [99]: nrm2(a) - 1e4
Out[99]: 0.49998750062513864
Update: as of scipy 0.10, this function is used by scipy.linalg.norm .
Timing
Computing the 2-norm of an array is a very cheap operation, thus
computations are usually dominated by external factors, such as latency
of memory access or overhead in the Python/C layer. Experimental
benchmarks on an array of size 10^7 show that nrm2 is marginally slower
than $latex \sqrt{a^T a}$, because scaling has a cost, but is is also
more stable and less prone to overflow and underflow. It also shows that
scipy.linalg.norm is the slowest (and numerically worst!) of all.
$\sqrt{a^T a}$ |
BLAS nrm2(a) |
scipy.linalg.norm(a) |
0.02 |
0.02 |
0.16 |