by

Robert Murray, Ph.D.

www.omicronrsch.com

(Revised October 19, 2006)

Most technical analysts are very
familiar with the ordinary simple moving average (MA) and exponentially
weighted moving average (EWMA). These are methods for *smoothing*
the price (or other) series, and various technical indicators such as the MACD
can be constructed from them. These methods were devised sometime in the
past when the only methods that were available were drawing graphs on graph
paper and doing calculations manually. But now that personal computers
are available, a much wider range of methods are feasible for constructing technical
indicators. This article will describe the **Savitzky-Golay smoothing
filter**, which is a much more sophisticated version of the simple moving
average. This filter is widely available as a routine in C, Basic, or
other languages, from __Numerical Recipes__ (see references). I have
been using the C routine, but perhaps the Basic routine could be incorporated
in a VBA program and used with Excel. The book __Numerical Recipes__
also contains an excellent discussion of this and other types of filters and
other routines (along with the code itself), upon which the present article is
based. (This book also contains many other techniques of Numerical
Analysis that would be useful for technical analysts.)

The simple moving average is the
most basic type of smoothing filter. An average is taken over the past *N*
prices, from the current price to the price *N*-1 days ago, and this
average value is taken for the current value of the smoothed price. The
number *N* is the width of the smoothing “window”. This type of
filter is called *causal* because the current value of the smoothed price
depends only on current and past prices – no future prices are used.
Also, there is an inherent time delay in this type of filter of *N*/2 days
(assuming daily data), which is the delay of the “center of gravity” of the
smoothing window relative to the current time. For example, if the price
data were a straight line with a given slope, the average value in the
smoothing window of this straight line would be its value in the middle of the
window. This value is taken as the current value of the smoothed
price. But the middle of the window is delayed by *N*/2 days
relative to the current time, so the smoothed price is delayed by *N*/2
days relative to the current price.

A moving average is used to
smooth out the short-time period fluctuations, presumed *random*, in order
to be able to see the longer-term cycles, which are presumed to be the true
“signal”. A good discussion of moving averages is in the book by Pring, __Technical
Analysis Explained__. Pring states that, depending on the time period
of the cycles you are trying to capture, different window widths *N* of
the moving average are appropriate. This then presumes the existence of
discrete cycles in the data, which can be viewed as a *deterministic signal*
buried in the *stochastic noise*. So the MA can be thought of, from
the point of view of Signal Processing, as a *de-noising* method.
This is also discussed in __Numerical Recipes__. Presumably, the
signal corresponds to the lower-frequency, longer-period part of the *spectrum*,
while the higher-frequency, shorter time-period part consists of noise.
But the MA is a *low-pass filter*, which filters out the high-frequency
noise, leaving the low-frequency signal. So the smoothed price series
from the MA is interpreted as a representation of the *signal*, consisting
of cycles of various long-period components. But the periods of these
cyclic signal components are unknown, and the MA with window width *N*
introduces a *time lag* of *N*/2 days, which will correspond to an
unknown *phase shift* that depends on the cycle period. This is the
trouble with using a causal filter. The unknown phase shift might have a
serious effect, unless in the tradition of Technical Analysis it is assumed
that there is a long-term trend that is more or less linear and constant,
except that it suddenly reverses direction after persisting over a fairly long
time. If this were the case, then the phase shift of the MA would be
negligible if the window width *N* is small compared to the time period
between trend reversals. But if the period of the signal cycle is not
much larger than *N*, the phase shift will be large compared to the cycle
period and hard to estimate.

The Savitzky-Golay (SG) smoothing
filter is described in __Numerical Recipes__. It has a wide variety of
configurations, and the simple moving average is obtained as the lowest-order
version of the SG filter. This filter can be set to be either *causal*
or *acausal* (as could the simple MA, for that matter). The *N*-day
window for this filter can be set as in the simple MA to be from the current
time to *N*-1 days in the *past*, or from the current time to *N*-1
days in the *future*, or anything between these values. The *acausal*
filter I prefer has a *symmetrical* window about the present time, so that
using an even value of the window width *N*, the current time is in the
middle of the window. Then the time lag of the smoothing filter will be
zero.

According to __Numerical Recipes__,
using the symmetrical window just described, suppose there is a sharp peak in
the price data that is being smoothed. Then the moving average window
preserves the *area *under this sharp peak, which is called the zero’th
moment. Also, as described above, the *position* of the peak is
preserved, which is called the first moment. But the simple MA smoothing
flattens the peak out, so the *width* of the peak, which is called the
second moment, is not preserved. The idea of the Savitzky-Golay smoothing
filter is to find a filter that smoothes the data but preserves the
higher-order moments, so in particular the widths of features in the data are
preserved (along with the positions – no time lag). So the acausal SG
smoothing gives a more faithful smoothed representation of the price data than
a simple MA.

How the Savitzky-Golay smoothing
filter works is as follows. Consider again the simple MA filter. In
the *N*-day smoothing window, the value of the smoothing at the right end
of the window is taken to be equal to the *average* of all the price
values in the window. In other words, we are taking a *least-squares
fit* of the *N* prices in the window, to a straight horizontal line or
constant value (zero’th order polynomial), and taking the value of the smoothing
at the right end of the window to be equal to this constant value. This
is equivalent to the zero’th-order SG filter. In the SG smoothing filter,
on the other hand, a *least-squares fit* is made to a higher-order
polynomial. A first-order polynomial is a straight line with a constant
slope, described by a linear function. This would lead to a first-order
SG filter if the data in the window were fit to this straight line. A
second-order polynomial is a parabola, described by a quadratic function.
This would lead to the second-order SG filter if the data were fit to a
parabola. In my applications I have been using the fourth-order SG
filter, in which the price data in the window are fit to a fourth-order
polynomial. In general, to preserve the widths of narrow features, the
filter order should be higher the higher the width *N* of the smoothing
window. But the typical filter orders used are two, four, or maybe six in
most cases.

Another feature of the
Savitzky-Golay smoothing filter is its capability to compute the *derivative*
of the smoothed price data. From Calculus, the *first derivative* of
a function is its rate of change at a point, or the slope of the tangent line
to the function at a point. The *second derivative* is the
derivative of the first derivative, and this measures the curvature of the
function at a point. By analogy with Physics, I compute three different
acausal smoothings of the (logarithmic) price data. The first I call the **Relative
Price**, which is the smoothed price with an *N*-day window, minus the
smoothing with a 1024-day window. This is similar to a kind of MACD
indicator. The second indicator is the *first derivative* smoothing
with an *N*-day window, which I call the **Velocity** because velocity
is the first time derivative of position. It is the first *rate of
change* of the smoothed price. The third indicator is the *second
derivative* smoothing with an *N*-day window, which I call the **Acceleration**
because acceleration is the second time derivative of position. It is the
second *rate of change* of the smoothed price. Since the filtering
is acausal and there is no time lag, the *phase* relationships between
these three indicators are preserved, and are analogous to the phase
relationships that exist between a pure sine wave or cycle and its first and
second derivatives. This combination of three indicators I call the **Harmonic
Oscillator**, and I would like to discuss these in greater detail in a
separate article.

Any smoothing filter is a *low-pass
filter*, meaning that it passes the low frequencies and suppresses the
higher frequencies. With a smoothing window *N* time units in
length, the cutoff frequency of the filter will correspond approximately to
cycles with a period twice this long, so that a half wavelength is *N*
units long. So I call the cycle with period (wavelength) 2*N* the **dominant
cycle**. If a price time series, with daily prices, is filtered by a
smoothing filter with window length *N*, the dominant cycle will be
essentially the highest frequency that is passed by the filter, and the
smoothed price series will look roughly like a wave with a period of 2*N*.
The three indicators discussed above, the Relative Price, Velocity, and
Acceleration, will then each look roughly like waves with period 2*N*.
But the three waves will differ in phase relative to each other. Because
it is an acausal filter with no time lag, the Relative Price wave will be *in
phase* with the fluctuations of the price series itself. The Velocity
is then one-quarter cycle or 90 degrees ahead of the Relative Price, so it is *N*/2
time units ahead. This is because when the Relative Price is at a minimum
[min], the Velocity or slope of the Relative Price is zero and increasing, or
is at an upward moving zero crossing [Z+]. When the Relative Price is
crossing zero moving upward, the Velocity is at its positive maximum
[max]. When the Relative Price is at a maximum, the Velocity is again
zero, but decreasing, so it is at a downward moving zero crossing [Z–].
So it can be seen that the Velocity cycle is 90 degrees or a quarter cycle
ahead of the Relative Price. For the same reasons, since the Acceleration
is the derivative of the Velocity, it is 90 degrees or a quarter cycle ahead of
the Velocity. Keeping track of these phase relationships is obviously
important if we want to use these technical indicators to time buy/sell points,
especially if the cycle we are timing is the dominant cycle or close to
it. An example of this timing problem involving the MACD will be given
shortly.

As stated previously, the SG
filter can be used for any position of the smoothing window relative to the
current time. If the current time lies on the right end of the window,
then the SG filter is a *causal* filter just like the simple MA. In
fact, the simple MA in this case is the zero’th-order SG filter. But
using the fourth order SG filter, the data in the window are fit by a
fourth-order polynomial instead of a zero’th-order polynomial (constant average
value). The value of the smoothed data is taken as the value of this
polynomial at the right end of the window (current time). But it can be
seen that this will drastically reduce the apparent *time lag* if a
fourth-order causal SG filter is used, as opposed to the simple MA.
Considering again the example of a straight-line price series, if an SG filter
of first order or higher is used, then the fit of this price series to the
polynomial inside the smoothing window is *exact*, so there would be *no
time lag* in this (unrealistic) case. Using the fourth-order SG
filter, if the price data were any polynomial up to fourth order, the time lag
would likewise be *zero*, and the smoothing of the price curve would be *exactly*
the same as the price curve itself. For realistic cases, then, the
fourth-order causal SG smoothing will have virtually no apparent time lag,
compared to simple MA smoothing, so this is a big advantage while still
retaining the *causal* filtering. (However, I prefer the *acausal*
smoothing because it is automatically a *zero-phase* filter with no time lag
at all, so that the phase relationships with the data are preserved.)

To use the SG filter, there is a
routine in __Numerical Recipes__ that computes the Savitzky-Golay filter
coefficients, given the number of data points to the left and right of the
current time in the data window, the order of the derivative, and the order of
the smoothing (which I take to be four). This routine just computes the
filter coefficients – it does not depend on the price data series itself.
Then the filter coefficients are utilized in a *convolution* with the
price data to produce the smoothed price data. A separate routine is used
for the convolution, which takes as input the price data series of a given
length, the SG filter coefficients, and outputs the smoothed price
series. This convolution routine uses the **Fast Fourier Transform**,
so the length of the data input must be an integer power of two. This is
accomplished by *zero-padding* the price data, meaning filling up the rest
of the array of length *n* with zeros after the price series. The
zero-padding is necessary for the convolution function to avoid the
“wrap-around problem”, as described in __Numerical Recipes__. This can
happen when there is not enough zero-padding and the data near the two ends of
the data series get mixed together, since the routine treats the data as being *periodic*.
Generally, the minimum number of zeros is of the order of the filter width *M*,
but it does not hurt to use more zeros. Also, I usually de-trend the data
by subtracting a line passing through the first and last data points, then add
this line back after the smoothing. This eliminates any kind of
transients that might occur at the end points, in particular at the *present
point in time*, due to a drastic discontinuity between the price data and
zero.

It should be noted again that
this convolution routine uses the technique of **Fast Fourier Transform (FFT)**.
This means that the smoothing curve can be extended right to the end of the
price data series and beyond. This is unlike the situation where the
moving average is computed directly. For example, in conventional
smoothing with a simple MA of window width *N*, the first smoothed value
would start *N* days after the first day of prices. If acausal
smoothing is used in the conventional smoothing, the smoothed values would
begin *N*/2 days after the first data value, and end *N*/2 days
before the current point in time. But using the convolution routine based
on the FFT, the smoothing values can be *extended* right to the end of the
price series and beyond, even for acausal smoothing. This is another nice
feature of the SG smoothing filter, compared to conventional smoothing
techniques.

When the smoothing is performed,
the output of the SG filter actually extends *beyond* the *present point
in time* into the *future*, where there was only zero-padding
originally. So in this way the SG filter actually produces a **Price
Projection**. This is not the best way to do a price projection – a
better way is to use a **Linear Prediction** filter. However, in a
trending market such as in the late ‘90s, this projection would have a certain
degree of validity. This is merely a more sophisticated version of the
simple method of making a projection from drawing a *trend line* through
the data and extending it into the future, on the presumption that the trend
will be continued. This will be valid in a trending market, one that
exhibits **trend persistence**. This type of market is one for which
the *spectrum* of returns is *peaked* at the low end of the spectrum,
corresponding to the constant trend line or the low-frequency cycles.
These low-frequency cycles can be regarded as the *signal* buried in
high-frequency *noise*. So the price is extrapolated by filtering
out all frequency components with periods less than 2*N*, with a smoothing
window of length *N*, and then these low-frequency components are
extrapolated by the smoothing filter into the future. These low-frequency
components presumably correspond to the *signal*, while the high-frequency
components are the stochastic *noise*, so what we are doing is using a
simple *de-noising* procedure. This type of procedure is also
described in __Numerical Recipes__.

What I normally do is use the
Savitzky-Golay smoothing filter in conjunction with a **Linear Prediction**
filter for the **Price Projection**. The Linear Prediction filter
gives a Price Projection based on **correlation** in the past price series,
and this Price Projection is added on to the end of the past price series to
extrapolate into the future. Then the whole series of past plus future
projected prices is smoothed using the Savitzky-Golay smoothing filter.
By using the smoothing filter on the Price Projection, I hope to filter out the
extraneous high-frequency stochastic *noise*, leaving the lower-frequency
modes, which hopefully contain the *signal*. It should be noted that
in the approach described above, where the past price data are de-trended by a
trend line passing through the first and last points of the past data, this is
equivalent to using a Price Projection consisting simply of a straight trend
line with slope equal to the average return, extending from the last price
point into the future. Then the past price series and this straight-line
future Price Projection are smoothed together with the SG filter, yielding an
improved Price Projection.

Incidentally, it should be
mentioned here that the other simple type of moving average filter is the **exponentially
weighted moving average (EWMA)**. This type of MA is not directly
related to the SG smoothing filter, which uses a finite window of length *N*,
while the EWMA uses an infinitely wide (ideally) window. But it has been
shown [Durbin & Koopman (2001)] that the EWMA is the **Linear Prediction**
filter for an MA(1) process, a *moving average stochastic process* of
order 1. So this is another example of a smoothing filter that can be
used as a Linear Prediction filter. This simple type of prediction filter
evidently works well in some situations, those that are well described by the
MA(1) process (a special case of an ARMA process). Probably it has been
used for the same purpose in Technical Analysis in the past.

The **Moving Average Convergence
Divergence (MACD)**, is defined by Pring as a shorter-term EWMA divided by a
longer-term EWMA. These are the moving averages of prices, whereas up to
now I have been talking about *logarithmic* prices. If you take the
logarithm of the *quotient* of two prices, this is the *difference*
of the logarithms of the prices. So I use a logarithmic MACD based on log
prices, which is the *difference* of a shorter-term MA minus a longer-term
MA. If this indicator is based on the acausal SG smoothing instead of the
EWMA, then an example of this kind of MACD indicator would be the **Relative
Price** indicator.

The **MACD** can be used to
time trades. If we idealize the prices as following a definite cycle with
some specified period, then we would want to buy at the low points and sell at
the high points. These buy/sell points would then be the minima/maxima of
the Relative Price indicator defined above, and they would be the
positive/negative zero-crossing points of the Velocity indicator. These
latter points will always be in phase with the buy/sell points using the
acausal SG filter with zero time lag to define the Velocity indicator (at least
for *past* price data). However, the positive/negative zero-crossing
points of the MACD indicator are often used as buy/sell points, and this will
necessarily involve a certain amount of time lag. If the cycle we are
trying to time has a period much longer than the period of the longer EWMA in
the MACD, then the MACD acts as a sort of time-delayed Velocity, so these
buy/sell points are approximately correct (compared to the long time period of
the cycle). However, for shorter period cycles the MACD buy/sell points
will be seriously delayed, by an amount that may be hard to determine.
This is the problem with using causal MA’s for determining buy/sell points.
The buy/sell points determined from an acausal MA with zero time lag, in the *past*
data, will always be in phase (using the Velocity indicator). However, of
course, to use the acausal MA to predict *future* buy/sell points requires
the use of some kind of **Price Projection**, which can itself introduce a
phase lag of unknown amount. So neither method is perfect, but at least
the zero-lag acausal filter gives the correct buy/sell points in the *past*
data.

Now supposed the cycle period that we are trying to time is much shorter than the longer time period of the MACD. Then if the MACD were constructed using the acausal SG smoothing with no time lag, it would be equivalent to the Relative Price indicator discussed above. We would be timing the buy/sell points according to the positive/negative zero crossings of the Relative Price. But we have already seen that the correct buy/sell points are the positive/negative zero crossings of the Velocity indicator, and the Relative Price lags this indicator by a quarter cycle or 90 degrees of phase. So the buy/sell signals will be too late by this amount if based on the positive/negative going zero crossings of the MACD. If the cycle period is somewhere of the order of the longer time period MA of the MACD, then the phase delay will be somewhere between 0 and 90 degrees, even if constructed using the zero time lag acausal SG smoothing.

If the MACD is constructed using
a simple MA or EWMA, there will be an additional time lag in the buy/sell signals
of *N*/2 or *N* days. The simple MA with window width *N*
has a time lag of *N*/2. However, if you define the EWMA as exp[-*t*/*N*],
then it will also have an “effective width” *N* (equal “areas” of the two
windows), but a time lag of *N* instead of *N*/2 (from doing a simple
integral). So if the length of the cycle we are trying to time is 2*N*,
corresponding to the *dominant cycle* of the shorter time period MA in the
MACD, then the simple MA smoothing will give an additional quarter cycle of
phase delay, and the EWMA smoothing will give an additional half cycle phase
delay, on top of the quarter cycle due to using the crossing points of the
Relative Price instead of the Velocity. If the simple MA is used to time
the *dominant cycle*, this means that the buy/sell signals are *180
degrees out of phase* with the correct signals. But we don’t know
exactly what the phase delay is, because in the real situation we don’t know
the periods of the cycles we are trying to time. The actual price action
will be a sum of many cycles, and the actual phase delay cannot be determined.

The way I have used this type of
indicator is to actually *measure* the **correlation** of the indicator
with **future returns**, for a given fixed **time horizon**, and then*
adjust the phase* of the indicator for *maximum correlation*. In
this way any of the various technical indicators based on various types of
smoothing could be used to time buy/sell points, once the *phase* of the
indicator is adjusted for **maximum correlation with future returns**.
But if this is not done, there is no way to tell for sure which indicator will
give the correct phase relationships, because that depends on the periods of
the various cycles we are trying to time. In other words, it depends on
the **correlation** structure of the returns time series. For the *past*
data, the acausal Savitzky-Golay smoothing filter with zero time lag yields a
Velocity indicator that is in phase with the returns, but for the future
returns there is still an unknown phase shift due to the smoothing of the **Price
Projection**. So to be sure of the correct phase relationships and
buy/sell points, this phase shift should be *measured* and corrected for,
by measuring the **correlation** of the indicator with **future returns**.
The correlation tests that can be used for this are also contained in __Numerical
Recipes__, although it is not hard to make one up from scratch.

One excellent area where the acausal Savitzky-Golay smoothing filter can be used is in the construction of Bollinger Bands. Evidently these are usually based on one or another variety of causal moving average, which means that they will have an inherent time lag. Using the acausal SG filter eliminates this time lag. I like to use the acausal SG smoothing with a window width of 1024 days, in which I plot a center curve and two sets of curves on either side of it, corresponding to one and two standard deviations of the log prices away from the center curve. An example of this is shown here for MSFT stock:

The center curve is dark yellow,
the one standard deviation Bollinger Band is dark cyan, and the two standard
deviation Bollinger Band is dark magenta. Also, difficult to see
underneath the price data, is a dark blue acausal smoothing with a window of 40
days. If you assume a return-to-the-mean mechanism, as we seem to have at
the present time for some or most stocks, then the Bollinger Bands can be a useful
guide to buy/sell points. The buy points would be between the lower cyan
and magenta bands, and the sell points would be between the upper cyan and
magenta bands. The price should be within each of these two regions
roughly 15% of the time, if Gaussian statistics are assumed to hold (maybe with
some “fat tails” in addition). As stated previously, all of these curves
can be extended to the present time and beyond, to make a projection, in spite
of the long time scale of the smoothing. Normally I use a **Linear
Prediction** filter of some type for the Price Projection, extending the past
data into the future, and then smooth the whole time series using the SG filter
to obtain the smoothed curves for the Bollinger Bands. But a simple trend
line of the type discussed previously could be used instead of the LP filter.

Price smoothing is used in
technical analysis because it provides a way to try to *de-noise* the
price data and discern more clearly any underlying *signal* underneath the
*noise*. A variety of technical indicators can be constructed out of
these smoothing curves, for various time scales of smoothing. In previous
decades, the simple MA and Exponentially Weighted MA were the main methods for
smoothing data, because they were the only methods that were practical to
compute by hand. However, now that we have powerful desktop computers,
the Savitzky-Golay smoothing filter is practical to use, and it offers a number
of advantages over the older methods. The ordinary simple MA is actually
the simplest case of the Savitzky-Golay filter, corresponding to the zero’th
order filter, but the second and fourth order filters are better because they
better preserve the *higher moments* of the data. The *acausal*
filter also has no time lag, and the *causal* filter with the higher order
smoothing has much less apparent time lag than the zero’th order filter (simple
MA). The filter can also compute smoothed *derivatives* of the data,
which offers the potential for an even wider variety of technical indicators to
be constructed. The filter can also serve as a simple *Linear
Prediction* filter for the purpose of constructing a *Price Projection*.
However, there are better and more sophisticated methods for Linear Prediction
that can be used, and these can be used in conjunction with the Savitzky-Golay
smoothing filter. The SG smoothing filter alone is best used for price
projection in trending markets that exhibit *trend persistence*, since the
trend is the low-frequency component of the price series, and the SG smoothing
filter is just a type of *low-pass filter*.

J. Durbin & S. J. Koopman, __Time Series Analysis by
State Space Methods__,

Oxford University Press (2001)

William H. Press, Saul A. Teukolsky, William T. Vetterling, & Brian P. Flannery,

__Numerical Recipes in C, The Art of Scientific Computing,
2 ^{nd} ed.__,

Cambridge University Press (1992)

Martin J. Pring, __Technical Analysis Explained, 3 ^{rd}
ed.__,

McGraw-Hill (1985)

Robert Murray earned a Ph.D. in theoretical particle physics
from UT-Austin in 1994. He obtained a stockbroker license at about the
same time, and started his financial software company, *Omicron Research
Institute*, soon afterward. This company is devoted to the study of
the Econometrics of the financial markets, and finding optimal trading rules
based on Time Series Analysis and Signal Processing techniques. Robert
has been intensively studying Time Series Analysis, Signal Processing, and
Stochastic Processes since 2001. He has been trading in stocks and
studying Technical Analysis since 1988.