(Revised April 12, 2004)
Robert Murray, Ph.D.
Omicron Research Institute
(Copyright Ó 2013 Omicron Research Institute. All rights reserved.)
It is common wisdom that the stock market is efficient, meaning that all available information is instantly taken into account by the market, and stocks are fairly priced at all times. This implies that past information cannot be used to predict future stock prices and make a consistent profit. The mathematical expression of this idea is (presumably) that stock price fluctuations (returns) are random Gaussian noise, and hence the price movements of stocks correspond to Brownian motion. This is the content of the Random Walk theory of the stock market. At best, according to this theory, there is a secular trend, which reflects the volatility (risk) of the stock investment.
However, it is clear to most traders and investors that, while the financial markets are indeed very efficient, they are not 100% efficient. The remaining inefficiencies may be utilized to try to make a profit and “beat the market”. The financial markets cannot be completely efficient, because investors and traders are not perfectly well informed, perfectly rational, and they do not react instantaneously to new market information. Instead, some players in the market act on the basis of incomplete information, behave irrationally and display a “herd mentality”, and they have a finite “time horizon”. These investor behaviors introduce a small degree of correlation between different market data, which may be exploited by the astute trader or investor. It is our purpose here to try to model these small correlations and develop trading rules to best exploit them. If these trading rules were to become widely known among traders and investors, then they would become ineffective because the market would become more efficient. If the market were totally efficient, then all financial assets would be fairly priced, volatility would be minimized, and dangerous market manias and crashes would be eliminated. Ideally, we would like to see the market become as efficient as possible, so that all assets are fairly priced. This will be a happy day when it finally arrives, but in the meantime, astute traders and investors can make use of the inefficiencies of the market to make a profit and “beat the market”, at the expense of the less astute traders and investors.
When the market is efficient, and the price of stocks is a fair and true reflection of their economic value, then this is the desired situation. Then the most investment capital will be allocated to those companies with the most potential to grow and produce the greatest wealth, and all investors will gain by this. However, if the market is inefficient, then traders who make money on these short-term inefficiencies can only do so when other traders sustain a loss, so it is a zero-sum game. If somebody or other has to lose because the market is inefficient, it is better for it to be the other guy. However, it would be better for everyone if the market could be made as efficient as possible, so that investment gains are a reflection of an increase in true economic value. This will happen if traders can react to existing market inefficiencies in such a way as to reduce or eliminate them, making the market more stable (and less volatile), rather than causing the market to become even more inefficient and unstable (and more volatile). This should happen, in general, if traders can use good technical indicators to judge the state of the market, and take appropriate action to anticipate impending moves in the market.
The central issue in any active trading strategy is the expected return for a given level of risk. The return is the most likely gain, in percentage terms, per year, and by risk we mean the variance, or average spread of the returns (squared), to be expected per year. In a (stationary Gaussian) stochastic process, both the expected return and the variance (square of the expected “spread” of the returns, or square of the standard deviation) increase proportionally to time. Hence we study the ratio of expected return to variance as a measure of the ratio of return/risk. We begin by studying some of the consequences of the Random Walk model and Efficient Market Hypothesis, and study the expected return and risk associated with an active trading strategy as compared with the “Buy & Hold” strategy.
An efficient market is defined as follows [Sharpe, et.al. (1995), p.105]:
A (perfectly) efficient market is one in which every security’s price equals its investment value at all times.
This implies that any attempt to identify mispriced securities is futile. An equivalent definition is as follows [Sharpe, et.al. (1995), p.106]:
A market is efficient with respect to a particular set of information if it is impossible to make abnormal profits by using this set of information to formulate buying and selling decisions.
This implies that in an efficient market, investors should expect to make only “normal” profits and earn a “normal” rate of return on their investments. There are actually three forms of market “efficiency”:
1) Weak-form efficiency means it is impossible to make abnormal profits by using past prices to make decisions about when to buy and sell securities.
2) Semi-strong-form efficiency means it is impossible to make abnormal profits by using all publicly available information to make decisions about when to buy and sell securities.
3) Strong-form efficiency means it is impossible to make abnormal profits by using all information, both public and private to make decisions about when to buy and sell securities.
The evidence seems to suggest that the market conforms to weak-form efficiency, but it does not conform so well to semi-strong-form efficiency, and even less so to strong-form efficiency [Sharpe, et.al. (1995)]. However, we will present other evidence to suggest that even the weak-form efficiency is only approximately valid, and it is indeed possible to use past prices alone to formulate profitable short-term trading rules.
But what is the precise meaning of the Efficient Market Hypothesis? Can this hypothesis be formulated in a precise mathematical manner? How can one devise a statistical test to determine the validity of the Efficient Market Hypothesis?
It is already clear that the Efficient Market Hypothesis is rather vague in its formulation. What is the definition of an “abnormal profit”? Presumably this refers to a specific theory of asset pricing such as the Capital Asset Pricing Model (CAPM), in which the expected rate of return is the risk-free rate (T-bill or discount rate) plus a risk premium that is proportional to the volatility of the investment. This model regards the volatility as due to Gaussian stochastic noise superimposed on a constant expected rate of return, which depends on the volatility. So perhaps the Efficient Market Hypothesis should be re-formulated in mathematical terms as follows:
A market is efficient with respect to a particular set of information if the de-trended price returns (over a given time interval) of securities are statistically independent of this set of information.
This would then imply that the prices of securities obey the Random Walk model (with drift), in which the drift velocity is a simple function of the (constant) volatility, and the price fluctuations are due to Gaussian white noise. So we may test this hypothesis by testing for the statistical independence of price returns with the information in question, in particular with the past price returns.
In The Econometrics of Financial Markets (1997), p.20, CLM offers another definition of an efficient market, which in turn is quoted from Malkiel (1992)[1]:
A capital market is said to be efficient if it fully and correctly reflects all relevant information in determining security prices. Formally, the market is said to be efficient with respect to some information set…if security prices would be unaffected by revealing that information to all participants. Moreover, efficiency with respect to an information set…implies that it is impossible to make economic profits by trading on the basis of [that information set].
These authors make the point that “perfect efficiency is an unrealistic benchmark that is unlikely to hold in practice”. But they consider relative efficiency, or “the efficiency of one market measured against another”, to be a more useful concept. They go on to make the following statement [CLM (1997), p.24]:
Few engineers would ever consider performing a statistical test to determine whether or not a given engine is perfectly efficient—such an engine exists only in the idealized frictionless world of the imagination. But measuring relative efficiency—relative to the frictionless ideal—is commonplace. Indeed, we have come to expect such measurements for many household products: air conditioners, hot water heaters, refrigerators, etc. Similarly, market efficiency is an idealization that is economically unrealizable, but that serves as a useful benchmark for measuring relative efficiency.
So the conclusion is that the Efficient Market Hypothesis is a useful idealization and is approximately true in the real world, but that market inefficiencies can and do exist. It is then up to the astute trader to find and make use of these inefficiencies in order to make profitable trades and “beat the market”.
If investors had perfect access to all economic and market data, had perfect judgment and foresight and knew the correct way to react to all situations, and could react instantaneously, then the financial markets would be perfectly efficient. However, this is not the case. Some investors are more astute than others in these areas, and these astute investors will be able to win and “beat the market”. Some of the main sources of inefficiency in the stock market are the following (arranged in approximate order of importance):
Some of the ways in which technical (and/or fundamental) indicators can be used to profit in the market are the following (arranged in approximate order of importance):
From these lists of market inefficiencies and ways to take advantage of them, it should be clear that it is of paramount importance to maintain an accurate picture of the long-term outlook for the economy, the market, and the individual company. Trading strategies based only on short-term indicators, that ignore the long-term picture, will lose, because there is no way to judge short-term value apart from the long-term context. But when viewed within the context of the long-term picture, technical indicators can be used for short-term trading to increase average returns (with a corresponding increase in risk), while maintaining an appropriate long-term investment strategy. Thus technical indicators should be used within the context of overall portfolio selection and strategy, to maximize returns while minimizing risk.
It should be added that, logically, the market must be inefficient to some extent. The market is kept efficient by the actions of short-term traders and arbitrageurs, but these types of traders are motivated by the profit motive, based on their perception that the market has inefficiencies that can be exploited. If the market were perfectly efficient, then there would be no incentive to do any short-term trading, and then the market would become inefficient because there would be no mechanism to keep it efficient. The conclusion is that the market is maintained at a level of inefficiency that is just sufficient to motivate short-term traders to keep it at that level of efficiency. Thus, if a trader is able to exploit the residual inefficiency to a greater degree than other traders, then he/she should be able to profit from short-term trading. Another way to look at it is that the market should be considered efficient for the average investor. But the average investor limits the complete efficiency of the market as mentioned above. More astute traders and investors will be able to spot inefficiencies and take advantage of them to make a profit. At the same time, less astute investors will tend to lose money in the market, by succumbing to the various pitfalls mentioned above. Trading and investing are very much a competition with definite winners and losers, not merely a game of chance in which winning is a matter of pure dumb luck. It is more like a poker game, in which a skilled player can win by taking the best advantage of lucky situations as they arise, and avoid serious losses in unlucky situations.
It is not exactly clear how the Efficient Market Hypothesis implies a particular mathematical description of the stochastic process of price returns. First of all, it is not clear what is meant, exactly, by “abnormal profits”. Evidently some people do not admit the existence of a secular trend in securities prices, although market averages (in the United States) have been growing roughly exponentially for the past 200 years. However, if the distribution of stock price returns has zero mean for the log prices, then there will be a natural upward bias for the actual price returns due to the volatility, as will be shown below. So we could postulate that the distribution of logarithmic price returns has zero mean, and the secular trend in actual prices arises solely due to the volatility. This is a small effect that is always smaller than the volatility of the prices themselves. But it appears that the secular trend is actually much larger than this effect, so we must include it explicitly in our model. However, from an intuitive point of view, we would expect the exponential secular rise in prices to be due to the corresponding exponential expansion of the overall economy. This being the case, it makes sense to invest for the long term to take advantage of this secular trend. Another point of view, derived from the CAPM, is that the secular trend is a “normal profit”, which compensates investors for the risk that they must bear due to the volatility. However, it is not clear a-priori what value this risk premium should have.
The next question that is unanswered by the EMH is the precise statistical nature of the stochastic process of (log) price returns. Perhaps the most extreme point of view is that the log price returns constitute a strictly stationary time series of Gaussian random variables with constant mean and constant variance s^{2}, N[0,s^{2}]. If this is the case, then it is indeed impossible to predict any short-term price fluctuations, and any short-term trading strategy is fruitless. The only trading strategy that makes any sense is the buy-and-hold strategy, making use of the long-term secular trend due to the (constant) mean of the log price returns. This specification of the stochastic process as a Gaussian stationary process implies that the mean, covariance, and all higher-order moments of the probability distribution functions are constant and are given by the Gaussian distribution. In particular, the correlation of the next day’s log price return with any function of the past returns, whether linear or nonlinear, will be zero, and no technical indicator will have any predictive power. This is a very strong assumption, and is probably unrealistic. A weaker specification of the stochastic process of log price returns is that it is a second-order stationary time series of independent, identically distributed (IID) random variables with constant mean and constant variance s^{2}, IID[0,s^{2}]. This implies that the price returns are not necessarily Gaussian, and although the random variables are uncorrelated, there may be higher-order correlations that are nonzero. In fact, the Higher-Order Statistics (HOS) [Nikias & Petropulu (1993)] are associated with non-Gaussian distributions. This would imply that no technical indicator that is a linear function of the past prices, including any “block” linear filter, would have predictive power. But a technical indicator, which is a nonlinear function of past prices, such as an “adaptive” linear filter in which the filter coefficients are actually functions of the past prices, might well have predictive power. So the EMH could either imply a strictly stationary process or a second-order stationary process for the log price returns. If we assume that the EMH is to be interpreted in the sense of a strictly stationary Gaussian process, we will see that inefficiencies do indeed exist, and the EMH is only approximately correct.
If the past price returns and other data are uncorrelated with future returns, then the EMH is valid and no technical indicator can have predictive power. On the other hand, if some correlation does exist between past prices or other data and future returns, then this correlation may be used to make a partial prediction of the future returns. This can lead to a set of profitable trading rules. The central problem is then to identify the existence of this correlation and then formulate an optimal set of trading rules to take advantage of it. We may define a technical indicator to be some function of the past stock prices that is correlated with future returns. What is the nature of this correlation? In an “ordinary” time series, such as an ARMA (Auto-Regressive Moving Average) series, the correlation is greatest for short time “lags”, and decreases exponentially for longer lags. However, there is evidence that in the case of financial data, a “fractional” process, which has the property that the correlation decreases like a power law rather than exponentially, may more accurately describe the correlation. This makes sense, if investors are basing their investing decisions on the long-term price history of the stock. So it is important to look for correlation going far back into the past, not just in the recent price returns.
On the other hand, it has been demonstrated that there is rather strong correlation in the price “ticks” of stock prices, for time scales less than about 20 minutes. This inefficiency evidently results from the fact that most traders do not have access to real-time quotes, and use 15-minute delayed quotes instead. However, this type of real-time trading is most appropriate for floor traders on the exchange floor, or other financial professionals. The ordinary individual investor or trader would have difficulty doing this kind of real-time trading. This topic is properly the subject of an entirely separate book, so we will not consider it further here.
Of greater interest are the correlations obtained by acting on the basis of technical patterns in the stock prices (as well as earnings and other fundamental and economic data). Let us now examine the nature of this correlation. From Technical Analysis, there would appear to be two basic kinds of correlation. These are price trends and reversion to the mean. The reversion to the mean mechanism must (theoretically) exist to some extent in stock prices, because it is a fundamental mechanism in the establishment of an equilibrium price for the security. If the price fluctuates below this equilibrium price then buying pressure increases, driving the price back up to its equilibrium value. If the price fluctuates above the equilibrium price, then selling pressure drives the price back down to equilibrium. So this phenomenon is fundamental to the pricing mechanism of the market. If the market were perfectly efficient, of course, then the price would never fluctuate from its “true” equilibrium price. But investors and traders sometimes over-react to news stories or other developments, causing a price fluctuation, which then corrects back to equilibrium when the market regains its long-term perspective. This is just the “Buy Low – Sell High” mechanism of the market. So there should be a (negative) correlation between the variation of the price from some equilibrium value, and future returns. The other main type of correlation is price trends. If the price is trending higher, then it should continue to do so. This is the most basic kind of correlation to measure in the financial price return data, as it takes the form of autocorrelation of the price returns series. Unfortunately, it appears that when the long-term correlation of price returns is measured, there seems to be virtually no correlation present. However, this seems to run contrary to real-life experience, in which trending behavior definitely seems to be present. What appears to be going on is that short-term correlation seems to exist in the data, so that a trend will develop, run for a while, and then dissipate again. Evidently what is happening is that the correlation of the returns is time-dependent. Over various intervals of time the correlation of returns is either positive or negative, and it all averages out to zero over a long time interval. So to measure this correlation, it is necessary to measure it over short time intervals, and consider a time-dependent autocorrelation function. Likewise, the reversion to the mean correlation is also probably time dependent. Another way of saying this is that as a stochastic process, the price returns process is nonstationary.
In any event, the correlations in question must inevitably be very small. If there is a correlation of, say, 1% between some technical indicator and the daily price returns, and variance in the price returns is also 1%, then this is an expected gain per day of 0.01%. This sounds small, but when compounded over 256 trading days in a year, it yields a compounded annual gain of 2.6%, which is a worthwhile (extra) gain. A 10% correlation between the indicator and the daily returns leads to an annual gain of 29%, which is excellent! On the other hand, a correlation of 3% between a technical indicator and the daily price returns leads to an annual gain of 8%, which is very respectable. But if the autocorrelation function of daily returns is measured as the sample autocorrelation, using, say, 1000 days of past price data (about four years), then the measurement error of this autocorrelation has a standard deviation of _{}. At the 95% confidence level, the bounds of the measurement error are _{}. But a correlation at this level leads to an annual gain of 17.2%! Thus we see that practically all the usable correlation in the price data is going to be smaller than the bounds of the measurement error, whether at one standard deviation or at the 95% confidence level. This means that the variance in the returns, or risk, is always of the same order of magnitude as the expected returns. To detect the presence of these small correlations below the level of the measurement error, therefore, it will be necessary to develop some alternative statistical tests, and to be content with a rather large variance in the returns. Nevertheless, even though the value of each correlation coefficient, for a given time lag, is very uncertain, we can verify the existence of this correlation by measuring the square of the correlation coefficient, averaged over a range of time lags. Statistical measures for doing this will be described shortly. However, it should be pointed out that part of the problem arises when we attempt to measure the correlation as a long-term sample average, based on the premise that the statistics are stationary. In fact, as mentioned above, the statistics are almost certainly nonstationary, so some measure of correlation over short time intervals, as a function of time, would be more appropriate. In this regard, the wavelet methods appear very promising for financial data, as opposed to working with the time series directly (time domain) or its Fourier transform (frequency domain), both of which rely on the assumption of stationarity [GSW (2002)].
In statistics, various inferences and deductions are made on the assumption that some sort of “Law of Large Numbers” applies, so that taking some kind of limit as _{} leads to certain conclusions. For example, the theoretical definition of correlation and expectation depends on having a very large “statistical ensemble” consisting of N similar systems, which vary from each other only in the values of their random variables, and then inferences are drawn based on taking the limit as _{}. However, in finance there are not available a large number N of “instances” of IBM stock, for example, over a given time interval. In the case of a stationary stochastic time series, we may invoke the postulate that the system is ergodic, which means that time averages may be substituted for ensemble averages. Thus, for example, to calculate mean values or correlations, we take a sum over a range of data extending over a large time interval, rather than a sum over the ensemble (which does not exist). But if the time series is nonstationary, then the assumption of ergodicity cannot be valid either. If the financial time series is nonstationary, as it almost certainly is, then it cannot be a stationary series of Independent, Identically Distributed Normal random variables, as in the Random Walk model. On the other hand, the nonstationarity will make it difficult to actually identify any correlation in the data, using the usual simple tests.
What can we do about this? One solution is to employ more sophisticated tests of correlation in the data, rather than basic sample averages over the time series. For example, instead of a simple measure of correlation as a sample average, we can employ a Linear Prediction Filter to find the best linear regression of the future returns on the past data, over a given time interval. This calculation can be repeated every day, thus leading to a new fit to the data every day, so that the LP coefficients now become time-dependent. Then we can define a new measure of correlation that measures the correlation between the predicted and actual future price returns, calculated in this manner, for price returns over a given time period. (Thus we are using a measure of correlation that depends on the degree of fit of the linear regression.) Depending on the degree of nonstationarity, the time interval used for the calculation of the LP coefficients can be longer or shorter, to capture slowly or rapidly varying correlation in the data. Likewise, other technical indicators can be defined, which utilize only the most recent data, and the degree of correlation between these indicators and future returns can be measured. The other solution to this problem is to replace the ensemble averages with portfolio averages. Thus, considering only a single stock, the advantage due to using a technical indicator may be smaller than the variance of the returns, but if the same trading strategy is used for a large diversified portfolio, then the advantage will be averaged, but the variance will be reduced, and the trading advantage will end up significantly larger than the variance. Also, the correlation in the portfolio data shows up more clearly than for individual stocks, due to the fact that the effects of “exogenous influences” on the individual stock prices cancel out in the average over the portfolio. So it is important to consider the trading strategy in the context of the portfolio as a whole, rather than each stock separately. Thus, in summary, the ensemble averages in a statistical analysis are replaced in the case of financial data by the use of time-dependent technical indicators whose correlation with the future returns is measured by means of a sample average, and by means of a portfolio average.
In statistical time series, the statistical properties may be specified (in part) by means of expectations. The expectation is an average over a statistical ensemble. Suppose we had a (hypothetical) statistical ensemble of “systems” consisting of N “instances” of the same stock, for example. Then the expected value of some function of the (present and past) data at time n is given by:
_{}
This expected value of the function of the past data f(n), at time n, is also called the mean of the function.
The ability to predict future returns from past data depends on the degree of correlation between the past data and the future returns. Suppose we had some function of the past data f(n), at time n, and the future returns r(n+1). Then we could calculate the covariance between the past data and the future return by taking the expected value of their product:
_{}
The variance is similarly defined as the covariance of a quantity with itself:
_{}
Then the correlation is defined in terms of the covariance and variance as follows:
_{}
The correlation coefficient r(n) varies between the values of –1 and +1. If the two quantities are uncorrelated, then the correlation coefficient is 0, but if they are (positively or negatively) correlated, then the value of the correlation coefficient is different from zero. If a non-zero correlation coefficient exists between the function of the past data f(n), at time n, and the future returns r(n+1), then the past data can be used to make a (partial) forecast of the future returns.
In the real world of finance, there are no statistical ensembles to work with. Therefore, we have to try to approximate the ensemble averages by some other methods. The main method that is used instead of an ensemble average is a sample average, invoking the assumption that the statistics are ergodic. Then the mean value over the ensemble is replaced by the sample mean:
_{}
If the statistics are stationary, then this sample mean should not change (on the average) with time. But the more usual case is that the statistics are non-stationary, so this sample mean will be a function of the time n. In this case, we can take the sum only over m of the most recent values of the data, where m is the typical time scale over which the statistics vary. Similarly, in the covariance and variance the ensemble averages are replaced by time averages, invoking the assumption of ergodicity. When the correlation is calculated as above, using sample averages, then the correlation thus calculated is also known as Pearson’s r. Perhaps a more sensitive method of detecting correlation is to use the nonparametric or rank correlation, such as the Spearman rank-order correlation [Numerical Recipes (1992)]. This type of correlation uses only the relative ordering of the different values of the quantities involved, and is considered more robust (insensitive to the requirement of Normal or Gaussian statistics) than the usual sample correlation.
The Linear Prediction method amounts to a linear regression of the future returns on the past price and other data (to be described below). It turns out that this linear regression involves only the variances and covariances as described above. If we regress the future returns on all the N past returns, or on N functions of the past returns or other data, then the result is the linear function of the N functions which is the best fit to the future returns. We could also consider a non-linear function of the past prices and data, and perform a non-linear regression. This nonlinear regression would then involve expectations of products of the future returns with quadratic or higher functions of the past data. These higher-order expected values constitute what are known as higher-order statistics (HOS) [Nikias & Petropulu (1993)]. Using nonlinear regression methods, and example of which is the neural network, it is possible to model phenomena that cannot be modeled by linear methods. However, it should be noted that a nonlinear model refers to a model that is nonlinear in its parameters. We may still perform a linear regression on a set of technical indicators that are themselves highly nonlinear functions of the prices. In this case we are simply solving for the best linear combination of the nonlinear technical indicators, to fit the data. In most cases of interest to us (in finance), this is more than adequate, and it is not really necessary to use models that are nonlinear in their parameters. This is what might be called a parametric approach. In a nonparametric approach, such as a neural network, we apply the past prices and other data to the input, and let the network itself find the optimal nonlinear function of this data to approximate the future returns. In the parametric approach, we postulate the nonlinear technical indicators a-priori, and simply find the linear combination of these indicators that produces the best fit to the future returns.
For a stationary stochastic process, we would expect any departure from the Random Walk model to show up in the autocorrelation coefficient at lag k, _{}. This is defined over the entire data set, consisting of T data points. However, we expect that in reality financial returns data is highly non-stationary. Hence, there may be temporary correlations present, which form and dissipate due to changing conditions. Since the Random Walk appears (at first glance) to be very closely approximated by the financial time series, we might conjecture that over a long time average, short-term correlations average out. But if these short-term correlations are present, they should show up in the square of the correlation coefficient. Thus we can use the Box-Pierce Q-statistic as a measure of the presence of correlation in the price series returns [CLM (1997)]:
_{}
For large sample sizes T, it is shown that the sample correlation coefficients are asymptotically independent and normally distributed [CLM (1997)]:
_{}
Thus, if the price returns series is Gaussian distributed, then the Q-statistic is distributed like the sum of squares of m Gaussian random variables. So this statistic is asymptotically distributed as the chi-square distribution with m degrees of freedom, _{}.
Another test of correlation is the Variance Ratio test [CLM (1997)]. This test is based on the fact that for a Gaussian Random Walk, the variance of the returns after q time steps should be q times the variance of each time step. This is a consequence of the fact that the variances add over any time interval. So we can detect the presence of correlations by measuring the variance after q time steps and comparing this with q times the variance for a single time step. If the q-time step return (at time t) is denoted by
_{}
then a short calculation shows that the Variance Ratio may be expressed in terms of the correlation coefficients at lags from 1 to q-1 by [CLM (1997), p.49]:
_{}
In contrast to the Box-Pierce Q-statistic above, which is a sum of squares of the correlation coefficients, the Variance Ratio statistic is a sum of the correlation coefficients themselves (at different time lags). Hence, if our hypothesis is correct that the correlation coefficients of the non-stationary process tend to average out over time, then the Box-Pierce Q-statistic, that measures the sum of squares of the correlations, will be a more sensitive indicator than the Variance Ratio statistic. However, Campbell, Lo, & MacKinlay (1997) show that in addition to the Box-Pierce Q-statistic, the Variance Ratio statistic also indicates significant departures from the null hypothesis of a Random Walk in various stock market data.
One could also construct a variant of the Q-Statistic by using the Rank-Order correlation in place of the ordinary correlation. Another variant could be constructed by first calculating the Linear Prediction coefficients for the regression of the future returns on the past returns, then expressing the correlation coefficient in terms of these LP coefficients. This method should be equivalent, since the LP coefficients are calculated in terms of the covariances between the future returns and past returns, as explained later in connection with Linear Regression. A nonzero result in the calculation of the LP coefficients, at a given percentile level, implies the existence of correlation in the returns at this significance level.
Let us quote here some of the numbers reported by Campbell, Lo, & MacKinlay (1997), in The Econometrics of Financial Markets, Ch. 2. Not all of the results will be quoted here; for the rest please see the book. But first, it should be pointed out that much greater correlation was detected for the market averages than for individual stocks. Furthermore, CLM report “significant positive autocorrelation for the equal-weighted portfolio, and autocorrelation close to zero for the value-weighted portfolio” [CLM (1997), p.72]. Then they go on to say:
“That the returns of individual securities have statistically insignificant autocorrelation is not surprising. Individual returns contain much company-specific or idiosyncratic noise that makes it difficult to detect the presence of predictable components. Since the idiosyncratic noise is largely attenuated by forming portfolios, we would expect to uncover the predictable systematic component more readily when securities are combined. Nevertheless, the weak negative autocorrelations of the individual securities are in interesting contrast to the stronger positive autocorrelation of the portfolio returns.”
Then CLM go on to explain why the positive autocorrelations exist in the portfolio returns[CLM (1997), p.74]:
“Despite the fact that individual security returns are weakly negatively autocorrelated, portfolio returns—which are essentially averages of individual security returns—are strongly positively autocorrelated. This somewhat paradoxical result can mean only one thing: large positive cross-autocorrelations across individual securities across time.”
Thus we see that in order to utilize whatever correlations are present, it must be done within a trading strategy that applies to the whole portfolio, not just to individual stocks. We also see that the construction of this portfolio must take into account the cross-correlation between the different stocks. This may be done using the Markowitz Model portfolio or the Capital Asset Pricing Model (CAPM).
We list here the measured correlation in the daily returns for the CRSP Value-Weighted Index and the CRSP Equal-Weighted Index, over a period from 1962 to 1994 [CLM (1997), p.67]. We also note that the weekly returns for the CRSP Equal-Weighted Index also showed significant correlation (not shown). The correlation coefficient for the first four days is shown, as well as the 5-day and 10-day day Box-Pierce Q-Statistic. For comparison with the latter, we note that the chi-square value corresponding to the 99.5-percentile is 16.7 for the 5-day statistic (and slightly greater for the 10-day). Thus, a value of the Q-Statistic greater than this value indicates that there is less than a 0.5% probability that this is due to random chance, and greater than a 99.5% chance that it is significant. As can be seen, the Q(5) and Q(10) statistics listed below are much greater than this. Also, the correlation coefficient is statistically significant if it is greater than _{} (three standard deviations or 99.7-percentile), which equals 3.30% for _{}, and 4.68% for _{}. Most of the correlation coefficients r(1), r(1), r(1),r(1), are also much greater than this:
CRSP Value-Weighted Index (Daily Returns)
Period |
Size (T) |
r(1) |
r(2) |
r(3) |
r(4) |
Q(5) |
Q(10) |
‘62-‘94 |
8,179 |
+17.6% |
-0.7% |
+0.1% |
-0.8% |
263.3 |
269.5 |
‘62-‘78 |
4,090 |
+27.8% |
+1.2% |
+4.6% |
+3.3% |
329.4 |
343.5 |
‘78-‘94 |
4,089 |
+10.8% |
-2.2% |
-2.9% |
-3.5% |
69.5 |
72.1 |
CRSP Equal-Weighted Index (Daily Returns)
Period |
Size (T) |
r(1) |
r(2) |
r(3) |
r(4) |
Q(5) |
Q(10) |
‘62-‘94 |
8,179 |
+35.0% |
+9.3% |
+8.5% |
+9.9% |
1,301.9 |
1,369.5 |
‘62-‘78 |
4,090 |
+43.1% |
+13.0% |
+15.3% |
+15.2% |
1,062.2 |
1,110.2 |
‘78-‘94 |
4,089 |
+26.2% |
+4.6% |
+2.0% |
+4.9% |
348.9 |
379.5 |
Thus we see that there are very significant correlations for the daily data for both Value-Weighted and Equal-Weighted indexes, but especially the latter. For the weekly and monthly data there are still some significant correlations, but less than for the daily data. It should also be observed that the correlations are much greater for the first period than for the second, indicating that the market has become more efficient over time.
Of course, these correlations were measured only between the one-day returns and the returns at lags of 1,2,3,4 days (as well as the 5 and 10 day Q-Statistics). There are many other possible functions of the past data that could be used for technical indicators, and which could display significant correlations. If the returns time series is modeled by a fractional process, then this indicates that long-term persistent correlations are present, which decay over time like a power law rather than exponentially (as with an ordinary time series). If this is the case, then a technical indicator consisting of a long-term average of the past data could be constructed, and the correlation of this indicator then compared to the future returns. Various nonlinear functions of the data can be used as technical indicators, and their correlation with the future returns measured. Once we have decided on a set of technical indicators to use, we may use linear regression to find the best fit of the future returns to some linear combination of these indicators. This fit could be a simple Linear Prediction filter, which is the optimal fit of the future returns to the most general linear function of the past returns themselves (going back N days), or it could be a regression on a set of N functions of the past data (linear or nonlinear), considered as technical indicators. An example of the latter could be, calculating a set of N+1 moving averages of the past price data (not returns), and then taking the N differences between adjacent moving averages as the N independent technical indicators. Of course, many other technical indicators are possible.
If the Random Walk model were to hold (exactly), then each daily log price return would be a Gaussian random variable, uncorrelated and statistically independent from all past log price returns. (That the random variables are distributed Normally, according to the Gaussian distribution, implies that if they are uncorrelated then they are statistically independent. Thus there are no higher-order correlations either.) On the other hand, in real financial data we expect the market to be almost, but not quite, efficient, and hence that there should be weak correlations present in the data. These correlations may then be found by computing the correlation coefficients, or more generally the covariances, between various functions of the data and the future returns. These functions are, in general, what we mean by technical indicators, and they may be linear or nonlinear functions of the past price returns, prices, fundamental data, or other economic data. If the correlations exist, we may then use these to calculate the expected value of the future returns. We do this by means of a general linear regression of the future returns on the set of technical indicators. If the regression of the future returns is done directly on the past returns, then this is called an autoregression. The linear regression is the linear combination of the set of technical indicators that provides the best, or minimum mean square error, fit to the future returns. The linear combination of the technical indicators consists of a set of linear regression coefficients, each one multiplying each technical indicator in the set. In the more general nonlinear regression problem, we could have a situation where a technical indicator depends nonlinearly on a parameter, and we need to find the best value of this nonlinear parameter in a fit to the data. However, generally, since the correlations are so weak and the fit is not very precise anyway, we see no need to employ nonlinear regression. If a special situation arises where a technical indicator depends nonlinearly on a parameter that must be fit to the returns data, then this can be dealt with by means of one of the standard methods of parametric nonlinear regression. But this will generally not be needed. A more general technique, an example of which is the Neural Network, is to try to make a nonparametric nonlinear model of the data. This technique tries to make as few assumptions about the form of the nonlinear fit to the data, so it tries to find a “general” nonlinear model of the data. However, we prefer the method of choosing a plausible set of technical indicators to try to fit to the data using linear, parametric methods. This allows us to make maximum use of any “prior” knowledge we might have regarding the possible dynamics of the financial markets and sources of inefficiency, based upon our understanding of investor behavior. We may, of course, incorporate nonlinearity into the linear regression by choosing technical indicators that are themselves nonlinear functions of the past price and other data.
We must also make a sharp distinction between the case of a stationary time series and non-stationary time series. In the stationary case, the correlation structure, and indeed the joint probability distribution in general, is constant in time. Hence the technical indicator _{} is the same function of the past prices and economic data, for all times. In other words, the technical indicator _{} has no explicit time dependence. In the non-stationary case, on the other hand, the technical indicator _{}as a function of the past prices and economic data changes explicitly with time. This is evidently the usual case in financial time series. Thus we must search for correlations that are time dependent and continually changing with time. We must find an empirical way to measure the correlation over the recent past, for a given range of the past data over that time interval, and use this to estimate the correlation extrapolated a given time interval into the future. This empirical correlation estimate, however, must not be a fixed function of the past price and economic data. Rather, it can be a fixed nonlinear function (as for example, the definition of the correlation coefficient itself), but the correlation coefficients themselves are time dependent rather than being fixed constants. In other words, we measure the (time averaged) correlation coefficient over a short time interval in the recent past, rather than taking a long-term time average over all the data. This short-term correlation coefficient is then itself a function of time. We may also consider various nonlinear functions of the recent short-term data, and measure the correlation coefficient between these and the future returns. This correlation coefficient will again be time dependent.
We need a notation for the sample covariance of two random variables. Let us define the sample covariance as follows:
_{}
This notation takes into account the sum over only a finite data set of length m, with the present time indicated by n, and the time lag between the two variables denoted by h. Thus this covariance function is an explicit function of time, taking into account non-stationarity. It also takes into account the fact that if the variables x and h are different, then the covariance is different depending on which variable lags the other one by h time units.
Suppose the one-step future price return (or whatever variable we are estimating) is given as a function of N other variables, which could be the N past returns, some nonlinear functions of these returns, economic data, and so forth. Thus we wish to estimate the following equation:
_{}
We now calculate the covariance with each of the N variables on the r.h.s. The condition that the regression minimizes the sum of squares of the error terms _{} is precisely the condition that these error terms are “white noise”, uncorrelated with the regressor variables. (This is the part of the future return that is “unexplained” by the regressor variables.) The covariance of the regressors with the noise term is then taken to be zero as the condition for the optimal least squares fit. This then yields:
_{}
We define the covariance matrix for this set of variables at time n as follows:
_{}
We also define a covariance vector between the “desired response” _{} and the variable at time n:
_{}
Hence the linear regression equation becomes:
_{}
Assuming the random variables are linearly independent (so that the covariance matrix is not singular), we may invert the covariance matrix and solve for the LP filter coefficients:
_{}
Then this equation yields the optimal coefficients for the best linear prediction of the one-step ahead price return _{} in terms of the N variables. Note that these filter coefficients are, in general, functions both of the current time n and of the time interval m over which the covariances are calculated.
If we can approximate the random variables as uncorrelated, as for example in the case where the random variables are the past price returns, then we have approximately:
_{}
In other words, apart from a normalization factor (involving the possibly time-dependent variances), the filter coefficients are proportional to the correlation coefficients between the random variable h and the desired future price return u. Hence we can contemplate defining a more general Portmanteau statistic as the sum of squares of the LP filter coefficients:
_{}
This statistic seems to combine the usual Q-Statistic with the Variance Ratio test. Thus a direct measure of the correlation in the data and hence the predictive power, is to compare this statistic with the chi-square distribution _{}.
This leads to the interesting idea of computing the filter coefficients for a variety of N values. Then, for each N, the Portmanteau statistic can be compared with the corresponding chi-square distribution, and the value of N for which the percentile is maximum can be chosen. This is a way to choose the optimum value of N for the greatest predictive power with the least variance.
We may choose any set of N technical indicators that we want for our linear regression model. The future returns are regressed on this set of N technical indicators, to yield the best one-step prediction of the future returns based on the present values of the indicators. In general, these technical indicators could be linear or nonlinear functions of the past log returns, log price levels, or other fundamental or economic data. However, it should be noted that an autoregression on the past log price returns is the most general linear function of the log returns. Any set of technical indicators that can be written as a linear function of past log returns, should be encompassed within the linear autoregression on past log returns. For example, if we consider a set of moving averages of the log prices, and take as our technical indicators the differences between the moving averages corresponding to different time scales, or if we consider log price levels relative to various averages on different time scales, then these should all be encompassed within the general linear autoregression (LP filter). The only advantage to using specific linear functions as technical indicators would be that the fit could be accomplished with far fewer fitting parameters, leading to a reduction in the “noise” or variance of the fit, or in “over fitting” with too many parameters. This is a very worthwhile consideration, so the approach of using specific technical indicators rather than a general autoregression on the past time series, is definitely worthwhile. We can also consider various nonlinear functions of the past data, such as the hyperbolic tangent of log price deviations from some long-term average, thereby possibly leading to new nonlinear effects in the model (such as the prediction of manias and crashes).
Let us now consider in more detail the case of “time-dependent correlation”. We consider a simple case of the one-step autocorrelation of a variable, which is time-dependent. We postulate the following equation (for _{}), which we would like to estimate:
_{}
We calculate the (time average) covariance, over m time steps, of both sides with the regressor _{}. The coefficients _{} are optimized over these m time steps, in the least-squares sense, and the condition that they are optimized is equivalent to the condition that the innovations _{} are uncorrelated “white noise”. Hence we arrive at the following optimization conditions, for each time n:
_{}
At each point in time n, this is supposed to be the optimum linear regression over the previous m time steps. Hence the product _{}, which we call the “regression function”, itself constitutes a technical indicator function which is an independent function of n (since the regression coefficients are not constant, but depend on n). We can thus write:
_{}
This is the best linear regression over the past m time units, for each time n. We can clearly generalize this procedure taking into account a set of N technical indicators, and calculating their best linear regression over the past m-day interval. This will then lead to a set of N regression coefficients, which are themselves time-dependent and complicated nonlinear functions of the past data.
If we calculate the regression over a different value of m time units, we arrive at different time-dependent regression coefficients, and a different technical indicator. Presumably, for some value of m there will be an optimal regression, in the sense that the mean-square error of the long-time regression of the future data on the indicator is minimized. This value of m, then, is what we mean by the “time scale” of the correlation. This optimal time scale will be different depending on how many time units ahead we are predicting. So to calculate a prediction at one-day intervals over the next 100 days, for example, we can choose 100 different values of m for the regression, one value for each number of days of prediction. The least squares fit for each m, for each number of days ahead predicted, can be made and the optimal value of m in each case can be determined.
Note that, since the coefficients _{} are no longer constants, but are ratios of two quadratic functions of the past data, the regression function _{} becomes a highly nonlinear function of the variables _{}. The upshot of this is that a time-dependent correlation is equivalent to higher-order statistics. Unless we know of some explicit time dependence to take into account, such as the business cycle for example, the only way to estimate the time-dependent correlation is through nonlinear functions of the past data, embodying higher-order statistics. In the example used here, the nonlinear function in question was the covariance of the past data over the m-day interval, which was then used to estimate the present value of the (two-point) covariance (correlation) of the data. However, other nonlinear functions could be used to estimate higher-order correlations, which may themselves be time dependent. In fact, any statistical indicator we can think of, when calculated over the past m-day interval, can be used as a time-dependent, nonlinear, technical indicator taking into account time-dependent higher-order statistics in the data. In particular, arbitrary products of these statistical indicators with themselves, each other, and with the covariance functions between themselves and the future data (and each other), may be used as new technical indicators. So we have at our disposal an infinite hierarchy of ever more highly nonlinear functions for use as possible technical indicators.
First let us consider the implications of the straight Random Walk model, according to the Efficient Market Hypothesis, so that there are no technical indicators that are effective. Then the daily (log) returns are given simply by a Gaussian random variable, which is uncorrelated with other daily returns, both past and future, and with constant variance (volatility squared), in addition to the constant secular trend. In the pure Random Walk case, the returns are given simply as follows:
_{}
The expected value of the return for day n and the variance of (logarithmic) returns are thus:
_{}
Considering this process from day 0 to day n, the change in the log prices is given in terms of the returns as follows:
_{}
Since the Gaussian random variables are independent, the expected values and variances in the sum of the log price changes add, and the sum is given by:
_{}
The Return/Risk ratio R/R for the log prices is defined to be the expected n-day return divided by the standard deviation, which is the square root of the variance. It is thus given by:
_{}
Thus for the log prices, in the Random Walk (Gaussian) case (with drift), the Return/Risk ratio increases in proportion to the square root of the time interval. Eventually, for a long enough holding period n, the cumulative return will be much larger than the uncertainty (standard deviation) of the return. This is why, for the Random Walk model, the Buy & Hold strategy is the most sensible one. For a Buy & Hold strategy, the logarithmic gains are additive, since the amount invested each day is proportional to the price on that day, for a constant number of shares. In other words, this corresponds to a continuously compounded gain.
Suppose we try a different trading strategy, corresponding to day trading in the Random Walk model. Each day we invest a constant amount, hold it for one day, and then take our profit or loss. Since we are talking about the Random Walk model and there are no technical indicators, there is nothing on which to base our decision as to how much to invest each day. We begin by considering what happens when we invest a constant amount each day, and then consider investing a different amount each day and observe what effect this has on the variance of the returns. (It should not have an effect on the expected returns, provided that the average amount invested is the same, because the expected returns are due solely to the secular trend.)
In the day-trading strategy, we need to consider the actual price returns, not just the logarithmic price returns. In this case, the mathematics becomes a little more complicated, because in terms of actual price returns the distribution is not Gaussian. What process do the price returns themselves follow, if the log price returns follow a Random Walk? The log price return is given in terms of the prices by:
_{}
The fractional dollar return or gain per day is thus given by:
_{}
The cumulative (dollar) gain over time, assuming that a constant amount is invested each day, would be obtained by adding together all of the daily fractional dollar returns, and then multiplying by the (constant) amount invested. In the more general case, the daily fractional dollar returns would be multiplied by the (non-constant) amount invested each day, then the results added up to produce the cumulative dollar gain. If the non-constant amount invested each day is assumed to be given by a constant dollar amount times a log-Normal probability distribution (since there are no technical indicators and it is random), then this simply corresponds to adding another log-Normal random variable to the daily (log) price returns. This should then have the effect of adding the variances of the two random variables, increasing the variance but not the average returns. We will investigate this question shortly.
To calculate expectation values of the fractional daily dollar return, we must now calculate the expectation of the exponential of white noise, _{}. To do this, we must make use of the various moments of the Gaussian distribution. These are given by the moment generating function of a normal random variable, which is given by means of a Gaussian integral as [Ghahramani (1996)]. (This is discussed in the Appendix below):
_{}
We define the Gaussian variable v(n) in terms of the standard normal variable Z with variance unity as follows:
_{}
Hence we have:
_{}
Thus the expected return on the investment in the Random Walk model is given by:
_{}
Let us note here that even when the expectation of the logarithmic price returns is zero, the expectation of the price returns themselves is (slightly) positive, due to the variance of the (log) returns and the fact that the distribution of the actual prices is not Gaussian.
Likewise, the variance of the fractional one-day dollar return is given by:
_{}
Once again, this variance vanishes if the variance of the log price returns series itself vanishes.
Similarly, the (compounded) fractional price gain over n days in the Random Walk model is given by:
_{}
Due to the statistical independence of the Gaussian random variables, we have:
_{}
Thus the expected (compounded) fractional price gain over n days is given by:
_{}
Notice that the variance increases the expected return, over what it would be with zero variance _{}. Using the same method, we may calculate the variance of the n day (compounded) fractional price returns:
_{}
Thus the compounded variance is given by:
_{}
Both of these results follow simply from the fact that the variance of the log returns increases linearly with time.
Now when we calculate the Return/Risk ratio for the dollar fractional price returns, we find:
_{}
Note that for small d and s_{n}_{,} this result is approximately the same as the result for the log prices, as it must be:
_{}
However, for large times the behavior is different. We find:
_{}
This is disturbing news. It indicates that, for the fractional prices, the Return/Risk ratio for a Random Walk process must ultimately go to zero for long times. Ultimately, the standard deviation of the process becomes larger than the expected return, no matter how large the secular trend. This is because the standard deviation in the actual prices itself increases in proportion to the factor _{}, along with the expected price return, in addition to other factors that depend on the variance. Thus the variance always wins in the end, no matter what the secular trend, for a Random Walk process. So, although the expected return increases exponentially with time, the risk increases at an even faster rate, so that the uncertainty of the returns in the Random Walk model increases exponentially faster than the returns themselves. This revelation provides a great deal of incentive to try to control this risk, somehow, through active trading (if possible). (However, it is likely that active trading will actually increase the risk, and decrease the Return/Risk ratio, even though the expected return will increase.)
Actually, this result might not be as bleak as it sounds, because most of the uncertainty is going to be on the upside, not the downside. It is clear that, since the price can never fall below zero, if we were to calculate the r.m.s. risk on the downside only, the R/R ratio for this downside risk alone must be a number greater than unity. (Hence it cannot go to zero as in the calculation above.) It is only on the upside that the “risk” of a large price move can be much greater than the expected price. For this reason, in the case of non-Gaussian distributions such as the distribution of actual price returns, it makes sense to consider other measures of risk, such as the Value at Risk (VaR) [Bouchard & Potters (2000)]. However, in any event, we must still realize that in terms of the actual prices, a “law of large numbers” does not appear to operate, in the sense that the width of the distribution in actual dollars, as a fraction of the expected return, does not go to zero for large times.
The simple fractional price gain over n days in the Random Walk model is given by the sum of the fractional price gains over the time interval:
_{}
The expected simple price gain is thus given by:
_{}
Thus the expected (simple) fractional price gain over n days is given by:
_{}
As expected, the expected gain increases linearly with time. Since the random variables are uncorrelated, the variance is given by the sum of the variances (as can be verified explicitly):
_{}
Thus
_{}
Thus, in this case, the variance is also the sum of the one-day variances. This makes sense, since the amount invested each day is constant, rather than changing from day to day with each profit or loss.
Now when we calculate the Return/Risk ratio for the simple dollar fractional price returns, we find:
_{}
We see that in this case, the R/R ratio depends, for large times n, on the square root of n, just as in the case of the log price changes. This is because the random variables are again additive, rather than multiplicative. Thus the risk is greatly reduced, by adding to or subtracting from, the position in order to maintain a constant dollar amount all the time. Only in this way can the risk grow at a smaller rate than the return over time, so that the secular trend wins out over the variance. So in this way, an active trading strategy is actually much safer than the Buy & Hold trading strategy. At the same time, though, since the risk in the Buy & Hold strategy is actually mostly on the upside, rather than the downside, the strategy of adjusting the amount invested to a constant amount can result in the loss of substantial gains, which would otherwise occur due to the exponential growth of the investment. So for the Random Walk model, the Buy & Hold strategy is the optimal one. (Note that the constant-investment strategy described here is actually very similar to what is called the “Dollar Cost Averaging” strategy.)
A portfolio consists of a collection of stocks or other assets of N different companies, each with its own price per share. If the logarithmic prices were additive, then computing the portfolio statistics would be simple because the statistics would be Gaussian (assuming the Random Walk model). However, additive log prices imply that the actual (percentage) price returns are multiplied together, which is the case for a single stock in the Buy & Hold strategy but not for different stocks. In the portfolio as a whole it is the actual price returns that must be added together, of course. Hence we will consider the situation of simple returns in a portfolio in which the amount invested in each security is held constant.
We wish to consider the effect of adding together the returns of N securities, with statistically independent returns, on the overall return and risk of the portfolio. In general, even though the distribution is not Gaussian for the actual price returns, due to the statistical independence of the returns it is still true that the expected returns and variances of the individual securities are additive. Thus we may simply add the expressions derived above for the mean values and variances of the returns of the securities, weighted by the market value of each security. If the market values were equal, along with the return and variance of each security, this implies that a factor of N corresponding to N securities multiplies the individual returns, and N also multiplies the variances. Hence the standard deviation is multiplied by the square root of N, so the square root of N also multiplies the ratio of expected return to risk. Hence the R/R ratio is improved by a factor of the square root of N, for a diversified portfolio consisting of N securities.
We consider the N securities to each have their own secular trend and daily innovation, with the (constant) dollar amount invested in security s being denoted by C_{s}, so the total simple return for the portfolio is given by:
_{}
The expected simple price gain is thus given by:
_{}
Thus the expected (simple) fractional price gain over n days is given by:
_{}
Similarly, since the random variables are uncorrelated, the variance is given by the sum of the variances of the individual securities:
_{}
Thus
_{}
Now when we calculate the Return/Risk ratio for the simple dollar fractional price returns for N securities, we find:
_{}
For small secular trend and variance, compared to unity, we find the following approximation:
_{}
We see that, once again, the Return/Risk ratio improves with time n, proportionally to _{}. However, we now see that, if the secular trend and variance (and amount invested) for each individual security were the same, then the above expression would reduce to the corresponding expression for each individual stock, multiplied by a factor _{}. In other words, the Return/Risk ratio (for simple returns) also improves with the number of securities N, roughly in proportion to _{}.
In the Random Walk model, the different securities are uncorrelated. Then it can be seen from the above expression that, given the secular trend and variance of each individual security in the portfolio, the optimal R/R ratio is given by adjusting the amounts invested in each security so that the above expression is maximized. If the returns of different securities are correlated, then we can utilize the Markowitz model, which optimizes the return on the portfolio for a given level of risk given the correlation matrix of the overall portfolio. In either case, the greater the “risk aversion”, the lower the risk, but the lower the expected returns as well, from the portfolio. This will be discussed below.
Now we need to investigate how the return and risk are modified by an active trading strategy. We expect the returns to be increased by such a strategy, but the risk is also increased. Given the presence of autocorrelation in the price data, or more generally correlation with past price returns and other fundamental and economic data, we would like to determine to what extent the return and risk are increased by an active trading strategy.
In order to take into account correlation in the log price data, we now admit the possibility that the future returns are a function of past prices and other economic data, whose influence is contained in a small correction to the Random Walk model. We would thus like to express the next day’s (log) return as the sum of a Gaussian random variable, the secular trend, and a small function of all the known past returns and other data:
_{}
The function _{} is a Gaussian white noise variable, which we call the innovation, with mean 0 and variance _{}. The constant d represents the secular trend. We can call the function _{} the deterministic factor; it is a function of the past log price returns (and perhaps other data) which partially determines the future returns as a function of the past data. The Random Walk Model specifies that the deterministic factor _{} is zero. If the market is not perfectly inefficient, then this factor will be small but nonzero. However, the catch is that this deterministic factor is in general unknown. We can attempt to estimate it in terms of various functions of the past price returns data and other data; these functions are known as technical indicators. We denote the estimate of the deterministic factor _{} as the technical indicator _{}, which is distinguished by a tilde. Explicitly, the technical indicator _{} is given as follows:
_{}
We can designate the technical indicator at time n, defined over a short-term time interval of length m units into the past, by:
_{}
In the limit as _{}, we have the case of a stationary time series.
The most basic example of such a technical indicator is the case where it is a linear function of the past log price returns data. This corresponds to an autoregressive model, in which the future returns are regressed on the past returns series. In the stationary case, we can define the sample autocorrelation coefficient as follows:
_{}
Then the technical indicator is defined by:
_{}
Note that the Linear Prediction coefficients _{}are a function of the inverse correlation matrix and of the one-step-ahead correlation coefficients. Each of these, in turn, are ratios of quadratic functions of the price series. So this technical indicator is, in reality, a nonlinear function of the price data. However, if we suppose that the time series is stationary, then these LP coefficients are constant, so the technical indicator is a linear function of the past data. However, in the non-stationary case, it becomes a complicated non-linear function of the past data. We may also calculate the (time dependent) correlation of the future returns with higher order functions of the past data, thereby leading to technical indicators that are even more highly nonlinear.
Let us assume that the technical indicator _{} is a Gaussian random variable _{}, of zero mean and variance _{}, with a certain correlation coefficient with the log price returns. Since the technical indicator is assumed to be uncorrelated with the innovations _{} (as is everything else), the degree of correlation of the technical indicator with the future (log) price returns is the same as the correlation between the technical indicator and the deterministic factor:
_{}
In the nonstationary case, we postulate that this correlation is actually time-dependent. Since the purpose of the technical indicator is to simulate the deterministic factor as closely as possible, the better the technical indicator, the closer this correlation is to perfect correlation. (Generally, however, the correlation will be very small.)
Since we measure the correlation between the technical indicator (assumed to have mean zero) and the actual returns, not just their deterministic part, a more useful expression is the correlation between the technical indicator and the actual returns:
_{}
However, the variance of the returns is equal to the sum of the variance of the deterministic part and the innovations, since they are uncorrelated:
_{}
Thus we have the following relationships:
_{}
Using this, we would like to express all quantities in terms of the correlations and variances with respect to u rather than z, since u is observable and z is not.
In general, a trading rule is connected with any given technical indicator, in that the amount invested is equal to a constant investment plus an amount that is an increasing function of the technical indicator. Exactly how the investment should increase as the technical indicator increases is an arbitrary decision. For example, we could simply make an investment that is proportional to the technical indicator itself, _{}. However, this technical indicator is supposed to be a model of the log price returns, not the actual fractional price returns. On the other hand, to calculate the actual return from short-term trading, we need to multiply the amount invested by the actual fractional price returns. Thus, in order to make the math simpler and more straightforward, we invest an amount that is proportional to our expectation of the actual (daily) fractional price returns (plus a constant, in general). Thus the amount invested, as a function of time n, is given by:
_{}
The constants of proportionality could in general be different for each security s, possibly depending on the secular trend, as well as the volatility, of each security. We take this as our standard trading rule, given a technical indicator that is supposed to model the log price returns. This form also has the advantage that the maximum amount invested on the short side is limited to a 100% short position (100% of $k), although the margin on the long side is unlimited. So this trading rule has an intrinsic safety feature built in. We assume that this technical indicator is itself a Gaussian random variable with zero mean, except that since it uses only past data, its value is always known. But as a function of all the past log price innovations, which are themselves (hypothetically) Gaussian random variables, the technical indicator is itself a Gaussian random variable if it is a linear function of those variables. If the technical indicator is a nonlinear function of the past data, then it will not in general be Gaussian, but we take it to be approximately Gaussian anyway, for the purpose of calculational expediency.
The actual expected dollar gain per day is given by the expectation of the percentage gain per day multiplied by the amount invested each day. The percentage gain per day is given by:
_{}
Let us define a new variable _{} that has mean zero, so that we have:
_{}
The expected gain per day is thus given by:
_{}
Using the results from Appendix B (see in particular Case 2), we have for the two expectation values:
_{}
Putting all this together, we have for the expected return:
_{}
If we make an expansion to first order in the secular trend and variance of the returns, but to all orders in the variance of the trading rules, we find:
_{}
From this form, it can be seen that even if the technical indicator is completely uncorrelated with the price returns, there is still a small gain from short-term trading due to the increased volatility. This is contained in the first term, which is proportional to the volatility of the price returns and the secular trend. However, it can also be seen that this additional gain can equally well be achieved simply by investing an additional constant amount. The second term contains the dependence on the correlation between the technical indicator and the daily price returns. The leading term shows that the gain from short term trading is simply proportional to the product of the standard deviations of the log price returns and log trading rules and their correlation, as is to be expected. However, if the variance of the trading rules is large, there are higher order terms that enter as well.
To calculate the variance due to short-term trading, we compute:
_{}
We may break the first bracket down term-by-term as follows:
_{}
The second bracket may be broken down term-by-term as follows:
_{}
In general, even for this simple model of a lognormal distribution, the variance becomes very complicated to calculate. The exact expression is not particularly illuminating, and is very tedious to compute. However, we can compute an exact expression for the first bracket above without too much trouble, corresponding to the buy-and-hold part of the trading strategy. Using the formulas in Appendix B, we find for the first term in the first bracket:
_{}
We previously arrived at the result:
_{}
The first bracket may then be re-written as:
_{}
Thus the exact form for the variance due to the first bracket is:
_{}
As expected, the variance for the buy-and-hold part of the trading strategy is approximately given by the variance of the log price returns themselves. However, if this variance were large, there would be higher-order corrections due to the fact that the distribution of the actual prices is lognormal, not normal.
The second term in the first bracket in the expression for the variance is given by:
_{}
From Case 1 in Appendix B, the first term may be approximated by:
_{}
From Case 2 in Appendix B, the second term may be approximated by:
_{}
Thus we find:
_{}
Using results arrived at previously, the second term in the second bracket is given by:
_{}To first order in the secular trend and variance of the log price returns, this term is zero. Thus the second term of the variance is given by:
_{}
To lowest order in the variance of the trading rules, this term is proportional to the product of the variance of the trading rules and the variance of the log price returns. Note that it is nonzero even if the correlation between the trading rules and the log price returns is zero. This is again due to the fact that the distribution of the actual prices is lognormal, not normal.
The third term of the first bracket is given by:
_{}
From Case 3 in Appendix B, the first term may be approximated by:
_{}
From Case 4 in Appendix B, the second term may be approximated by:
_{}
The sum of the first two terms is therefore given by:
_{}From Case 1 in Appendix B, the third term may be approximated by:
_{}
From Case 2 in Appendix B, the fourth term may be approximated by:
_{}
The sum of the last two terms is therefore given by:
_{}Using results arrived at previously, the third term in the second bracket is given by:
_{}
Thus we have for the third bracket of the variance:
_{}
Note that this result is again proportional to the variance of the log price returns, to this order of approximation.
We can now write the final result for the variance, valid to first order in the secular trend and variance of the log price returns, as follows:
_{}
In this expression, the variance is a complicated function of the variance of the technical indicator, multiplied by the variance of the log price returns. Hence for a given trading strategy, the risk is directly proportional to the log price variance. For a given log price variance, the risk is a complicated function of the variance of the technical indicator, but it generally increases exponentially with the variance of the technical indicator.
To get an estimate of the ratio of return to risk for short-term trading, in general we see that this is going to be very complicated. However, we can get a substantial simplification if we consider the situation where the variance due to the short-term trading rules is much greater than that due to the log price returns themselves. Then we can consider only the dominant terms in the variance of the technical indicator, and drop all the other terms. In the expression for the expected return, if we consider the approximation in which the secular trend and variance of the log price returns is negligible compared to the variance of the trading rules, then we obtain:
_{}
In this approximation, we obtain for Case 1:
_{}
In this approximation, we obtain for Case 2:
_{}
In this approximation, we obtain for Case 3:
_{}
In this approximation, we obtain for Case 4:
_{}
In this approximation, we keep only the largest terms in the variance of the technical indicator, so we drop the first and second brackets in the expression for the variance, and keep only the two largest terms in the third bracket. We thus have:
_{}
Thus the variance is given in this approximation by:
_{}
This gives us an estimate for the Reward/Risk ratio, in this approximation:
_{}
Note that this formula may not be accurate when the correlation is near zero, since both the expected return and the variance approach zero. In this case, the lower order terms will evidently dominate. In fact, from the above expression for the variance, keeping only the highest-order terms (in the variance of the technical indicator), we find the following approximate expression for the variance of the returns in the case that the correlation is zero (and the technical indicators do not work):
_{}
This formula shows that the risk due to short-term trading, when the correlation is zero, increases exponentially with the variance of the technical indicator, and is proportional to the variance of the log price returns (in the approximation that this is small). However, when the correlation is substantial, most of the “risk” may be on the upside, not the downside. In fact, it can be seen from the above formula that the “risk” actually increases as the correlation increases.
Thus we see that, for active trading, even though the expected returns increase exponentially with the variance of the trading rules, the variance of the returns increases so much faster that the ratio of return to risk decreases exponentially with the variance of the trading rules. This is true no matter how well correlated the technical indicators are with the log price returns. Thus, to control risk, it is necessary to do short-term trading using only a fraction of the total portfolio equity. To trade with the entire portfolio equity is risky in the extreme.
Hence we see that the variance of the outcome depends in a very complicated way on the variance of the trading rules. However, generally speaking, the variance of the trading rules plays a similar role to the log variance of the price returns. But since the former is likely to be much greater than the latter, it will usually be the determining factor in the risk.
(Consider also small variance of technical indicators, corresponding to a “smooth” indicator.)
When considering lognormal random variables, the expectation values of interest to us take the form of moment-generating functions, or are directly related to them. For a random variable Z, Ghahramani (1996) defines the moment-generating function:
_{}
We are especially interested in Gaussian random variables. For a standard Normal variable (a Gaussian random variable with mean zero and unit variance), the probability distribution function is:
_{}
Hence the moment generating function is given by a Gaussian integral:
_{}
We then solve this Gaussian integral as follows [Ghahramani (1996), p.346]. We first complete the square in the exponent:
_{}
This yields for the moment-generating function:
_{}
We now define a new variable as follows:
_{}
Then the integral becomes:
_{}
We note the following value of the Gaussian integral:
_{}
Thus we see that the moment-generating function for a single standard Normal variable has the value:
_{}
Thus, if Z is a standard Normal random variable, representing a lognormal distribution of returns, then the moment-generating function becomes a convenient method of finding the expectation values of the actual price returns.
We now repeat the above calculation for the case of a sum of two standard Normal random variables Z_{1} and Z_{2}, which are correlated with correlation coefficient r. In this case we define the moment-generating function as follows:
_{}
It is a theorem [Ghahramani (1996), p.414] that the joint probability distribution for two such standard Normal random variables has the general form:
_{}
In order to reduce this to Gaussian form, we must first complete the square in the quadratic form Q. To do this, let us write it in the following form:
_{}
To equate this to the quadratic form Q, we make the following assignments:
_{}
Now we may solve for the coefficients:
_{}
We now define two new variables as follows:
_{}
We may write this transformation as follows:
_{}
Thus we have in the Gaussian integral:
_{}
Finally the joint probability distribution takes the form:
_{}
Thus we have arrived at a diagonal form for the joint distribution function of the two correlated standard Normal random variables.
To proceed, we now need to express the old variables in terms of the new ones. We have:
_{}
Thus:
_{}
Now we may write the moment-generating function as:
_{}
Now it can be seen that this moment-generating function has been reduced to the product of two independent Gaussian integrals of a single variable. In fact, in terms of the one-variable moment-generating function derived above, we have:
_{}
Now it is just a matter of substituting back the original t variables:
_{ }
Thus we find for the final result:
_{}
This result may be re-arranged in a form that is a little easier to work with. The sum of the two terms involving the t’s is given by:
_{}
The difference of the two terms yields:
_{}
Thus the above expression may be written in the nicer form:
_{}
In case the two standard Normal random variables are uncorrelated, it can be seen that this expression reduces to the product of the moment-generating functions for the individual variables, as it should. If they are perfectly correlated, on the other hand, then the two Z variables are equivalent, and the sum of the two t’s plays the role of a single t, and once again the result coincides with the single-variable result.
In this appendix, we calculate some mean values and variances that are needed in the main body of the text. Suppose u is a Normal variable with variance _{}. Then we can write u in terms of the standard Normal variable Z (with unit variance) as _{}. Then from the result in Appendix A, the expectation value of the exponential of this variable is given by:
_{}
We thus have our first result:
_{}
When the variance is small, this expectation value is approximately given by:
_{}
This is due to the fact that the distribution is lognormal, and hence the distribution for the actual prices has a “fatter tail” in the direction of higher prices.
Let us now calculate the variance of this quantity. We have:
_{}
Thus the variance is given by:
_{}
It can be seen that when the variance of u is small compared to unity, this expression reduces to the variance of u:
_{}
This makes sense, since it is approximately the variance of the logarithmic returns.
For our next expectation value, consider two variables u_{1} and u_{2}, with correlation coefficient r_{12}. The expectation of the exponentials of these variables was derived in Appendix A and is given by:
_{}
Now we have:
_{}
Using the previous results, we find:
_{}
It is clear by inspection that when the variances are small compared to unity, this expression reduces to the following simple form:
_{}
This makes sense, since it is approximately the covariance between the two logarithmic returns t_{1}u_{1} and t_{2}u_{2}.
Finally, the results are modified if the mean value of the log variables is not zero. In this case we add a constant to the variables to represent the mean value. Thus we have:
_{}
This leads to the result:
_{}
When we keep terms of second order in the mean values, and first order in the variances and covariances, we arrive at the following approximate result:
_{}
These expectation values are all that we need to calculate any mean values or variances involving two independent variables, such as the daily price returns and the daily values of some technical indicator.
We are going to need four special cases of the above formula for the calculation of variance in the main text. We expect the secular trend d and the variance of the log price returns _{} to be small relative to unity, so we keep only the first order terms in these quantities. However, the variance of the technical indicator (trading rules) _{} could be arbitrarily large, so we calculate the exact result in terms of this quantity.
Case 1: We wish to calculate _{}:
_{}
This may be rewritten in the following form:
_{}
When we keep terms of first order in the mean value and variance of u, and to all orders in the variance of z, we arrive at the following approximate result:
_{}
Note that only the first term depends on the correlation between z and u.
Case 2: We wish to calculate _{}:
_{}
This may be rewritten in the following form:
_{}
When we keep terms of first order in the mean value and variance of u, and to all orders in the variance of z, we arrive at the following approximate result:
_{}
Note that only the first term depends on the correlation between z and u.
Case 3: We wish to calculate _{}:
_{}
This may be rewritten in the following form:
_{}
When we keep terms of first order in the mean value and variance of u, and to all orders in the variance of z, we arrive at the following approximate result:
_{}
Note that only the first term depends on the correlation between z and u.
Case 4: We wish to calculate _{}:
_{}
This may be rewritten in the following form:
_{}
When we keep terms of first order in the mean value and variance of u, and to all orders in the variance of z, we arrive at the following approximate result:
_{}
Note that only the first term depends on the correlation between z and u.
Jean-Philippe Bouchard & Marc Potters, Theory of Financial Risks, 2^{nd} ed.
Cambridge University Press, Cambridge, UK (2000)
Peter J. Brockwell & Richard A. Davis, Time Series: Theory and Methods, 2^{nd} ed.
Springer-Verlag, New York (1991)
John Y. Campbell, Andrew W. Lo, & A. Craig MacKinlay (CLM),
The Econometrics of Financial Markets,
Princeton University Press, Princeton, NJ (1997)
Wayne A. Fuller, Introduction to Statistical Time Series
John Wiley & Sons, New York (1996)
Ramazan Gençay, Faruk Selçuk, & Brandon Whitcher (GSW),
An Introduction to Wavelets and Other Filtering Methods in Finance and Economics,
Academic Press, San Diego, CA (2002)
Saeed Ghahramani, Fundamentals of Probability
Prentice Hall, Upper Saddle River, NJ (1996)
William H. Greene, Econometric Analysis, 5^{th} ed.
Prentice Hall, Upper Saddle River, N.J., (2003)
Rosario N. Mantegna & H. Eugene Stanley,
Cambridge University Press, Cambridge, UK (2000)
Chrysostomos L. Nikias & Athina P. Petropulu,
Higher-Order Spectra Analysis, A Nonlinear Signal Processing Framework
PTR Prentice Hall, Upper Saddle River, NJ (1993)
William H. Press, Saul A. Teukolsky, William T. Vetterling, & Brian P. Flannery,
Numerical Recipes in C, The Art of Scientific Computing, 2^{nd} ed.
Cambridge University Press, Cambridge, UK (1992)
William F. Sharpe, Gordon J. Alexander, & Jeffery V. Bailey, Investments, 5^{th} ed.
Prentice Hall, Englewood Cliffs, NJ (1995)
[1] Malkiel, B., 1992, “Efficient Market Hypothesis”, in Newman, P., M. Milgate, & J. Eatwell (eds.), New Palgrave Dictionary of Money and Finance, Macmillan, London.