Workplace Fund Performance: Luck or Skill?

In this paper we examine whether Parnassus Workplace fund delivers superior return despite it restrictive screening based on workplace environment. We use bootstrap method to evaluate the financial performance of the fund. This bootstrap allows us to distinguish skill from luck. The distribution of the actual t(α) and the simulated t(α) are compared to infer whether the actual distribution is generated by mere luck or whether some manager exhibits skill. Our results indicate that the fund exhibits stock selection skills. The t-statistic of the actual estimated alpha is more extreme than the simulated t-statistic of alpha and as such the fund exhibits skill. The fact that PARWX beats the simulations does suggest that by picking the right funds, investors can outperform the market.


Introduction
We live in a digital age where tremendous amount of information is available to almost everybody on almost everything including mutual funds. Still it is almost next to impossible to say whether an active manager that beats the market does so out of skill or luck. It is also equally hard to explain why so many of the investors lately earned the worst returns of their lives. Record on the consistency of mutual fund performance is mixed. The better performing funds continue to perform well in the following periods in some periods and perform worse in different sample periods. It is impossible to tell how high the probability of luck is in good performance of a mutual fund manager. Investors are becoming sophisticated and ask mutual fund managers about alpha, beta, Sharpe ratio, etc. in order to realize their financial goals. A number of advanced investors might even require that a mutual fund be evaluated by a pre-specified four factor conventional performance benchmark comprised of the three Fama-French factors and a momentum factor. They might want to look at the investment process and see whether it is consistent with results and repeatable over time. Some recent studies such as Kosowski et al. (2006), Fama and French (2010), etc. make it possible to analyze the performance of actively managed mutual funds.
Mutual fund investors focus particularly in the tails of performance distribution and are mainly interested to identify very good performers for investing or very bad performers to avoid. We find that the standard -straight forward and simple multi-factor performance measures used in previous mutual fund studies -have little ability to detect whether the positive or negative risk adjusted abnormal performance (the "alpha") is due to skill or luck. General consensus of mutual fund research is that restrictions placed on the screening criteria for stocks, result in less than optimal performance. So, as a test case, we choose a workplace fund with restrictive screening-Parnassus fund. This fund has long term superior performance based on a well-known performance evaluation technique: four-factor model proposed in Carhart (1997). Subsequently, we apply the bootstrap methodology using 1000 simulations with true alpha set to zero (i.e., assuming that fund has no stock picking ability), to determine whether or not manager/managers of this superior performing fund are skilled or lucky.
The first published bootstrap study applied to mutual fund performance was Kosowski et al. (2006). This latest methodology was never applied before to Parnassus fund. The bootstrap approach -as opposed to persistence studieshas several advantages, such as no assumptions are made about the distribution of returns and alphas of funds, and also a longer time series of performance can be used. In this study, we are able to separate 'skill' from 'luck' in the performance of Parnassus fund, even when the distribution of idiosyncratic risk is highly non-normal. Our study provides new evidence on abnormal performance measure of the fund and our paper's results are not easily inferred from other studies. Applying multi-factor model to a mutual fund involves considerations whose potential consequences cannot easily be studied without using longer time-series bootstrap simulations. Our study uses longer time series of performance.

Literature Review
A number of academic studies on performance of US mutual funds provide mixed evidence. For example, Carhart (1997), Christopherson et al. (1998), Hendricks et al. (1993) find that most mutual funds do not perform better than their benchmark and mostly have negative alphas when evaluated on style adjusted basis net of expenses and trading costs. These studies show little evidence of positive abnormal performance but present stronger evidence of poor performing funds. On the other hand, some studies find evidence of stock selection skills among some mutual fund managers. The evidence is not entirely definitive. Kosowski et al. (2006) use a bootstrap technique to document outperformance by some funds. Avramov and Wermers (2006) show the benefits of investing in actively-managed funds from a Bayesian perspective. Cuthbertson et al. (2008) used data on UK equity funds from 1976 to 2002 and found an existence of stock picking ability among a small number of the top performing funds that they concluded was not solely due to luck. They also found that the underperforming funds demonstrate bad skill. Cuthbertson et al. concluded that for the majority of funds, positive abnormal performance could be attributed to good luck.
No study of socially responsible funds so far has focused on measuring whether the realized performance was driven by skill or mere luck. This paper separates skill from luck in managers' investment performance using a Carhart four-factor model and applies latest bootstrapping simulations methodology to distinguish skill from luck. This bootstrap approach is necessary because of non-normalities in individual fund alpha distributions. Using this bootstrap technique we examine the performance of Parnassus Workplace Fund over the May 2005 through June 2012 period. Actual observed performance is then compared to the performance under the hypothesis of pure luck, by setting alpha to zero.

Data
For this study we consider Parnassus Workplace (Ticker: PARWX) mutual fund which is based in San Francisco and only invests in companies that have a reputation for providing good workplace environments for their employees, thus promoting common good of all people. Common wisdom holds that the more restrictions a fund has, the more difficult it is for it to consistently perform well. The fund's concern for strong workplace environments appears to be exclusively ethical. But in reality it is a restriction that is ultimately aimed at making money because a happy and motivated workforce-people that like the company and feel that they are being treated fairly-are going to work harder. This just might turn out to be a successful investing strategy. This study explores whether PARWX has delivered superior results despite its restrictive screens. Active management must produce returns large enough to offset its higher risks and fees. General consensus of investors is that managers who described themselves as active did not deserve that title, as they did little more than track an index.
The monthly data used for this study came from Morningstar Database and was from May 2005 through June 2012, a total period of 86 months. The characteristics of this mutual fund PARWX are: U.S. stocks comprise 96.8% of the assets; price-earnings ratio (P/E), price-cash flow ratio (P/C), and price-to-book ratio (P/B) are 14.5, 9.1, and 1.9 respectively. The market capitalization of the fund is $23,063 million with earnings growth of 15.8 percent as of June 20, 2012. The turnover rate is 47% and gross expense ratio is 1.16% with no front-end load, no back-end load and no 12b-1 fees. This mutual fund is in Morningstar category of large growth with a prospectus objective of growth. We compare the characteristics of the PARWX fund with the average of characteristics of 10,220 domestic stock funds and show the results in Table 1.
The data shown in Table 1 are as of June 30, 2012. P/E, P/C, and P/B are three value factors, besides earnings growth, used by the fund in selecting the portfolio. The average size of a stock fund's portfolio is defined as the geometric mean of the market capitalization for all of the stocks it owns. Two characteristics that affect the net performance of a fund are (1) the costs of management and administration, summarized in its expense ratio and (2) the magnitude of its security purchases and sales, summarized in its turnover ratio. Both these characteristics for the fund are lower than those for the 10, 220 funds. PARWX fund successfully managed to keep its expenses low relative to its peers and within its load structure. This fund has total net assets of $233.7 million. Total net asset figures are useful in gauging a fund's size, agility, and popularity. They help determine whether a small company fund, for example, can remain in its investment-objective category if its asset base reaches an ungainly size.

Methodology
First, we examine the abnormal performance of the PARWX mutual fund using Carhart's (1997) four factor model. This model considers three-factor model of Fama and French (1993) along with the momentum factor: where R is excess monthly returns of fund i for month t, R M is excess monthly market for month t, SMB , HML , and MOM are monthly premiums of the size, book-to-market, and momentum factors. In this model α is interpreted as the abnormal performance of fund i; it measures the impact that a fund manager has on fund performance. A positive alpha implies that the manager has a positive impact on fund performance, and the opposite happens with a negative alpha.
Alpha is the excess return over what is predicted by four-factor model. Fund managers are typically paid to generate this alpha. The Fama-French and Carhart models show that, in fact, most alpha is attributed to investing in small, value companies with price momentum. If that were the case, investor has no reason to pay excessive fees to mutual fund and hedge fund managers for their stock picking.
The regression results are shown in Table 2. Controlling for the Fama-French (1993) three factors and momentum, average estimate of alpha this mutual fund provided was .23 percent per month with a standard error of 0.08 and a t-statistic of 2.96. In other words, annual return earned by this fund is .23% times 12=2.76% more than the investors might have expected. This number roughly equals a little over twice the mutual fund's annual gross expense ratio of 1.16%. Generally, it is no accident that alpha is normally negative and quite similar to fund total operating and transaction costs. So, prima facie, this fund exhibits skill. Now the question is, whether a mutual fund, such as Parnassus with a positive alpha relative to four-factor benchmark over some measurement period, say 86 months, can be identified as a superior fund? Generally, we do short term persistence tests to see whether this kind of performance persists.

Methodology: Bootstrapping Simulation with a True α of Zero
In this study we use the bootstrapping simulation approach of Fama and French (2010) to separate manager's skill from luck. Following are the steps taken by this study. First, α and its t-statistic for PARWX portfolio are estimated from four-factor Carhart model. Second, the estimated actual α is subtracted from mutual fund's monthly returns to generate an adjusted return series with a true α of zero. Then, a random draw from these adjusted returns 86 times (with replacement) generated new return series per simulation run. Next, a new simulated α and t(α) per portfolio in each simulation run are estimated. It is pertinent to mention that due to the random sampling the estimated α might deviate from zero. The underlying α is zero by design. So, in this "no skill" (true α =0) simulation, the distribution of estimated alphas should always stick at zero. The fact that we see "fat tails" in the distribution of estimated alphas tells us that there is some skill and some "negative skill". Any estimated α different from zero is then obviously just one generated by luck. Finally, the t (α) value at each percentile is computed as an average of the percentile values from all 1000 simulation runs.
Following Kosowski et al. (2006), Fama and French (2010) as well as Barras et al. (2010), we use the t-statistics of α, t (α), instead of α for the analysis. The distribution of the actual t (α) and the simulated t(α) are then compared to infer whether the actual distribution is generated by mere luck or whether some manager exhibits skill. We follow the approach of Fama and French (2010) by comparing the values at the percentiles. For each of the 1,000 simulations we calculate the value at every percentile. For the comparison, we then compute the average value at the percentiles as well as a figure representing how many simulations in percent generated a value at the respective percentile that was below the actual percentile value.

Results
In this study, we compare the actual fund four-factor α estimate to the results from 1,000 bootstrap simulations of the cross-section. The returns of the funds in a simulation run have the properties of actual fund returns, except we set true α to zero in the return population from which simulation samples are drawn. The simulations thus describe the distribution of "α" estimates when there is no abnormal performance in fund returns.
In Table 3, percentiles of t(α) estimates based on 1000 simulations, with true alpha set to zero, are presented. 99th percentile is 2.75, the value (or score), below which 99 percent of the observations may be found. The percentile of 4-factor t(α) for actual fund returns (2.96) is above the average simulation 99th percentile. The t-statistic of the actual estimated alpha is more extreme than the simulated t-statistic of alpha and as such the fund exhibits skill.

Conclusion
The Parnassus Workplace fund invests in companies that provide good workplace environments for their employees.
To investigate whether the empirical factors of momentum, size and book-to-market are priced in the data, this study examines whether these factors explain the cross-section of returns and find evidence that these factors have significant explanatory power for returns of momentum, size and book-to-market sorted portfolios.
In this age of instantaneous information, for an investor, it may not be enough to estimate alpha; it is more important to identify superior or inferior funds. Generally, short term performance-persistence tests are utilized to do this.
Comparing the distribution of "α" estimates from the simulations to the "α" estimates for actual fund returns allows us to draw inferences about the existence of skilled manager. The t-statistic of the actual estimated alpha is more extreme than the simulated t-statistic of alpha and as such the fund exhibits skill. The fact that PARWX beats the simulations does suggest that by picking the right funds, investors can outperform the market. But the problem continues to be that the good funds can't be separated from the lucky bad ones that land in the top percentiles.
We acknowledge and present some limitations of our study. First, it is hard to detect abnormal performance when it exists, particularly for a fund whose style characteristics differ from those of the bench mark portfolio based on four-factor Carhart model. Second, if abnormal performance is short-lived (say, less than a year), the seven year results shown in Table 2 may overstate the gains. Manager's profit opportunities are more likely to be short-lived. We used Carhart four-factor model as the primary risk model. Other researchers may explore variants of this model, study a large number of different categories of funds for different time periods, and use data without any survivorship bias.