Faith Based Investing (FBI) Fund Performance: Luck or Skill?

This study uses monthly net return data on 27 actively managed faith based mutual funds and the latest Fama-French bootstrap methodology to distinguish between luck and skill of fund managers. The main benchmark of this study is the Carhart (1997) four-factor model. The evidence suggests that majority of fund managers do not have enough skill to produce expected benchmark-adjusted net returns that cover costs. By ranking funds on performance after costs, we find that the performance of the majority of funds can be attributed to "bad skill". The evidence is strongest in the top 95th percentile and above, and from the bottom 10th percentile and below.


Introduction
There has been a growing interest in faith-based organizations and faith-based-mutual funds but the literature on return performance of these mutual funds is sparse. Mutual fund performance is generally judged against a standard benchmark index. Outperformance (or a positive outcome) is generally associated with skill and underperformance (or a negative outcome) with bad skill.
Active management of a mutual fund is pursued with the goal of outperforming a multi-factor benchmark index adjusting for a common set of risk factors. Studies of mutual fund performance indicate that actively managed funds tend to underperform similar passively managed funds that closely replicate the return performance of benchmark index. This is true both before management fees and after management fees (See Malkiel, 2012).
Performance is influenced by both skill and luck. In view of the present limited knowledge, this study addresses the following research questions: 1. How do performance of faith based mutual funds stack up terms of net returns? 2. Is the fund performance due to skill or luck?
Alpha compares a mutual fund's risk-adjusted performance to a benchmark index. We begin by first examining the overall α for equally weighted faith based mutual funds. We then turn to the performance of individual funds.
For January 2007 to June 2014, "the α" (abnormal return) of the portfolio, measured with Carhart four-factor model, is -0.10647% per month or -1.2775% per year. Estimate of t(alpha) is-2.22, which is rather strong evidence that FBI mutual funds as a whole provide returns to investors below those of an equivalent portfolio of the four passive benchmarks of the Carhart (1997) model. Alpha estimated on net returns realized by investors is negative by about the amount of fund expenses. Costs matter.
When we use the four-factor model to explain the monthly percent returns of the aggregate fund portfolio for 2007-2014, we get, R Pt -R ft = -0.10647 + 0.9993019 (R Mt -R ft ) + 0.127103 SMB t + 0.019799 HML t -0.04279011 MOM t + e it where R Pt is the return (net of costs) on the equally weighted faith based mutual fund portfolio for month t, R ft is the riskfree rate of interest (the one-month T-bill return for month t), R Mt is the cap-weighted NYSE-Amex-Nasdaq market return, SMB t and HML t are the size and value/growth returns of the Fama French -three-factor model and MOM t is the fourth factor added by Carhart (1997). MOM t is a factor-mimicking portfolio for one-month return momentum and calculated as average of sum of two high prior returns portfolio minus sum of two low prior returns portfolio. The results show that the equal-weighted FBI mutual fund portfolio has almost full exposure -close to 1.0-to the market portfolio, but very little or almost no exposure to the size, value/growth returns and Momentum portfolio (0.127, 0.019, and -.043). Moreover, the market alone captures 91.4% of the variance of month-by-month equal-weighted fund returns. Now, we examine fund-by-fund performance. We compare range of outcomes, with assumption that every manager's true α is zero to the actual observed range of outcomes and if the latter is wider than we expect by chance, we can infer that there are bad fund managers overpopulating the left tail of outcomes and good managers overpopulating the right tail.

Literature Review
Past academic studies of US mutual funds show little evidence of positive abnormal performance but present stronger evidence of poor performing funds. The evidence is not entirely definitive. Though several researchers document negative average fund alphas on a style adjusted basis, net of expenses and trading costs (see, Carhart (1997), Christopherson et al (1998), andHendricks et al (1993)) recent papers indicate that some fund managers have stock-selection skills. For example, Kosowski et al. (2006) separated luck from skill by applying a novel bootstrap methodology that creates a sample of monthly pseudo excess returns by randomly re-sampling residuals from a factor benchmark model and imposing a null of zero abnormal performance. Kosowski et al. document outperformance by some funds and show that some fund managers have exhibited skill in picking stocks. Avramov and Wermers (2006) show the benefits of investing in actively-managed funds from a Bayesian perspective. Cuthbertson et al. (2008) used data on UK equity funds from 1976 to 2002 and found an existence of stock picking ability among a small number of the top performing funds that they concluded was not solely due to luck. They also found that the underperforming funds demonstrate bad skill. Cuthbertson et al. concluded that for the majority of funds, positive abnormal performance could be attributed to good luck.
No study of Faith-Based-Investing funds so far has focused on measuring whether the realized performance was driven by skill or mere luck. This paper separates skill from luck in managers' investment performance using a Carhart four-factor model and applies latest bootstrapping simulations methodology to distinguish skill from luck. This bootstrap approach is necessary because of non-normalities in individual fund alpha distributions. Using this bootstrap technique we examine the performance of 27 FBI mutual funds over the January 2007 to June 2014 period. Actual observed performance is then compared to the performance under the hypothesis of pure luck, by setting alpha to zero.

Data
The data used in this study comes from Morningstar and consists of the monthly returns on 27 actively managed faith -based investing funds over the period January 2007 to June 2014. The dataset also includes data on four of the Carhart (1997) risk factors available from the website of Kenneth French http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/index.html. The website provides detail analysis of all factors and the method of their calculations.

Methodology
Using the same methodology used by Fama and French (2010), we estimate the four-factor model on each fund's actual returns and then subtract the resulting α estimate from the fund's returns. This gives us returns for each fund that have the properties of the fund's actual returns, except that true α and t(α) for the cloned returns are zero. To generate a chance distribution of α and t(α) estimates, we draw a random sample (with replacement) of months from the cloned population of fund returns. (Drawing the same random sample of months for every fund maintains the cross correlation of fund returns.) For every fund we then estimate the three-factor model on the random sample of returns. This gives us one chance distribution of α and t(α) estimates. To have many such samples on which to base inferences, we repeat this "bootstrap simulation" 1,000 times.

Results
We estimate the four-factor Carhart model on monthly net returns for each fund in the sample. This gives us an "α" estimate for each fund. It is more sensible to rank funds on the t-statistics for their α estimates rather than on alpha. The t-statistic, "t(α)", is the ratio of an α estimate to its standard error, which is a measure of the precision or reliability of the α estimate. Dividing each α estimate by its standard error gives us precision-adjusted "α" estimates that allow meaningful comparisons across funds.
Results in table 1 show the percentiles of the distribution of four-factor t(α) estimates for the funds in our sample. For example, the first percentile of t(α) for actual returns is -3.7543, which means that 1% of the 27 funds in our sample have t(α) estimates at or below --3.7543. The tenth percentile of t(α) for actual returns is -2.45842, so 10% of our funds have t(α) estimates at or below -2.45842. At the other end of the distribution, the 90 th percentile of t(α) for actual returns is 0.641388. Equivalently, t(α) is 0.641388 or above for 10% of our funds. Finally, the 99 th percentile of t(α) is 1.403668, so t(α) is 1.403668 or above for 1% of funds. Typically, we look for values of t(α) above 2.00 to infer statistical reliability. Each simulation run gives us a chance distribution of t(α) estimates from a world in which true α is zero for every fund. The Simulation column of Table 1 shows the averages of the percentile values of t(α) obtained from the 1,000 simulation runs. For example, the average of the first percentile values of t(α) from the 1,000 simulation runs is -2.3044345, and the average of the 99 th percentile values of t(α) is 2.3168724.
The simulations say we expect lots of chance dispersion in t(α) estimates when true α and t(α) are equal to zero. Not surprisingly, high percentiles of t(α) estimates in the simulations are associated with stellar returns. In the simulations, the average t(α) estimate at the 90 th percentile is 1.2239229. Recall that t(α) is the ratio of an α estimate to its standard error, and the average standard error is around 0.028. A t(α) of 1.2239229 thus translates to an α estimate of about 0.034269841% per month, or about 0.4% per year. Low percentiles of t(α) in the simulations imply depressing investment outcomes. These funds would not have a happy future in the investment management business. It is important to note that all dispersion in the α and t(α) estimates from the simulations is due to chance, since true α is zero for every fund. The good performance in the right tail of the simulation distribution is just good luck, and the poor performance in the left tail is bad luck. Our goal is to use the chance distribution of t(α) from the simulations to draw inferences about performance in the distribution of t(α) estimates for actual fund returns.
If there are funds that have positive true α, we should find more high values of t(α) in actual fund returns than we observe in the simulations. Conversely, if there are funds that have negative true α, we should find more low (extreme negative) values of t(α) in actual fund returns than in the simulations. Put differently, the worst performing funds should perform worse than we expect just by chance if every fund has a true α of zero, and the best performing funds should perform better than we expect by chance. Concretely, if there are funds with negative and positive true α, the negative values of t(α) at low percentiles should be more extreme for actual fund returns than for the simulations, and the positive values of t(α) at high percentiles should also be more extreme for actual fund returns than for the simulations. Table 1 confirms that poorly performing funds indeed do worse than we expect if true α is zero for all funds. For every percentile, the t(α) estimate for actual fund returns is far below the average value from the simulations. For example, the first percentile of t(α) is -3.75 for actual fund returns, so 1% of actual funds have t(α) of -3.75 or lower. The simulated distribution has fewer funds whose performance is this bad; to include 1% of simulated funds, we have to raise the boundary to -2.30. The 10 th percentile of t(α) for actual fund returns, -2.46, is far more extreme than the 10 th percentile for simulated fund returns, -1.29. The 50 th percentile of t(α) in Table 1 says that half the actual funds have t(α) of -0.82 or less, while the median t(α) from the simulations is almost zero, -0.04. All this suggests that among poorly performing funds, there are many with negative true α. In other words, their poor performance is not entirely due to chance.
Unfortunately, the percentiles of t(α) for actual fund returns are also below the average percentiles from the simulations throughout most of the right tail of the t(α) estimates, which contains the strong performers. For example, Table 1 says that the 95th percentile of t(α) for actual fund returns is 1.12, versus 1.63 from the simulations. In other words, the seemingly impressive t(α) estimates of the good performers are actually low relative to what one would get in a world where true α is zero. The percentile of 4-factor "t(alpha)" estimates for actual net fund returns and the average value of t(α) at the selected percentiles from the 1,000 simulation runs in which true α is zero for net returns.

Conclusions
Alpha, risk adjusted performance relative to a benchmark and beta, the systematic risk, are very important when making investment decisions about actively managed faith based mutual funds. Investors use both alpha and beta to judge a mutual fund's performance. Investors invariably prefer a high alpha. Their preference for beta depends on their tolerance of risk.
Bootstrapping technique is applied to randomly generate a thousand time series with the same statistical properties as the original zero-alpha fund's returns time series. If many of these simulated random time series show an alpha higher than the original non-zero-alpha time series, then the probability of having obtained that alpha just by chance is relatively high. Too much chance does not speak in favor of the fund manager's skills. Often, when a stock fund manager has a good year, it's due to chance.
Much of the benchmark adjusted, skill/luck-based, component performance, alpha, seems to be siphoned off into higher fees. As long as present excessive costs persist, time does not run in favor of FBI fund shareholders. Many funds reflect the market as a whole, that is, they are "closet" index funds that mimic the market at much higher cost.