100% (1)
page:
9 pages/≈2475 words
Sources:
45
Style:
Harvard
Subject:
Mathematics & Economics
Type:
Research Paper
Language:
English (U.S.)
Document:
MS Word
Date:
Total cost:
$ 46.66
Topic:

Regression Model Explaining the Excess Returns of the US Manufacturing Index

Research Paper Instructions:

The data needs to be processed in Rstudio, thanks, and remember that references need to be in Harvard format

give me until April

Research Paper Sample Content Preview:

MULTIPLE REGRESSION PROJECT REPORT
Name
Course
Due Date
(Total words =2943)
Table of Contents 1.0 Executive Summary. 3 2.0 Introduction. 4 3.0 Theoretical Framework. 4 4.0 Economy and Econometric Theory. 5 5.0 Data. 6 6.0 Regression Analysis. 7 6.1 Result Interpretation. 8 7.0 Assumptions Testing. 10 7.1 Normality. 10 7.2 Multicollinearity. 11 7.3 Heteroskedasticity. 12 7.4 Autocorrelation. 14 Reference List 15
1.0 Executive Summary
An empirical analysis utilizing annual data for the US manufacturing index's gross returns, excess market returns, small-minus-big, high-minus-low, robust-minus-weak, conservative-minus-aggressive, and risk-free rate was conducted to identify the relationship between these variables using the R-Studio software. The multiple regression output showed that the independent variables (Mkt.RF, HML, RMW, CMA, RF) together explain 86.37% of the variation in the stock market return of the US manufacturing index's gross returns, after adjusting for the number of independent variables. Post-tests conducted to confirm the reliability of the model showed that the model met the Gauss-Markov assumptions.
2.0 Introduction
The project aims to develop a regression model to explain the excess returns of the US manufacturing index, which is one of the main industry indices categorized by the four-digit Standard Industrial Classification (SIC) codes. The analysis will utilize annual data for the US manufacturing index's gross returns, excess market returns, small-minus-big, high-minus-low, robust-minus-weak, conservative-minus-aggressive, and risk-free rate (Fama and French 2017). The question of interest is to identify the relationship between these variables and the excess returns of the US manufacturing index and to develop a reliable and accurate model that can explain the variability in the index returns.
3.0 Theoretical Framework
Several economic theories suggest potential relationships of interest in the current analysis. The Capital Asset Pricing Model (CAPM) suggests that the excess market returns should have a positive relationship with the excess returns of the US manufacturing index (Rossi 2016; Galagedera 2007). According to CAPM, the market risk premium is the key driver of expected returns, and the excess returns of an individual asset are expected to be positively related to the excess returns of the market (Cahart 1997; Elbannan 2015; Fama and French 2004; Perold 2004;). Therefore, a positive coefficient for the excess market returns variable in the regression model is expected.
The Fama-French Three-Factor Model suggests that the excess returns of the US manufacturing index are also influenced by other factors such as small-minus-big (SMB), high-minus-low (HML), robust-minus-weak (RMW), and conservative-minus-aggressive (CMA). SMB captures the size effect in the stock market, as small-cap stocks tend to outperform large-cap stocks (Fama & French 1993; 2015; 2017). HML captures the value effect, as value stocks tend to outperform growth stocks. RMW captures the profitability effect, as companies with high profitability tend to outperform those with low profitability (Fama & French 1993; 2015; 2017). Finally, CMA captures the investment style effect, as conservative companies tend to outperform aggressive companies.
Additionally, the risk-free rate (RF) is included in the model as an independent variable. According to economic theory, the risk-free rate represents the opportunity cost of investing in a risk-free asset, such as US Treasury bills, and is expected to have a negative relationship with the excess returns of the US manufacturing index (Adam et al. 2021; Fabozzi et al. 2008). Therefore, a negative coefficient for the risk-free rate variable in the regression model is expected (Livdan et al. 2009).
4.0 Economy and Econometric Theory
The economic model is a theoretical framework that describes the relationships between economic variables based on economic theory (Gibbard and Varian 1978). In the context of the regression model for the excess returns of the US manufacturing index, the economic model identifies the factors that are expected to influence the excess returns, such as the excess market returns, small-minus-big, high-minus-low, robust-minus-weak, conservative-minus-aggressive, and the risk-free rate, based on economic theory (Chen, Chang, and Du 2012).
On the other hand, the econometric model is a statistical model that estimates the relationship between the dependent variable and the independent variables using empirical data (Baltagi 2008; Chong and Hendry 1986; Intriligator 1983; Nachane 2006). The econometric model estimates the coefficients of the independent variables, which indicate the strength and direction of the relationship between each independent variable and the dependent variable (Giles 1999; Phillips 1988).
5.0 Data
The dataset provided for the regression model includes monthly and annual data for the following variables:
• Manuf: The gross return of the US manufacturing index
• Mkt-RF: Excess return on the market, which is the market gross return minus the risk-free rate
• SMB: Small Minus Big
• HML: High Minus Low
• RMW: Robust Minus Weak
• CMA: Conservative Minus Aggressive
• RF: Risk-free rate
The data used in this regression model is a time series, as it includes observations for the variables at different points in time (Brillinger 2001; Wang et al., 2013). Specifically, there are monthly observations for the variables Manuf, Mkt-RF, SMB, HML, RMW, and RF, and annual observations for the variable CMA (Fu 2011; Hyndman 2015; Liao 2005. Therefore, the data is a combination of a time series and panel data.
One potential limitation of using time series data is that it can be affected by serial correlation or autocorrelation, which occurs when there is a correlation between successive observations (Shrestha and Bhatta 2018; Hyndman 2015). This can lead to biased and inefficient coefficient estimates and can affect the validity of statistical tests. Additionally, time series data can be influenced by factors such as seasonality, trend, and cyclical effects, which may need to be accounted for in the regression model (Cryer and Kellet 1991). Another potential limitation of using panel data is that it can be affected by cross-sectional dependence, which occurs when the observations for different individuals or entities are correlated (Cohen 2014). This can lead to biased and inefficient coefficient estimates and can affect the validity of statistical tests. Additionally, panel data can be influenced by factors such as unobserved heterogeneity, selection bias, and endogeneity, which may need to be accounted for in the regression model (Hsiao 1985).
6.0 Regression Analysis
The Gauss-Markov assumptions are a set of assumptions that must hold for the Ordinary Least Squares (OLS) estimator to be unbiased and efficient (Hansen 2022). These assumptions are important because they ensure that the OLS estimator provides the best linear unbiased estimate (BLUE) of the coefficients in the regression model (Oktaba 1984; Shaffer 1991). The Gauss-Markov assumptions required the relationship between the dependent variable and the independent variables to be linear in the parameters (Hallin 2014). Random sampling is also an essential assumption where the observations should be randomly drawn from the population of interest. The independent variables should not have a perfect linear relationship between the independent variables (Hallin 2014). In addition, Gauss-Markov requires zero conditional means (Grob 2004). That is, the conditional mean of the dependent variable given the independent variables is zero, i.e., E(u|X) = 0 (Brace and Musiela 1994). In addition, the error terms should have a constant variance, should not be correlated with itself over time or across observations, and should be normally distributed (Hallin 2014). The assumptions are based on the classical linear regression model, and violations of any of these assumptions can lead to biased and inefficient coefficient estimates, which may affect the validity of the statistical tests and the model's predictive power (Shaffer 1991).
6.1 Result Interpretation
The intercept in a linear regression model is the value of the dependent variable when all independent variables are zero (Ramsey 1977; Nunez et al., 2011). In this case, the intercept represents the expected value of the dependent variable (stock market return) when all the independent variables (Mkt.RF, HML, RMW, CMA, RF) are zero. On the other hand, Coefficients represent the change in the dependent variable (stock market return) for a unit change in the corresponding independent variable, holding all other independent variables constant. In other words, coefficients represent the partial effect of an independent variable on the dependent variable, assuming all other variables remain constant (Poole and O’Farrell, 1971).
R2 is a measure of how well the independent variables explain the variation in the dependent variable (Katipamula et al. 1998). In this case, the R2 value of 0.8643 indicates that the independent variables (Mkt.RF, HML, RMW, CMA, RF) explain 86.43% of the variation in the stock market return. Adjusted R2 penalizes the addition of unnecessary independent variables that do not contribute significantly to the model's explanatory power (Kramer and Sonnberger 2012). In this case, the adjusted R2 value of 0.8637 indicates that the independent variables (Mkt.RF, HML, RMW, CMA, RF) together explain 86.37% of the variation in the stock market return, after adjusting for the number of independent variables.
Omitting an independent variable that has a significant partial effect on the dependent variable can lead to biased estimates of the other independent variables' coefficients. This is because the omitted variable may be correlated with the other independent variables, leading to omitted variable bias (Fahrmeir et al., 2022). Therefore, it is essential to include all relevant independent variables in the model.
The t-value represents the number of standard errors the coefficient estimate is from zero. A larger t-value indicates that the coefficient estimate is more significant (Harrell and Bios 2017). In this case, all the coefficients except for the intercept are significant at a 95% confidence level (p-value < 0.05). The p-value is the probability of observing a t-value as extreme or more extreme than the one observed if the null hypothesis (the coefficient is equa...
Updated on
Get the Whole Paper!
Not exactly what you need?
Do you need a custom essay? Order right now:

👀 Other Visitors are Viewing These APA Essay Samples:

Sign In
Not register? Register Now!