The goal of the project is to develop a statistical model for a specific business problem and provide a thorough analysis of it.
Choose one topic: prediction of prices for a specific stock/bitcoin/commodity/futures/options; prediction of a country’s GDP level using market indicators; prediction of auto sales; analysis of bonds liquidity
You should choose a topic that interests you, construct a solid multivariable regression model by brainstorming which factors can be relevant to prediction, obtain data and use R to carry out statistical analysis. It is expected that you work with a sufficiently large data set originally to find which factors are significant for the model (I expect 20-25 factors). Including categorical variables along with continuous variables is strongly encouraged.
Check the assumptions of the model by carrying out residual analysis and provide recommendations if there are significant violations. Test relevant hypotheses. Remove insignificant variables and analyze the new model using R tools.
A report covering the background information (the topic of research), the objective of your research, model description with following rigorous statistical analysis and conclusions.
There is no requirement on the size of the final report but typically it should be 10-15 pages + Appendix with R code. Make sure to indicate all additional resources you use, including the data sources, in the reference list. Include your R code in the Appendix of the report
An abstract is a concise summary of a research paper or entire thesis. It is an original work, not an excerpted passage. An abstract must be fully self-contained and make sense by itself, without further reference to outside sources or to the actual paper. I require the abstract to be no more than 400 words. The abstract is essentially a short presentation of our work that you can use, for example, during the job interview process.
Order of assignment
Cover page
Abstract
Background information
Description of factors (20-25 requested)
Discussion of the assumptions of the model backed up by graphs and analysis
Recommendations in case of violation of the assumptions
Statistical analysis (hypothesis testing)
Proper citation and reference list
Presentable source files and code