Part 1: Analysis [15]

This task involves finding a dataset of interest to you, that contains a mix of categorical (factors) and numerical variables. As a guideline, the dataset would typically have a minimum of two categorical variables and three numerical variables; these minima are guidelines and not hard thresholds.

If you wish you can make use of the following websites to find the dataset:

• The Irish government data repository: https://data.gov.ie/

• Google dataset search: https://datasetsearch.research.google.com/

The task is to use the methods covered in this course to complete an analysis and write a report using R Markdown on the data. The analysis of the data should involve the use of graphical summaries, tables and numerical summaries of the data.

This part of the project will be assessed in terms of:

• Using the functionality and settings of the appropriate functions in R.

• Clearly annotating the code in the R Markdown file.

• Producing clear results for the data.

• Summarizing the conclusions from the analysis appropriately.

Part 2: R Package [15]

This task involves finding an existing R package, that we didn’t use extensively in the course, and write a report demonstrating its use using R Markdown.

The report should demonstrate some of the key functionality of the package, but doesn’t need to demonstrate all of the functions (only the main ones).

This part of the project will be assessed in terms of:

• Clearly summarising the purpose of the package.

• Clearly demonstrating the functionality of some of the main functions in the package on appropriate data.

• Clearly showing the code and output for the demonstration examples.

Part 3: Functions/Programming [15]

This task is to write an R function (or set of functions) that can be used to provide a statistical analysis of interest. The function(s) should be documented by the code having comments and a working example.

The output from the function should use S3 or S4 classes and an appropriate print, summary and plot methods should be developed and demonstrated with an example.

This part of the project will be assessed in terms of:

• Writing a working function to provide an analysis of interest.

• Providing appropriate print, summary and plot methods for the output from the function.

• Clearly commenting the code and writing a clear example.

Tips for R Markdown

• Be aware that a common error is to give the same label to two different code chunks!

“`{r cars} summary(cars)

“`

“`{r cars} plot(cars) “`

You can fix this by changing the label to one of them:

“`{r cars2} plot(cars) “`

• If you want to improve the appearance of your plot in your knitted document you can set up the dimension of your figure:

“`{r, fig.height = 10, fig.width = 7, fig.align = -center-} plot(Nile) “`

• In case of an error in your code, add the option error = TRUE into the R chunk to run the code, show the error message on the knitted file. For example:

“`{r, error = TRUE} x – -a- sum(a) “`

• For all the available options for the R chunk, you can see here: https://yihui.name/ knitr/options/

• R Markdown website: https://rmarkdown.rstudio.com/

• R Markdown cheatsheet is available here: https://www.rstudio.com/resources/cheatsheets/

#rmarkdown

The post This task involves finding a dataset of interest to you, that contains a mix of categorical (factors) and numerical variables. As a guideline, the dataset would typically have a minimum of two categorical variables and three numerical variables; these minima are guidelines and not hard thresholds. If you wish you can make use of the following websites to find the dataset: • The Irish appeared first on My Academic Papers.