Introduction to Clinical Biostatistics MSC1090 Assignment
In this assignment we would like to know a little bit about your own research and study fields.
For doing so, we would like to invite you to use a representative set from your *own* data. It doesn’t need to be unpublished nor new data, just something that might resemble the actual data you have to deal with in your research.
If you don’t have any data available, you can still use other data that is close to your interests, either from the R data sets or from other websites, like the Open Data Toronto.
If you are going to use some of the R data sets, do *not* use the ones we have been presenting and discussing in class!
The goal of this assignment is that you will have to incorporate several of the tools and techniques we have been discussing in the course so far.
you will have to create a git-repository
you will have to have at least two modules: a main driver script and a utilities file where the functions used in the main driver will be defined.
the functions should have arguments and return statements, for receiving information and returning what ever they were aimed to do.
you must have one loading function to load the data, either yours or from wherever you will use.
no global variables of any kind! ie. functions can not access variables that are not passed to them!
you will need to incorporate at least three of the statistical techniques discussed in class, each one in its own function:
probability/statistical estimators computations
statistical hypothesis testing
statistical power analysis
you can use some of the functions you have been creating for assignment #5 and #6, or modifications of those.
you are welcome to include other statistical methods that we haven’t discussed in class, but you will need to briefly explain them and also incorporate them as functions.
you can also include shell scripting, in case you need to handle several files, as we did in assignment #3.
You will have to submit:
the git log for the repository you created
any data file used in the analysis
your main driver and utilities file
a short report, including the following sections:
Introduction: where you will briefly introduce the field you work in and describe the data you will use and what the goal of your analysis is.
Methods: you will describe the statistical methods you will implement to analyze your data, if you are using a method not discussed in class, please provide a short description and justification why are you choosing such a method.
Implementation: here, you will describe how did you implement the methods discussed in the previous section
Results: in this section you will present the results you obtained, interpreting the actual numerical values in the context of the data and if you have figures please add them here and include a brief description and discussion about them too.
Discussion: Explain what advantages or disadvantages did you find by utilizing this implementation, we are specially interested in cases where you can also compare to other tools such as SPSS, STATA, SAS, G*Power, etc… specially if you use those in your lab/group, and how they compare to your implementation.
References: Include here, references (if any) for either citing your data and/or statistical methods.
Submit your main driver script and Utiltites file, as well as any data set you decided to use, your report and the output of “git log” from your assignment repository, to the ‘Assignment Dropbox’.