Data Analysis for research with RStudio

Statistics for Research

Purpose:

Statistics helps in designing studies, analyzing data, testing hypotheses, and drawing conclusions.

 Types of Research:

 Descriptive – summarize data (mean, SD)

 Inferential – test hypotheses, generalize to populations

Experimental – cause-effect (e.g., clinical trials)

 Observational – analyze associations (e.g., stroke vs age)

 Key Statistical Methods:

 t-test / ANOVA – compare group means

 Chi-square – categorical variable association

 Linear Regression – predict continuous outcomes

 Logistic Regression – predict binary outcomes (e.g., stroke risk)

 Poisson/NB – for count data

 Cox Regression / KM curve – survival/time-to-event

 ARIMA / Time Series – trend forecasting (e.g., pollution, GDP)

 Panel Data Models – economic/environmental data across time

 Visualization Tools:

 Histogram, Boxplot, Violin plot, Scatterplot, ROC, Survival curves

 Important Concepts:

 p-value, Confidence Interval, Effect Size, Correlation

 Common Mistakes:

 Misuse of p-values

 Ignoring assumptions

 Overfitting models

 Not adjusting for confounders

 Software Tools:

 R (best for academic/statistical modeling)

 SPSS, Stata, Python (also widely used)

 Reporting:

 Always include: sample size, effect size, p-values, confidence intervals, tables, and clear plots

Scroll to Top