RStats Glossary


Data Collection

Power analysis -  procedure often used before data collection to determine the smallest possible sample size to detect an effect size of a given magnitude (typically based in theory, previous research, and/or practical importance) at the desired level of significance, can also be used to evaluate the strength of the sample after data collection
Population - a collection of units to which we want to generalize a set of findings or a statistical model 
Sample - a smaller but hopefully representative collection of units from a population to determine truths about the population 
Sampling distribution - the probability distribution of a statistic, the distribution of possible values of a given statistic that we could expect from a given population 
Generalization - the ability of a statistical model to be applied to the population at large 
Qualitative methods -  research involving unstructured or semi-structured techniques, such as gathering observations and interviews, with an emphasis on the qualities of individual experience as opposed to measurable, quantifiable data points. Particularly useful in preliminary research or in the form of targeted open-ended questions, can capture nuance and supplemental information that may not be covered in a multiple choice survey for instance, This insight may help guide subsequent research or detect misunderstandings or unanticipated problems in a survey or experiment, or help explain low completion rates. 
Quantitative methods - inferring evidence for a theory through measurement of variables that produce numeric outcomes 

Data Type

Categorical data - any variable including categories of objects or entities (e.g. freshman, sophomores, juniors, seniors) 

  • Levels - different values within a factor in an experiment. For instance, let’s say 0 mg, 5 mg of a medicine, 10 mg of a medicine, and 15 mg are administered to patients and the efficacy is compared in an experiment. The four different conditions represent four levels of the independent variable “Dosage”. 

Binary variable - (AKA dichotomous variable) a categorical category with two exclusive categories (e.g. dead or alive, yes or no, pregnant or not pregnant)
Continuous variable - a variable that can be measured to any level of precision. Time is a continuous variable, because there is no limit to how finely it can be measured.  
Interval data - data measured on a scale along the whole of which intervals are equal 

  • Ratio variable - an interval variable with the additional property that ratios are meaningful

Ordinal variable - ranked data, without a measure of differences between values, (e.g. in a race, we have first place, second place, and third place, although it is not clear how much faster higher places were)

Research Design


Construct -  an underlying concept, characteristic, ability or skill that a given measure is intended to test. For example, an IQ test is one form of measurement of the construct "intelligence". 
Hypothesis - a prediction about the state of the world. In other words, what do you expect to find?

  • Experimental hypothesis - predicted relationship between variables 
  • Null hypothesis - the reverse of the experimental hypothesis that your prediction is wrong and the predicted effect does not exist 

Variable - anything that can be measured and can differ across entities or across time 

  • Dependent variable (AKA DV, outcome variable, y) -  the variable being tested in an a research experiment
  • Independent variable (AKA IV, predictor, factor, explanatory variables, regressor variables) - the variable manipulated or identified by the experimenter as predicting or having an association with the outcome variable 
  • Latent variable - a variable that cannot be directly measured, but is assumed to be related to several variables that can be measured.  

Validity - evidence that a study allows correct inferences about the question it was aimed to answer or that a test measures what it set out to measure conceptually 

  • Content validity - evidence that the content of a test corresponds to the construct it was designed to measure 
  • Ecological validity - evidence that the results of a study, experiment, or text can be applied and allow inferences
  • Type I error - occurs when we believe that there is a genuine effect in population, when there is not 
  • Type II error - occurs when we believe there is no effect, when in fact, there is  
  • Confounding variables (AKA confounds) -  these are factors unrelated to the experiment that may impact the outcome, perhaps masking an effect or causing an observed effect to appear more influential than it actually is. An example may be test fatigue or boredom during a test. Confounding factors should be carefully considered during the planning phase of a research project and either avoided if possible, and addressed in the limitations section of a writeup

    • Practice effect - refers to the possibility that participants performance may be influenced by repetition or increasing familiarity of a task

    • Suppressor effects - when a predictor has a significant effect, but only when another variable is held constant

    • Randomization - random assignment of participants to varying treatment conditions to prevent test order effects, sampling effects, etc. 

Reliability - the ability of a measure to produce consistent results when the same entities are measured under different conditions 

  • Test-retest reliability - the ability of a measure to produce consistent results when the same entities are tested at two different points in time 
  • Split-half reliability -  a measure of internal consistency in which one half of a test is compared to the other half

Power - the ability of a test to detect the effect of a particular size 

Test Formats

Normal distribution - a probability distribution of a random variable that is known to have certain statistical probabilities 

  • Central limit theorem - theorem stating that when samples are large, (above 30) the sampling distribution will take the shape of a normal distribution regardless of the shape of the population from which the sample was derived.  
  • Parametric test - a test based on the normal distribution, generally requiring four basic assumptions, normality, homogeneity of variance, interval or ratio data, and independence of observations 
  • Two-tailed test - a parametric test of a non-directional hypothesis, without suggesting the direction of the relationship
  • Probability distribution - a curve describing an idealized frequency distribution of a particular variable from which it is possible to ascertain the probability to which specific values of that variable will occur  

Monotonic relationship -  two variables that demonstrate either an increasing or decreasing relationship, though not necessarily at the same rate. This may result in a curved pattern as opposed to a linear straight line. 

Polynomial - a growth curve, or trend over time 

  • Linear model - a monotonic model based on a straight line, may include correlation or regression
  • Non-linear model

Meta-analysis - a statistical procedure used to assimilate research findings 
Likert scale - a common 5 - 7 point scale used in survey research in which the participants rate their agreement or disagreement with a statement, resulting in ordinal data useful in analysis

Inferential statistics -  used for generalization in making predictions (“inferences”) about a population based on the test results from a sample


Normality - refers to the tendency of data to fall along a bell-shaped distribution, a feature upon which estimation using parametric testing is based on mathematically

  • Skew - a measure of the symmetry of the frequency distribution. Symmetrical distributions have a skew of 0
  • Positive skew - frequent scores are clustered at the lower end of the distribution, and the tail points towards higher or more positive scores
  • Kurtosis - measurement of the degree to which scores cluster in the tails of a frequency distribution
  • Leptokurtic - kurtosis > 0. Too many scores in the tails and is too peaked
  • Platykurtic - kurtosis < 0 too many scores in the tails, quite flat

Sphericity - assumes that variances of the differences between data taken from the same participant or entity are equal in repeated-measure ANOVA

  • Mauchly's test of sphericity - a formal test of the assumption of sphericity in repeated measures ANOVA, used to assess whether the variances in the differences in the combinations
  • Epsilon ε - an estimate of the departure of sphericity, with a maximum value of 1. Values closer to 1 indicate the assumption is met, while values much less than one indicate the assumption is violated
  • Greenhouse-Geisser correction - common conservative correction for a violation of sphericity, particularly when epsilon is less than .75.
  • Huynh-Feldt correction - more liberal and less common correction of sphericity in comparison to Greenhouse-Geyser, as it can tend to overestimate epsilon. Researchers may consider using this if epsilon is greater than .75.

Homogeneity - the assumption that the variance of one variable is stable (i.e. relatively similar) at all other levels of another variable

  • Heterogeneity - case in which the variance of one variable differs across levels of another variable
  • Levene’s test - tests the assumption of equality of variance (AKA homogeneity - involves the spread of the values - how far do values diverge from the mean) between two or more variables within a repeated measures are equal, e.g. the levels of the independent variable when there are more than two points of data from the same person. 
  • Heteroscedasticity - residuals of each level of the predictor variables have unequal variances
  • Homoscedasticity - an assumption in regression analysis that the residuals at each level of the predictor variable(s) have similar variances. In other words, at each point along a given predictor variable, the spread of residuals should be fairly constant
  • Q-Q plot - a graph plotting the quantiles of a variable against the quantiles of a particular distribution. Values falling along the diagonal of the plot demonstrate similar distributions. Values deviating from the diagonal show deviations from the distribution of interest.  

Multicollinearity - two or more variables are very closely linearly related

Singularity - perfect correlation between variables, (correlation coefficient of either 1 or -1)

Leverage - gauges the influence of observed value of the outcome value over the predicted values

Cooks distance - in least-squares regression, Cook's distance is used to determine if a single data point causes an undue influence on the statistical results

Independence - the assumption that one data point does not influence another

Collinearity - used to describe independent variables that are highly correlated, which may negatively impact the validity of the analysis

Tests & Test Statistics


Parametric Tests of Association

Statistical Test

Test Statistic Effect Size Number of Variables and Data Type Assumptions
Pearson's correlation r r Two continuous paired observations Linearity, no significant outliers, bivariate normality 

Chi-square test of independence 


Cramer’s V

At least two categorical variables

Independence of observations, all cells have expected counts greater than or equal to five 

Differences Between Groups

  Statistical Test Test Statistic Effect Size Number of Variables & Data Type Assumptions
t-tests Independent t-test t Cohen's d One continuous DV, one categorical IV consisting of two independent groups  Independence of observations, no significant outliers, normality, homogeneity 
  Dependent t-test t Cohen's d One continuous DV, one IV consisting of two categorical related groups or matched pairs  No significant outliers, normality 
Analysis of Variance (ANOVA) One-way Between-subjects ANOVA  F Eta squared η2, η2p Categorical, more than two independent groups  
Inferential statistical procedure utilizing the F-ratio to test the overall fit of a linear model, usually defined in terms of group means.  Two-way between subjects ANOVA  F   Categorical, more than two independent groups  
  Repeated Measures ANOVA  F   Categorical, more than two repeated observations   
  Mixed ANOVA F   Categorical, more than two independent groups with repeated observations  

Parametric Tests for Prediction

  Statistical Test Definition Test Statistic Effect Size Number of Variables & Data Type Assumptions
Regression Linear regression Expands upon correlation, used to predict a variable based on another variable. F Cohen's f2, R2β (beta), b One continuous IV, one continuous DV  Independence of errors (residuals), linearity, homoscedasticity of residuals, no multicollinearity, no significant outliers, or cases with high leverage or influence, normal distribution of error (residuals)
  Standard Multiple regression An extension of linear regression used to predict a variable based on two or more variables F Cohen's f2, R2β (beta), b One continuous DV, Two or more continuous or nominal IV’s   
  Hierarchical Multiple Regression   F Cohen's f2, R2β (beta), b One continuous DV, two or more continuous or nominal IV   
  Binomial Logistic Regression   F Cohen's f2, R2β (beta), b    

Nonparametric Tests

  Statistical Test Test Statistic Effect Size Number of Variables & Data Type
Association Spearman's correlation rs or ρ (rho)  Two continuous and/or ordinal paired observations  Monotonic relationship 
Differences Between Groups Mann-Whitney U Test Uz One continuous or ordinal DV, one categorical IV consisting of two related groups or matched pairs  
  Wilcoxin's rank-sum test (AKA Wilcoxin's signed rank test) z One continuous or ordinal DV, one categorical IV consisting of two related groups or matched pairs   

Descriptive statistics -  used to summarize data rather than make generalizations or inferences about a population - may include percentages, ratios, measures of central tendency such as mean, and measures of variability (spread of the data).

  • Central tendency (AKA Measures of Central Tendency) - refers to the center of a frequency distribution of observations as measured by the mean, median, and mode
    • Mean (μ) - a simple statistical model of the center of the distribution of scores. A hypothetical estimate of the “typical” score
    • Median - the middle score of a set of observations
    • Mode - the most frequently occurring score in a set of data

Degrees of freedom - the number of entities that are allowed to vary when estimating a statistical parameter. Determines the probability distribution for test statistics

Chi-square test of association (AKA Chi-square test of independence) - used to assess the relationship between two or more categorical variables

F-ratio - a test value used in analysis of variance (ANOVA procedures) determining whether the difference between two variables is statistically significant

Model sum of squares - a measure of the total amount of variability for which a model can account, derived from the difference between the total sum of squares and the residual sum of squares

Nonparametric Tests

Wilcoxon’s rank-sum test - a non-parametric test to detect differences between two independent samples, nonparametric equivalent to independent t-test, provides same function as Mann-Whitney U test

Wald statistic - a test statistic with a known probability distribution (a chi-square distribution) used to test whether the b coefficient for a predictor in a logistic regression model is significantly different from zero  

Wilcoxon’s signed-rank test - nonparametric test detect differences between two dependent samples, nonparametric equivalent to dependent t-test

Spearman’s correlation coefficient - a standardized measure of the strength of relationship between two variables that does not rely on the assumptions of a parametric test

Simple regression - a linear model in which one variable or outcome is predicted from a single predictor variable

Hierarchical multiple regression (AKA sequential multiple regression) - a type of multiple regression where predictor variables are added in separate steps

Test Statistics

α (alpha) level -  the probability of making a Type 1 error, usually .05

  • Bonferroni correction - a correction applied to the α level to reduce the probability of a Type I error when multiple significance tests are carried out. The α level is divided by the number of tests conducted.  
  • p-value - the p-value is a statistic representing the likelihood that the observed effect is due to chance. p < .05 is generally considered "statistically significant"

Effect Size - an objective measure of the magnitude of the observed effect

  • Correlation coefficient (AKA r, R, or Pearson's r) -  a standardized measure representing the linear relationship between two variables, ranging from -1 to 1. The closer this measure is to zero, the weaker the relationship. Numbers closer to -1 or 1 represent a negative or positive relationship, respectively. In other words, if the correlation = .78, the variables are positively correlated. When exercise increases, strength goes up.
  • Cohen’s d -  a standardized measure of the difference between means
  • Eta squared η2 (AKA coefficient of determination) - an effect size that is the ratio of the model sum of squares to the total sum of squares
  • z-score - the value of an observation expressed in standard deviation units
  • Chronbach’s α - a measure of the reliability of a scale. The number of items is squared, multiplied by the average covariance between items, then divided by the sum of all elements in the variance-covariance matrix.  
  • β (beta) - standardized regression coefficient. Indicates the strength of relationship between a given predictor and an outcome in standardized form. It is the change in outcome associated with a one standard deviation change in the predictor
    • b - unstandardized regression coefficient, indicates the strength of the relationship between a given predictor and an outcome, in the units of original measurement
  • Nagelkerk’s R2n: a version of coefficient of determination for logistic regression
  • Partial eta squared η2 - the proportion of variance the variable explains, when excluding other variables in the analysis
  • Pearson’s r (AKA correlation coefficient) - a standardized measure of the strength of relationship between two variables, ranging from -1 to 1.  
  • Phi Φ - a measure of association between two categorical variables, used with 2 x 2 contingency tables, a variant of the chi-square text  

Confidence interval - for a given statistic calculated for a sample of observations (e.g. the mean), the confidence interval is a range of values around that statistic that are believed to be certain, with a certain probability (e.g. 95%), the true value of that statistic

Variance - an estimate of the average variability (spread) of a set of data

  • Standard deviation (σ) - an estimate of the average variability (spread) of a set of data measured in the same units of measurement of the original data, derived from the square root of the variance
  • Standard error (SE, AKA standard error of the mean) - the standard deviation of the sampling error of a statistic. For a given statistic, (e.g. the mean) it tells how much variability there is in the statistic across samples from the same population. Large values indicate that a statistic from a given sample may not be an accurate reflection
  • Standard error of differences - a measure of the variability of differences between sample means
  • Standardized residuals (AKA as studentized residuals) - the unstandardized results divided by an estimate of its standard deviation that varies point by point
  • Residual - difference between the value the model predicts and the value observed in the data on which the model is based


Bar chart - a graph in which a summary statistic is plotted on the y-axis against a categorical variable on the x-axis

  • Error bar - a graphical representation of the mean of a set of observations including the 95% confidence interval of the mean

Histogram - frequency distribution. Differs from bar chart in that the bars are touching

Boxplots - (AKA box-whisker diagram) - a graphical representation of some important characteristics of a set of observations. The center of the plot contains the median, surrounded by a box.

  • Interquartile range - The top and bottoms of the box representing the limits between which the middle 50% of observations fall (the interquartile range) 
  • Whiskers - Two lines extending from the top and the bottom of the plot, displaying the most and least extreme scores 

Scatterplot - a graph that plots values of one variable against the corresponding value of another

  • Regression line - a line on a scatterplot representing the regression model of the relationship between variables plotted