Data Collection
Power analysis  procedure often used before data collection to determine the smallest possible sample size to detect an effect size of a given magnitude (typically based in theory, previous research, and/or practical importance) at the desired level of significance, can also be used to evaluate the strength of the sample after data collection
Population  a collection of units to which we want to generalize a set of findings or a statistical model
Sample  a smaller but hopefully representative collection of units from a population to determine truths about the population
Sampling distribution  the probability distribution of a statistic, the distribution of possible values of a given statistic that we could expect from a given population
Generalization  the ability of a statistical model to be applied to the population at large
Qualitative methods  research involving unstructured or semistructured techniques, such as gathering observations and interviews, with an emphasis on the qualities of individual experience as opposed to measurable, quantifiable data points. Particularly useful in preliminary research or in the form of targeted openended questions, can capture nuance and supplemental information that may not be covered in a multiple choice survey for instance, This insight may help guide subsequent research or detect misunderstandings or unanticipated problems in a survey or experiment, or help explain low completion rates.
Quantitative methods  inferring evidence for a theory through measurement of variables that produce numeric outcomes
Data Type
Categorical data  any variable including categories of objects or entities (e.g. freshman, sophomores, juniors, seniors)
 Levels  different values within a factor in an experiment. For instance, let’s say 0 mg, 5 mg of a medicine, 10 mg of a medicine, and 15 mg are administered to patients and the efficacy is compared in an experiment. The four different conditions represent four levels of the independent variable “Dosage”.
Binary variable  (AKA dichotomous variable) a categorical category with two exclusive categories (e.g. dead or alive, yes or no, pregnant or not pregnant)
Continuous variable  a variable that can be measured to any level of precision. Time is a continuous variable, because there is no limit to how finely it can be measured.
Interval data  data measured on a scale along the whole of which intervals are equal
 Ratio variable  an interval variable with the additional property that ratios are meaningful
Ordinal variable  ranked data, without a measure of differences between values, (e.g. in a race, we have first place, second place, and third place, although it is not clear how much faster higher places were)
Testing
Construct  an underlying concept, characteristic, ability or skill that a given measure is intended to test. For example, an IQ test is one form of measurement of the construct "intelligence".
Hypothesis  a prediction about the state of the world. In other words, what do you expect to find?
 Experimental hypothesis  predicted relationship between variables
 Null hypothesis  the reverse of the experimental hypothesis that your prediction is wrong and the predicted effect does not exist
Variable  anything that can be measured and can differ across entities or across time
 Dependent variable (AKA DV, outcome variable, y)  the variable being tested in an a research experiment
 Independent variable (AKA IV, predictor, factor, explanatory variables, regressor variables)  the variable manipulated or identified by the experimenter as predicting or having an association with the outcome variable
 Latent variable  a variable that cannot be directly measured, but is assumed to be related to several variables that can be measured.
Validity  evidence that a study allows correct inferences about the question it was aimed to answer or that a test measures what it set out to measure conceptually
 Content validity  evidence that the content of a test corresponds to the construct it was designed to measure
 Ecological validity  evidence that the results of a study, experiment, or text can be applied and allow inferences
 Type I error  occurs when we believe that there is a genuine effect in population, when there is not
 Type II error  occurs when we believe there is no effect, when in fact, there is

Confounding variables (AKA confounds)  these are factors unrelated to the experiment that may impact the outcome, perhaps masking an effect or causing an observed effect to appear more influential than it actually is. An example may be test fatigue or boredom during a test. Confounding factors should be carefully considered during the planning phase of a research project and either avoided if possible, and addressed in the limitations section of a writeup

Practice effect  refers to the possibility that participants performance may be influenced by repetition or increasing familiarity of a task

Suppressor effects  when a predictor has a significant effect, but only when another variable is held constant

Randomization  random assignment of participants to varying treatment conditions to prevent test order effects, sampling effects, etc.

Reliability  the ability of a measure to produce consistent results when the same entities are measured under different conditions
 Testretest reliability  the ability of a measure to produce consistent results when the same entities are tested at two different points in time
 Splithalf reliability  a measure of internal consistency in which one half of a test is compared to the other half
Power  the ability of a test to detect the effect of a particular size
Test Formats
Normal distribution  a probability distribution of a random variable that is known to have certain statistical probabilities
 Central limit theorem  theorem stating that when samples are large, (above 30) the sampling distribution will take the shape of a normal distribution regardless of the shape of the population from which the sample was derived.
 Parametric test  a test based on the normal distribution, generally requiring four basic assumptions, normality, homogeneity of variance, interval or ratio data, and independence of observations
 Twotailed test  a parametric test of a nondirectional hypothesis, without suggesting the direction of the relationship
 Probability distribution  a curve describing an idealized frequency distribution of a particular variable from which it is possible to ascertain the probability to which specific values of that variable will occur
Monotonic relationship  two variables that demonstrate either an increasing or decreasing relationship, though not necessarily at the same rate. This may result in a curved pattern as opposed to a linear straight line.
Polynomial  a growth curve, or trend over time
 Linear model  a monotonic model based on a straight line, may include correlation or regression
 Nonlinear model
Metaanalysis  a statistical procedure used to assimilate research findings
Likert scale  a common 5  7 point scale used in survey research in which the participants rate their agreement or disagreement with a statement, resulting in ordinal data useful in analysis
Inferential statistics  used for generalization in making predictions (“inferences”) about a population based on the test results from a sample
Analysis of Variance (ANOVA)
Main effect  the unique effect of the predictor variable (or independent variable) on the outcome variable, usually used in context of ANOVA
Interaction effect  the combined effect on two or more predictor variables on the outcome variable
Planned contrasts  a set of comparisons between group means that are constructed before any data is collected
Post hoc tests  a set of comparisons between group means that were not thought of before data was collected, typically comparing the means between all combinations of groups, and a strict significance criterion (e.g. Bonferroni correction). Tend to have less power than planned contrasts. Usually used for exploratory work
Robust test  a term applied to a family of procedures to estimate statistics, reliable even when the normal assumptions are not met. ANOVA is considered robust.
Mixed design  an experimental design incorporating two or more independent variables with both repeated measures and betweensubject measurements
Unique variance  variance specific to a particular variable
Normality  refers to the tendency of data to fall along a bellshaped distribution, a feature upon which estimation using parametric testing is based on mathematically
 Skew  a measure of the symmetry of the frequency distribution. Symmetrical distributions have a skew of 0
 Positive skew  frequent scores are clustered at the lower end of the distribution, and the tail points towards higher or more positive scores
 Kurtosis  measurement of the degree to which scores cluster in the tails of a frequency distribution
 Leptokurtic  kurtosis > 0. Too many scores in the tails and is too peaked
 Platykurtic  kurtosis < 0 too many scores in the tails, quite flat
Sphericity  assumes that variances of the differences between data taken from the same participant or entity are equal in repeatedmeasure ANOVA
 Mauchly's test of sphericity  a formal test of the assumption of sphericity in repeated measures ANOVA, used to assess whether the variances in the differences in the combinations
 Epsilon ε  an estimate of the departure of sphericity, with a maximum value of 1. Values closer to 1 indicate the assumption is met, while values much less than one indicate the assumption is violated
 GreenhouseGeisser correction  common conservative correction for a violation of sphericity, particularly when epsilon is less than .75.
 HuynhFeldt correction  more liberal and less common correction of sphericity in comparison to GreenhouseGeyser, as it can tend to overestimate epsilon. Researchers may consider using this if epsilon is greater than .75.
Homogeneity  the assumption that the variance of one variable is stable (i.e. relatively similar) at all other levels of another variable
 Heterogeneity  case in which the variance of one variable differs across levels of another variable
 Levene’s test  tests the assumption of equality of variance (AKA homogeneity  involves the spread of the values  how far do values diverge from the mean) between two or more variables within a repeated measures are equal, e.g. the levels of the independent variable when there are more than two points of data from the same person.
 Heteroscedasticity  residuals of each level of the predictor variables have unequal variances
 Homoscedasticity  an assumption in regression analysis that the residuals at each level of the predictor variable(s) have similar variances. In other words, at each point along a given predictor variable, the spread of residuals should be fairly constant
 QQ plot  a graph plotting the quantiles of a variable against the quantiles of a particular distribution. Values falling along the diagonal of the plot demonstrate similar distributions. Values deviating from the diagonal show deviations from the distribution of interest.
Multicollinearity  two or more variables are very closely linearly related
Singularity  perfect correlation between variables, (correlation coefficient of either 1 or 1)
Leverage  gauges the influence of observed value of the outcome value over the predicted values
Cooks distance  in leastsquares regression, Cook's distance is used to determine if a single data point causes an undue influence on the statistical results
Independence  the assumption that one data point does not influence another
Collinearity  used to describe independent variables that are highly correlated, which may negatively impact the validity of the analysis
Statistical Test 
Test Statistic  Effect Size  Number of Variables and Data Type  Assumptions 
Pearson's correlation  r  r  Two continuous paired observations  Linearity, no significant outliers, bivariate normality 
Chisquare test of independence 
χ^{2} 
Cramer’s V 
At least two categorical variables 
Independence of observations, all cells have expected counts greater than or equal to five 
Statistical Test  Test Statistic  Effect Size  Number of Variables & Data Type  Assumptions  
ttests  Independent ttest  t  Cohen's d  One continuous DV, one categorical IV consisting of two independent groups  Independence of observations, no significant outliers, normality, homogeneity 
Dependent ttest  t  Cohen's d  One continuous DV, one IV consisting of two categorical related groups or matched pairs  No significant outliers, normality  
Analysis of Variance (ANOVA)  Oneway Betweensubjects ANOVA  F  Eta squared η^{2}, η^{2}p  Categorical, more than two independent groups  
Inferential statistical procedure utilizing the Fratio to test the overall fit of a linear model, usually defined in terms of group means.  Twoway between subjects ANOVA  F  Categorical, more than two independent groups  
Repeated Measures ANOVA  F  Categorical, more than two repeated observations  
Mixed ANOVA  F  Categorical, more than two independent groups with repeated observations 
Statistical Test  Definition  Test Statistic  Effect Size  Number of Variables & Data Type  Assumptions  
Regression  Linear regression  Expands upon correlation, used to predict a variable based on another variable.  F  Cohen's f^{2}, R^{2}, β (beta), b  One continuous IV, one continuous DV  Independence of errors (residuals), linearity, homoscedasticity of residuals, no multicollinearity, no significant outliers, or cases with high leverage or influence, normal distribution of error (residuals) 
Standard Multiple regression  An extension of linear regression used to predict a variable based on two or more variables  F  Cohen's f^{2}, R^{2}, β (beta), b  One continuous DV, Two or more continuous or nominal IV’s  
Hierarchical Multiple Regression  F  Cohen's f^{2}, R^{2}, β (beta), b  One continuous DV, two or more continuous or nominal IV  
Binomial Logistic Regression  F  Cohen's f^{2}, R^{2}, β (beta), b 
Statistical Test  Test Statistic  Effect Size  Number of Variables & Data Type  
Association  Spearman's correlation  r_{s} or ρ (rho)  Two continuous and/or ordinal paired observations  Monotonic relationship 
Differences Between Groups  MannWhitney U Test  U, z  One continuous or ordinal DV, one categorical IV consisting of two related groups or matched pairs  
Wilcoxin's ranksum test (AKA Wilcoxin's signed rank test)  z  One continuous or ordinal DV, one categorical IV consisting of two related groups or matched pairs 
Descriptive statistics  used to summarize data rather than make generalizations or inferences about a population  may include percentages, ratios, measures of central tendency such as mean, and measures of variability (spread of the data).
 Central tendency (AKA Measures of Central Tendency)  refers to the center of a frequency distribution of observations as measured by the mean, median, and mode
 Mean (μ)  a simple statistical model of the center of the distribution of scores. A hypothetical estimate of the “typical” score
 Median  the middle score of a set of observations
 Mode  the most frequently occurring score in a set of data
Degrees of freedom  the number of entities that are allowed to vary when estimating a statistical parameter. Determines the probability distribution for test statistics
Chisquare test of association (AKA Chisquare test of independence)  used to assess the relationship between two or more categorical variables
Fratio  a test value used in analysis of variance (ANOVA procedures) determining whether the difference between two variables is statistically significant
Model sum of squares  a measure of the total amount of variability for which a model can account, derived from the difference between the total sum of squares and the residual sum of squares
Nonparametric
Wilcoxon’s ranksum test  a nonparametric test to detect differences between two independent samples, nonparametric equivalent to independent ttest, provides same function as MannWhitney U test
Wald statistic  a test statistic with a known probability distribution (a chisquare distribution) used to test whether the b coefficient for a predictor in a logistic regression model is significantly different from zero
Wilcoxon’s signedrank test  nonparametric test detect differences between two dependent samples, nonparametric equivalent to dependent ttest
Spearman’s correlation coefficient  a standardized measure of the strength of relationship between two variables that does not rely on the assumptions of a parametric test
Simple regression  a linear model in which one variable or outcome is predicted from a single predictor variable
Hierarchical multiple regression (AKA sequential multiple regression) 
Test Statistics
α (alpha) level  the probability of making a Type 1 error, usually .05
 Bonferroni correction  a correction applied to the α level to reduce the probability of a Type I error when multiple significance tests are carried out. The α level is divided by the number of tests conducted.
 pvalue  the pvalue is a statistic representing the likelihood that the observed effect is due to chance. p < .05 is generally considered "statistically significant"
Effect Size  an objective measure of the magnitude of the observed effect
 Correlation coefficient (AKA r, R, or Pearson's r)  a standardized measure representing the linear relationship between two variables, ranging from 1 to 1. The closer this measure is to zero, the weaker the relationship. Numbers closer to 1 or 1 represent a negative or positive relationship, respectively. In other words, if the correlation = .78, the variables are positively correlated. When exercise increases, strength goes up.
 Cohen’s d  a standardized measure of the difference between means
 Eta squared η2 (AKA coefficient of determination)  an effect size that is the ratio of the model sum of squares to the total sum of squares
 zscore  the value of an observation expressed in standard deviation units
 Chronbach’s α  a measure of the reliability of a scale. The number of items is squared, multiplied by the average covariance between items, then divided by the sum of all elements in the variancecovariance matrix.
 β (beta)  standardized regression coefficient. Indicates the strength of relationship between a given predictor and an outcome in standardized form. It is the change in outcome associated with a one standard deviation change in the predictor
 b  unstandardized regression coefficient, indicates the strength of the relationship between a given predictor and an outcome, in the units of original measurement
 Nagelkerk’s R^{2}_{n}: a version of coefficient of determination for logistic regression
 Partial eta squared η2  the proportion of variance the variable explains, when excluding other variables in the analysis
 Pearson’s r (AKA correlation coefficient)  a standardized measure of the strength of relationship between two variables, ranging from 1 to 1.
 Phi Φ  a measure of association between two categorical variables, used with 2 x 2 contingency tables, a variant of the chisquare text
Confidence interval  for a given statistic calculated for a sample of observations (e.g. the mean), the confidence interval is a range of values around that statistic that are believed to be certain, with a certain probability (e.g. 95%), the true value of that statistic
Variance  an estimate of the average variability (spread) of a set of data
 Standard deviation (σ)  an estimate of the average variability (spread) of a set of data measured in the same units of measurement of the original data, derived from the square root of the variance
 Standard error (SE, AKA standard error of the mean)  the standard deviation of the sampling error of a statistic. For a given statistic, (e.g. the mean) it tells how much variability there is in the statistic across samples from the same population. Large values indicate that a statistic from a given sample may not be an accurate reflection
 Standard error of differences  a measure of the variability of differences between sample means
 Standardized residuals (AKA as studentized residuals)  the unstandardized results divided by an estimate of its standard deviation that varies point by point
 Residual  difference between the value the model predicts and the value observed in the data on which the model is based
Bar chart  a graph in which a summary statistic is plotted on the yaxis against a categorical variable on the xaxis
 Error bar  a graphical representation of the mean of a set of observations including the 95% confidence interval of the mean
Histogram  frequency distribution. Differs from bar chart in that the bars are touching
Boxplots  (AKA boxwhisker diagram)  a graphical representation of some important characteristics of a set of observations. The center of the plot contains the median, surrounded by a box.
 Interquartile range  The top and bottoms of the box representing the limits between which the middle 50% of observations fall (the interquartile range)
 Whiskers  Two lines extending from the top and the bottom of the plot, displaying the most and least extreme scores
Scatterplot  a graph that plots values of one variable against the corresponding value of another
 Regression line  a line on a scatterplot representing the regression model of the relationship between variables plotted