Reasoning with Statistics

From Driscollwiki

Jump to: navigation, search


Williams, Frederick. (1992) Reasoning with Statistics: How to read quantitative research.

Contents

Why Do Quantitative Research? Statistics and Research. RWS, Cps. 1, 2

  • Stats can be powerful tool for description or hypothesis testing
  • Stats is no guarantee of a study's worth

Statistics: a branch of applied mathematics that specializes in procedures for describing and reasoning from measures.

Quantitative fundamentals

Quant is appropriate when

  1. measurement can offer a useful description of whatever you are studying
  2. you may wish to make certain descriptive generalizations about the measures
  3. you wish to calculate probabilities that certain generalizations are beyond simple, chance occurences (including appliation in hypothesis tests)

Challenges

  • "makes sense"
  • representative sampling
  • context

Statistics terminology

Statistics, quantitative work uses applied mathematics to assist in the research process.

  • especially in terms of what is observed and how the observation is reported

Phenomenon: any object or event, the characteristics of which are susceptible to observation.

  • a study concerns how phenomena vary, affect, and relate to other phenomena

Variable: an observable characteristic of an object or event that can be described according to some well-defined classification or measurement scheme.

Data: reports of observations of variables

Measurement: a scheme for the assignment of numbers of symbols to specify different characteristics of a variable.

Population: any class of phenomena arbitrarily defined on the basis of its unique and observable characteristics.

  • Not necessarily people

Sample: a collection of phenomena so selected as to represent some well-defined population.

  • Used when it is infeasible to study an entire population

Descriptive statistics: calculated values that represent certain overall characteristics of a body of data, e.g.

  • values that reflect an average
  • values that represent dispersion of measurements

Sampling statistics: calculated values that represent how sample characteristics are likely to vary from population characteristics

  • Necessary if you are not studying a complete population

Quantitative research plan

Scientific research process:

  • Problem: a precise statement of what knowledge was sought and why it was sought
  • Method: the plan of the research, that is, how the knowledge was gained.
  • Results: a precise statement of the knowledge that was gained.

Problem

  • Research begins w a need for information
  • Problem statements are expressed as a purpose, question, hypothesis
    • Hypothesis is a problem statement susceptible to testing by reasoning from observation
  • Problem statement begins to determine other features of the study
    • Type of statistics to be used
    • An average group value to be determined
    • Preliminary definition of the population to be studied

Method

Method types

  • Descriptive method: a research plan undertaken to define the characteristics or relationships, or both, among variables based on systematic observation of these variables.
    • aka, empirical survey, normative, analytic, clinical
  • Experimental method: a research plan undertaken to test relationships among variables based on systematic observation of variables that are manipulated by the researcher.

Variable types

  • Independent variable: a phenomenon that is manipulated by the researcher and that is predicted to have an effect on another phenomenon.
  • Dependent variable: a phenomenon that is affected by the researcher's manipulation of another phenomenon.

Gathering data

  • Subject (S): an individual member of a population, the minimal unit of observation
    • Total number of subjects observed in a study represented by N.
  • Materials: tools, measurement instruments used to carry out investigation
    • Also, any materials used to manipulate the independent variable
    • Choice of materials determines available statistics
  • Procedures: precise manner by which materials are applied to Ss and how this led to the data
    • Choice of procedures determines available statistics

Statistical Analysis

  • How will the researcher move conceptually from real world observation to statistical modeling and back?
  • Before gathering data, a researcher must consider:
  • Population characteristics (averages, dispersions, ...)
    • Comparisons might be made
    • Type of measurement scale is involved
    • Acceptable error limit
  • Calculations, deductive mathematical rules, opersations of the chosen statistical model

Results

Interpreting whatever has been deduced by use of statistical procedures in terms of what we identify them with in the population

  • Statistical results are determined deductively using mathematical reasoning
  • Identification with the real world is obtained inductively
    • Generalized by means of observing a variety of particular instances of the variable under study
    • inductive pattern

Strong statistical modeling will not lead to conclusive results without an overall strong research plan.

  • Researcher must inductively reason from the statistical results to the real world

Levels of Measurement, RWS, Cp. 3

Scale: a specific scheme for assigning numbers or symbols to designate characteristics of a variable.

  • Usually numerals but signification highly variable

Levels of measurement

Nominal scale: the assignment of numbers or symbols to designate subclasses that represent unique characteristics

  • aka, classificatory
  • Mutually exclusive categories, thus we might say categorical variables
  • Weakest level of measurement, denotes least info about observations
  • Frequency used in content analysis studies
    • e.g., frequency of various ethnic groups in children's TV

Ordinal scale: the assignment of numbers of symbols to identify ordered relations of some characteristics, the order having unspecified intervals

  • Categories are ranked
  • Each subclass can be compared with the others as "greater than" or "less than"
  • Interval between subclasses is not constant, nor specified
  • If numerals are used to code responses, it's not possible to do arithmetic using them
  • e.g., Questionnaires with ranking questions or ["strongly agree", "agree", ...]

Interval scale: the assignment of numbers to identify ordered relations of some characteristic, the order having arbitrarily assigned and equal intervals but an arbitrarily assigned zero point

  • Intervals are equal
  • Most statistics arithmetic may be performed on these values
  • But values may not adequately reflect the real world
  • e.g., Fahrenheit and Centigrade temperature scales, numerals assigned by convenience but interval is equal

Ratio scale: the assignment of numbers to identify ordered relations of some characteristic, the order having arbitrarily assigned and equal intervals but an absolute zero point

  • Intervals are equal
  • All statistics arithmetic may be performed on these values
  • Values
  • e.g., inches and centimeters scales for measuring distance
  • e.g., typical of enumeration (counting) measures

Common problems

semantic differential: blurry distinction between ordinal and interval scale

  • good . ____ . ____ . ____ . ____ . bad
  • choosing to see this as an interval measure makes available greater statistical tools but introduces risk of error and weaker confidence in validity of results

Measurement adequacy, accuracy

Validity: the degree to which researchers measure what they claim to measure

  • Does the measurement fit the real world characteristics of the phenomenon being examined?

Reliability: the external and internal consistency of measurement

  • If the measures were applied under precise replication of the conditions, would the same results be obtained?

Describing Distributions RWS, Cp. 4

Descriptive statistics: gives capability of reasoning back to the real world in terms of what we have statistically reasoned from data

Distribution: a collection of measurements usually viewed in terms of the frequency with which observations are assigned to each category or point on a measurement scale

Score (X): a particular measurement as expressed as a point along a scale

Frequency table: a graphic representation about interval, ordinal, or nominal data in which each score is given a row that includes its symbol (typically a numeral), a number of symbols representing the various Ss with that response, and a frequency value.

  • Note: always tied to a particular number of observations, see probability distributions

Skewedness: if a distribution has a peak that tends to be displaced at one or the other end of the measurement scale and has a tail strung out in the opposite direction

  • described interms of direction, e.g. "positive" or "negative" skew.

Indexes of Central Tendency

Indexes to describe clustering of scores within a distribution.

Mode (Mo): the most frequent score in a distribution

  • If all scores are equal, there is no mode
  • If two or more scores are highest, equal frequencies, it is bimodal,trimodal,multimodal
  • Easiest to obtain

Median (Mdn): the midpoint or midscore in a distribution

  • The point above and below which one half of the scores fall
  • If the scores within the median interval do not divide into equal groups, then the median point is determined by interpolation
  • Easier to obtain than the mean if the scores are ranked
  • May be more representative of central tendency than mean in a very skewed distribution with outliers

Mean (\overline{x}): the sum of the scores in a distribution divided by the number of scores

  • Most sensitive index of central tendency of this group
  • Specifically, this is the arithmetic mean
  • Additional symbols:
    • μ: the mean of a population
    • M: the mean of a sample
  • Mean may be biased in a very skewed distribution
    • it is "pulled away" from central tendency by extremely high or low scores

Deviations from the mean: points away from the mean of a distribution

  • Described in terms of direction, e.g. "positive" or "negative" deviations
  • The sum of negative deviations should equal the sum of positive deviations

Symetrical distribution: the mean and median are the same

Indexes of dispersion

Indexes to describe scattering of scores across a distribution.

Range: the highest score in a distribution minus the lowest score

  • Total range: highest score minus lowest score + 1
  • Crudest index of dispersion, sensitive to only two scores (high,low)

Variance: the mean of the squared deviation scores about the mean of distribution

  • Literally square each of the deviation values (+1,-1,etc) and find the mean of the results
  • Sum of squares: term used to describe the intermediate value in finding the mean
  • Symbols:
    • s2: the variance of a sample
    • σ2: the variance of a population

Standard deviation: the square root of the variance

  • Basis for estimating the probability of certain scores within a sample
  • Practical advantage: smaller, easier values to work with
  • Symbols:
    • s: the std. dev. of a sample
    • σ: the std. dev. of a population

(Note: see p.44 for geometric expression of these concepts.)

Predicting Parameters, RWS, Cp. 5

Statistic: a characteristic of a sample

  • e.g., mean, variance, standard deviation, etc.

Parameter: a characteristic of a population

  • e.g. estimate based on a statistic

Statistical inference: the process of estimating parameters from statistics

  • examining relationship between statistic and parameter

Random sample: a collection of phenomena so selected that each phenomenon in the population has an equal chance of being selected.

  • Allows us to use the logic of statistical inference, estimating parameters from statistics

Four kinds of distributions

Probability distribution: Considering a distribution in terms of the proportion of the total number of units is represented by each category

  • The proportion may also be expressed as probabilities
    • Sum of all probabilities should be 1.0
  • The probabilities and proportions could then be generalized to different samples and sample sizes
  • Alternative to the frequency table
    • Frequencies are noted in the graphical representation as f

Sample distribution: the frequency with which observations in a sample are assigned to each category or point on a measurement scale

  • Mean (M), std. dev. (s)

Population distribution

Population distribution: the frequency with which all units or observations in a population would be assigned or expected in each category or point on a measurement scale

  • Mean (μ), std. dev. (σ)
  • Obtained by random sampling, taking measurements, and calculating distribution characteristics
  • Given the sample characteristics (statistics), we can generalize to population characteristics (parameters)
  • May be identified with a Normal distribution and expressed as probabilities within σ unit segments

Population mean: ??? (see pp 62, 63)

Population standard deviation: σσ

Sampling distribution

Sampling distribution: the frequencies with which particular values of a statistic would be expected when sampling randomly from a given population

  • Obtained after taking several samples from the same population
  • May be identified with a Normal distribution and expressed as probabilities within σ unit segments
  • Larger the sample size, the more likely the sampling distribution will cluster and be leptokurtic

Standard error of the mean (σM): a sampling statistic representing the standard deviation of a distribution of sample means

  • The symbol refers to the standard deviation (σ) of the mean (M)

Sampling error: an estimate of how statistics may be expected to deviate from parameters when sampling randomly from a given population

  • Random sampling will yield characteristics that tend toward population characteristics
  • Laws of chance enable us to estimate what kinds of devations to expect

Normal distributions

Deviation segments: the result of dividing the total area underneath a distribution curve according to standard deviations from the mean (e.g., , + 1σ, − 1σ, etc)

  • The area of a deviation segment is either a proporation or probability

Particular functional relation: characteristic of a curve by which it is divided according to σ-width segments.

Normal distribution curve: a definition of a particular functional relation between deviations about the mean of a distribution and the probability of these different deviations occurring

  • Theoretical distribution only
  • Baseline measured by σ standard deviation
  • Areas of segments under a Normal curve:
    • \pm1\sigma: .3413
    • \pm2\sigma: .1359
    • \pm3\sigma: .0214
    • Sum of all segments: 1.0
  • The central feature of the normal distribution is the defined relation between each σ unit and respective areas under the curve

Testing Hypotheses, RWS, Cp. 6

Step 1: The Problem

Design involving hypothesis testing will incorporate:

  • Research hypothesis
  • Null hypothesis
  • Probability level selected as a criterion for rejecting the null hypothesis
  • Definitions of key terms for precision

Hypotheses: statements we wish to test in our studies

  • Need two kinds to bridge the gap between what sampling statistics can tell us about probabilities and the kinds of statements we want to make about phenomena

Null hypothesis

Null hypothesis: a statement that statistical differences or relationships have occurred for no reason other than laws of chance operating in an unrestricted manner.

  • Evaluated in terms of the probabilities that sampling statistics provide
  • Must be logical alternative to the research hypothesis
  • If it is implied clearly by the research hypothesis, it may be redundant and not explicity stated

Research hypothesis

Research hypothesis: a statement expressing differences or relationships among phenomena, the acceptance or nonacceptance of which is based on resolving a logical alternative with a null hypothesis

  • The actual research prediction that we want to test
  • Logical opposite of the null hypothesis
  • The relationship between two variables is attributed systemic

Research hypothesis must meet two goals:

  • Orient readers to the problem under study
  • Provide unambiguous mathemetical statement to be tested statistically
    • Must imply a null hypothesis that is susceptible to a probability estimate

Probability level

Significance level (aka, rejection region): a level of probability set by the researcher as grounds for the rejection of a null hypothesis

Statistically significant: the level of calculated probability was sufficiently low as to serve as grounds for rejection of the null hypothesis

Step 2: The Method

The method centers upon how we calculate values of probability associated with the null hypothesis using both descriptive and sampling statistics.

Standard error of the difference between means (σdiff): a statistic representing the difference between the means of two samples under observation

  • You would not actually draw many different pairs of samples but rather calculate the value based on a formula, e.g.:
    • If we have calculated the means of two different samples (Mr and Mn),
    • \sigma_{diff} = \sqrt{\sigma_{M_{r^{2}}} + \sigma_{M_{n^{2}}}}

Using such a value, we can consult a table of probabilities to see if we are within the acceptable probability level.

Step 3: The Results

Interpreting statistical results

Two-tailed test: a non-directional hypothesis test that incorporates rejection regions in both tails of the probability curve used for a given statistic

One-tailed test: a directional hypothesis test that incorporates a rejection region in only one tail of the probability curve used for a given statistic

  • Only used if there is a very good reason to make a directional prediction
  • Only used if there is near certainty that there will not be an outcome in the opposite direction
  • Increasing power

Stating a conclusion for the study

See pp 75-76 for examples of conclusing statements.

Failure to reject the null hypothesis: Negative outcome. Does not mean that the null outcome is true.

  • Null hypothesis cannot be rejected but
  • Does not rule out possible alternatives

Errors in Hypothesis Tests

Type I error: rejecting a null hypothesis when it should have been the acceptable alternative

Type II error: accepting a null hypothesis when it should have been the rejected alternative

Power: the probability of rejecting a null hypothesis that is, in fact, false

  • The area under the true alternative curve to the right (positive) of the significance level.

The t test, RWS, Cp. 7

t test: statistical model that can be used for testing the significance of difference between the means of two populations based on the means and distributions of two samples

degrees of freedom (d.f.): ???

Analysis of Variance: RWS, Cp. 8

Factor: can mean the same thing as independent variable

Analysis of variance (or F-test): used when research hypothesis incorporates two or more population means and tests of difference among their respective sample means

  • Method for making a probability statment about the null hypothesis
  • Will not describe relationships among pairs of means within the larger set of population means
  • Centers upon the question of whether the three samples represent the same population in terms of their means
  • Yield statistical value F via this calculation:
    • F = \frac{variance\ between\ groups}{variance\ within\ groups}

Grand mean: a mean of all the groups combined

  • Useful for comparing individual means to see how much groups different from each other

Between-groups variance: variance among sample groups as compared with the grand mean

  • More difference among group means

Within-groups variance: expected variance among individual scores in the population from which all the samples were drawn

  • If there are no differences among the groups, then the between-groups and within-groups variance will be approx equal

TODO Some detail missing in this section. Might need example.

Factorial Analysis of Variance, RWS, Cp. 9

Factorial designs: number of factors and the number of levels each factor has, e.g.

  • 2 x 2 design, incorporates two factors, each having two levels
  • 4 x 2 x 2 design, incorporates three factors, the first having 4 levels, the second and third having 2

Main effects: assessment of the effects of independent variables individually

  • Each main effect is treated as though the other IVs did not exist

Interaction: assessment of the effects of multiple independent variables in a combined form

  • Variation due to interaction effects is only that variation that is not attributable to either the main effects nor treated as error varaince
  • Whatever nonerror variation is observed among the individual group when the main-effects variation has been removed

Total variation: variation of all scores about a grand mean

Error variance: individual differences among the subjects

  • Variance within groups


TODO Discussion of method, calculation, interpretation


Midterm Exam

Nonparametric Tests, RWS, Cp. 10

Correlation, RWS, Cp. 11

Regression & Multiple Regression. RWS, Cps 12 & 13

Factor Analysis, RWS, Cps 15

Time Series Analysis: RWS, Cps 16

Final Exam

Missing from this text

  • pp62; calculations missing from text for selecting range in determining population mean
Personal tools