COMM650/Class notes

From Driscollwiki

Jump to: navigation, search


Dr. Sheila Murphy

  • COMM650
  • Office hours: Monday 11-1

Contents

August 23

Who am i?

  • SM came to ASC directly out of PhD
    • Straight thru U of Michigan (under thru phd)
  • Institute for Social Research, UMich
    • Destination for lots of exiled European scientists during WWII
      • Many stopped at MIT first and moved to UM
    • Highly quantitative program
  • Trained in experimental design
    • In addition to survey, interview, focus group, etc.
    • Still using "quasi-experimental design"
  • "If you can put together a good survey, you'll never starve"

Surveys in Annenberg

  • Need to practice
  • Goal of this course, to assemble a survey you actually hope to conduct
  • Draft of survey due at end of class
  • 10 min presentation regarding the survey

5 smaller assignments throughout semester

  • All posted on Blackboard

Guest speaker, Sep 20th

  • G/Jerry Power
  • Former Annenberg grad
  • "Non-traditional" careers
  • Worked first at Magid (sp?) and Associates
    • Media research firm, TV
  • Moved to BBC World Service Trust (non-profit)
    • Large-scale interventions
    • Teaching people how to use media
    • Inserting health, governance info into the media
  • Moved again to Intermedia (US-based)
  • Also giving the lunch talk on Sep 20

History of surveys

  • Egyptian monarchs used surveys for taxation
  • 1790: U.S. conducts first Census, repeat every 10 years
  • 1889: "Life and labour of the people of London"
    • Charles Booth
    • More than "just a headcount"
  • 1920: Introduction of standardized survey
    • Previously, interviewers could word and sequence questions however they saw fit
  • 1930: Rensis Likert developed a scale for measure
    • Previously, questions tended to be comparative
      • e.g. 20 types of soda, comparing 2 at a time, very time consuming, complicated results
    • Interval scale
      • Reduce number of questions without sacrificing accuracy
  • 1946: Likert became head of Survey Research Center (SRC), UMich
    • Primary peer: North american Opinion Research Center (NORC) at UChicago
  • Late 1940s: George Gallop, young sociology student predicted outcome of 1948 presidential election
    • Literary Digest, best known poll of the time, 6mil reader responses + telephone interviews
    • Gallop identified selection bias: who is a subscriber? Who has a phone?
    • Leading to new thinking around selection and sampling

Total survey error paradigm

  • Assumption: Accept that truth exists
  • Researcher's job is to represent or describe that truth as accurately as possible
  • Many types of error threaten this task:
    • Measurement error
    • Sampling err
    • Response err
    • Nonresponse err
    • Processing or coding err
    • Statistical err

Initial steps constructing a survey

  • Decide what you need to know
  • Formulate a problem statement (either a research question or hypothesis)
    • Research Question:
Is there a relationship...?
    • Hypothesis: suggests a direction to the relationship
  • Decide how to measure or operationalize your constructs

Example Hs, RQs

"The mass media portrays women negatively" (broad starting point)

  • RQ: Is there a relatinoship between stereotyped media portrayals and the self image of young female viewers?
  • H: Stereotyped media portrayals of women cause a more negative self image among young female viewers.


Still too broad

  • "... portrayals in primetime dramas/telenovelas/reality TV/etc ..."
  • Seek existing scales for measuring "self image", e.g. Rosenberg's self-esteem scale

Internal validity

Construct validity

Extent to which the measure is related to the underlying construct

  • Discriminant validity: Extent to which a measure distinguishes between individuals who do and do not have certain characteristics
  • Convergent validity: extend to which a new measure correlates with other previously validated measures
  • Content validity: extent to which a measure thoroughly assesses a particular domain of content
  • Face validity: how compelling the measure is on its face? Person on the street? "Common sense"?
  • Criterion validity: is validity compared to other existing criterion outside of the instrument
    • Predictive validity: extent to which forecasts future performance
    • Concurrent validity: extent to which predicts score on an established criterion measure administered at same time

External validity

  • Generalizable to the wider world?
    • Often the weakness of lab studies, student-sampling (freshman psych students)

Reliability

Reliability = true score/(true score + error)

Types of reliability

  • Test-retest reliability: same people, same result?
  • Parallel forms reliability: are diff forms equivalent? (i.e., SAT, diff q's, same test)
  • Split half reliability: half items on one version, half on other
    • Not typically a good idea
    • Often used when survey is too long
  • Internal consistency reliability: Do items assess one and only one dimension
  • Interrater reliability: Is there consistency between raters (e.g. in a content analysis)?
    • (Number of agreements)/(Number of disagreements)

Jump offs

  • Computer-Assisted Telephone Interviewing Systems (CATI)

Aug 30

while you and i have lips and voices which are for kissing and to sing with who case if some one-eyed son of a bitch invents an instrument to measure spring with -- e. e. cummings

e.g Identify factors that predict intervening in spousal abuse

  • You COULD ask someone to respond on a 5-pt scale, how likely are you to intervene...?

Hypothetical scenario

  • Present a hypothetical scenario about a couple who suspects that there is abuse in a neighboring apartment
  • Ask about the characters:
    • Should the people intervene?

Defining spousal abuse

  • Be specific, what is of interest in "spousal abuse"
    • Only physical abuse?
    • Only between married (not cohabitating) couples
    • Only male on female
  • Write up a def: "Physical abuse by a husband such as .... meant to cause physical pain on his wife."

Defining intervention

  • "Engaging in one or more of these behaviors."
  • Should Nina ... ?
    • Talk to abused spouse
    • Talk to abuser
    • Offer abused spouse a place to go
    • Offer abused spouse other resources like money
    • Try to physically intervene the next time it happens
    • etc. (Are these all the interventions we need?)
  • We could make an additive score
    • 1 point for YES
    • 0 for No
  • We could make a weighted score
    • Some interventions are worth more points than others
  • We could also offer a Likert scale rather than a yes/no
    • E.g. Talk to abused spouse, "Very likely, likely, somewhat likely, unlikely, very unlikely"

How do we know that this measures a single construct?

  • We want all of these to combine into one construct
  • Factor analysis, Cronbach's alpha
    • Perhaps there are two or more subgroups that "hang together"

"Fence-sitting"

  • Weakness of using Likert response scales
    • People might only want to answer in the middle for each intervention

Variable

  • Measurable counterpart of a construct
  • Manifest variables are variables that have obvious measurements available
  • Latent variables are more difficult to measure
    • e.g. intelligence, aggression

Dependent

  • Variables being explained or predicted

Independent

  • Vary naturally or through some sort of manipulation

Background variables

  • Past behavior
  • Demographic variables
  • Attitudes toward targets
  • Personality, moods and emotions
  • Other individual differences such as perceived risk
  • Exposure to a media campaign

Fishbein and Capella's integrative model of behavior

  • Full model measures behavior
  • Final product may be attitudes
    • But people also use open-ended questions, focus groups to get preliminary info on attitudes
  • Behavioral intent is imperfect but the best we have

Reality Isomorphism or "Fit"

  • Measurements should be taken in a structure similar to off-line reality

Levels of measurement

  • Progressive
  • Each level has all the benefits of the former plus additional ones
  • In general you want the highest level possible without sacrificing fit

Nominal

  • "Name"
  • Doesn't matter what number you assign to it
    • Male/female, 0/1, 1/0, doesn't matter
  • No particular order
  • Nominal categories should be
    • Exhaustive: account for all possible responses
    • Exclusive: responses fall into one and only one category (including "other")
  • "Please specify: ____" might accompany "Other"
    • Which might inform future versions of the survey

Ordinal

  • Relationship between categories
    • You might order them
    • But the distance between the choices is not equivalent
  • e.g. Ranking football teams where team 1 is much better than team 2 and 3
  • People have a tendency to treat these as they are interval
  • Often possible to take something ordinal and bump it up to interval
    • Preferable!

Interval

  • Feels ordinal (low to high, high to low) but
  • Assumes equal or roughly equal distance between each rank
  • Most common level of measurement used in the social sciences
  • Most common interval scale is the Likert scale
    • Similar to "bipolar" scale
  • Tend to either offer choices:
    • Strongly disagree, disagree, ..., strongly agree
    • Very minor 1 2 3 4 5 Very severe
    • Could also be smiley faces (D:, ):, :|, :), :D)
  • Advantage of a ten-point scale (1-10)
    • Discourages fence-sitting
    • May capture more slight variation
  • Ideal scales might be:
    • 7pts, all labeled, for bipolar
    • 5pts, all labeled, for unipolar
  • "Semantic differential scales"
    • Good ... Bad
    • Weak ... Strong
    • Boring ... Interesting

Ratio

  • Natural zero point signifies the complete absence of the variable being measured
  • Can assume that the numeric values are ratios of one another
  • Most physical dimensions can be measured in ratio scales, much rare that nonphysical variables such as attitudes can be measured this way

Scale or index

  • Scales are groups of individual questions or items that all try to measure the same underlying construct
  • Advantage of scales is that if they are measured on the same interval scales, you can group them into a single construct

Pre-existing scales

Advantages:

  • Validated, "bugs out"
  • Can make direct comparisons between your population and what others have found

Disadvantages:

  • Constrained from changing the wording on a scale
  • In general, you should use a scale in its entirety
  • Or use complete scale followed by additional items that interest you

7 decisions to make when using scales

  • Number of response options
  • Labeling of response options
  • Physical format of the scale
  • Balanced v unbalanced
    • Balanced scales have equal positive/negative choices (bad/ok/good)
    • Unbalanced scales don't and are not interval scale
  • Odd v even number of categories
    • Do you want a midpoint?
    • Often you get fence-sitters with a midpoint
  • Forced v nonforced choice
    • If you offer a "don't know", you may get a lot of these responses
    • Some people believe that it's important to provide these options so that people don't just pick the midpoint and skew the results
  • Social desirability (or the extent to which people are willing to tell you the truth)
    • If people are lying because they think that they "should", we have a problem

Reducing social desirability

  • Online survey may reduce SD bias compared with in-person or phone survey
    • Reduce "judgemental" aspect
  • Anonymity
  • Confidentiality
  • Word questions according to "other people", "most people"
  • Use indirect or nonself-report measures
    • Observation, tracking, response time, software assistant
    • Potential probs: observation apprehension, demand characteristics, internal validity

"Simpatico"

  • People try to please the survey taker
  • One way to circumvent this is to use a bigger scale
    • If they are always choosing 6 and 7 on a 7pt scale, expand to 10pt scale

Response time

  • Do you notify the respondent?
  • How do you make meaning?

Sept 13, Question ordering, biases, and ANHCS

  • Academic v. non-academic survey research
  • "Audiencescapes"
  • Descriptive statistics,
    • Inferential statistics
  • Ev Rogers Award, annual award for people who produces educational entertainment
    • Martine Bowman from U of Amsterdam
    • Next Wednesday

Information processing

  • 1970s, "cognitive revolution", "computer metapor of the mind"
    • Could we model the mind in software, input/output
  • Applying these ways of thinking (psych, info processing) to survey construction
  • Notable practitioners: Sudman, Bradburn & Schwartz; Roger & ...geau

Thinking about answers (S, B, & S)

Suggest the process of answering a question can be broken down into at least 4 separate processes:

  1. Comprehension
  2. Retrieval
  3. Forming a judgement
  4. Editing the answer (for social desirability, or to match response options)

Methods access different stages

  • "Retrospective thinkaloud" verbal protocols
    • But can people access/articulate their own thought processes
  • Dick Nesbit, Wilson, "Telling more than you know".
    • People come up with all kinds of reasons for doing something beyond what they evidently did

Retrieval of autobiographical memories (Ch 7, S, B, & S)

Autobiographical memories have 3 components:

  • Personal memories - visiting the Eiffel tower
  • Autobiographical fact - city where born (no actual personal memory)
    • Usually pretty accurate
  • Generic personal memory (driving to USC - no specific memory but composite)
    • Generally most difficult, nothing particularly notable

Estimation is particularly problematic when the event is ...

  • Frequent
  • Has occurred for a long time
  • Happened in the distance past
  • Not distinctive
  • Mundane or unimportant

Cannel, Miller and Oksenberg (1981)

Proposed 2 main routes to answering survey items:

  • A high road (careful "optimizers")
  • A low road (sloppy, superficial, use heuristics)
    • This "low road" is the same as the "satisficing" proposed by Krosnick and Tourangeau (based on Herb Simon's 1957 concept of satisficing)

Dealing with satisficing

Be specific!

  • Ask about narrow bounds rather than general
  • Think about your "most recent [sex partner]"
  • Frame the question in terms of closer bounds

Satisficing

  • People give you quick'n'dirty
  • Simon won Nobel Prize
  • Big problem for survey researchers

More potential biases

Generic memory

  • R will generate a typical instance as opposed to an actual one

Retrospective bias

  • R will color the past to match current attitudes

How to reduce these potential biases

Note: see Groves Table 7.1

  • Supplement R recall with available records
    • Bank, phone, GPA, IQ, perhaps not from R themselves
  • Cues (what, who, where)
  • Taking more time (slow pace of interview)
  • Diary methods
    • (Smith, 1991 showed poor match between diary and recall. Although diary is generally considered to be more accurate it is still subject to fatigue, social desirability, etc.)
    • (i.e., Nielsen diaries v. Peoplemeter)
  • Experience Sampling Method (ESM)
    • Avoids recall problems altogether by collecting concurrent reports at random moments in time using pager or phone
    • Cons: expensive and places considerable burden on researchers as well as respondents
    • Used in MSM research
  • The Day Reconstruction Method (DRM)
    • Cheaper approximation to ESM
    • Call in only on days where event occurred - report subject id number and describe event

Proxy reporting

  • Single household member asked to report on behavior of entire household

Pros

  • Doesn't overrepresent large families with more members
  • "Household" is more common unit of analysis than individual

Cons

  • Not all members can report on other family members
    • e.g. teenage porn viewing (DiClemente)
  • All members may not be equally able to report the information you want
    • For example to get the best estimates of grocery purchase you probably want the grocery shopper (not always mother)
    • Not everyone in the family knows the family finances

Order effects

Response order

  • Due to limits in working memory (7 +/- 2)
  • First (primacy) and last (recency) response options have an advantage

(But recent work by Bishop and Smith that looked at split ballots conducted by Gallop across the 1930s to the 50s and showed that the response order effect size was probably no more than 5%)

  • You can randomize the order of responses

Sequence bias

  • Are you priming a response?
  • Questions that might influence responses to subsequent questions should be placed toward the end
  • If you are unsure whether or not one question will influence the answer to a second question, you can
    • Employ split-half design
      • Half don't get the question at all
      • Now you can comare and find the effect
    • Counter-balance
      • Some get it with the question first, others get the questions later

Adopt a general organization pattern that compliments your objectives

Two general patterns

Inverted funnel

  • Ask super specific question right out front
  • Fans of this believe that there are fewer order effects from narrow to broad than reverse

Funnel

  • 90% of the surveys use this pattern

Softballs First several questions should be easy to answer and nonthreatening

  • Questions that are difficult, time consuming, or embarrassing should come near the end
    • Income almost always comes last, most likely the primary hang-up, break-off question
    • Demographic questions usually come at the end

Topically related questions should be grouped together

  • Make it feel like a conversation

Questions should be ordered in such a way as to minimize response bias

  • Such as yea-saying, nay-saying, or fence-sitting
    • Yea-saying: saying yes to everything
    • Nay-saying: saying no to everything
    • Fence-sitting: saying no to everything
  • Some psych scales have questions to filter for yea/nay
    • "Do you make all your clothes?"
    • Create two questions that contradict if they both have yay
  • Keep relatively short
  • Reverse some items

"Filter questions" and "skip patterns" should be specified so that respondents are not asked irrelevant questions

  • Must not violate rules of convo
  • If people have no children, don't ask childrens' names
  • Easy on qualtrics

Screener questions up front

  • Eliminate ineligible respondents right off the bat

Otherwise demographics at end to avoid breakoffs

  • Income last
  • Most people don't want to be in the lowest

History of ANHCS

The Core

  • First 20 minutes of the survey
  • Contains items measuring media use and health behaviors, knowledge and beliefs, as well as health policy
  • Remains fairly constant over time

The modules

  • Build off of the core
  • The last 5 or so minutes of the survey
  • Available to Annenberg faculty and graduate students for health related research
  • On a competitive basis

Additional ANHCS features

  • Random assignment of respondents to different conditions
  • Present images and stream short video clips
  • Leverage ANHCS data by paying Knowledge Networks to administer a 2nd survey and have them combine the data from ANHCS and the second survey into a single datafile

Sept 20

  • Visit Gerry Powers

Takeaways

  • Threats to internal validity:
    • Referring to "access to information" when yo mean "access to media"
    • Or access to "information" when you don't know what's quality
  • DHS survey asks: Do you have radio/tv?
    • No other info about signal, reception, preferences, habits

Sample scenario

  • National survey in Angola
  • Baseline to understand the gaps, knowledge, practices in HIV/AIDS
  • Arrive in Luanda, Angola
  • No census, 20-30 yrs out of date
  • Many many borders, corridors for HIV transmission
  • My central interest: knowledge about HIV
  • Logistically difficult to reach certain groups
    • e.g. in Cambodia budget included elephants to reach very remote people
    • Assume that all the work is face-to-face

How to manage border communities?

  • People are Angolans but they live on the borders of Zambia
    • What if they live across the border?
    • What if they have migrated because of conflicts?
  • Who is included in the sample? Who is not?

How to achieve more and better agriculture coverage in African media?

  • How do you begin? Where do you start?
  • Examine existing content?
  • We want to know about coverage?
    • Start with media professionals?
  • Diasporic reporting happens elsewhere
    • "News production environments" vary considerably

Sampling in multi-lingual situations

  • Contacted all BBC bureaus, stringers for names+numbers
  • Snowball before first contact
    • Needed 10 contacts for 4 interviews

Translation? Seven target languages

  • Original survey in English
  • How to get this surveyed?
    • Translate it and then back-translated
  • "Verbatims" in 7 languages and translated back into English
  • 10 week turn around

Limits of comm theories and methods developed in Western tradition?

  • What biases, assumptions will creep up in new contexts?
  • Social construction in methods
  • "The Social Life of Methods", conference at Hughes College, Oxford
    • Presentation by someone from b-school at U of Copenhagen
    • Seeking influence of religion (Catholicism, Protestantism) on research methodology
  • Linguistic, epistemological issues
    • One, two, many...
  • Very practice of sharing thoughts with a researcher
    • Roots in confessional
    • Placing value on individual opinion, experience
    • Perhaps giving an opinion is an issue of targeting

Example: universal education for girls research

  • In Somalia
  • No Somali women on the team
  • Research conducted by huge 350lb Somali man
    • But actually the women were more likely to open up to a man
    • Because the man talked about it, he normalized, legitimized it
    • Reduced the potential embarrassment

Africa Talks Climate

  • One of their social science requirements: none of the respondents can know each other
    • But over 50 men showed up, literally everyone in the village
    • Couldn't turn away anyone from the focus group
  • Later, seeking women's input, men said, "why would you want to talk to them?"
    • Finally, with men's permission, they interviewed the women and they said, "why would you want to talk to us?"
  • Go with desing that will yield highest quality data

Scales?

  • Typically using 5-pt Likert scales also 5+1 with a Don't Know
  • In Burma, considerable fencing-sitting, yay/nay-saying

Non-academic jobs

Non-academic project design v. academic

  • "Soft money", if you're not generating grants, you're not getting paid
  • Similar research might address both theoretical and practical questions
  • Designs are not mutually exclusive
  • Important to keep abreast of the latest theory, methods
  • Turn around time is much faster, 3-6mos

Jump offs

Sept 27, Populations and sampling

Midterm

  • "First half of a paper"
  • Starting a lit review
  • Something we can work with for the 2nd half of the course

Terms

Census

  • A survey that attempts to include every member of the population of interest (POI)

Sample

  • A subset of individuals from the larger group of interest (aka the population of interest or the target population)
  • A "well-stirred soup"

Bias

  • Refers to the systematic over or under representation of certain segments of the population (aka a non representative sample)

Stages of drawing a sample

Define the target population aka population of interest (POI)

Demographics
  • The social grouping or categories that we use to describe precisely who it is we are targeting
  • Common demographic categories include: age, sex, education level, race or ethnicity, etc.
Psychographics

Attempt to target specific "types" of individuals that focuses not on the standard demos or location but on their psych profile

Determine the sampling frame

Survey elements, sampling units, units of analysis (group? individual? org?)

  • Sampling frame: relatively complete list of those in your population of interest eligible to participate in your survey
    • "All fortune 500 companies..."
    • But what if you want to survey everyone in a town?
  • You must define your pop-of-interest carefully considering both inclusion and exclusion criteria
  • Also you must assess the degree of potential sampling frame error

Selecting a sampling technique

Probability sampling:

  • Each individual or household or "element" has a known probability or chance of being selected

Nonprobability sampling:

  • Does not use a chance selection procedure sand therefore is more easy for bias to creep in

Simple random sample

Samples in which every member of the population of interst (POI) has an equal and known chance of being selected

  • Certain kinds of analysis:
    • e.g. if you want to do a network analysis, they have to know one another
  • Also, face-to-face interviews
    • You might constraint your sample to a local area, "clustered" or "multi-staged" sample

Stratified random sampling

  • Decide on the subgroups that you want to be sure are represented
  • Be sure that your final sample contains a certain "mix" of people by matching the proportions you want on your key "variables of interest"
  • Never have less than 100 in a particular subgroup
  • Use random selection process but stop entering when a particular subcategory is filled
  • Chochran (1961) generally suggests that the biggest gains are made in up to the first 6 categories or strata in a given variable
  • The father of stratified random sampling is Jerzy Neyman who demonstrated in 1934 that stratified RS can produce samples equal or superior in quality to SRS particularly if you use a system called the Neyman allocation
Male Female

-

White 50 50

-

AfAm 50 50

-

Latino/a 50 50

-

Asian 50 50

Cluster sampling (aka Area SAmpling or Multistage sampling)

  • The population is divided into mutually exclusive and exhaustive categories (eg. counties)
  • These counties are selected aka area sampling
  • Most often used in face-to-face surveys to control costs and logistics
Rate of homogeneity (ROH)
  • Measure of similarity within a sample

Nonprobability sampling

There are at least 3 appropriate situations for nonprobability sampling:

  1. Hard to identify groups, e.g. gang members
  2. Very specific groups, e.g. patients with rare disease
  3. Program evaluations

Quota sampling

  • Takes this idea of selecting respondents on the basis of their demographics one step further
  • No sampling frame
  • Collects a sample until study's demographics resemble those of the population as a whole or matches those demographics you are particularly interested in
  • Most common demographic categories used include: age, gender, race, income

Snowball sampling

  • A nonrandom sample that uses participants to recruit other participants who have the characteristics of interest
  • For use only in hard to reach or rare populations of interest
    • e.g., commercial sex workers, "straightedge kids"

Convenience sampling

  • Grabbing almost any warm bodies without worrying about demographics and representativeness
  • Least expensive and fast but you get what you pay for

Determine the sample size

  • People are overly swayed by the N=1 experience
    • e.g. your one friend bought a lemon but consumer reports tells you the car is great

Sample size should be dictated by ...

  • Homogeneity of the population (e.g. close races need more respondents than landslide)
    • Japan smaller sample than the US
  • Type of data and future analysis (dichotomous (chi-square v interval (regression)))
  • Precision required (what is the acceptable level of error?)
    • e.g. +/- 3% too much in the Bush/Gore election
  • Magnitude of a difference you are trying to detect (power or beta)
  • Number of subgroups of interest
    • Need to almost double your sample for each subgroup of interest
  • Present goals: complex questions where you are interested in multiple variables need larger samples (more variance)
  • Future goals
    • For example, in planning a longitudinal or panel poll in which you reinterview the same individuals at several points in time, you will want to draw a larger initial sample because of attrition over time at each "wave" of data collection
  • Typically calculate at least 10% attrition between each wave
    • Tougher for elderly or highly mobile populations

Power

  • Sample size
  • Magnitude of effect expecting
  • Variability

Sample size calculator

VERY large sample size

  • Statistical significance is a function of sample size
  • If the sample is very large, everything starts to come up statistically significant
  • Large N, small effect size
    • But not necessarily meaningful
  • Might have to do a sample from within it

Oct 11, Survey mode

Housekeeping

  • 10/18, Next homework due (4, 5, 6)
  • 11/22, optional presentation day
  • 11/29, 10 min presentations
  • 12/3, papers due

4 Modes of survey administration

  • Face to face
  • Telephone
  • Online
  • By mail

Factors driving survey mode

Appropriateness

  • Can I really reach my target population?

Quality of resulting data

Logistics

  • Time, resources, monies

Cost

  • Face to face is most $$

Amount and type of information you need

  • Limits to how many questions you can ask in diff modes

Potential biases (Schwartz et al 1991)

Acronyms to know

  • CAPI, computer assisted personal interview
  • CASI, computer assisted self-interview
  • CATI, computer assisted telephone interview
  • IVR, interactive voice response (telephone counterpart to ACASI)
  • Touchtone data entry (TDE)

Brief history

Pre 1960s

  • Up through early 60s, it was all face-to-face
    • People tended to agree ("politeness norm"), 70-90% response rates
  • Phone surveys were only rarely used after the Alf Landon Literary Digest debacle in 1936
    • Also, no areas codes until 1960s
    • Very expensive long distance
    • People were not accustomed to talking to strangers on the phone
  • Mail surveys began to be used widely in the 1940s but only for highly specialized populations (i.e. USC alumni)
    • Lists did not exist and response rate was very low

1960s-1980s

  • By the late 1960s, shifts in tech/cultural norms
    • For a time, the 'politeness norm' carried over to the phone
    • Electronic typewriters enabled people to make copies on normal paper (instead of mimeo)
    • Both telephone + mail enabled surveys to cover the nation and eliminated the need for "complex multistage samples", CHEAPER

1980s

  • National household samples (phone)
  • Mail when postal addresses are adequate and cost is a concern
  • Face-to-face when even small coverage omissions are not acceptable (still expensive!)

1990s-present

  • Trend toward gated communities and locked apartment bldgs
  • Move toward unlisted numbers
  • Telemarketers poisoned public, mistrust, end of "politeness"
  • Less tolerant of long surveys
  • Answering machines, caller ID, multiple lines
  • Still not OK to call someone on a cellphone because of the billing
    • 25% of U.S. is cellphone only
  • Result: mail surveys currently can have higher response rate than phone

Note: face-to-face surveys almost always exclude Hawaii and Alaska

Factors to consider in selecting survey mode

  • Coverage of population of interst
  • Availability of a sampling frame
    • Giving up probability sampling
  • Unit of analysis
    • Most common: household
  • Degree of interviewer involvement
  • Logistics
    • Type of info collected, cost, time, length, response rate, etc.
    • Post office will sell a list of addresses

Hypo: Surveying a fortune 500

  • Start with a letter, follow up with call?
  • Face to face?
  • Trusted listserv?

Coverage of population of interest

  • Surveys of households (FTF or mail) have much better overage than telephone
  • 50% of homes have internet access but no sampling frame

Logistics

  • Channel of comm
    • Sometimes even with computer entry, researchers admin with pen+paper because people might find the laptop to be rude
  • Cost
  • Turnaround time
    • Data entry: entered, cleaned
    • Data cleaning: eliminate "wild codes"
      • Computer assistance can clean data for you
    • Analysis
  • Type of info collected (i.e. reaction time)

Teleforms

  • Scantron, teleforms
  • Somewhat out of fashion because the computer is cheaper

Length

  • F-to-F, 1 hour unpaid, 90min paid (in single sitting)
  • Mail, 1 hr if paid
  • Online, half hour if unpaid and topic is of interest (otherwise 5 min)
  • Phone, 10-to-15 min

Lottery incentives?

  • Sometimes a lottery ticket is better than a small amount of money
  • If you pre-pay someone, they may feel "reciprocity"

Online incentive?

  • Gift certificate
  • Paypal accounts

Response rate

Number of surveys completed / number attempted

Best rates tend to be:

  • FTF
  • Telephone
  • Mail
  • Online

Why not keep going?

  • If the number of people refusing to participate goes up, you're more likely to have a systematic bias
  • Jeopardizing the representativeness
    • 80% very good
    • 60% acceptable
    • Under 50% questionable

Refusal conversion: Increasing response rate

  • Emphasize the importance of the study
  • Personalize by emphasizing the respondent's importance to the study
  • Offer incentives ($, raffle, coupons)
  • University sponsorship
  • "Stamped" as opposed to metered return postage
  • Personalized postcard pre-notification and follow-up
  • Increase the number of follow-up contacts to at least 3
  • The color mint green
  • Notification of cutoff date (if mail survey)
  • "Foot-in-the-door", get agreement to a small request first (fill out screener to see if eligible)
  • Burden - if your survey is indeed short highlight length
  • Interviewer training and manner (address concerns of R "not selling anything")
  • Interviewer matching (or switch to a more skilled "converter" on multiple attempts

What is appropriate monetary incentive?

  • Close to the amount they'd make in the same time
    • Make it "worth their time"

The leverage salience theory (Groves, Singer and Corning, 2000)

  • Figure out what is most important to people

Major survey modes

Mail surveys

  • Self-administered
  • Return stamped envelope

Internet surveys

  • Preferable for international sampling
  • Cheaper than mail
  • Data entered automatically
  • No sampling frame
  • Site-based surveys for a certain kind of skew

Telephone surveys

  • Random digit dialing (RDD) pool is helpful
  • Not that expensive
  • Fast turnaround, good to do immediately after an event

Personal interviews

  • OK response rate (65-75%)
  • Useful for sensitive info

Biases by mode interations =

  • Time pressure
  • Nonverbal + clarification ues
  • Perceived confidentiality
  • External distractions
  • Self selection of respondents
  • Order effects
  • Context effects
  • Response order effects
    • Worse in self-admin
  • Recall
  • Social desirability
    • Worst on FTF
  • Question form
    • FTF open ended questions are longer, more hetero
    • Fence-sitting worst on self-admin

Multimodal surveys

  • "Mixed mode surveys" (Dillman, Smyth, Christian, 2009)
    • Controversial....
    • Still need "unimode or unified design", questions are standardized across modes
  • Occasionally there is a need for mode specific design

Nov 1, Piloting surveys

Housekeeping

  • Team B: Elisheva, David
    • Send pilot survey to teammates
    • Do the "intensive" pilot of your own + the other ppl in your group
    • Do the "polishing" version with people also in Team A
    • Pilot on paper, not qualitrics

3 distinct standards that all survey questions should meet

Content standards

  • Are your questions assessing what you want to assess?
  • Internal validity, measuring what you think you are

Cognitive standards

  • Do Rs understand the questions?
  • Are they able to answer (ie, do they have the information required to answer them? The ability to answer them?)
  • Are they willing to answer? (social desirability or privacy issues)

Usability standards

  • Can interviewers and Rs complete survey as intended? (ie. fatigue, scales, etc)

The purposes or pretests (Converse & Presser, 1986)

For specific questions:

  • Ensuring sufficient variation among respondents
    • If there is no variation? Throw it out! Waste of time.
  • Meaning
    • Internal validity
  • Task difficulty (too hard to answer)
    • People might drop out if they can't answer on the spot
  • Respondent interest and attention
    • Softball, warmup questions can make ppl feel comfortable, good about taking the survey

For questionnaire as a whole:

  • "Flow" and naturalness of sections
    • Is it like a conversation?
    • Do you explain the purpose?
    • Also need IRB instructions
  • Order of questions
    • Is it funnel?
    • Dealing with "order effects"?
  • Skip patterns
  • Scales
    • Keep your responses increasing from left to right
    • SM tends to use 7+ scales
  • Timing
    • Need to get a sense of timing when piloting
  • Respondent interest and attention
    • Fatigue! Is the same alternative checked over + over?
  • Respondent well-being
    • Is it upsetting?
  • Appropriate reading level?
    • 4th grade level
    • If bilingual, be sure that you are using appropriate terms (not just auto translate)
  • Is it manageable?
    • Page breaks?
    • Put questions into grids, test on qualtrics

Side note: don't refer to humans as "subjects"; use "participants" in a focus group or "respondents" in a survey

6 ways to review surveys

1. Expert content review

  • Subject matter expert
  • Survey design expert
  • An expert panel is most effective + efficient (least $) way of debugging yr survey
  • Strength: good for finding ambiguous terms, order and measurement problems
    • Especially good to be sure you aren't looking over a MAJOR scale used in this area
  • Graesser, Kennedy, Wiemer-Hastings, and Ottati (1999) developed a list of potential problems
    • Graesser et. al have developed a computer program that is supposed to identify these problems as a rought expert appraisal but it is inferior to human experts.

2. Focus groups

  • Focus groups are most useful early on in the survey construction process to get a sense of key issues, nomenclature, and a general sense of how people target population thinks about the issues
  • Pros: quick turnaround
  • Cons: Ps may not be representative, bandwagon effects, not good for specific wording issues

3. Cognitive interviews

NRC held workshop in 1983 that raised interest in cog psych in survey design

  • Protocol analysis ("talk aloud")
    • Rs think aloud as they answer questions, narration
  • Retrospective think-alouds
    • Rs describe ho they arrived at their answers either just after each question or at the end of the survey
  • Confidence ratings
    • Rs assess how confident they are in each response
  • Paraphrasing
    • Rs restate each question in their own words
  • Definitions
    • Rs provide definitions of key terms in the question
  • Probes
    • Rs ask followup questions designed to reveal their response strategies, for example, "Could you tell me more about that? Could you give me an example?"

4. Pretests or piloting

Two outcomes:

  • Interviewer debriefing, interviewer feedback
    • Improving wording, order, streamlining
    • If it's hard for the interviewer to speak aloud, there may be a problem with the question wording
  • Quantitative info
    • Based on response of a small sample
    • Look at item distribution (items with no variance may be dropped or scale changed) and missing data
Participating survey
  • More in depth (using draft version)
  • Uses talk alouds, definitions, ask about alternate wordings

Rs are told that it's a pretest...

  • Ask Rs for feedback, "what did that question mean to you...?"
  • Drawback: very time intensive
    • Small number of Rs (Converse & Presser recommend 25-75 participants)
Polishing survey
  • More like dress rehearsal
    • Show must go on!
  • Done on very polished (near finished) instrument
  • Would include a timing of each section or page
    • Online survey will give you a sense of the timing

5. Behavior coding

Videotape interviews + rate interviewer behavior

  • Note wording, clarification issues

6. Randomized or split ballot experiments

  • If you are uncertain of wording, pilot with two versions
    • Assess the outcomes
  • Look at length of time taken to answer
    • And expectations regarding outcome

Other points

Don't reinvent the wheel

  • Double check for pre-existing scales

Timing

  • Self-administered - overall time or "sections" if possible
  • Administered - per page / sections
  • Online - can do for individual items, by section and overall

POI

  • For pilots, go to your least common denominator

Interviewing

  • Interviewer must "delivery" the interview, performance!
  • High energy!

Ways of evaluating pretests

  • Margin notes
  • Oral debriefing with Rs
  • Written reports section by section
  • Field observation
  • Optional: statistical analysis if you have enough pretest Rs

Questions for interviewers

  • Did any questions make R uncomfortable or confused?
  • Did you have to repeat any questions?
  • What questions were the most difficult or awkward?
  • Did any of the sections seem to drag?
  • Were there any sections where the respondent would have liked the opportunity to say more?

Nov 8

  • Following up on pilot surveys

Income measures

  • Try to accommodate 95% of yr target population

Dealing with "sacred" questions

  • You may find that a survey starts to go stale over time
    • Current events, etc change the terms
  • Need to be self-critical, -reflective and make changes

Time scales

  • Minutes v hours?

Social desirability

  • Presenting R with scenarios may be a good way to address this

Pilot

  • Enter the data into an SPSS data sheet

Nov 22: Reliability, Validity

2 basic properties of empirical measurement:

  • Reliability: will consistent trials show the same results?
  • Validity: are you measuring what you think you are measuring?

Note: You can have reliable results that are not valid

Steps to analyzing data

Remember: use reason and interpret the data. Don't just rely on what SPSS spits out at you.

1. Clean data

Garbage in, garbage out (GIGO)

  • Eliminate missing data
  • Check for wild codes, data outside acceptable range; unacceptable
    • You put in zeros for no answer but SPSS computes 0 as an acceptable response. New SPSS convention == . or period
    • Listwise missing data: if any data is missing in a case, remove the whole case
    • Pairwise missing data: if any data is missing in a case, remove that case and another case (results in "swiss cheese data")
  • Outliers, more than 3 std. deviations from the mean, top 0.5% of the population acceptable but will seriously skew your data
  • Response biases (nay-sayers, yay-sayers)

Check distribution of items

Need to ask these questions about every single item in the survey.

Is data normally distributed?

  • If not, this may substantially limit your ability to analyze the data.
  • Can be non-normal if there is a significant skew to one side or the other

Is data skewed?

  • Items with means nearer the center of the response options better. (If very skewed, should you adjust your scale to measure smaller differences?)

If no or almost no variance, should you cut the item?

Check validity of your items

Interpretation of the validity coefficient

  • Can range from -1 to 1 but almost always between 0 and 1.
  • Rarely exceeds .5
    • Cohen suggests that .1 = small correlation, .3 moderate, and .5 large

Coefficient of determination

  • Percent of variance explained
  • Validity coefficient squared
    • e.g. validity coefficient of .4 predicts .16 or 16% of the variance

Validity coefficients influenced by sample size

  • Small sample size, very small validity coefficient, or no result -- even when it is actually predictive

Scales

Easy mistake

  • Don't forget to reverse code items for flipped scales!!!

Item-scale correlations for a given item

  • Corrected item-scale correlation: with all items excluding this one
  • Uncorrected item-scale correlation: with all items including itself
  • Compare the 2 to assess impact of specific items

Check internal consistency or Reliability

  • How strongly do your items correlate with one another?
  • Number of items in the scale?

Coefficient alpha

  • Common measure of reliability of items in terms of internal consistency
  • Proportion of variance in the scale attributed to the "true" score (as opposed to measurement error)

Where alpha is the alpha coefficient (Note: SPSS does both of these)

  • k is the number of items on the test
  • sigma squared sub i is the variance of specific item I
  • sigma squared sub x is the total variance of the scale

Imagine you have a 6 item scale and the intercorrelation among items is .5.

  • Alpha equals 6(.5) / [1+.5(6-1)] = 3/3.5=.857.
  • Very good reliability
Interpretation of the alpha coefficient
  • 0 implies no reliability (all measurement error)
  • 1 implies perfect reliability (no measurement error)
  • (Over .90 might consider shortening the scale, Over .80 is

considered good, .70 is acceptable, below .60 unacceptable)

Factor analysis

Seeking 1 underlying factor to explain relationship among variables

  • Set of items is not necessarily a scale
  • Items may share no common underlying latent variable or they may share several
  • Factor analysis is a generic term for set of statistical techniques that reduce set of observable variables into a small number of latent variables
  • Factor analysis begins with the assumption that a single factor will be sufficient to explain the pattern of responses and then performs a statistical check on how well the single factor solution fares.
  • If one factor is not adequate to explain the pattern of results it will try a 2 factor solution and so on until the unexplained residual correlations are small.
  • For our purposes factor analysis is particularly useful for construct validation (of latent variables that are not easily operationalized) and assessing the number of factors within a scale.

Terminology:

  • Load higher, means that a factor has a stronger predictive relationship to another

Rules of thumb:

  • Only include variables that you believe are related to one another
  • Sample size: you need at least 50 cases
    • Based on correlations which are unstable with small samples (Tabachnick and Fidell, 2001)
  • Have at least 3 observed variables for each "factor"
    • Factors of only 1 (singlet) or 2 (doublet) items are undesirable (Thurstone)
    • Add more differently-worded items if need be
  • If relationships are curvilinear or strongly nonnormal, factor analysis will not work.

Determining extraction method

Common Factor Analysis (CFA)
  • Typically used only when there is a lot of error
  • Represents hypothetical variables and analyses only the common variance of the observed variables
Principle Component Analysis (PCA)
  • Grounded in actual items
  • Considers the total variance (common and unique to individual items) to account for the maximum proportion of variance with the minimum number of factors
  • Use when measurement error is relatively small

Determining the method of rotation

Regardless of which one you pick, you can specify that a simple structure be maximized

Orthogonal rotation:

  • Factors are forced to keep the angle between the reference axes perpendicular (90 degrees)
    • First two factors are 2-D, factor three adds a third dimension
  • Making sure factors are as different as can be
  • VARIMAX, most common

Oblique rotation

  • Less common, more complicated rotation that allows rotation angle to vary
  • Used when you suspect underlying factors are correlated
    • Residual correlations after orthogonal rotation are .15 or more)
  • PROMAX, most common

Determining the number of factors

Note: ultimately this is a subjective decision but there are some criteria for determining the number of factors statistically

Kaisen-Guttman rule of eigenvalues

  • An indicator of the amount of info or variance accounted for by a facotr
  • Over 1 (you can set this minimum differently)

The percentage of variance explained

  • Keep extracting factor until some % of variance has been explained

Scree test

  • Image showing rate of change in size of eigenvalue
  • Look for an "elbow"

Size of residuals

  • If you are extracting the right factors, residuals will be small

Most important

  • Interpretability - factors should be "theoretically meaningful"

Interpreting factors

  • Examine factor loadings for significance
    • Items with highest loadings are considered most similar to underlying latent factor
  • Try to name the factors that reach significance
    • If you can't name this structure, it might not be meaningful

Reviewing survey, remember...

Construct validity

  • Measure what it's supposed to measure

Convergent validity'

  • Does item/scale correlate with other established measures of same construct?

Face validity

  • Common sense

Criterion-related validity

  • Is measure a good predictor of some external criteria?

Content validity

  • Does measurement reflect the entire domain

Exploratory v. Confirmatory factor analysis

Exploratory

  • Analysis explores underlying factor structure

Confirmatory

  • Begin with a structure an plan a study to confirm (or disconfirm) the hypothesis
  • Need to use LISREL or another kind of SEM (structural equation modeling)

Examples

  • Green & Salkind (2008)
  • Williams and Monge (2001)

Factor analysis

Data reduction

  • Identify factors that explain shared variance among a set of variables

Assumptions

  • Measured variables are linearly related to the factors plus errors
  • Measured items are multivariately normally distributed (for chi-squared test used in maximum likelihood)
Syntax v menus
  • When you run 100s of factor analyses, syntax can be easier to manage
  • Possible to use pulldown to generate a chunk of syntax and then manually edit for future analyses
Rotation
  • After rotation, items are shifted to "best fit"
  • Looking for items that are high on one component and low on the other

Reliability analysis

Item-total statistics

  • If I have too many items, my alpha is not reliable
  • Look at "Scale variance if Item Deleted"
    • Cut ones that will raise the scale variance

Nov 29

Adam

  • Mapping (Norman, 1988)
  • Natural mapping (Skalski et al, in press)

= Skalski's topology of mapping

  • Directional n.m.
  • Kinesic
  • Incomplete tangible
  • Realistic tangible

psychology of n.m.

  • Activation of mental models

Next steps

  • Using Mechanical Turk to get a bigger non-student sample

Andrew

  • Crimes against Children Research Center (CCRC)
  • Lots of descriptive surveys

Elisheva

"Give the public what it wants to have and part of what it ought to have whether it wants it or not." -- Herbert Bayard Swope

Take aways

  • "PCA": Principle Component Analysis
  • Oblique rotation is best when there are interdependencies
    • Promax: oblique
    • Varimax: orthogonal
Personal tools