Kapitel 02: The basic components – scales and items

Coaley, K. (2009): An Introduction to Psychological Assessment and Psychometrics. Sage.

THE BASIC COMPONENTS – SCALES AND ITEMS

Some basic questions (and answers) related to this chapter:

• Different types of scales and items?
Different scales incl. nominal, ordinal, interval, ratio. Different items incl. analogies and odd-one-out (ex intelligence tests), constructing, making or demonstrating an ability (ex performance tests), stem, options and distractors (ex ability and aptitude tests), yes-no, true-false, like-dislike (ex personality questionnaires).

• Disadvantages and advantages?
– Nom. klassificerer folk i kategorier, ex mand eller kvinde. Disse kategorier er ikke ordnede på nogen måde og kun få regneoperationer er meningsfulde med nom. skala.
– Ord. klassificerer folk i rangorden, ex fra hurtigst til langsommest, men skalaen viser ikke folks absolutte position eller de reelle forskelle mellem folk, kun deres rang ifht. andre personer i sættet. Median og mode.
– Int. knytter et nummer til folk, der også repræsenterer, hvor stor en forskel, der er mellem individer, men manglen på et ægte nul-punkt betyder at man ikke kan kende det absolutte niveau af det der måles, ex depressionsskala, celsiusskalaen. Means, variance, standard deviation, Pearson prod.-mom. correlations.
– Ratio er ligesom int. men med et ægte nulpunkt, ex meter, kilo, reaktionstid. Er den mest ideelle skala og kan bruges til matematisk analyse samt er bedst egnet til statistisk analyse.

• A short account of Classical Item Analysis, Item Response Theory and Rasch modelling.
CTT is based on the assumption that OBSERVED SCORE = TRUE SCORE + ERROR. It focuses on test level info.
IRT focuses on Item-Characteristic Curves, which mark the probability of getting a correct answer with the amount of ability. Also focuses on item difficulty and discrimination.
Rasch modelling is the creation of a set of items which get progressively more difficult, in a way that, if a person answer a question wrong, all the following questions will also be wrongly answered, and if a person answers a question right, then all the previous questions will also have been rightly answered. 

• What is the nature of attitudes and what are the key characteristics?
Attitudes are underlying tendencies of individuals to respond in certain ways to certain stimuli. There are many definitions…

• Describe some different approaches to scaling and measurement of attitudes.
Thurstone (pair comparisons and equal-appearing intervals), Likert scales (invl. rating items on a continuum from 1-5), Guttmann’s scalogram( (a cumulative ordinal scale), the Semantic differential (invl. having people rate a specific concept on a scale from ex ”Good 1 2 3 4 5 6 7 Bad”).

KINDS OF SCALES

A FUNDAMENTAL PROBLEM with measuring psychological concepts:
The lack of an external scale with which one can judge the usefulness/reliability of own scales.

⇒ Because of this, it is common to use a reference population/norm group.

Absolute scores = ie. raw scores. Doesn’t depend on or relate to the scores of other, ex cognitive tests.

Relative scores = scores related to how others perform, ex depression test.

4 DIFFERENT KINDS OF SCALES:

Nominal
How to use: Folk klassificeres i kategorier ved brug af numre/tal, ex mand = 1, kvinde = 2.
Statistics: Frekvens (dvs. antal personer i hver kategori), proportioner, Chi-square test til at forstå associationer mellem kategorier.
Disadv.: Kun få regneoperationer kan lade sig gøre. Kategorier er ikke ordnede på nogen måde og kan ikke sammenlignes kvantitativt.

Ordinal
How to use: Folk sættes i rangorden med hensyn til en bestemt variabel, fx fra først til sidst eller lav til høj i en konkurrence.
Statistics: Median (den midterste placering på rangorden), mode (den mest almindelige/hyppigste score).
Disadv.: Skalaen viser ikke folks absolutte position, kun den relative ifht. de andre i sættet. Det er ikke muligt at udlede de reelle forskelle mellem to personer, kun deres relative placering på rangordenen.

Interval (a scalar scale)
How to use: Numre knyttes til et individ og viser om denne er >, <, = andre, men repræsenterer også hvor meget forskel der er mellem individer, fx depressionsskala fra 1-10, celsiustemperaturer, IQ etc.
Statistics: Mean (gnm.snit), variance, standard deviation, Pearson product-moment correlations.
Disadv.: Manglen på et ægte nulpunkt betyder at man ikke kan vide/kende det absolutte niveau af det der måles på.

Ratio (a scalar scale)
How to use: Ligesom intervalskalaen, men med et naturligt/absolut nulpunkt, fx meter, kilo, reaktionstid etc. Den mest ideelle og meningsfulde skala.
Statistics: Kan bruges til matematisk analyse og er i det hele taget den bedst egnede til statistiske operationer.

Det er vigtigt at kende skalatypen, hvis man skal kunne lave beregninger ud fra den. I psykologiske vurderinger benyttes ofte ordinal- eller intervalskalaer.

CONSTRUCTION AND ANALYSIS OF ITEMS

Unidimensionality = en skala (og items), der kun måler den ønskede attribut/egenskab.

Different items and their use: Incl. analogies and odd-one-out (ex intelligence tests), constructing, making or demonstrating an ability (ex performance tests), stem, options and distractors (ex ability and aptitude tests), yes-no, true-false, like-dislike (ex personality questionnaires).

UNDERSTANDING EFFECTIVENESS AND FUNCTIONING OF ITEMS
TWO APPROACHES:

Classical item analysis

• Based on Classical Test Theory (CTT) ≈ classical psychometrics
• Focus on TEST LEVEL INFO
• Kerneidéen er at OBSERVED SCORE = TRUE SCORE + ERROR
• Det vigtigste ved testkonstruktion og udførsel er at minimere fejlmængden
• Metoder inkl. Descriptive Statistical Analysis, Distractor Analysis, Item Difficulty Analysis, Item Discrimination Analysis, Analysis og Item-Total Correlation

Item response theory (IRT)

• A range of models designed to investigate the relationship between a person’s response to an item and the attribute being measured
• Mathematical functions and graphs:
ex ITEM-CHARACTERISTIC CURVE (represent probability of correct response vs. level of attribute)
• Examining the curve helps to determine: item difficulty, item discrimination, probability of correct response by guessing

Comparing CTT and IRT

CTT is criticized for being a weak model, while IRT is highly technical.

CTT has too much focus on test-level info, not item difficulty and discrimination, while most of IRT models mainly are designed to assess intellectual functioning and need large samples to work with.

CTT is criticized for assuming that measurement-errors are the same for all/constant, while IRT is criticized for assuming unidimensionality for constructs.

Benefits of CTT and IRT

CTT analyses can be carried out using smaller (representative) samples, uses simpler mathematics, a more foreward estimation of parameters and is easy to apply, while IRT is more strong and precise and this precision can develop more accurate methods of selecting items.

RASCH SCALING

• Assumes that alle items measure only a single trait.
• Items are successively harder to ”pass”, meaning that if you have 10 items, a person who gets item number 4 wrong will also get items 5-10 wrong, while a person getting item number 9 right, will also get items 1-8 right.
• Is based upon knowledge of ability level and item difficulty.
• Unidimensionality + one-parameter logistic function which measures performance on the underlying trait.

MEASURING ATTITUDES

Attitudes = underlying tendencies of individuals to respond in certain ways. Other definitions incl. experiences and the evaluation of a stimulus.
• Useful when trying to describe/explain present and past behaviour as well as when attempting to predict future behaviour.

Measuring attitudes with:

• direct observation (ex children)
• direct questioning/interview
• assessment techniques, first of all questionnaires. The most common method is the development of a scale consisting of positive and negative statements.

Thurstone scales
Pair comparisons: asking many people (”judges”) to compare items with each other and identifying the positive one.

Equal-appearing intervals: the judges ares asked to sort items into 11 categories on a continuum from unfavourable – neutral – favourable.

Likert scales
A large number of favourable and unfavourable items are developed and administered to a large group of people who rate them on a continuum from 1-5 (sometimes 1-7).

Guttmann’s scalogram
Arranging items in increasing order of difficulty of being accepted, so that if an individual agrees with an item, the individual must agree with all of the items below that item. (The aim is a cumulative ordinal scale).

The Semantic differential
A numerical rating scale. Uses questionnaires which lists bipolar adjectives (fx good 1 2 3 4 5 6 7 bad, valuable 1 2 3 4 5 6 7 worthless) linked by seven points, which the respondent have to mark to indicate their attitude towards a given concept.

Limitations of scales measuring attitudes: A common problem is that of social desirability. People’s responses can be influenced, fx they want to preserve some self-image, they don’t want to reveal their true attitudes.

12 tanker om “Kapitel 02: The basic components – scales and items

  1. Greetings from Carolina! I’m bored to death at
    work so I decided to check out your site on my iphone during lunch break.
    I really like the information you present here and can’t wait to take a look when I get home.
    I’m amazed at how fast your blog loaded on my cell phone ..
    I’m not even using WIFI, just 3G .. Anyways, superb blog!

  2. Thank you for every other fantastic post. The place else may just
    anybody get that kind of information in such an ideal approach of writing?

    I have a presentation subsequent week, and I am at the look for such
    information.

  3. I just like the valable info you provide for your
    articles. I’ll bookmark your blog and take a look at again here frequently.
    I am somewhat sure I will learn plenty of new stuff right right here!

    Good luck for the following!

  4. Excellent pieces. Keep writing such kind of information on your blog.
    Im really impressed by it.
    Hello there, You have performed an excellent job. I’ll certainly
    digg it and in my view recommend to my friends. I’m sure they will be benefited from this website.

  5. Episode 93 (The One with the Wedding Dresses) Air Date: 04-16-1998.

    Any software running on the PC’s regular operating system is completely by-passed, so you have no interference
    from it. t care for it you will be able to tell that as well just by the funny look on his face.

  6. The Journeyman Electrician works under the supervision dimaond ace plumber and el cajon of an experienced lineman. Generators function well if slip rings and other parts.

Skriv et svar

Udfyld dine oplysninger nedenfor eller klik på et ikon for at logge ind:

WordPress.com Logo

Du kommenterer med din WordPress.com konto. Log Out /  Skift )

Twitter picture

Du kommenterer med din Twitter konto. Log Out /  Skift )

Facebook photo

Du kommenterer med din Facebook konto. Log Out /  Skift )

Connecting to %s