What test reliability is

Reliability is the degree to which a test measures consistently: under similar conditions it gives a similar result, not driven by chance.

5 min read By Equipo Kokoro · Updated June 2026

Reliability is the degree to which a test measures consistently: if it were repeated under similar conditions, it would give a similar result, not one driven by chance. It’s one of the two key properties of any serious assessment, along with validity. It answers a simple question: is this result stable, or would it change if the person took the test again?

Reliability: one concept, one name

Reliability is about consistency over repeated measurement. It’s worth keeping it distinct from accuracy or correctness: a result can be highly stable —giving the same outcome again and again— without that stability telling you whether the test measures the right thing. Stability is one property; relevance is another.

Why it matters: the margin of error

No psychological test measures without error. Tiredness, the day’s focus, a misread question or simple chance introduce variation. Reliability estimates how much of a result is stable signal and how much is noise. The more reliable a test is, the more confidence we can have that the score reflects something real and not just the moment.

The relationship with validity

Reliability and validity are not the same thing, and it’s best not to confuse them:

	Reliability	Validity
Question	Does it measure consistently?	Does it measure the right thing?
It is	Consistency	Relevance
Relationship	Necessary for validity	Requires reliability, but also something more

A test can be consistent and still invalid, but it cannot be valid if it isn’t consistent. That’s why reliability is the first requirement, not the last. It’s complemented by what test validity is.

See how we build tests designed to deliver stable signal.

See the science behind it

In short

Reliability is the consistency with which a test measures: if repeated, it would give a similar result. It estimates how much of a score is stable signal and how much is noise, which is why results are best read as ranges, not as exact figures. It’s distinct from validity —consistency versus relevance— and it’s a prerequisite for it. In Kokoro, the tests in the library are designed to deliver stable, comparable signal; you can see the approach in the science behind it.

What test reliability is

Reliability: one concept, one name

Why it matters: the margin of error

The relationship with validity

In short

Keep reading

Start organizing your candidates with evidence