SERVQUAL Service Quality (July 2014 updated)

July 2014 updated
Prepared by Michael Ling Page 1
QUANTITATIVE RESEARCH METHODS
SAMPLE OF
REGRESSION & MANOVA PROCEDURES
Prepared by
Michael Ling
Reference: Parasuraman, A., Zeithaml, V. A., and L. L. Berry (1988),
“SERVQUAL: A Multiple-Item Scale for Measuring Consumer Perceptions of
Service Quality,” Journal of Retailing, Vol. 64, No. 1, 12-40.

July 2014 updated
INTRODUCTION
Service quality has been considered as an important attribute to business but
yet hard to measure due to its unique features: intangibility, heterogeneity and
inseparability of production and consumption. In the absence of an objective
measure of service quality, customers’ perception is considered as the standard of
measure.
This paper contributes to the marketing literature by developing the service
quality concept and the derivation of the SERVQUAL scale. The key research
question is to search for a universal service quality scale that can be applicable to all
service categories.
Factor analysis is employed as the data reduction method in the development
of SERVQUAL. The paper provides details on how it is developed from initially a 97-
item scale across 10 service dimensions into a 22-item scale across 5 service
dimensions: tangibles, reliability, responsiveness, assurance and empathy.

July 2014 updated
SUMMARY
Based on the service literature, the initial SERVQUAL scale consists of 97
items across ten service dimensions: tangibles, reliability, responsiveness,
communication, credibility, security, competence, courtesy, understanding/knowing
the customer, and access. Each item is represented by two kinds of statements –
expectation statements (E’s) that measure customer expectations about the firms in
a service category and perception statements (P’s) that measure customer
perceptions about the performance of a particular firm in the same service category.
Data collection is conducted in two stages. During the first stage, 200
respondents from five service categories are selected and provided with self-
administered questionnaires. All responses gathered are pooled for analysis,
regardless of their service categories. Based on the disconfirmation model in
customer satisfaction literature, a difference score Q = P – Q is formed for each of
the 97 items and the coefficient alpha values (α) for the service dimensions range
from 0.55 to 0.78. Coefficient α values are then improved through an iterative
process of deleting items with low item-to-total correlations to achieve better
reliability. The outcome is a reduced set of 54 items, with coefficient alpha values
range from 0.72 to 0.83. Finally, the factor structure is reduced to 34 items across 7
dimensions, with coefficient α values range from 0.72 to 0.94.
During the second stage, four samples of 200 respondents are selected from
each of the four service firms. Again, the respondents are self-administered with
questionnaires that made up of 34 items. This time, the data are sorted into the four
corresponding groups and analysed. The outcome is a 22-item scale across five
dimensions: tangibles, reliability, responsiveness, assurance and empathy.

July 2014 updated
CRITIQUE
The difference approach
The difference approach, Q = P - E, used in the evaluation of SERVQUAL is
based on the disconfirmation model in the customer satisfaction literature. The
authors argue that the “idea” of a difference score is not new and this approach has
been used in role conflict research.
Consider the equation Q = P – E, where the same Q value can be obtained
from various combinations of P’s and E’s. For example, the case where the
difference between P and E is 1 can come from these scenarios: P = 2, E = 1; P = 3,
E = 2; P = 4, E = 3; P = 5, E = 4; P = 6, E = 5; P = 7, E = 6. The difference score, Q,
will not capture the individual P’s and E’s and valuable information could be left out.
A major concern is whether the customers’ perception of service quality is the same
regardless of the individual P’s and E’s. The authors have neither discussed this
point nor conducted trials to test this possibility.
Dimensions of SERVQUAL
The final refined SERVQUAL scale consists of five dimensions, which are
“designed to be applicable across a broad spectrum of services”. A concern is
whether these five service dimensions are sufficient to account for the variations of
quality across all service categories. The sample data has been drawn from a limited
number (five) of service categories and a limited number (four) of service firms. Is it
possible that complex SERVQUAL dimensions (larger number of dimensions) are
required in some services such as movie ticketing but not required in other services
such as airline ticketing? The authors should address the external validity of

July 2014 updated
SERVQUAL by cross-validating their results against a much broader range of service
categories.
The number of items used for each SERVQUAL dimension is made up of only
four to five items. A concern is whether the number of items is sufficient. Is it
possible that service quality can be influenced by contextual factors (depending on
service categories) which some service categories, due to their complex nature, need
to be measured by a larger number of items than others?
Reliability
The inter-item reliability (coefficient alpha) of the final refined scale ranges
from 0.52 to 0.84, where “the reliabilities are consistently high across all four
samples” and “the total–scale reliability is close to 0.9”. This is a good outcome.
However, given the limited data samples, a concern is whether the reliability can be
sustained across all service categories. Again, the issue of external validity of
SERVQUAL should be addressed by cross-validating the results against a much
broader range of service categories.
Amongst the test items, nine pairs of P’s and E’s statements (items #10 to
#13, items #18 to #22) are negatively worded, which all come from the
Responsiveness and the Empathy dimensions. It is well understood that negatively
worded statements are designed to reduce systematic response bias. There are a
couple of concerns here. Firstly, the negatively worded items are not spread out
across the five dimensions, which should be a better alternative to reduce bias.
Secondly, some of the negatively worded items are not straightforward to understand

July 2014 updated
and interpret. For example, “It is not realistic for customers to expect prompt service
from employees of these firms” (E11). There is potential data quality problem here.
The SERVQUAL items are ordinal, which mean that polychoric correlation
might be needed to estimate the correlations if the underlying distributions are
assumed to be continuous.
Questionnaire administration
The questionnaire is made up of “97-statement expectations part followed by a
97-statement perceptions part”. There are a couple of concerns here. Why is the
expectations part before the perceptions part and not the other way round? Why are
the individual items, P’s and E’s, not grouped together? Focus groups should be
conducted prior to data collection to find out how the expectation and perception
statements should be set up.
The 97-statement pairs make the questionnaire lengthy. A concern is that it
might cause the respondents to lose interest and attention to answer all the
questions. Again, there is a potential data quality problem.
Convergent validity
Separate one-way ANOVA procedures have been used in the evaluation of
the association between SERVQUAL scores (dependent variables) and Overall Q
(independent variable) across each of the five SERVQUAL dimensions. Some of the
concerns are as below.
i. In ANOVA/MANOVA procedures, the dependent and independent variables
are interval (or continuous) and categorical respectively. Here, the dependent

July 2014 updated
variables (or SEVQUAL scores) are ordinal, not interval, variables. No
discussions are provided to explain how this might affect the results.
ii. No considerations are taken to distinguish the impact of experiment-wise level
of Type I error given that multiple ANOVA procedures are used. In the second
data collection stage, six one-way ANOVAs are conducted – one for each of
the five SERVQUAL dimension and one for the combined scale. The
experiment-wise probability of a Type I error might be be 6 F tests at .05 each
or 30 percent. It is important to discuss whether the probability should be set
at this level.
iii. Why is the MANOVA omnibus test not conducted prior to the ANOVAs? Apart
from protecting against inflated error probability of Type I error, the MANOVA
procedure also takes into account the intercorrelations among the SERVQUAL
dimensions.
iv. The assumptions of ANOVAs such as independence, normality and
homogeneity of variance for each test group are not tested. No descriptive
statistics (such as Skewness and Kurtosis) or Shapiro-Wilk’s statistic is
provided. No Levene’s test of homogeneity of variances is reported.
v. No effect sizes such as Cohen’s measure is reported.
Overall Assessment
There is concern that the difference approach, Q = P – E, might be too
simplified to have omitted critical information. There is concern whether the five
dimensions are sufficient to cover all service categories. There is concern whether
the items in the dimensions are influenced by contextual factors. There is concern

July 2014 updated
about negatively worded items not spread out. There is concern that the
questionnaire is lengthy. There is concern on how the perception-expectation
statements are presented. There is concern over convergent validity. The strengths
of the paper are the new conceptual framework of SERVQUAL and the high
reliabilities achieved. The weaknesses of the paper are the concerns raised above
and the applicability of SERVQUAL across all service categories.

July 2014 updated
CONCLUSION
The contribution of the paper is its development of the service quality scale,
SERVQUAL, in the marketing discipline. The final refined scale consists of 22-items
across five service dimensions, which is the result of an iterative process of data
reduction based on samples drawn from five service categories and four service
firms.
Though the reliabilities of the measurement scale are consistently high
(Cronbach’s value close to 0.9) across the samples, this critique raises concerns
over the difference model, Q = P – E, and other areas such as item dimensions,
validity, reliability, questionnaire administration and generalization.
The research could have improved by addressing the concerns raised in this
critique. In particular, closer examination of the difference model should be done to
ascertain whether customer perceptions can be summarized by the difference
scores, which is a key assumption upon which SERVQUAL is built. Other
improvement includes testing whether the reliability coefficients of the SERVQUAL
dimensions will hold across a broader range of service categories, and testing the
convergent validity of SERVQUAL to increase the rigour of the research method.

SERVQUAL Service Quality (July 2014 updated)

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (8)

Similar a SERVQUAL Service Quality (July 2014 updated)

Similar a SERVQUAL Service Quality (July 2014 updated) (20)

Más de Michael Ling

Más de Michael Ling (20)

Último

Último (20)

SERVQUAL Service Quality (July 2014 updated)