MATH PLAY’S EFFECTS ON MATHEMATICS ATTITUDE AND AWARENESS: A
RASCH ANALYSIS
By
Jean Bowen
B.Sc., University of Northern British Columbia, 1999

THESIS IN PARTIAL FULFILLMENT OF
THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
IN
MATHEMATICS

UNIVERSITY OF NORTHEN BRITISH COLUMBIA
March 2018
© Jean Bowen, 2018

Abstract
Where children see mathematics in the world and the attitudes children hold toward
mathematics are not thoroughly understood. The Mathematics Attitude Assessment (MAA)
and the Difficulty Associating Words with Mathematics (DAWM) are instruments developed
to examine mathematics attitude and the propensity of primary students to indicate words
as mathematics related. These instruments were used to measure any changes among preintervention, post-intervention and one-month post-intervention of this quasi-experimental
design. The intervention was Math Play: a collection of mathematics-based games and
activities with emphasis on exploration and problem solving. Rasch analysis and traditional
analysis techniques were used to examine the participants, the assessment instruments and
any effects. The difference scores were analysed using a 2X2 Two-Factor ANOVA. A
statistically significant difference was found for pre-treatment to one-month post-treatment
for the DAWM with each participant as an experimental unit. No statistically significant
differences were found for the MAA results.

ii

Table of Contents

Abstract ..................................................................................................................................... ii
Table of Contents ..................................................................................................................... iii
List of Tables ............................................................................................................................. vi
List of Figures ......................................................................................................................... viii
Acknowledgments ..................................................................................................................... x
Dedication ................................................................................................................................ xi
Chapter One Introduction and Background ............................................................................. 1
Attitudes Toward Mathematics ........................................................................................... 1
Summary ............................................................................................................................ 11
Statement of Problem ........................................................................................................ 12
The Research Questions ..................................................................................................... 13
Specific Research Questions .............................................................................................. 14
Research Objectives ........................................................................................................... 14
Significance of this Study ................................................................................................... 15
Chapter Two – Rasch Analysis ................................................................................................ 16
Rasch Analysis as a Measurement Instrument .................................................................. 16
What is Rasch Analysis? ..................................................................................................... 16
Chapter Three Methods ......................................................................................................... 29
iii

Procedure ........................................................................................................................... 29
Approval Process ................................................................................................................ 29
Group Placement ............................................................................................................... 30
Math Play Facilitators......................................................................................................... 31
Timeline .............................................................................................................................. 32
Assessments ....................................................................................................................... 33
The Assessment Instruments ............................................................................................. 33
Difficulty Associating Words with Math (DAWM) assessment. ..................................... 33
Math Attitude Assessment............................................................................................. 34
Study Design ....................................................................................................................... 36
Intervention Group ............................................................................................................ 37
The Games .......................................................................................................................... 39
Collected Data .................................................................................................................... 41
Ethical Considerations ........................................................................................................ 41
Chapter Four Results .............................................................................................................. 42
Difficulty Associating Words with Mathematics ................................................................ 43
Missing Data Analysis. ........................................................................................................ 60
Traditional Statistical Analysis of the DAWM Results ........................................................ 61
DAWM with Students as the Experimental Unit ................................................................ 64
iv

Summary of Analysis of DAWM Assessment Intervention ................................................ 76
DAWM with Schools as a Block and Treatment and Control Separating the Units ........... 77
Math Attitude Assessment (MAA) ..................................................................................... 80
Traditional Statistical Analysis of the MAA Results............................................................ 95
Overall Summary .............................................................................................................. 108
Chapter Five Discussion ....................................................................................................... 110
Conclusion ........................................................................................................................ 111
Delimitations, Limitations and Future Research .............................................................. 114
Implications ...................................................................................................................... 116
Summary .......................................................................................................................... 117
References ............................................................................................................................ 118
Appendix .............................................................................................................................. 123
Appendix A Forms and Consent ....................................................................................... 123
Appendix B Attitude Assessment Instruments ................................................................ 150
Appendix C Games and Instructions ................................................................................ 160
Appendix D Rasch Material .............................................................................................. 200

v

List of Tables
Table 1 Fit Statistics and Interpretations .............................................................................. 26
Table 2 Reasonable Item Mean-Squares Ranges for Infit and Outfit ................................... 28
Table 3 Excerpt DAWM Participant Measures ...................................................................... 44
Table 4 Item Characteristics for the Difficulty Associating Words with Mathematics DAWM
................................................................................................................................................ 48
Table 5 Anchored Logit Measure, Gender and Group (Treatment or Control) for February,
April and May DAWM Excerpt ............................................................................................... 59
Table 6 DAWM Descriptive Statistics by Group and Gender April – February Difference
Measure ................................................................................................................................. 63
Table 7 DAWM Descriptive Statistics by Group and Gender May – April Difference Measure
................................................................................................................................................ 63
Table 8 DAWM Descriptive Statistics by Group and Gender May – February Difference
Measure ................................................................................................................................. 63
Table 9 DAWM Between-Subject Effects Statistical Significance (p) .................................... 71
Table 10 DAWM Estimated Marginal Means and Standard Error April – February ............. 72
Table 11 DAWM Estimated Marginal Means and Standard Error May – April ..................... 73
Table 12 DAWM Estimated Marginal Means and Standard Error May – February .............. 74
Table 13 The Number of Participants in each Between Subject Factor ............................... 78
Table 14 Excerpt MAA Participant Measures........................................................................ 83
Table 15 Item Difficulty Math Attitude Assessment (MAA) .................................................. 85
Table 16 Items from the MAA Ordered Hardest to Easiest by Logit Measure ..................... 86
vi

Table 17 Measure, Gender and Group (Treatment or Control) for February, April and May
MAA Excerpt........................................................................................................................... 93
Table 18 MAA Descriptive Statistics by Group and Gender April – February Difference
Measure ................................................................................................................................. 96
Table 19 MAA Descriptive Statistics by Group and Gender May – April Difference Measure
................................................................................................................................................ 96
Table 20 MAA Descriptive Statistics by Group and Gender May – February Difference
Measure ................................................................................................................................. 97
Table 21 MAA Between-Subject Effects Statistical Significance (p).................................... 103
Table 22 MAA Estimated Marginal Means and Standard Error April – February ............... 104
Table 23 MAA Estimated Marginal Means and Standard Error May – April ...................... 105
Table 24 MAA Estimated Marginal Means and Standard Error May – February ............... 106

vii

List of Figures
Figure 1. An excerpt from work by Stodolsky, et al., (1991, p. 97) indicting percent of
students using the indicated words to describe either mathematics or social studies. ......... 9
Figure 2. A sample of the DAWM assessment. ...................................................................... 34
Figure 3. The statements from the MAA ordered by factor. ................................................. 35
Figure 4. DAWM item and outfit z scores: A graphical display of the fit. .............................. 51
Figure 5. DAWM Persons and items mapped by difficulty. Persons on the left and items on
the right. A dot ( ∙ ) represents 1 to 2 people and a number sign (#) represents 3 people. . 54
Figure 6. DAWM item versus average person measure for item endorsement. ................... 56
Figure 7. DAWM frequency distribution and normal curve for April – February data. ......... 65
Figure 8. DAWM frequency distribution and normal curve for May – April data. ................ 66
Figure 9. DAWM frequency distribution and normal curve for May – February data. ......... 67
Figure 10. DAWM box-plot for April – February data test for outliers with difference score
scale on the vertical axis. ....................................................................................................... 68
Figure 11. DAWM box-plot for May – April data test for outliers with difference score scale
on the vertical axis. ................................................................................................................ 69
Figure 12. DAWM box-plot for May – February data test for outliers with difference score
scale on the vertical axis. ....................................................................................................... 69
Figure 13. DAWM estimated marginal means versus group April – February separated by
gender. ................................................................................................................................... 73
Figure 14. DAWM estimated marginal means versus group May – April by gender. ........... 74
Figure 15. DAWM estimated marginal means versus group May – February by gender...... 75
viii

Figure 16. Treatment and control groups versus blocks of the April – February data .......... 79
Figure 17. MAA items z scores of Outfit. ............................................................................... 88
Figure 18. MAA item z score of outfit item 6 removed. ........................................................ 89
Figure 19. MAA person ability and item difficulty map. ........................................................ 91
Figure 20. MAA item versus person measure for item endorsement. .................................. 92
Figure 21. MAA frequency distribution and normal curve for April – February data. .......... 98
Figure 22. MAA frequency distribution and normal curve for May – April data. .................. 99
Figure 23. MAA frequency distribution and normal curve for May – February data. ......... 100
Figure 24. MAA box-plot for April – February data test for outliers. .................................. 101
Figure 25. MAA box-plot for May – April data test for outliers. .......................................... 101
Figure 26. MAA box-plot for May – February data test for outliers. ................................... 102
Figure 27. MAA estimated marginal means versus group April – February separated by
gender with 2XSE error bars. ............................................................................................... 104
Figure 28. MAA estimated marginal means versus group May – April separated by gender
with 2XSE error bars. ............................................................................................................ 105
Figure 29. MAA estimated marginal means versus group May – February separated by
gender with 2XSE error bars. ............................................................................................... 106

ix

Acknowledgments
I would like to thank and acknowledge the support of:
My family – I would not have been able to straighten this tangled mess of letters, numbers
and symbols without your love and encouragement.
My committee (in every iteration) – Your guidance and support were un-paralleled and
unfaltering.
My colleagues – Your patience, understanding and assistance were major contributors to
the end-product.
My friends – Are you still there? Of course you are! You are always there to lend an ear, a
tissue or a critical eye: whichever complimented the situation.
School District 57 – Everyone of you was so welcoming and inclusive in your space. This
would not of have been possible without you.
MATH 190 volunteer facilitators – What can I say? You were all prime figures in allowing
this work to happen. Thank you.

x

Dedication

In memory of:
Florence Elaine Taylor (nee Coates)
Feb 19, 1930 – Dec. 25, 2017
A perfectly cooked breakfast, any kitten or puppy, and a sting of pearls around my neck on
my wedding day.

and
Gracie Brown-Ryburn Leitch (nee Yates)
May 15, 1923 – Apr. 10, 2018
“They are in terrible, terrible trouble,” a strong cup of tea in a proper tea cup, and my first
Blizzard®.

Two amazing, strong women who impacted my life in countless ways.

xi

Chapter One Introduction and Background
Where children see mathematics in the world around them and the attitudes
children hold toward mathematics are not thoroughly understood. Attitudes toward
mathematics, also referred to as mathematical affect, is an ongoing area of research.
Research into attitudes towards mathematics does not all focus on the same aspects of
attitude. For example, research into attitudes toward mathematics introduces definitions
of attitudes toward mathematics, methods of measuring those attitudes, the results from
various attitude assessments and methods used to change those attitudes. Studies in
these areas include various age groups.
Attitudes Toward Mathematics
What is mathematics attitude? The answer to this question is not a simple one.
The literature provides a number of definitions of attitudes towards mathematics and
extensive information on factors which influence these attitudes. Attitudes towards
mathematics is a specific type of mathematical affect or emotion associated with
mathematics. Mathematics attitude is often not defined in studies on the subject. Instead
the studies focus on factors that influence attitudes.
According to Hannula (2002, p.1) mathematics attitude consists of the emotions
associated with mathematics. Specifically, Hannula’s emotion-based definition was
broken down to, “the emotions aroused in the situation”, “emotions associated with the
stimuli”, “expected consequences” and “relating the situation to personal values.”
Hannula’s definition is more complex than the earlier definition of Jadav and Quinn

1

(1987) who simply defined liking mathematics as having a positive attitude towards
mathematics; a uni-dimensional definition.
The study by Jadav and Quinn (1987) was a meta-analysis of articles investigating
attitudes toward mathematics. The meta-analysis required that self-concept (or selfimage) not be a factor associated with attitude. This exclusion of self-concept is in
contrast to the factors later examined by Marsh and Tapia (2004) and Baumert, Koller,
Ludtuke, Marsh and Trautwein (2005) as both studies included self-concept in their factor
list. Tapia and Marsh (2004) included self-confidence in the factors they examined which
is directly related to self-concept. The study by Baumart et al. (2005) focussed on self
concept. The list of factors across studies which may contribute to or define mathematical
attitude is extensive and at times contradictory. Uni-dimensional or multifactorial, there
is not a widely accepted definition of attitudes toward mathematics.
Math attitude assessments instruments. With the variation in the factors being
included in the definitions of attitudes towards mathematics, it is not a surprise that there
are a number of ways that attitudes toward mathematics have been assessed. Hannula
(2002) collected information through dialogues. The dialogues were then examined for
statements that described mathematics affect and the researcher provided
interpretations on the dialogues. This method was the least common method of
information collection.
Self-reporting through questionnaires was a far more common method for
collecting data. In “A review of instruments created to assess affect in mathematics,”
2

Chamberlin (2010) reviewed a progression of instruments used to assess mathematics
affect for secondary school to college age students. The first instrument reviewed was
The National Longitudinal Study of Mathematical Abilities (NLSMA). The NLSMA identified
attitude as “uni-dimensional” (Chamberlin, 2010). The NLSMA did not define or specify
attributes that form attitude. Instead the NLSMA directly asked what the participant’s
attitude toward mathematics was. Chamberlin explained that Aiken expanded the
attitude assessments to include enjoyment and value of mathematics in the Mathematics
Attitude Inventory (MAI) (Chamberlin, 2010). Chamberlin’s review of assessment
instruments of attitudes toward mathematics continued with work by Fennema and
Sherman from 1976 (Chamberlin, 2010). The next step in the evolution of mathematics
affect assessment instruments was Fennema-Sherman Mathematics Attitude Scales
(FSMAS) in 1976. The FSMAS instrument assessed mathematics affect based on four
components: attitude, self-efficacy, anxiety, and value of mathematics. When Chamberlin
did the review in 2010 the FSMAS was still being used. The draw back to the FSMAS was
the language used. Chamberlin was concerned that over time the meanings of words
change so the test became dated (Chamberlin, 2010).
The final instrument reviewed by Chamberlin was the Attitude Towards
Mathematics Inventory (ATMI) created by Tapia and Marsh (Tapia & Marsh, 2004). Unlike
the NLSMA which examined attitude as a uni-dimensional attribute or the FSMAS which
includes attitude in the factors examined when identifying mathematical affect, Tapia and
Marsh did not directly include attitude in the factors but instead looked at factors they
thought contributed to attitude (Chamberlin, 2010). At the onset of the study, the ATMI
3

assessed attitude defined by six attributes: self-confidence (self-concept), anxiety, value,
enjoyment, motivation, and teacher/parent expectations. However, through their analysis
of the results they modified the assessment to include self-confidence, value, enjoyment
and motivation. (See Appendix B for the complete ATMI) (Tapia & Marsh, 2004).
After the review by Chamberlin was published, two of the instruments reviewed
were modified: The Fennema-Sherman Mathematics Attitude Scale and the ATMI. The
Fennema-Sherman Attitude Scale was re-examined and modified by Doepken, Lawsky,
and Padwa (2013) and the language was updated. In 2012, Chapman and Lim reevaluated the ATMI and found such a strong link between motivation and enjoyment that
they suggested removing the motivation factor (Chapman & Lim, 2012). Removal of
motivation is in contrast to work by Hannula (2006) who focussed exclusively on
motivation. It is evident that the creation of an instrument to assess attitudes toward
mathematics is a complex and ever developing process.
Assessing Attitudes Toward Mathematics in Primary students. All the previously
discussed mathematics attitude assessment instruments were designed for and used with
students from high school age and up. However, questionnaires are employed with
primary students as well. Questionnaires included various factors and were of varying
lengths. Tezer and Karasel (2010) used a 10 item questionnaire to assess the attitudes of
230 Grade 2 and 3 students. The students were given the option of a face representing
“very happy, happy, neutral and sad” to use to respond to the items. Some example
statements include, ‘“What I learn in my math course I use in my daily life” with which

4

idea do you emotionally agree to?’ and ‘”The course I mostly like is math” which facial
expression would you reply with for this idea?’ The study reported that a positive
attitude toward mathematics was found.
Another questionnaire-based mathematics attitude assessment was used by
Dowker, Bennett and Smith (2012). They used a 28-question format and focussed on
seven areas; “maths in general, written sums, mental sums, easy maths, difficult maths,
maths tests, and understanding the teacher”. The participants for this study were 44
grade 3 students and 45 grade 5 students. The participants responded to each item with a
self-rating, a degree of liking, level of anxiety and level of unhappiness. An attitude
“above neutral” was reported with no gender difference.
Another study was a meta-analysis by Quinn and Jadav (1987). They included grade 2
to 6 and examined the results from 1758 students. One of the studies included 11
questions that were not specified in the article nor were any factors specified. A second
study in the meta-analysis used the Survey of School Attitudes. The Survey of School
Attitudes was designed by T. P. Hogan in 1975 (Jadav & Quinn, 1987). The focus of this
study was not to assess attitude as much as it was to look at the relationship between
attitude and achievement. The studies that were included in the meta-analysis had four
attributes in common:
(1) Mathematics and reading evaluative data regarding attitude and achievement had
to be collected from testing that occurred at the same time and for the same
grade on two or more occasions.
5

(2) The attitude assessment needed to include liking a subject, but no self-concept.
(3) The achievement assessment needed to be obtained from specific assessment
tests and not teacher grades.
(4) The original data had to be available to be re-examined.
The component of this study most relevant to attitudes of primary students toward
mathematics is point (2) the attitude assessment needed to include liking a subject, but
no self-concept. This is counter to some of the pervious attitude assessment instruments
(Tapia & Marsh, 2004, Aiken as cited in Chamberlin, 2010).
Levine (1972) used a questionnaire with children as young as grade 3. The sample
included 144 grade 3, 4, and 6 students. Unlike some of the other studies regarding only
attitudes toward mathematics of elementary school students this study was designed to
compare attitudes across school subjects. Levine included the following statements:
1) I enjoy studying this subject the most.
2) I do my best work in this subject.
3) I think this subject is the most important subject I study in school
4) My parents are able to help me most in this subject.
5) My parents feel that this should be my best subject.
6) I wish this was my best subject.
7) I feel I need the most help in this subject.
8) I feel my teacher does her (his) best job in teaching this subject.
9) This is my teacher’s favorite subject. (p. 53)
6

Students and their parents were asked to rank English, Mathematics, Science and Social
Studies using the nine statements.
Results from assessing attitudes toward mathematics. The results from various
mathematics attitude assessments are not consistent. Dowker et al. (2012) found that
students tend to have a positive attitude toward mathematics. However, Hannula, (2002)
reports that attitudes towards mathematics vary but tend to worsen as children progress
through school. Arnold, Fisher, Doctoroff, and Dobbs (as cited in Geist, 2010) claim that
the attitudes children have form early and are difficult to change. Kogce, Yildiz, Aydin and
Altindag (2009) found a correlation between early attitudes and later attitudes of
students as they progress from elementary school to secondary school. The role gender
has in affecting mathematics attitude varies. No gender difference was found by Kogce et
al, 2009, and Ma and Kishor (1997). However, a gender difference was found by Hannula
(2002).
These findings together could be interpreted as students’ generally positive
attitudes toward mathematics are hard to change but if attitudes do change they will
decline. This supports an incentive for mathematics researchers and educators to assess
attitudes early and take steps to keep attitudes positive.
When students like mathematics and how they describe it. Liking of mathematics
was found to be based on getting the correct answers and finding the work easy
(Stodolsky, Salk, & Glaessner, 1991). They found this was in contrast to the liking of social
studies. Social studies was liked if the topic was interesting or the activities were enjoyed.
7

The same study also found that when students were asked to describe
mathematics and social studies that the number and nature of the words they chose were
very different (Stodolsky et al, 1991). The words and concepts in Figure 1 are a summary
of the discussions held with the grade 5 students in the study. The words were not given
to them. When describing mathematics and social studies, the responses broke down as
in Figure 1.
The list for social studies continued with six more categories. What is of note is the
difference in the percent of students that agree about the mathematics definition versus
that of social studies. Also, the number of categories used to define social studies is
greater. Mathematics was described with fewer words that were more specific. Social
studies was described with more words that were more general. Games did not appear
on the list for mathematics.

8

Words used to define Math

Words used to define Social Studies

Addition

77%

Special place, event, person, period

67%

Subtraction

68%

History

48%

Multiplication

62%

About People

30%

Numbers

62%

Cultures

20%

Division

58%

Wars

13%

Fractions or decimals

30%

Projects

12%

Measurement

12%

Social living

10%

Doing Problems

10%

Events-dates

10%

Word Problems

10%

Maps: read & make

8%

Geometry

10%

Land forms

8%

Counting money

7%

Reading

7%

Telling time

5%

Definitions

7%

Miscellaneous

13%

Countries

7%

Figure 1. An excerpt from work by Stodolsky, et al., (1991, p. 97) indicting percent of
students using the indicated words to describe either mathematics or social studies.
The relationship between mathematics attitude and achievement. The link
between attitudes and achievement has been studied extensively but the results from
these studies are often contradictory. These contradictions may in part arise because of
the different factors used to define attitude and the different instruments used to assess
attitude. In a study of seventh graders by Marsh et al, (2005) self-concept, a component
of attitude from several assessment instruments, was found to predict achievement.
Dowker et al. (2012) also found a link between self-rating and performance in third and
fifth grade students. De Lourdes, Monteiro and Peixoto (2012), found that there was not
a link between achievement and attitude. This finding did not support work by Anttonen
9

(1969) who found a correlation between attitude and achievement. Moenikia and ZahedBabelan (2009), identified attitude toward mathematics as a cause for achievement in
mathematics. The possibility that there is a link between attitude and achievement
further supports the need to identify attitude and try to improve it as much as possible.
There is a lot of focus on mathematics achievement but studies which identify a
relationship between attitude and achievement lend support for the need to focus on
attitude and achievement simultaneously.
Why Play with Mathematics? How children experience and learn mathematics is
diverse. Children will engage in mathematics play during free play sessions (as cited in
Capacity Building Series, 2011). Ginsburg goes on to explain that the way a child thinks, “is
not limited to the concrete and mechanical; it is often complex and abstract” (Capacity
Building Series, 2011, p. 1). MacDonald (2014) states that algorithms are needed to
ensure that students are able to solve problems in a timely manner and to allow students
to perform mathematics at a higher level. Teaching algorithms is necessary because there
are some concepts students need to master that would take too long to solve through
discovery if the students found the solutions at all. However, Robinson (2011) has offered
the view that we are educating the creativity out of our children. The article by
MacDonald referenced work by Mighton, who agreed with the need for discovery and
creativity in mathematics (Mighton as cited in MacDonald, 2014). Mathematics education
is not limited to the classroom taught topics.

10

“Play expands intelligence, stimulates the imagination, encourages creative
problem solving, and helps develop confidence, self-esteem, and a positive attitude
toward learning.” This is a quote from Dr. Mustad in an article titled CMEC Statement on
Play-Based Learning produced by the Council of Ministers of Education, Canada (2010).
Now consider the findings of Geist (2010), who stated that a dislike of mathematics is
influenced by high stakes situations and the stress associated with timed tests (especially
for female students). Finally add the work of Tapia and Marsh (2004) who found that selfconfidence, enjoyment, motivation and value were four factors which could be used to
measure attitudes toward mathematics. These finding could be put together to create a
possible instrument to effect attitudes toward mathematics.
Summary
There is not a concise or consistent definition of attitudes toward mathematics.
This contributes to the variety of instruments used when studies address attitudes toward
mathematics. Attitudes have been found to be positive but decline as students progress
through school. Often work on attitudes toward mathematics had been done with older
elementary students to university students. Addressing attitudes toward mathematics in
the late primary years may help researchers’ and educators’ understanding of attitudes
toward mathematics in the later years.
Identifying attitudes toward mathematics and improving attitudes toward
mathematics are two separate yet related concepts. Improving attitudes may be done by
addressing the aspects of mathematics which are related to why people dislike it and by

11

increasing the exposure to aspects of mathematics through play. The idea of play is
supported by the benefits associated with play. Some of the benefits of play directly
address factors which have been identified as influences of attitudes toward
mathematics.
Statement of Problem
The focus of this study addresses two components: 1) How can awareness of
mathematics in the world around them, mathematics attitude and changes in awareness
and attitudes toward mathematics in Primary school students be measured; 2) Where do
grade 2 and or 3 students see mathematics in the world around them, what are the
attitudes toward mathematics of primary students and does a “Math Play” intervention
improve either of these? For the purpose of this study Math Play was created as a
collection of mathematics-based games and activities where the emphasis is exploration
and problem solving, not correction and criticism.
The lack of a consistent definition of attitude toward mathematics and
consistently used instruments used to assess those attitudes may contribute to the
variety in study results. It is like measuring the quality of a picture without defining
quality and without specifying what a picture is. Is it a painting, a polaroid, photo on a
smart phone or graffiti on a wall? Is quality a subjective thing or is the interest in framing,
lighting, composition, and perspective? A consistent definition and an instrument with
known properties that is used repeatedly makes comparison of research findings, changes
with time and measuring effects of interventions easier.

12

It is one thing to know what attitudes exist, but if those attitudes are not positive,
improving those attitudes is crucial. Introducing mathematics in a way that addresses or
removes areas of mathematics that have been identified as being related to negative
attitudes toward mathematics may be able to improve attitudes toward mathematics.
Mathematics was described using words associated with arithmetic by over half of
the students in the study by Slassner et al (1991), and several areas of mathematics were
not noted (for example: patterns, probability, graphs, solving equations). The association
with arithmetic may be to the detriment of the other areas in day to day life that use and
involve mathematics. Trying to expand what students think of math as, and associate
math with, may expand their understanding of the subject and its connection to everyday
activities they take part in.
The Research Questions
1) Do the mathematics instruments Difficulty Associating Words with Mathematics
(DAWM), which was created for this study, and Mathematics Attitude Assessment
instrument (MAA), which was modified for this study, have the necessary
psychometric properties to allow assessment of attitude change in attitudes?
2) If a component of mathematics that does not include timed tests nor being
graded, but rather exploration and experience based, (Math Play), is introduced
will the overall attitudes toward mathematics improve? Across both genders? And
will improvement persist one-month post treatment?

13

Specific Research Questions
1) Is the Difficulty Associating Words with Mathematics assessment (DAWM) an
appropriate assessment instrument for the participants?
2) Will Math Play change the propensity of participants’ endorsement of words
associated with mathematics?
3) If there is a change in the DAWM, will it persist after Math Play has ended?
4) Is there a gender difference for propensity to associate words with mathematics?
5) Is the Math Attitude Assessment (MAA) an appropriate instrument to assess the
attitudes of the participants toward mathematics?
6) Will Math Play change the expressed attitude of the participants toward
mathematics?
7) If there is a change in the MAA results, will it persist after Math Play has ended?
8) Is there a difference in the MAA results between the genders?
9) Is there a correlation between the MAA and the DAWM assessment results?
Research Objectives
There are multiple objectives for this research. First, instruments must be
modified/created to assess attitudes toward mathematics and the propensity to identify
mathematics in the world that is appropriate for a Primary school student participant
group. Second, the psychometric properties of these instruments must be assessed.
Third, an intervention must be designed (Math Play - low stress, low risk, high likelihood
of success math related games). Fourth, the effectiveness of this intervention must be
assessed.
14

Significance of this Study
Research significance. This study will contribute to the understanding of attitudes
toward mathematics for Primary school students and attitudes toward mathematics
measurement instruments for Primary school students as well as the propensity to
endorse words and concepts as being mathematics related. Further this study will
determine if Math Play has any effect on mathematic affect (the emotions associated
with mathematics) and the propensity to endorse words and concepts as mathematics
related.
Practical significance. This study will contribute a possible technique for improving
attitudes toward mathematics as well as increasing awareness of where mathematics is in
the world. This study will also provide a possible technique for improving the
understanding of what mathematics is through improving the vocabulary of students with
relation to mathematics.

15

Chapter Two – Rasch Analysis
Rasch Analysis as a Measurement Instrument
Studies of mathematics attitude scale development have used a wide variety of
methods to investigate the properties of the instruments to be employed. Others who
have then used these instruments may or may not adapt the instruments for their target
samples. Often a re-evaluation of the properties of the instrument is not done when the
instrument is used with a new population. Rasch analysis is an approach to evaluate the
instruments. Rasch analysis promises sample free estimates of item difficulty and person
ability estimates free of the effects of the sample of items used to measure the person’s
characteristics (Bond & Fox, 2012; Linacre, 2012a,b,c,d, n.d.; Lochhead, 2009; Sebok,
2010).
What is Rasch Analysis?
Not all people are of equal ability and not all questions are of equal difficulty. For
example, on a multiplication assessment there could be the following two items: (A) Find
the product 2 × 4, and (B) Find the product 246 × 14. The first item (A) would be
considered easier than the second item (B). If there were two people in the class, who
wrote the test, it would be expected that the mathematics skill of a person who correctly
answered only the first questions to be lower than the ability of a person who correctly
answered both questions. We can then assess each person’s ability on how many items
they got correct; few disagree with this approach – a higher score means more skill. What
happens if one takes the difficulty of the question or item into account when assessing
ability? If Person M got the first item correct and the second item wrong, and Person N
16

got the first item (the easier item) incorrect and the second item (the harder item)
correct: which examinee has a greater ability? Classical Test Theory, observed score
practices (Crocker & Algina, 2006) does not address this conundrum – a score of ‘1’ is a
score of ‘1’. The issue of the ability of Person N, who scores a difficult item correctly and
an easy item incorrectly, is not addressed even if it is noticed. Rasch analysis, developed
by Georg Rasch in Denmark (Rasch, 1993), takes both the difficulty of the question and
the ability of the respondent into consideration when creating measures for persons and
items (Linacre, 2012a). Other statistics (fit statistics) developed in the realization of these
Rasch models by Wright, Linacre and others (Huang, 2015; Linacre, 2012a, b, c, d, e) allow
for the assessment of aberrant behaviors such as getting easy items incorrect and difficult
items correct.
Item properties such as difficulty and ease are not confined to dichotomously
scored achievement items. Linacre introduces the concept of difficulty and easy using
items endorsed by participants indicating their liking of science activities, on a three-point
scale. The participants had to indicate “like”, “neutral”, or “dislike”, for items including
but not limited to: “watch a rat”, “go to zoo” and “talk w/friends about plants” (Linacre,
2012b, p. 3). The responses were then used to order the items in terms of difficulty. The
Rasch analysis of the responses indicated that the order of difficulty from easiest to most
difficult, of these sample items, was; “go to zoo”, “talk w/friends about plants”, then
“watch a rat” (Linacre, 2012b, p. 5). Other examples Linacre used were: it is easier to “hit
a single” than a “home run”, and “division” is more difficult than “addition” (Linacre,
2012b, p. 11).
17

The items included by Linacre in the science example were items a researcher
decided to include in the measure of “liking of science” (Linacre, 2012b, p. 3). Andrich
explained that psychometric researchers construct variables for use in measurement of
the concept of interest, for example how happy someone is (Andrich, 1988). This type of
measurement is in contrast to measuring how long a piece of wood is in metres; one
metre is always one metre. Rasch analysis is an instrument that creates a measurement
system comparable to the metre stick, where an increase of 1 metre is always the same
value. The increase of one unit in a Rasch analysis scale is always the same for the set of
data being analysed.
In general, Rasch models may be thought of as using examinee responses on a set
of items to produce estimates of examinee ability and item difficulty that maximize the
probability of these combinations of observed responses. Both the examinee ability
estimates and the item difficulty estimates are theoretically invariant of the specific
situation should the data sufficiently fit the Rasch model. Data that do not fit the Rasch
model are considered poor data: similar to outliers in classical measure theory. The idea
of the model fitting the data in Rasch Analysis is different from that of traditional
statistical analysis where the model is chosen to fit the data. This is explored and
discussed at length by Smith and Smith (2004). Smith and Smith (2004) describe the
differences in what they refer to as the Traditional Paradigm, where the model fits the
data and the Rasch Paradigm where the data fits the model. Chapters 7 and 8 in
Introduction to Rasch Measurement are dedicated to this discussion (2004). One of their

18

key points is the ordering of items based on item difficulty should be a direct result of the
data not a model.
Rasch Analysis includes a variety of techniques – dichotomous and rating scales,
and partial credit – only those used in this study will be discussed in further detail.
The following symbols will be used in the equations for Rasch measures:
𝑃𝑛𝑖 is the probability of person 𝑛 succeeding on item 𝑖,
𝑛 is the person number
𝑖 is the item number
𝐷𝑖 is the estimated item difficulty
𝐵𝑛 is the estimated person ability.

Rasch dichotomous model. The dichotomous Rasch model with person ability 𝐵𝑛
and item difficulty 𝐷𝑖 is given by (Equation 1) (Linacre, 2012a):

𝑙𝑛 (

𝑃𝑛𝑖
) = 𝐵𝑛 − 𝐷𝑖 .
1 − 𝑃𝑛𝑖

Linacre (2012c, p. 21) describes this as the “log-odds of a person 𝑛 succeeding on item 𝑖=
Ability of person 𝑛 −Difficulty of item 𝑖”. The units used for the log-odds is “logits”.
According to Linacre (n.d.) “one logit is the distance along the line of the variable that
increased the odds of observing the event specified in the measurement model by a
factor of 2.728.., the value of "𝑒”, the base of the “natural” or Napierian logarithms used
for the calculation of “log-“ odds.” This process takes responses that are scored zero

19

(failure) or one (success), referred to as dichotomous items, and creates a linear interval
scale.
Figure 2 shows the probability of success against the difference between person
ability and item difficulty. Appendix D expands the explanation of the formation of the
graph and the effects of a probability of zero and a probability of 1 for Pni (Linacre, 2012 a,
step 109 to 114). Appendix D expands on the development of the person and item
measures and shows the sigmodal curve that is asymptotic as the logit difference scores
approach infinity and negative infinity.

Figure 2. A sample curve for probability of success versus logit difference (Bn-Di) (Linacre,
2012a)
There are various computer software programs that may be used for Rasch
Analysis: Winsteps, Rumm 2020, Facets, Quest and ConQuest are a few examples (Sick,
2009). The equations described here are those used by Winsteps. Winsteps uses several

20

iterations to estimate Bn and Di. Initially all persons and abilities have the same estimate
of zero. These estimations are then modified by the PROX (Normal Approximation)
estimation algorithm (Equation 2) (Linacre 2012e):

𝐵𝑛 = 𝜇𝑛 + √1 +

𝜎𝑛2
𝑅𝑛
ln (
)
2.9
𝑁𝑛 − 𝑅𝑛

where 𝐵𝑛 is the current person ability estimate for person n, 𝜇𝑛 is the mean item
difficulty of the items responded to by person n, and 𝜎𝑛 is the standard deviation of the
items responded to by person n. 𝑅𝑛 is the raw score for person n and 𝑁𝑛 is the maximum
possible score on items responded to by person n.
Similarly, item difficulty is estimated as shown in Equation 3 (Linacre, 2012e):

𝐷𝑖 = 𝜇𝑖 + √1 +

𝜎𝑛2
𝑅𝑖
ln (
)
2.9
𝑁𝑖 − 𝑅𝑖

where 𝐷𝑖 is the current item difficulty estimate for item i, 𝜇𝑖 is the mean ability of the
persons who responded to item i, and 𝜎𝑖 is the standard deviation of the person abilities
who responded to the item. 𝑅𝑖 is the raw score for item i and 𝑁𝑖 is the maximum possible
score on the item. This iteration process is repeated until the change in square sum of the
residuals for the overall output for each person and item is less than 0.5 logits. Once this
PROX stage is complete the JMLE (Joint Maximum Likelihood Estimation) iterations begin.

21

The JMLE estimation process iterates using Equation 4 (Linacre, 2012e):

𝑦′ = 𝑦 +

∑(𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 − 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑)
∑ 𝑐𝑒𝑙𝑙 𝑣𝑎𝑟𝑖𝑒𝑛𝑐𝑒

where expected value of person A on item 1 is calculated by Equation 5 (Linacre, 2012e):

𝑃{𝐴1 = 1} =

𝑒 (𝑃𝑒𝑟𝑠𝑜𝑛 𝐴 𝑙𝑜𝑔𝑖𝑡 – 𝐼𝑡𝑒𝑚 1 𝑙𝑜𝑔𝑖𝑡)
,
(1 + 𝑒 (𝑃𝑒𝑟𝑠𝑜𝑛 𝐴 𝑙𝑜𝑔𝑖𝑡 – 𝐼𝑡𝑒𝑚 1 𝑙𝑜𝑔𝑖𝑡) )

and (Equation 6) (Linacre, 2012e):
𝐶𝑒𝑙𝑙 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑣𝑎𝑙𝑢𝑒(1 − 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑣𝑎𝑙𝑢𝑒).
Equations 4 to 6 are used to calculate both 𝐵𝑛 and 𝐷𝑖 . The JMLE iterations are continued
until the residuals are close to zero. Specifically, since Rasch measures are calculated to
two decimal places, iterations cease once the residuals are close enough to zero that the
reported Rasch measure is not changed.
Model Standard Error or Model SE is also reported in a Rasch Analysis. Rasch
Analysis reports Model Standard Error for the dichotomous scale, as (Equation 7) (Linacre,
2012e):

𝑆𝐸(𝐵𝑛 , 𝐷𝑖 ) =

1
√∑(𝑃𝑛𝑖 (1 − 𝑃𝑛𝑖 ))

.

22

Rasch-Andrich rating scale model. With the introduction of more than two
possible responses, the model changes slightly to accommodate. David Andrich (1978, a,
b, c) expanded on the Rasch Model to allow the analysis of Likert-type rating scale data.
The Rasch-Andrich rating scale model equation (Equation 8) (Linacre, 2012d) is:
𝑃𝑛𝑖𝑗
𝑙𝑛 (
) = 𝐵𝑛 − 𝐷𝑖 − 𝐹𝑗 .
𝑃𝑛𝑖(𝑗−1)
where 𝑗 is a response on a Likert type scale, 𝐹𝑗 is the step calibration or step difficulty and
all other variables are as described previously. The rating scale final values are reached
after several iterations using similar equations as Equation 2 through 6 (Linacre, 2012e).
This model “specifies the probability, 𝑃𝑛𝑖𝑗 , that a person 𝑛 of ability 𝐵𝑛 is observed
in category 𝑗 of a rating scale applied to item 𝑖 of difficulty 𝐷𝑖 as opposed to the
probability 𝑃𝑛𝑖(𝑗−1) of being observed in category (𝑗 − 1),” (Linacre, 2012a, p 1). Model
Standard Error for a Rasch-Andrich rating scale model is calculated using (Equation 9)
(Linacre, 2012e):

𝑆. 𝐸. =

1
2
𝑚
√∑𝑛 𝑜𝑟 𝑖(∑𝑚
𝑗=0(𝑗𝑃𝑛𝑖𝑗 − ∑𝑗=0 𝑗𝑃𝑛𝑖𝑗 ) )

Again, the model is linear and the possible responses are modelled into equal difficulty
difference intervals.
Rasch models and fit. Each Rasch Model is created for the data it is representing.
Once the Rasch measures are calculated, the persons and the items are examined for

23

“fit”. “Fit” refers to how well the data fit the model (Linacre, n.d.). If the fit for an item or
person is poor, the item or person is referred to as “misbehaving.” Fit is commonly
quantified in two ways, mean square and t (or standardized z). The mean square statistic
approximates a chi-square distribution. This statistic is divided by its degrees of freedom
to produce an expected value of ‘1’ (Bond & Fox, 2012).
Fit statistics for Rasch Analysis are part of what make Rasch Analysis so effective.
An example that illustrates this effectiveness is the results of the 2002 Olympics. The
results from a judge in figure skating were questioned. Linacre (2002) and Looney (as
cited in Linacre, 2002) suggests that the patterns, indicated by fit statistics, in the judge’s
responses may have been detected had Rasch Analysis been employed.
Outfit. Outfit is sensitive to outliers. It is expressed as both an outfit mean square
and with an associated t (or standardized z) value. Outfit mean square is calculated as the
2
“average of the standardized residuals, (𝑧𝑛𝑖
)” (Bond & Fox, p 285). This is an unweighted

average. As such, “unexpected responses far from a person’s or item’s measure” have
more of an effect on the outfit statistic (p 285). The effects of variations in response
patterns are described in Appendix D. The outfit mean square equation (Equation 10) is:
2
∑ 𝑧𝑛𝑖
𝑂𝑢𝑡𝑓𝑖𝑡 =
𝑁

where 𝑧𝑛𝑖 is the standardized residual of each person on each item (Bond & Fox, pp 285286). Specifically, (𝐸𝑞𝑢𝑎𝑡𝑖𝑜𝑛 11) (Linacre, 2012e):

𝑧2 =

(𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 −𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑)2
𝜎2

.
24

Infit. Infit is sensitive to patterns in the responses. Infit mean square is a weighted
average. As such, there is more of an impact on the infit mean square for “unexpected
responses close to person or item’s measure” (Bond & Fox, p 285). The equation for infit
mean square (Equation 12) is:
2
∑ 𝑧𝑛𝑖
𝑊𝑛𝑖
𝐼𝑛𝑓𝑖𝑡 =
∑ 𝑊𝑛𝑖

where 𝑊𝑛𝑖 is the variance for an individual across a given item (Bond & Fox, p 286).
Appendix D includes a table that demonstrates the effects of various patterns on the
Outfit and Infit mean squares.
Both infit and outfit are also described using a t (or a standardized z statistic)
(Bond & Fox, 2012). The use of t or z varies, but as the sample size, n, increases (n > 30)
the t -distribution approximates the z-distribution. Whether t or z are reported this fit
statistic must be examined. In practice, the usual guideline is to examine t or z scores if
they exceed the absolute value of 2 (Bond & Fox, p. 286). This holds for both the infit and
outfit statistics. For consistency, z will be used.
Fit mean square, MS, values are greater than or equal to zero, with an expected
value of one. According to the Rasch website (Linacre, n.d.), Mean square values less than
0.5 do not degrade the measure, values from 0.5 to 1.5 are ideal for the measure, 1.5 to 2
do not add to or take away from the measure. Mean square values great then +2 degrade
the measure but may be caused by as few as one result (Linacre, 2012e).

25

While Linacre (n.d.) and Bond and Fox (2012) recommend a magnitude of MS
approach, others such as, R. M. Smith (Review of Reviews of Bond and Fox, 2002) prefer a
statistical probability approach. Standardized fit statistics are based on a z-test with
values outside of -2 to +2 (p < .05) are of greatest concern.
Items and persons with fit concerns must be examined. The exact values that are
of concern vary based on source and sample size (Table 1) (Bond & Fox, 2012 p 240,
Linacre, 2012e).
Table 1
Fit Statistics and Interpretations
Bond and Fox
(2012)

Linacre (2012e)

General Interpretation

Mean
Square

z

Mean
Square

z

Response
Pattern

Variation

Interpretation Misfit
Type

> 1.3

> 2.0

> 1.2

> 2.0

Too
haphazard

Too
much

Unpredictable Underfit

< 0.75

< -2.0 < 0.8

<-2.0

Too
Too little
determined

Guttman

Overfit

Table 1 summarizes fit from both Bond and Fox, and Linacre (2012e). The table
shows the slight differences in the Mean Square values for underfit and overfit but the z
values are the same for both. When z < -2, overfit, the data is too predictable and there
may be variables influencing the response that are not being detected. Overfit is also
described as redundant, muted or cramped. Guttman refers to a response pattern that
does not have a transition area. Most responses, to a dichotomous item, will have a
26

region where the responses vary, for example 1111011010000 the middle four responses
are in the transition region. A Guttman response pattern would be 111111000000 (Bond
& Fox, 2012, p. 239). When the z score is between -1.9 and 1.9, inclusively, the data is
predictable. If z > 2.0 the data are considered underfit, too unpredictable, or noisy.
Underfit degrades the quality where as overfit makes the results appear better than they
may be (Bond & Fox, 2012). When investigating fit concerns Linacre (Linacre, 2012e)
suggests examining outfit (outliers) before infit (inlying patterns) and the size of the
concern, as indicated by mean square, before the significance of the concern, as indicated
by z. This differs from Bond and Fox (2012) which recommends investigating infit
concerns before outfit concerns.
Fit guidelines may be affected by both sample size and the stakes around the data
collected. As sample size increases, means square values which indicate underfit,
decreases (Smith, Schumacker & Bush, 1998). Table 2 (Linacre, 2012e) indicates the
changes is the suggested fit mean square statistics for infit and outfit for situations with
varying levels of investment.

27

Table 2
Reasonable Item Mean-Squares Ranges for Infit and Outfit
Type of Test

Range

Multiple Choice (High stakes)

0.8 – 1.2

Multiple Choice (average stakes)

0.7 – 1.3

Rating Scale (Survey)

0.6 – 1.4

Clinical observation

0.5 – 1.7

Judged (agreement encouraged)

0.4 – 1.2

To avoid any confusion the guideline being used can be stated. Fit statistics
indicate where problems may occur and are examined for both the participants and the
items. Any participant or item outside the fit guidelines, decided on, should be examined.
Together, with the use of Rasch analysis techniques the result is a linear measure
that fit the data and takes ability of participants together with item difficulty to create
measures that are invariant across the population. Granger (2010, p. 7) summarized it by
saying, “Rasch analysis provides an internally valid measure that, when developed from
an appropriate sample, is independent of the particular sample to which it is applied,
meaning that the findings for the sample extrapolate to its population.” The idea of
independence of the sample is further discussed in Smith and Smith (Smith & Smith,
2004). These properties of Rasch analysis make it an excellent choice for both major goals
of this study.

28

Chapter Three Methods
Procedure
Between December 2014 and June 2015, 12 grade 2 and/or grade 3 classrooms
participated in a Math Play intervention designed to increase awareness of mathematics
in the world and improve attitude toward mathematics. The sample came from grade 2
and/or grade 3 classes identified with the help of Cindy Heitman, District Principal,
Curriculum and Instruction, School District 57. Split classes were included because of the
limiting factor if they were excluded – due to school sizes, several schools do not have
classes with only grade 2 or grade 3 students – therefore requiring classes to be
exclusively grade 2 or grade 3 would reduce the number of classes that could participate.
Approval Process
Tentative approval for the Math Play study was granted by Ms. Cindy Heitman,
District Principal, Curriculum and Instruction, School District 57. Final approval from
School District 57 was conditional on UNBC Research Ethics Board (REB) approval and the
supervising committee approval. These approvals were all granted: forms are located in
Appendix A. Approval to work in each school was obtained from the school principals.
Once participating schools were confirmed, the individual teachers were contacted for
approval. Finally, the parents of the children and the children in the class were contacted
to seek consent. Information letters and consent forms were provided to all parties
involved. Copies of sample documents are located in Appendix A. Participation in the
assessments was voluntary and could be withdrawn at any time by any party. The
children were also given the opportunity to withdraw any time during the assessment
29

component. For the classes assigned to the treatment group, the entire class participated
in the Math Play sessions. No data were collected from the children during these sessions.
Group Placement
Between February and the end of May half of the classes were involved in Math
Play intervention and assessments. For the same time frame, the other classes, the
control group, only participated in attitude assessments. To ensure all participants, the
elementary students, and their teachers and aids, were given access to any benefits
associated with Math Play, Math Play sessions took place in control group classes after
the final attitude assessments were complete – in late May and June.
Placement into treatment or control group was not completely random. It was
based on which group the teacher requested (when the requests could be
accommodated) and matching the class. For example, when matching classes, if there
were two grade 3 classes in a single school, one was placed in the control group and the
other was in the treatment group. If there was only one grade 3 class from a specific
school the class was paired with a grade 3 class from another school, one in treatment
and one in the control group. The only class that was not paired that way was one grade
3/4 split class where the grade 3 students took part in the study. This class was paired
with a grade 3 class. There was one class that consisted of two individual classes merged
into one with two teachers working in the classroom. This merged class was part of the
treatment group.

30

Math Play Facilitators
Students from the University of Northern British Columbia’s MATH 190 (Math for
Elementary Educators) class facilitated the Math Play sessions in the grade 2/3
classrooms. To avoid confusion, for the purpose of this study facilitator refers to MATH
190 student volunteers and teacher refers to the grade 2/3 classroom teacher. There
were two facilitators per class, with one exception. The class that was two classes merged
into one had four facilitators due to the doubled class size. Participation as facilitators in
the Math Play study was part of the MATH 190 course work but an alternative was
provided if MATH 190 students did not want to or were not able to participate. All the
students from MATH 190 who wanted to take part were able to, with one exception. This
student’s availability did not match with any of the times that the classroom teachers had
provided. The total number of volunteer facilitators was 14.
Before facilitators entered the classrooms, they obtained criminal record checks.
After obtaining criminal record checks and prior to the intervention, the facilitators visited
their treatment group class for one hour a week for two weeks. This was to mitigate a
possible Hawthorn effect and familiarize the facilitators and the classroom teachers with
each other (Oswald, Sherratt & Smith, 2014). The researcher was present to observe all
Math Play sessions and the pre-intervention visits. For the intervention, facilitator visits
lasted for one hour, and occurred once per week for a total of five weeks (5 hours). If a
regular visit was scheduled for a holiday or non-instructional day the Math Play session
was rescheduled. Time was allotted for training of and feedback from the facilitators.

31

Initially, facilitators were going to be placed in the classes based on the results of a
mathematics attitude assessment, which the facilitators took before the study began
(Tapia & Marsh, 2004) but due to scheduling restrictions that was not feasible. Instead,
the placement depended on matching the times the classroom teachers had available and
the time the facilitators were available.
Facilitators were responsible for transportation to and from the Math Play
sessions. They were also responsible for reviewing and understanding the instructions
provided on session facilitation (Appendix C). Short overview meetings to review
instructions and finalize the plan for each session were held immediately prior to each
Math Play session. Feedback was gathered after each Math Play session. This allowed for
immediate problem resolutions if any arose.
Timeline
Recruitment of classes began December 2014. The timeline was restricted by the
semester dates of Winter 2015. Recruitment of facilitators was not able to begin until the
Winter 2015 semester began. Normally School District 57’s Spring Break and UNBC’s
Reading Break do not coincide but winter of 2015 the two breaks coincided: February 16
to March 1, 2015.
•

December to January 2014: recruit classes to participant in the study – 12
classrooms in total

32

•

January 2015: Facilitator recruitment, and facilitators had criminal record checks,
did self-attitude assessments using the ATMI and trained for intervention,
classroom teachers handed out and collected consent forms,

•

February 2 – 15, 2015: Attitude assessment in the classes where Math Play will
occur and facilitators perform classroom support for an hour a week

•

February 16 – March 1 2015: Spring Break for School District 57/Reading Break for
UNBC

•

March 2 – April 2 2015: Facilitators lead Math Play sessions in the class for one
hour a week

•

April 7 – 10 2015: Attitude Assessment in the class all classes

•

End of May: Attitude Assessment in all classes

•

June 2015: Math Play visits in control group classes

Assessments
The classroom teacher administered the attitude assessments. This was done as
indicated in the timeline. The exact day the assessments took place in each class was
decided by the teacher, based on their schedule, as long as it was in the indicated weeks.
The Assessment Instruments
Difficulty Associating Words with Math (DAWM) assessment. Difficulty
Associating Words with Mathematics, (DAWM), was designed to examine the propensity
of students to indicate they associate words and/or concepts with mathematics. The
DAWM assessment was inspired by the ideas in work by Stodolsky, et. al. (1991) who
33

complied a list of words students used while discussing mathematics. The DAWM was
then created: a list of words and/or concepts that may be identified as mathematics
related (Figure 2). The DAWM was designed to assesses a student’s propensity to indicate
words or concepts as being related to mathematics.

Figure 2. A sample of the DAWM assessment.
Math Attitude Assessment. Attitudes were measured using an assessment
instrument modelled after the ATMI by Martha Tapia and George E. Marsh II (2004)
(Appendix B). After conferring with Dr. Cindy Hardy and Ms. Cindy Heitman, School
District 57 District Principal, Curriculum and Instruction, it was decided that 10 to 15
questions would be better suited to the age group than the 40 questions in the original
ATMI. Quinn and Jadav (1987) used as assessment with 11 questions on a group of similar
34

ages. Dowker et al. (2012) used a total of 28 questions. To meet the requirements set out
by School District 57, together with the lengths of previous assessments, the MAA
assessment length was set at 12 statements. Consistent with the work by Tapia and
Marsh (2004), the DAWM contained statements based on the following factors: selfconfidence (SC), value (V), enjoyment (E) and motivation (M). Due to the similarity of the
target age group, the assessment language was similar in complexity to that used by
Levine, (1972) as the study by Levine was designed for grade 3 students.
Math Attitude Assessment (MAA)
1. SC

Math is easy.

2. SC

I know I can get math questions right.

3. SC

I find math hard.

4. V

Math is useful.

5. V

I can think of ways to use math.

6. V

Math is useless.

7. E

Math makes me feel happy.

8. E

I am sad when I have to do math.

9. E

Math makes me scared.

10. M

I want to do math next year.

11. M

When I grow up I want a job that uses math.

12. M

Next year I want to stay away from math.

Figure 3. The statements from the MAA ordered by factor.
There were three response categories: Yes (agree), Sometimes (neutral), or No
(disagree). The students were told to circle the response that best described how they felt

35

(Appendix C). The assessment statements were read to the students by their classroom
teachers.
Basic background information was collected: age, grade, and gender. Each
students’ name was assigned a random number for identification. This was done to
ensure that the students’ results are kept confidential.
Study Design
This was a quasi-experimental pre-post treatment design. Lack of randomness
characterizing it as a quasi-experimental design (Hurlburt, 2006). Schools and teachers
were self selecting. Classes had to be kept intact, so participants were not randomly
assigned to the treatment or control group. This meant that if there were variables that
caused differences in the groups they may affect the results. The control group should
have matured at a similar rate while being exposed to comparable curricular content
helping to reduce the non-random placement of the participants (Creswell, 2014;
Hurlburt, 2006; Pagano, 1998).
Each classroom teachers knew that their class was part of the treatment group
and this may have affected topics discussed in class or ways that they approached topics.
The parents knew if their child was part of the treatment group and this too may have
affected home behaviour. The participants in the treatment group may have known they
were part of the treatment group. Instructions were given that requested that the
connection between Math Play and the assessments not be made for the students, but
how consistently these instructions were followed can not be confirmed. Participant
36

awareness could contribute to the possibility of Hawthorn effect. To mitigate the effects
of the Hawthorn effect pre-visits took place. A study by Oswald et al (2014) suggested six
steps to help reduce the Hawthorn effect and found having the participants in the study
have a good relationship with the researchers and have the participants feel comfortable
were the most important steps in Hawthorn effect mitigation. The pre-visits occurred to
allow relationships between the facilitators and observer with the student participants to
develop.
Intervention Group
Classroom visits pre-Math Play. During the two weeks from February 2-15 the
facilitators were in the classroom to provide support. They were there for one hour a
week for two weeks. The amount of time allotted to these class visits was dictated by the
restrictions on the timeline. The teacher instructed the facilitators on their role during the
visits. Most pre-intervention visits consisted of the facilitators listening to a lesson then
circulating to support students on their work for that lesson. In one class visit the
facilitators took part in an outdoor school play session.
Math Play sessions. Before each Math Play session written instructions that
included “Before you start you will need,” “The goal of this game,” “How to play,” “What
to do to teach the game,” “Ways to modify the game,” “Ways to correct without it feeling
like criticism”, “Avoid statements like…” and “Ask,” were distributed to the facilitators.
Facilitators were provided with instructional videos (Appendix C for a complete list of
games and instructions).

37

During the Math Play sessions, from March 2, 2015 through April 2, 2015, teachers
were asked to be present to supervise the classroom (student behaviour), but did not
participate directly in the Math Play sessions, with a few exceptions. If the class was not
understanding the instructions or if a facilitator was absent the teacher supported the
facilitator(s). The classroom teachers were given specific instructions on the language to
be used (positive) and the instructions for the activities. All supplies for Math Play, other
than pencils, were brought in by the facilitators.
Classroom teachers were instructed to not bring any of the games used during
Math Play sessions into the class while the research was taking place. The teachers were
asked to teach math as they normally would.
Each Math Play session involved three games or activities with one exception. The
day the facilitators taught the participants “Magic Number” they had an extra game.
These games and activities were a combination of paper and pencil games and storebought games. Each session involved at least one of each type – paper and pencil and
store bought. Games were only used once in the five-week intervention. The Math Play
sessions started with the entire class working on a game together facilitated by the
volunteers. The class was then given the opportunity to try the game on their own with
the facilitators circulating to help. Group work was encouraged.
After the initial activity, the class was split in two groups, with each facilitator
presenting and supporting a new game. The facilitators then switched groups. This
method resulted in approximately 20 minutes per game.
38

The Games
The games were selected to not directly overlap with grade 2 and 3 curriculum.
This was done to avoid curriculum-based games being presented in one class but not in
others. Most of the games were selected for their focus on logic and problem-solving
strategies that must be employed to complete the game. A complete list of the games
used and the instructions for the games can be found in Appendix C.
During the Math Play sessions the students were encouraged to find the
mathematics in each game. The facilitator tried to help the students identify them. Some
facilitators had more success with this than others did.
The facilitators were instructed to only use positive language. Success on a game
was not based on getting the “right answer”, instead the focus was on attempting to
complete the game, making progress toward completion, and/or figuring out what
success would look like for that game. Phrases like, “Your solution does not look quite like
what I got, can you tell me what you did?” or “Your first step looks great, can you tell me
how you did that and use that same strategy to complete the next step?” were
encouraged. Facilitators were instructed to avoid phrases like, “That is wrong,” or “You
need to correct that.” This approach was used to try to avoid the feeling of getting the
wrong answer which was identified as a being related to a negative attitude toward
mathematics (Stodolsky, et al, 1991). Focus was placed on taking part and engaging with
the activities not getting the right answer.

39

Paper and pencil games. Paper and pencil games varied but often were based on
grid style games. Some students chose to work independently, and others worked in
groups. A focus was allowing the students to work in a way they were comfortable with.
They were not forced into working groups. An example of game a paper and pencil game
used follows. Several of the games came from a collection compiled by Susan Milner
(Milner, n.d.).
Hidato. Hidato is a grid game. The grids were made of squares connected to each
other. The goal is to place a natural number in each square in the grid so that consecutive
numbers are able to be connected by a horizontal, diagonal or vertical chain. Some of the
squares were already filled.
This game was presented on a large version made for demonstration purposes.
The facilitators explained the game and worked through several examples with the
students and then the students were given paper versions of the game and encouraged to
work through them. A variety of difficulty levels were available to accommodate the
difference in abilities in the classes. For students who were not able to place the numbers
in the correct places they were encouraged to figure out which numbers were missing
and place those numbers in the grid.
Boxed games. A variety of store bought games were brought in. Like the paper
and pencil games, these games were worked on in groups and independently. An
example of a boxed or store bought games follows. A full list of games and instruction is
in Appendix C.
40

Q-bitz. Q-bitz is a pattern matching game. A card that shows a pattern is selected.
There are 16, six-sided dice. Each die has a pattern on each side. The 16 dice were then
oriented in a four by four layout to match the pattern on a card. The variety in the
difficulty level of the cards was appropriate for the group and enable pattern matching
success for most of the students. If students made their own patterns, instead of trying to
match the pattern cards, they were encouraged to describe the pattern they made.
Collected Data
The information in the assessments was coded and entered in a secure file to
ensure privacy. Information collected regarding the students was handled by the
classroom teacher and researcher. When the data were entered it was separated from
any identifiers of the participants except a number which was recorded on the original
assessments and a separate file.
Ethical Considerations
Research Ethics Board approval was obtained. The original application was
resubmitted to accommodate for an increase in the number of classes involved in the
study.
Ethics is always of concern. To minimize the concern of shared confidential
information the volunteers did not at any point handle or see the personal information of
the participants. The teacher handled the consent forms and the consent forms were
then secured and given to the researcher. The facilitators signed confidentiality forms to
ensure they were aware of and would respect confidentiality concerns.
41

Chapter Four Results
Two measures were used to examine the attitude and effect of Math Play with
163 grades 2 and 3 students in schools in Prince George in School District 57: The
Difficulty Associating Words with Mathematics (DAWM) and the Math Attitude
Assessment (MAA). The data were analysed with Rasch Analysis using Winsteps (Version
3.92.1) followed by the Rasch measures being used in traditional statistical analysis
techniques employing SPSS (Version 24). The data were also sorted, organized and
graphed using Excel 2016. The analysis was done on the DAWM data followed by the
MAA data.
The analyses took place in six stages. First, the participants’ responses were
analysed using Rasch Analysis. Second, Rasch Analysis techniques were employed to
examine the items in the assessments. Third, commonly used Rasch Analysis techniques
were used to examine scale. Fourth, each participants’ measure was determined for each
of the three assessments, using anchored item difficulty. Fifth, the adequacy of the data
for use in traditional statistical analysis was explored using SPSS. Finally, data was
analysed using traditional statistics methods in SPSS (Field, 2011). In the first four stages
of analyses Rasch analysis techniques were employed for data preparation, then
traditional statistical analysis techniques were used in the investigation of the treatment
effect related to the research questions of this thesis.

42

Difficulty Associating Words with Mathematics
Rasch Analysis DAWM. Rasch Analysis of the DAWM assessment took place in
four stages. First, participants’ responses were analysed, to determine if any participant’s
results warranted being removed from the study. Second, items were analysed, in the
DAWM assessment to determine if all items should be retained. Third, the scale in the
assessment instrument was examined to determine if it was appropriate for the
participants. Fourth, anchored values were determined for further analysis.
Rasch Analysis of participant results DAWM February treatment and control. The
responses from all 163 participants were analyzed in Winsteps to examine the
participants for misbehaving responses. Displayed in Table 3 is a selection of high, low
and midrange endorsement rates for the 19 words or concepts, referred to as items, used
in the DAWM. Each row corresponds to one student. Gaps indicate where entries are not
reported due to space considerations in this table.
The “Score” in Table 3 indicates the actual number of words or items the
participant indicated as being related to mathematics. Each row represents the results
from a single participant. The mean score was 6.3 with a standard deviation of 3.3. The
second column reports “% Endorse”. It is the percent of the 19 items which the
participant indicated as being associated with mathematics. The scores are converted to
Logit Measures (Equation 2, 4, 5 & 6). The “Logit Measure” column represents the
students’ propensity to associate words with mathematics. Figure 5 also shows individual
participant’s logit measures on a map know as a “Wright map”. The larger the number,

43

the higher the participants’ propensity was to endorse words as being associated with
mathematics. For example, a participant with a logit measure of -11.92 (in the last row of
Table 3) is least likely to endorse words as being associated with mathematics (they
selected two of the 19 possible choices) where as a participant who endorsed all 19
words had a logit measure of -4.60.

Table 3
Excerpt DAWM Participant Measures

%
LOGIT
SCORE ENDORSE MEASURE
19
100.0
-4.60
19
100.0
-4.60
19
100.0
-4.60
18
94.7
-4.64
15
80.0
-4.83
15
80.0
-4.83
14
73.7
-5.02

MODEL

INFIT

INFIT

OUTFIT

OUTFIT

SE
0.21
0.21
0.21
0.24
0.39
0.39
0.52

MS
0.25
0.25
0.25
0.22
0.38
0.26
0.60

z
-0.6
-0.6
-0.6
-0.4
-0.1
-0.3
0.1

MS
0.53
0.53
0.53
0.48
0.62
0.41
0.55

z
-0.6
-0.6
-0.6
-0.7
-0.2
-0.5
0.1

CORR
.00
.00
.00
.25
.42
.50
.50

9
9
8

47.4
47.4
42.1

-6.75
-6.75
-7.14

0.62
0.62
0.66

1.51
1.46
0.75

1.6
1.5
-0.7

1.59
1.25
0.43

0.8
0.6
-0.2

.50
.54
.75

4

21.1

-9.51

1.03

3.09

2.0

3.11

1.5

.48

3
3
3
2

15.8
15.8
15.8
10.5

-10.68
-10.68
-10.68
-11.92

1.10
1.10
1.10
1.08

0.22
0.22
0.22
0.91

-1.4
-1.4
-1.4
-0.1

0.06
0.06
0.06
0.17

-0.8
-0.8
-0.8
-0.5

.86
.86
.86
.68

44

The “Model SE”, or Model Standard Error is a measure of the “best case” error
(Equations 7 & 9). The general trend is that as the score decreases, and therefore the
Logit Measure decreases, the Model SE increases.
The grey row in Table 3 is an example of a participant with infit MS of 1.51, which
does not add or impede the results, and an infit z of 1.6 which is between -2 and +2, and
indicates the data were predictable. There are two values associated with infit and outfit
statistics, the mean square, “MS” and standardized fit statistic, “z”. The fit statistic is a
mean square value made of a ratio of an approximate Chi-square statistic divided by its
degrees of freedom (Equations 10 & 12). The expected value is ‘1’ but values range from
zero to infinity. Infit is sensitive to patterns and outfit is sensitive to outliers. The Mean
Square indicates the magnitude of the randomness and the z is an indicator of the
probability of fit of the model. Mean Square values were examined and given more
weight when identifying areas of concern than the z statistic. Giving the Mean Square
values more attention is suggested by Linacre (2012b). An entry with a mean square
greater than ‘2’ is considered to possibly degrade the measure but this value can be
caused by as few as one observation. At a p <.05, Mean Square values between 1.5 and
2.0 do not impede or help the measure. Mean square values between 0.5 and 1.5
contribute the most to the measure. Values less than 0.5 do not degrade the measure but
are considered an indication of a lack of new information and may cause high reliability
and separation coefficients. The z statistic is affected by large sample size, which we have,
and large misfit values for z may not be as much of a concern as they would be if there
was a small sample size. Mean Square concerns are addressed before z concerns.
45

Any participants whose MS or z fell outside of the indicated ranges were further
examined to determine if it was advisable to remove their data from the study. No
participants were removed because even though their values did fall outside of the ideal
ranges the responses were recorded properly and there was no indication that they were
not a reflection of the participant’s propensity to indicate a word as associated with
mathematics.
The last column indicates the correlation between the participant results and the
ability measure, referred to as a point-measure correlation. The participants who
endorsed all 19 words had a correlation of zero indicating no relationship between the
participant’s ability and the measure. Except for the three participants who endorsed all
19 words, all participants showed a correlation greater than zero.
Of the original 163 participants, 135 wrote all three assessments (February, April
and May) an attrition rate of 17.1%. Of the 135 participants in the DAWM data, 66 selfidentified as male and 69 self-identified as female. The mean age of the 163 students was
7.89 with a Standard deviation of 0.63. The minimum age was 7 and the maximum age
was 10. Of the 163 who completed the original assessment, only 4 students did not fill in
an age. With the exception of the participants who did not write all three assessments, all
participants were retained.
Rasch Analysis of the items for the DAWM. Difficulty Associating Words with
Mathematics (DAWM) for the entire group in February was used to assess the DAWM as
an instrument and to create an anchor for item values for future analysis. The items

46

(words) showed an item reliability of .98 – indicating enough items of varying difficulty
were in the assessment. The Item Real Separation was 7.51. Item Real Separation can be
interpreted as the 19 items can be separated into 7 statistically distinct endorsement
levels. The collection of words showed a range of difficulties of endorsement from -7.71
to 3.22 (Table 4). Individual item summary statistics are given in Table 4. Person Real
Separation of 1.68 and Person Reliability of.74 were identified. The Person Real
Separation of 1.68 suggests that the assessment may not be quite sensitive enough to
separate people of high and low abilities (any score <2) but is close to the 2 guideline. The
assessment may benefit from more items. The assessment is sensitive enough to identify
treatment. Person Reliability is analogous to traditional statistics “test” reliability often
reported using Cronbach Alpha. Person Reliability is close to the .8 guideline for low
stakes standardized exams and exceeds the .7 recommended for classroom exams (Wells
& Wollack, 2003). This Person Reliability indicates a good internal consistency. The logit
measures of Item 7 and 12, “maps,” and “working on maps,” respectively is an extra
indication of the participant’s consistency in their responses.
Table 4 shows a range from easiest to associate mathematics with (most likely
word to be associated with mathematics), “adding” with a logit measure of -7.71 (at the
bottom of Table 4) to the most difficult word(s) for the students to associate with
mathematics (least likely word(s) to be associated with mathematics), “walking a dog”
had a logit measure of 3.22, at the top of Table 4. The standard error ranges between 0.2
and 1.06. The mean of the item logit measures is zero, as is standard practice for Rasch
Analysis.
47

Frequency of word selections for the original 163 participants ranged from a
maximum endorsement of 162 for “adding”, 159 for “subtracting”, and 158 participants
selecting “counting”, down to 11 selecting “singing” and “playing tag” and a minimum of
10 selecting “walking a dog” (Table 2).
Table 4
Item Characteristics for the Difficulty Associating Words with Mathematics DAWM
MODEL
Item
4
5
14
13
18
9
8
3

19
10
7
12
17
16
15
2
11
6
1

Word(s)
walking a
dog
singing
playing tag
watching a
movie
painting
cleaning
your room
cooking
doing a
maze
playing
board
games
puzzles
maps
working on
maps
studying
people
playing
cards
reading
writing
counting
subtracting
adding

INFIT

OUTFIT

SCORE

% EDDORSE

LOGIT
MEASURE

SE

MS

z

MS

z

CORR.

10
11
11

6.1
6.7
6.7

3.22
3.05
3.05

0.43
0.40
0.40

1.10
0.78
1.11

0.4
-0.6
0.5

0.45
0.47
0.96

-0.8
-0.8
0.2

.49
.54
.44

12
13

7.4
8.0

2.89
2.75

0.38
0.37

0.10
1.02

0.1
0.2

0.59
0.72

-0.6
-0.3

.49
.48

16
31

9.8
19.0

2.40
1.25

0.33
0.24

0.80
0.86

-0.8
-1.0

0.39
0.66

-1.3
-0.8

.56
.56

32

19.6

1.20

0.24

0.84

-1.1

0.73

-0.6

.56

38
39
45

23.3
23.9
27.6

0.88
0.83
0.55

0.22
0.22
0.21

0.92
0.91
0.95

-0.7
-0.8
-0.4

0.93
0.77
0.69

-0.1
-0.6
-1.0

.53
.55
.55

49

30.8

0.38

0.21

0.93

-0.6

0.69

-1.1

.56

58

35.6

0.02

0.20

1.25

2.6

1.12

0.5

.44

61
79
112
158
159
162

37.4
48.5
68.7
96.9
97.5
99.4

-0.10
-0.76
-2.04
-5.82
-6.04
-7.71

0.20
0.19
0.21
0.54
0.56
1.06

1.13
0.90
1.25
0.93
1.53
0.56

1.5
-1.2
2.2
0.0
1.2
-0.3

1.04
0.83
2.28
1.82
3.22
0.02

0.2
-0.7
3.9
1.1
1.9
-2.6

.48
.57
.38
.19
.04
.24

48

From Table 4, infit ranges from 0.56 to 1.53 with z scores from -1.2 to 2.6. Outfits
range from 0.02 to 3.22 with z scores from -2.6 to 3.9. Indicating only four items of
possible concern:
•

Item 17, “studying people”, infit z = 2.6

•

Item 2, “writing”, infit z = 2.2, outfit MS = 2.28 and z = 3.9

•

Item 6, “subtracting”, outfit MS = 3.22

•

Item 1, “adding”, outfit z = -2.6.

For the items with fit concerns:
•

Item 17, “Studying people”, from, Table 4, was kept because infit and outfit MS
are considered before z and the infit and outfit MS values were within the 0 to 2
range.

•

Item 2, “Writing”, showed higher values for outfit MS (2.28) and z score (3.9)
indicating random response or outliers (an indication of a few students not
selecting writing even though most did) but was kept in the assessment (Table 4).
Infit MS is considered before z and infit MS is less than +2.

•

Item 6, “Subtracting”, showed a high outfit value of 3.22 with z = 1.9. outfit of 3.22
indicates a few random response or outliers (an indication of a few students not
selecting subtracting even though most did with relative ease) which did not
justify removal (Table 4).
49

•

Item 1, “Adding”, does not add information to the test (infit of 0.56 and outfit of
0.02) but was kept because it does not take away from the information of the test.
The lack of variation is expected. Low fit scores are a result of the ease of
recognition the item (Table 4).

The data were examined with the guideline of an MS > 1.3 is considered underfit,
unpredictable, noisy, or having unmodeled noise and an MS < 0.7 is considered overfit,
(Bond & Fox, 2012) too predictable, redundant, muted, or cramped. Underfit degrades
the quality where as overfit makes the results appear better than they may be. From
Table 4, Items 1, 2, 11, and 6 demonstrate underfit and so are unpredictable. Items 4, 5,
13, 18, 9, 8, 3, 7, 12 and 1 show overfit so are behaving predictably and therefore
redundantly. These Items were retained. Overfit and underfit items do not need to be
removed from the assessment but need to be examined.
The last column in Table 4 indicated the Pearson-Point Correlation of the item
with the assessment. Correlations ranged from .04 to .57. The words easily endorsed by
the participants (counting, subtracting and adding) had the lowest correlations (.19, .04
and .24 respectively). Sixteen of the nineteen items in the DAWM assessment were
indicated to have a moderate positive correlation (.38 to .57). No items showed a
negative correlation. There is only one item with a very low correlation (r = .04), Item 6.
Having one item with a very low correlation is not a concern.
The common mathematics related word(s), “Counting”, “Subtracting” and
“Adding” were included in the assessment so the young participants had words they
would more commonly associate with mathematics. As can be seen by the scores in Table
50

3 there was a drop in endorsement of words shown above those three words, indicating
that several students may have only endorsed those three words.
The values from Table 4 can be represented as the bubbles shown in Figure 4. The
bubble plot is a combination of measures, errors and fit. The items that where endorsed
most easily and often and had the most fit issues (Items 2, 11, 6 and 1) can be seen
standing out from the majority of the other items. The large impact of one participant not
endorsing Item 1 is show in the large bubble on the far left bottom of Figure 4.

DAWM Item Difficulty versus Outfit z

Less

Measures
More

6
5
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-4

-2

Overfit

0

Outfit z

2

4

6

Underfit

Figure 4. DAWM item and outfit z scores: A graphical display of the fit.
Items with -2 < zMS < +2, on the horizontal axis, are generally of no concern. The
radius of the bubble is determined by relative Standard Error. The placement along the
vertical axis is determined by the Logit Measure. The zMS outfit determines the left to
right positioning. Item 2 is underfit. It is unpredictable but is retained because writing is
51

an important component of mathematics. Item 1 when its error in taken into
consideration, may still be within the –2 to +2 guideline values.

All Items were to be retained in the DAWM assessment. There were a few items
with outfit concerns but infit was considered before outfit and infit MS was less than +2
for all for the items. Item 6, “subtracting”, has a low correlation, was highly endorsed and
kept because it was one of the common mathematics related words. The DAWM
assessment item logit measures will be used as anchors in future analyses.
Rasch Analysis of the dichotomous scale for the DAWM. Figures 5 and 6 are
common displays used in Rasch analysis to examine an assessment instrument. Figure 5 is
a person and item map, referred to as a ‘Wright’ map. The numbers on the far left (-12 to
3) indicate the logit measure. On the left of the centre line, the participants are mapped.
The number sign (#) represents 3 participants and the dot ( ∙ ) represents one or two
participants. On the right side of the centre line, the WA followed by a number, represent
each item. For example, “WA4”, at the top of the map, represents Item 4, ”walking a
dog”. Along the centre dashed line are “S”, “T”, and, “M”. The “S” represents one
standard deviation. The “T” represents two standard deviations and the “M” represents
the mean of the data on the corresponding side.
The separation index indicated seven statistically distinct groups. Figure 5 suggests
four to five groups that may have differed in difficulty in a practical way. The ease of
indicating “adding”, “counting” and “subtracting” as being related to mathematics can be
seen by their positions on the bottom right. The position of most of the participants
below the easiest items suggests the DAWM difficulty level was not consistent with the
52

abilities of the participants. This suggests the difficulty of the items exceeded the ability of
most of the participants. It would be analogous to giving a grade two class a mathematics
achievement test designed to assess the ability of a grade 3 class. The scale is not ideal for
these participants at the time of the assessment.

53

Figure 5. DAWM Persons and items mapped by difficulty. Persons on the left and items on
the right. A dot ( ∙ ) represents 1 to 2 people and a number sign (#) represents 3 people. 1

1 While there are more advanced graphics available, for simplicity and consistency (over

decades of use) the Rasch displays are used for Figures 5, 6, 19 and 20.
54

Figure 6 is a scale assessment figure used to analyse the dichotomous scale. It
shows the average person logit measure for the endorsement of each item as a “1” and
the average logit measure of the persons who did not endorse the word as a “0”. For
example, the average logit measure of a person not choosing Item 1, “adding”, was about
-7 and the average logit measure of a person choosing Item 1 was about -0.7. The “m”
indicated missing data. In a dichotomous scale like this the “0” is expected to be to the
left of the “1” indicating the participants who do not endorse the item have a lower
average propensity or lower ability to endorse than the participants who do endorse the
item.
All but one of the items, Item 6, “subtracting”, show at least one logit difference
between the average of the participants endorsing the item and the ones not endorsing
the item. Figure 6 shows how a participant’s average ability to endorse Item 4, “walking a
dog” is higher than the ability of the average participant’s score who endorsed Item 1
“adding”. For items like 1, 6 and 11, which were endorsed by almost every participant,
the zero estimate is unstable because the average ability of the participants who did not
endorse the item is calculated using few participants (in the case of “adding” all but one
participant endorsed it).

55

Figure 6. DAWM item versus average person measure for item endorsement.

As a result of the item analysis, all items were retained in further analyses and all
item logit measures were determined for anchoring. All participants were also retained
for further analyses.
Anchoring is a common technique in Rasch Analysis. It is defined as, “measures
obtained from one analysis (or construct theory) imposed on another to place it in the
same frame of reference,” (Linacre, 2012d). All items other than Item 1, “adding” and
Item 11, “counting” were anchored using the item logit measures in Table 4 calculated
56

using the February data from the 163 participants. Items 1 and 11 were not anchored
because of their easy of endorsement and their fit concerns. With anchoring the number
of items anchored should be at least 5 and the fit should be good and the items should
have a range of logit measures (Linacre, 2012e). Though Item 6 and 2 did not have ideal
fit, they were anchored to ensure a range of anchoring items across the logit measures.
Anchored Logit Measures for February, April and May DAWM. With anchors set,
person participation confirmed, and item retention confirmed the logit measures were
determined for the participants in February, April and May. An excerpt of the measures is
found in Table 5. Of the original 163, the 135 who wrote all three assessments were kept
for the final analysis.
Analysis of the groups. The 135 participants were from 12 classes in School
District 57. Classes were grade 2, grade 2/3 split and grade 3 (one was a grade 3/4 split
from which only grade 3 students participated). Participants were not able to be
randomly assigned to the treatment or control group because of the restrictions on the
class that they were in and the need to match classes. Neither could the participants be
pre-tested to assign them to groups. Both the treatment group and the control group
wrote the DAWM assessment and the MAA in February (pre-treatment), April (posttreatment) and May (a month after treatment ended). The items for DAWM and MAA
were the same on successive sittings but the items were presented in a different order for
each assessment. Note, one class wrote the February assessment after our first visit (not
a Math Play visit but the first visit).

57

Table 5 is an excerpt from the complete list of logit measures for the control, “C”,
and treatment, “T”, for all three assessments with anchoring. The table was ordered
based on the logit measure in February (lowest to highest). Blank lines show where data
were not presented due to space considerations. The first column contains the student
ID. The second column indicates the self identified gender of the participants.
The logit measure in Table 5 indicate a propensity to endorse word(s) as being
associated with mathematics. For example, the participant who received a 6.03 in
February associated all 19 word(s) with mathematics and the participant who received a
-6.19 only endorsed 1 word as being associated with mathematics. Therefore, in this scale
the higher the measure the higher the propensity of the participant to endorse
word/concepts as mathematics related.
In Table 5, three rows have been highlighted. The first highlighted row shows a
male from the control group with a logit measure of -6.19 in February and increased to
1.63 in April and decreased to 0.11 in May. This participant had a difficult time associating
regular words and concepts with mathematics in February but in April, when presented
with the same words, endorsed more words as being related to mathematics. However,
by May, his propensity had decreased. The second highlighted row is the results from a
female in the treatment group. She showed a decrease from the February logit measure
of -0.82 to -2.36 in April and then another decrease to -4.14 in May. The third participant
highlighted is another male but from the treatment group. Unlike the other highlighted
participants, this participant showed an increase in his logit measure each time he wrote

58

an assessment. He had a higher propensity to endorse the words as mathematics related
on each successive assessment. The changes observed for the participants varied.
Table 5
Anchored Logit Measure, Gender and Group (Treatment or Control) for February, April
and May DAWM Excerpt
Person
ID

Gender

57518
83816
80911

M
M
M

21512
26214
59311
23213
23813
48220

M
F
M
F
M
M

-2.36
-2.36
-2.36
-2.36
-2.36
-2.28

-1.45
-2.36
-2.36
-6.31
0.11
-0.82

-0.32
-0.32
-0.32
0.5
1.62
-3.56

T
T
T
T
T
C

7520
39216
87816
4810

M
M
F
F

-1.43
-1.43
-1.43
-0.82

-0.82
3.66
0.93
-2.36

0.11
2.00
2.40
-4.14

C
C
C
T

46210
815
66621
41410
45918
52013
76414

F
F
M
M
F
M
F

-0.32
-0.32
-0.32
-0.32
0.11
0.11
0.11

0.89
1.63
2.01
1.63
-1.43
0.11
-0.82

1.25
1.62
2.00
2.00
-0.32
0.11
0.11

T
T
C
T
C
T
T

45115
73421
81317
57814

M
M
F
M

2.82
6.03
6.03
6.03

2.40
0.88
2.40
0.51

2.00
-0.81
3.30
3.30

T
C
C
T

Feb
Apr
May
Group
Logit
Logit
Logit
Measure Measure Measure
-6.19
1.63
0.11
C
-5.12
0.88
0.11
C
-4.09
-6.31
-4.14
T

59

This concludes the Rasch Analyses portion of the results for the DAWM. First, the
participants were analysed. All participants who wrote all three assessments were
retained for further analysis. Second, the items were analysed, and all items were
retained in the DAWM assessment. Third, anchor values were determined for future
analysis. Finally, initial Logit Measures were determined for all participants, using
anchoring, for the February, April and May assessments.
Missing Data Analysis.
Though Rasch Analysis accommodates for missing data for a participant, sources
indicate participants must be paired for ANOVA (Laerd, n.d.; Pagano, 1998; Plonsky, 1997)
(the next stage of analysis). Of the original 163 participants 135 were present for all three
assessments. There was no reason to suspect the missing data was anything other than
random, therefore two options were explored for how to accommodate for missing data:
omitting missing data or imputing data based on the regression equations. The benefit of
imputing data was an increase in sample size. The disadvantage was “overestimates [in]
model fit and correlation estimates” and “weaken[ed] variance”, (Humphries, n.d.). The
benefit of deleting missing data was ease and comparisons of the actual data. The
disadvantage of deleting data was loss of data and therefore reduction of power and if
the missing data was not random, a bias may be introduced (Humphries, n.d.). The
regression line created using the February control group, 𝑥𝑖 , and the April control group,
𝑦̂𝑖 , was used to predicted values of missing data in those groups. The regression equation
for the control group was 𝑦̂𝑖 = .3998𝑥𝑖 + .1764. Once the imputed data were present a
t-test was done to compare the February mean measure with and without data imputed.
60

There was not a statistically significant difference between the February control group
with and without the imputed data (p = .875). The same process was repeated for the
April control group data and again no statistically significant difference was found (p =
.964). As a result, all missing data were omitted. Data from participants who did not write
all three assessments were omitted from the traditional statistical analysis portion of the
analysis.
Traditional Statistical Analysis of the DAWM Results
Adequacy for Traditional Statistical Analyses of the DAWM results. The logit
measures mentioned in Table 5 were used in SPSS for further analyses. Various analyses
were attempted but were not used because they did not satisfy the necessary
assumptions on data behavior. Two-Way Repeated Measure ANOVA also called a TwoFactor Repeated Measure ANOVA would have been best suited to the situation based on
three repeated measures of both a treatment and control group and so was the preferred
method of analysis. This technique requires sphericity. A data set passes the test for
sphericity if the population variance in all possible combinations is equal (van den Berg,
2017). The data for this study resulted in a significance of p = .003 on the test for
sphericity using Mauchly’s W and therefore the assumption of sphericity was violated.
Next, ANCOVA was considered but the lack of homogeneity of regression lines for the
February, April and May data (homogeneity of variance) disallowed its use.
Two different approaches were used to analyse the DAWM data: three 2X2 TwoFactor (treatment by gender) ANOVAs on the difference scores (April – February, May –
April and May – February) with each student as an experimental unit and a three way
61

ANOVA (treatment, gender and block) on the difference scores with each class as an
experimental unit and each school as a block. The adequacy of the data for use in both
ANOVA analyses is explored further in this chapter. The choice of method of analysis was
guided by Tabachnick and Fiddell (2007).
Descriptive statistics for the DAWM difference scores. The final count for the
participants was 50 in the control group and 85 in the treatment group. The difference in
size is due, in part, to one treatment group class consisting of two classes combined.
Tables 6, 7 and 8 contain the descriptive statistics for April – February, May – April, and
May – February.
Most of the difference Rasch Measures are positive indicating a higher Rasch
Measure on the second assessment. An example of a negative difference is in Table 6:
Female Control Group (-0.36), indicating a mean Rasch Measure higher in February than
in April. The mean difference in the Rasch Measures for the May – April Control Group
Females was zero (Table 7) and close to zero for the Female Treatment group. The
statistical significant of the various mean difference Rasch Measures are examined
further using ANOVA, later in this chapter.
Tables 6, 7 and 8 contain the descriptive statistics for the DAWM data. In Tables 6,
7 and 8 n represents the sample size. Determining if the differences in these descriptive
data are of a statistically significant level are explored later.

62

Table 6
DAWM Descriptive Statistics by Group and Gender April – February Difference Measure
Group
Control

Gender
Female
Male
Total
Treatment Female
Male
Total

Mean
-0.36
0.84
0.24
0.90
0.27
0.60

SD
1.49
3.09
2.47
1.79
2.06
1.94

n
25
25
50
44
41
85

Table 7
DAWM Descriptive Statistics by Group and Gender May – April Difference Measure
Group
Control

Gender
Female
Male
Total
Treatment Female
Male
Total

Mean
0.00
-0.45
-0.23
-0.05
0.52
0.23

SD
1.12
1.79
1.49
1.75
1.86
1.82

n
25
25
50
44
41
85

Table 8
DAWM Descriptive Statistics by Group and Gender May – February Difference Measure
Group
Control

Gender
Female
Male
Total
Treatment Female
Male
Total

Mean
-0.36
0.39
0.01
0.85
0.79
0.82

SD
1.71
2.93
2.40
1.83
2.29
2.05

n
25
25
50
44
41
85

63

DAWM with Students as the Experimental Unit
Checking assumptions for analysis. The difference scores were examined to
determine if it was advisable to use ANOVA. The assumptions were: the dependent
variable has to be continuous, the independent variables must consist of independent
categories, observations must be independent, homogeneity of variance for each
combination of groups, data must be approximately normal, and no significant outliers
(Laerd, n.d.). Normal distribution, homogeneity of variance and independent samples
were the only assumption that needed to be met according to Field (2009) and only
Normality and homogeneity of variance was required by Pagano (1998). Some sources
replace the requirement for homogeneity of variance with homogeneity of error in
variance (Kutner et al, 2004). Levene’s test is suggested for testing variance (Laerd, n.d.)
but Levene’s test tests for equality of error variance so satisfies both assumption.
The Rasch measure for each student is a continuous scale, satisfying the first
assumption. The independent variables Gender and Treatment or Control group are
independent, meeting the second assumption. No participants were in both the
Treatment and Control group, meeting the assumption of observations being
independent. Levene’s test for equality of error Variances results indicated it was
reasonable to assume equal variance (April – February p = .056, May – April p = .130, May
– February p = .141). The other assumptions were also examined.
The April – February difference score data do not show normality (p = .001,
Shapiro-Wilk’s) indicating that the assumption of normality is not statistically evident. The
May – April data also do not show normality (p = .000, Shapiro- Wilk’s). The May –
64

February data do appear normal (p = .093, Shapiro-Wilks’). ANOVA is robust to nonnormal data and the histograms suggest that though the data is not normal, the data
does somewhat follow the normal curves. (Figure 7, 8 and 9). Due to the sample size, n,
being large (n > 50) it is advised to look beyond the test statistics (Field 2009; Laerd, n.d.).
As can be seen in Figure 7 (April – February histogram) there are a few data points at the
low end that may be contributing to the non-normal results but as discussed previously
there is no valid reason to reject the participants’ data.

Figure 7. DAWM frequency distribution and normal curve for April – February data.
In Figure 7 the high peak of entries just above zero and the few entries below
-5.00 can be observed not following the normal curve.

65

Figure 8. DAWM frequency distribution and normal curve for May – April data.
Figure 8 also shows a peak near zero and a few participants near the ends of the
normal curve but outside of those entries does have a general normal shape.
Figure 9 shows the May – February data which does have an approximately
normal distribution (p = .093).

66

Figure 9. DAWM frequency distribution and normal curve for May – February data.
Further examination of normality was warranted. Though the Shapiro-Wilk’s tests
suggest the data is not normal (except May – February) the Kurtosis values together with
skewness indicate the April – February data do not have an undue level of lack of
normality (except April – May). According to Tabachnick and Fiddell (2007) when
determining a level of skewness, examine how far from mean the centre of the
distribution is and Kurtosis values other than zero are an indicator of peaks that are not
ideal. More specifically, if skewness is below -0.5 or above 0.5 and Kurtosis is below -2 or
above +2 concerns for normality are justified (Brown, 2016). April – February Kurtosis
value was 1.731 (SE = .414). The Kurtosis value for May – April is 2.534 (SE = .414) which
does support lack of normality. May – February Kurtosis value is 1.049 (SE = .414)
supporting the normality found previously. Skewness was also minimal for two data sets
(April – February .127± .209 and May – February -.030 ± .209). May – April skew of .870 ±
67

.209 was a bit high. Together these results suggest the level of lack of normality is not at a
level of concern (except for May – April).
In a data set of 135, approximately seven participants would be expected to be
outliers (∝ = .05). Figures 10, 11 and 12 show box-plots of the difference scores.

Figure 10. DAWM box-plot for April – February data test for outliers with difference score
scale on the vertical axis.

Figure 10 shows 13 outliers. This is the largest number of outliers on any of the
difference scores. This is more than the expected seven outliers on a group this size.

68

Figure 11. DAWM box-plot for May – April data test for outliers with difference score
scale on the vertical axis.
There were a total of six outliers reported for the May – April data (Figure 11),
which is less than the seven expected.

Figure 12. DAWM box-plot for May – February data test for outliers with difference score
scale on the vertical axis.

69

There were a total of 8 outliers reported in the May – February data (Figure 12),
only one more than the expected seven.
The assumptions for use of ANOVA, (continuous variable, matched pairs, no
significant outliers, normality and equal variance (Laerd, n.d.) were met with the
following exceptions: the May – April data lacks normality, and April – February data had
13 outliers. ANOVA is however robust to a lack of normality (Field, 2009). One source
indicated that the ANOVA results for data like April – February must be interpreted with
caution due to the number of outliers (Laerd, n.d.). However, another source replaces the
assumption of no outliers with an assumption of equal sample size (Plonsky, 1997). Yet
another source has equality of variance and equal sample size as the only two
assumptions for ANOVA (Pagano, 1998, p 418). Only results from participants who wrote
all three assessments (February, April and May) were used, therefore sample sizes were
equal. Also, quality of variance is met by all three sets of difference scores. Therefore, a
2X2 Two-Factor ANOVA was used to analyse the data (April – February, May – April and
May – February).
Traditional Statistical Analysis of the treatment on DAWM assessment. Three
2X2 Two-Factor (group and gender) ANOVA tests on the April – February, May – April and
May – February difference scores were preformed. The significance results for the Tests
of Between-Subject Effects are reported in Table 9.
For each of the three difference scores (April – February, May – April and May –
February) logit measures, the between subject effect was examined. Table 9 shows the p
values for the between subject effects. A significant interaction was found between
70

Group (treatment and control) and Gender (p = .017) in April - February. The interaction is
illustrated in Figure 13 (graphically shown as crossing lines). No other statistically
significant interactions were reported (last row Table 7). The power for the April –
February data was .22. The power for the May – April and the May – February data were
higher at .72 and .79 respectively.

Table 9
DAWM Between-Subject Effects Statistical Significance (p)
Source
Gender
Treatment
Gender and
treatment

April – Feb
p
.563
.829
.017

May – April
p
.838
.127
.092

May – Feb
p
.378
.040
.307

A statistically significant difference was found between the treatments group and
the control group in the May – February data (p = .040, Cohen’s d = 0.337) (Table 7) with a
small to medium treatment effect size (Partial ETA of .032, Cohen’s d = 0.337). There was
not a significant difference between gender or treatment groups for the May – April data
(p =.838, p = .127 respectively) with a minimal treatment effect size (Partial ETA .018,
Cohen’s d = 0.30). Due to the interaction in the April – February data (p= .017) and the
possible interaction in May – April (p = .092), the most interpretable data set is May –
February. A trivial effect size (Cohen’s d = 0.15) was determined for the April – February
data and a small effect size (Cohen’s d = 0.30) was determined for the May – April data.
The means and standard errors for each of the difference scores are shown in
Tables 10, 11, and 12.
71

Table 10
DAWM Estimated Marginal Means and Standard Error April – February
Group

Gender

Control

Female
Male
Female
Male

Treatment

Estimated
Marginal Mean
-.356
.840
.901
.271

2xSE
.848
.848
.638
.622

As opposed to the earlier means reported, the means in Tables 10, 11, and 12 are
Estimated Standard Means, which are the means adjusted for the other variables present.
Table 10 shows the April – February data. Twice the Standard Error is also reported in
Table 10.
In Figures 13, 14, and 15, the data from Tables 10, 11 and 12 are represented
graphically. The interaction, characterized by the crossing of the lines in a graph, is
observed in the April – February data and the possible interaction in the May – April data
are visible as the crossing lines. E. Wagenmakers, A. Krypotos, A. H. Criss, and G. Iverson
(2012) explore interactions, characterised by the crossing of lines, and indicate that they
are nonremovable and caution the interpretation of any main effect when a interaction is
present.

72

Estimated Marginal Means Versus Group April – February Separated
by Gender DAWM
Logit Measeure Estimated Marginal Means

2
1.5
1

0.5
0

-0.5
-1

-1.5

Control

Treatment
female

male

Figure 13. DAWM estimated marginal means versus group April – February separated by
gender.
Figure 13 shows the results from Table 10 graphically. Logit measure is on the
vertical axis. The left column shows the control group separated by gender. The right
column shows the treatment group separated by gender.
Table 11
DAWM Estimated Marginal Means and Standard Error May – April
Group

Gender

Control

Female
Male
Female
Male

Treatment

Estimated Marginal
Mean
-.003
-.454
-.051
.523

2xSE
.678
.687
.512
.528

Table 11 shows the May – April data. The data are represented graphically in
Figure 14.
73

Estimated Marginal Means Versus Group May – April Separated
by Gender DAWM
Logit Measeure Estimated Marginal Means

1.5

1

0.5

0

-0.5

-1

-1.5

Control

Treatment
Female

Male

Figure 14. DAWM estimated marginal means versus group May – April by gender.
Figure 14 is a visual representation of Table 11.
Table 12
DAWM Estimated Marginal Means and Standard Error May – February
Group

Gender

Control

Female
Male
Female
Male

Treatment

Estimated Marginal
Mean
-.359
.386
.850
.794

2xSE
.876
.876
.660
.684

Table 12 shows a range of Estimates Marginal Means between -0.359 and 0.850
with twice Standard Error between .660 and .876. Figure 15 shows this information
graphically. The power for the May – February data was .79.

74

Estimated Marginal Means Versus Group May – February
Separated by Gender DAWM
Logit Measeure Estimated Marginal Means

2
1.5
1
0.5
0
-0.5
-1
-1.5

Control

Treatment
female

male

Figure 15. DAWM estimated marginal means versus group May – February by gender.
The final trend of increase in the estimated marginal means can be seen in Figure
15. There is no statistically significant difference between male and female participants in
the control and treatment groups, however when the means of the female and male
participants are taken into consideration together there is a statistically significant
difference between the control and treatment groups with a small to moderate effect size
(p = .040, Cohen’s d = 0.337). The effect size for the female participants was large
(Cohen’s d = 1.380). The male participants showed a medium effect size (Cohen’s d =
.466).

75

Summary of Analysis of DAWM Assessment Intervention
All participants’ scores were retained for the analyses if they wrote all three
assessments (February, April and May). Though some participants did not perfectly fit the
Rasch model, their results were retained for analysis. There was enough variety in the
item difficulty (item reliability of .98). Rasch Analyses indicated the 19 items, though
difficult for the participants, had 7 statistically different levels of difficulty (Item
Separation 7.51), that the items varied enough in difficulty, the assessment had good
internal consistency and an appropriate reliability for a low stakes test (Person Reliability
.74) (Wells & Wollack, 2003). The assessment may not have been sensitive enough to
reliably determine individual participant’s scores (Person Separation 1.68). All items were
also retained for the analyses. The average logit measure of participants who did not
endorse a word as being associated with mathematics (0) was lower than the average
ability of participants who did endorse a word as being associated with mathematics (1)
indicating the scale was appropriate. Anchors were used for all but Item 1 and 11. Overall,
the DAWM instrument is reasonably suitable for the task of assessing the treatment
effect in this study.
Summary of DAWM traditional statistical analysis with students as the
experiment unit. The difference scores of the logit measures for each assessment (April –
February, May – April, May – February) for the DAWM met the assumptions necessary to
preform a 2X2 Two Factor (treatment, gender) ANOVA. A statistically significant
interaction was indicated for the April – May data (p = .017). The April – February data
had a power of .22 well below the .8 guideline. The only statistically significant difference
76

was identified for the May – February data for treatment (p = .040) with a small to
medium effect size (Partial ETA 0.032, Cohen’s d = 0.337). The power for the May –
February data was .79. Close to the .8 guideline. For the May – February data the effect
size for the female participants was larger than that of the male participants (Cohen’s d =
1.38 and d = .466 respectively). This indicated a difference between control and
treatment from before the intervention to a month after the intervention ended.
DAWM with Schools as a Block and Treatment and Control Separating the Units
The experimental units. The study took place in five schools. In three of the
schools there was one class in the control group and one class in the treatment group. In
one school there were two treatment group classes and one control group class. In
another school there was one treatment group class and two control group classes. To
facilitate comparing blocks (schools) if a school had more than one class in either the
treatment or control group the classes were combined into one. The result was five
blocks each with a treatment group and a control group (each of these groups as an
experimental unit). Table 13 contains the summary of the number of participants
separated by treatment and control, gender, and block.

77

Descriptive data for each group in each block is not being included.
Table 13
The Number of Participants in each Between Subject Factor
Treatment

50

Control

85

Male

66

Female

69

Block 1

24

Block 2

50

Block 3

27

Block 4

22

Block 5

12

Checking assumptions for analysis. The difference scores were examined at
length with each student as an experimental unit. Due to the robustness of ANOVA to
non-normal data, normality of errors was explored. The variance was investigated using
Levene’s test and at a significance level of ∝ = 0.5, equal variance was reported for May –
April (p = .290) and May – February (p = .406). With a significance level of ∝ = .01 for April
– February (p = .010). All other assumptions were met as discussed previously.

78

Summary of traditional statistical analysis of DAWM with Blocks. Three
difference scores were analysed in a Three Factor (Treatment, Gender and Block) ANOVA
(April – February, May – April and May – February), resulting in the model:
𝑦𝑖𝑗𝑘 = 𝜇 + 𝜏𝑖 + 𝛾𝑗 + 𝛽𝑘 + 𝜏𝑖 ∗ 𝛾𝑗 + 𝜏𝑖 ∗ 𝛽𝑘 + 𝛾𝑗 ∗ 𝛽𝑘 + 𝜏𝑖 ∗ 𝛾𝑗 ∗ 𝛽𝑘 + 𝜀𝑖𝑗𝑘 , where
𝜏𝑖 represents the treatment, 𝛾𝑗 represents the gender and 𝛽𝑘 represents the block
effects. For the April – February data, statistically significant main effect was identified for
Block (p = .025), however there was also a statistically significant interaction between
Block and Treatment (p = .001). Figure 16 indicating that there was an interaction
between the intervention and the schools which indicated different schools responded
differently to the treatment.

Treatment and Control
Versus Blocks April - February
Estimated Means

8
6
4
2
0
1

2

3

-2

4

5

Block
t

c

Figure 16. Treatment and control groups versus blocks of the April – February data

79

The May – April Three - Way ANOVA did not indicate any statistically significant
effects or interactions.
The May – February data was also analysed. As with the April – February data,
there was a statistically significant Block effect (p = .004) and an interaction between
Block and Treatment (p = .001). For the Treatment, Gender and Block a statistically
significant interaction was identified (p = .016). There were no Blocks reported by Tukey
HSD as being statistically significantly different.

Math Attitude Assessment (MAA)
Each time the participants wrote this assessment (MAA) the order the statements
were presented in was different. There were a total of four versions of the MAA. During
data entry an error was found in one version of the MAA. One version (Version 2) of the
MAA had Statement 11 (I know I can get math questions right,) twice and Statement 1,
(Math is easy,) was not present (Appendix B). Rasch Analysis can be done with missing
values. A quasi random process was used to select one of the responses for Statement 11
and Statement 1 was left as missing data. Of the original 163 participants, 44 received
Version 2, the affected MAA.
The analysis for the MAA data was conducted in the same order as the DAWM.
The participants, items and scale were examined, then anchor values set using Rasch
Analysis techniques then the difference scores were used in traditional statistical analysis.
Rasch Analyses MAA. Rasch Analysis of the MAA results took place in four stages.
First, the participants were analysed to determine if any participants warranted removal
80

from the study. Second, the items in the MAA were assessed to determine if all items
should be retained. Third, the scale in the assessment was examined to determine if it
was appropriate for the participants. Forth, anchored values were determined for further
analysis.
Rasch Analysis of participant’s results MAA February treatment and control. All
participants data for February were analysed together in Winsteps to identify
misbehaving data. Table 14 shows an excerpt from the full person measure results. All
participants data were examined if they had a high infit or outfit value, but no responses
warranted removal. The MAA included 12 statements with, “No”, “Sometimes”, and “Yes”
as the responses. “No” was scored as “0”, “Sometimes” as “1” and “Yes” as “2”. There
were both positively and negatively phrased statements. For example, Statement 6 was,
“Math is useful”, and statement 2 was, “Math is useless”. Negatively worded items
(statements) were reverse coded. If a participant responded “Yes” to “Math is useless,”
an original score of “2” was been assigned to the response but then switched to a “0” for
further calculations.
Table 14 includes 9 columns of information from the MAA. The “Count” column
refers to the number of statements to which the participants responded. Unclear
responses were entered as missing data. The “Score” is the result the sum based on the
responses, after re-framing scores based on negative language. For example, 12
responses of “Yes” would result in a score of 24. The Logit Measure is affected by both
the count and the score. The grey cells in Table 14 illustrate that if the scores are equal,
for example 11, the participant with the lower count received the higher Logit Measure
81

(more positive attitude toward mathematics). If the counts are equal, for example 12, the
participant with the higher score receives the higher Logit Measure (more positive
attitude toward mathematics).
The Model Standard Error column in Table 14 is the Model Standard Error as
described previously (Equation 9). Generally, the higher the SE the higher the logit
measure and visa versa. There are exceptions to this trend.

82

Table 14
Excerpt MAA Participant Measures

COUNT SCORE
12
24
12
24
12
24
12
23

LOGIT
MODEL INFIT
MEASURE
SE
MS
4.48
1.84
1
4.48
1.84
1
4.48
1.84
1
3.24
1.03
1.01

z
0
0
0
0.35

OUTFIT
MS
1
1
1
0.73

z
0
0
0
0.22

CORR
0
0
0
0.22

12
11
12

21
20
20

2.02
2.46
1.67

0.63
0.75
0.56

0.96
0.66
2.35

0.13
-0.31
2.13

0.67
1.09
1.97

-0.27
0.39
1.44

0.35
0.38
0.11

12
11
11

19
19
19

1.38
1.91
1.91

0.52
0.64
0.64

0.68
0.78
0.53

-0.66
-0.22
-0.79

0.62
0.46
0.48

-0.66
-0.68
-0.64

0.71
0.81
0.67

12
12

17
17

0.9
0.9

0.46
0.46

1.29
0.66

0.82
-0.9

1.36
0.59

0.9
-1

0.05
0.65

12
11

16
16

0.7
1.04

0.44
0.49

0.46
1.32

-1.84
0.85

0.58
1.96

-1.14
1.75

0.68
0.34

11
10
11

12
11
11

0.27
0.2
-0.08

0.43
0.46
0.43

0.69
1.22
1.8

-1.01
0.72
2.16

0.71
1.24
1.72

-0.84
0.7
1.8

0.52
0.1
0.27

11
10
11
12
12
10
11

8
7
6
5
4
2
1

-0.63
-0.63
-1.06
-1.38
-1.65
-2.19
-3.11

0.45
0.48
0.48
0.5
0.55
0.75
1.02

1.22
0.67
0.99
0.42
0.71
1.16
1.15

0.71
-0.89
0.09
-1.59
-0.5
0.45
0.48

1.09
0.62
1.29
0.43
0.63
1.52
7.65

0.34
-0.85
0.7
-1.15
-0.47
0.79
2.6

0.7
0.75
0.31
0.64
0.39
-0.16
-0.75

83

Fit statistics in Table 14 vary, as they did with the DAWM data. If a participant
responded positively to all 12 items (statements) their infit and outfit were 1 with a z = 0
which indicates that the person is helpful to the measure. One student (the bottom row
in Table 14) has an extreme outfit value of 7.65 with z = 2.6. Any participants whose fit
statistics were outside the ideals, were reviewed. The equations used for the RaschAndrich rating scale are in Chapter 2, Equations 8, 9, 10, 11 and 12.
All participants’ individual scores were positively correlated to the group scores,
except for the two participants in the bottom two rows. For the three participants who
scored 24, correlation was zero, expected when there is no variation in score.
Though not all participants fit the model ideally, all participants who wrote all
three MAA assessments (February, April and May) were retained for traditional statistical
analysis, to be consistent with the DAWM assessment analysis. Of the 163 students who
participated, 135 responded to all three assessments (February, April and May). A few of
these were different from the respondents in the DAWM data. This can be observed in
the gender breakdown. Of the 135 participants 67 self identify as female and 68 self
identified as male. This resulted in the same attrition rate as the DAWM, 17.1%.
Rasch Analysis of Items for February MAA. Rasch Analysis of the Math Attitude
Assessment results for the entire group was done to assess the MAA as an instrument.
The items showed a Rasch person reliability of .67 – indicating a need for more items in
the assessment. Recall, person reliability is test reliability which is analogous to reporting
Cronbach Alpha, so the test reliability is close to the .7 guideline for a low-stakes test
(Wells & Wollack, 2003). A Person separation of 1.43 was reported – indicating that while
84

the assessment may be suitable for detecting group differences, it is not sensitive enough
for scoring individuals. The items in the MAA showed an item reliability of .96. This value
indicates a large enough sample size with enough variety is item difficulty. The item
separation was 4.80. This value indicates the items can be separated into 4 statistically
significantly different levels of difficultly.
Table 15 contains the Logit Measure, Fit information and Correlation for the 12
items in the MAA. The item number, in the first column, corresponds to the statements
listed in Table 16. The “Count” column indicates the number of responses to the
statement. The “Score” column indicates the sum of the response values (sum of zeroes,
ones and twos). Generally, as the “Score” increases the logit measure for that item
decreases.
Table 15
Item Difficulty Math Attitude Assessment (MAA)
Item
No.
12
4
7
1
11
3
8
5
9
6
2
10

Count
158
157
153
114
158
152
155
159
150
156
154
151

Score
111
191
187
152
219
215
227
241
232
248
261
264

Logit Model
Measure
SE
1.80
0.13
0.56
0.13
0.54
0.13
0.25
0.15
0.12
0.13
0.03
0.13
-0.10
0.14
-0.27
0.14
-0.34
0.15
-0.48
0.15
-0.93
0.17
-1.19
0.18

Infit
MS
1.16
0.76
0.79
0.76
1.17
0.86
0.96
0.92
1.03
1.55
1.17
1.10

Infit
z
1.5
-2.7
-2.3
-2.1
1.6
-1.3
-0.3
-0.6
0.3
3.8
1.1
0.6

Outfit
MS
1.60
0.75
0.85
0.88
1.39
0.72
1.01
0.70
0.88
1.89
1.16
1.64

Outfit
z
3.5
-2.2
-1.2
-0.8
2.5
-2
0.1
-1.9
-0.6
3.9
0.7
2.1

Corr
0.58
0.64
0.53
0.50
0.34
0.67
0.58
0.64
0.49
0.31
0.44
0.41

85

The statement order shown in Table 16 is based on the order indicated in Table
15. They are ordered from the item found most difficult to endorse at the top, to the
easiest to endorse at the bottom. The “N” beside several of the items in Table 16 indicate
the items whose scores were reversed due to negative language.

Table 16
Items from the MAA Ordered Hardest to Easiest by Logit Measure
Item
12
4
7
1
11
3
8
5
9
6
2
10

Statement
When I grow up I want a job that uses math.
Math makes me feel happy.
I find math hard. N
Math is easy.
I know I can get math questions right.
I want to do math next year.
I am sad when I do math. N
Next year I want to stay away from math. N
I can think of ways to use math.
Math is useful.
Math is useless. N
Math makes me scared. N

The “Logit Measure” column in Table 15 indicates the difficulty level of the item to
which the participants responded positively. The higher scored items have a lower logit
measure as the measure is ‘difficulty’ not ‘easiness’. For example, it was harder for a
participant to indicate they want a job that uses mathematics when they grow up than it
was to indicate mathematics does not scare them. The item logit measures ranged from 1.19 to 1.8 and their Model Standard Errors between .13 to .18. The errors increased as
the logit measures decreased. Items 6 and Item 2 received similar Logit Measures. The
similar measures for items 2 and 6 is further indication of reliability in participant
86

responses. This is not surprising due to the similarities in the items. The difference in the
Logit Measures for Items 4 and 10 may be an indication that the participants did not
interpret the opposite of “scared” as “happy”.
Infit ranged from 0.76 to 1.55 with z scores from -2.7 to 3.8. Outfit ranged from
0.7 to 1.89 with z scores from -2.5 to 3.9. All MS are less then +2, however some of the
items have z values outside of the -2 to +2 ideal range. MS was given priority over z values
when determining item retention as suggested by Bond and Fox (2012). The z = 3.8 and
3.9 for Item 6 and z = 3.5 for Item 12 are greater than 3 and of particular concern.
Exploration of the removal of Item 6 is discussed further later in this chapter.
The last column is correlation. Of the 12 statements, all items showed a
correlation greater than +.3. This indicates that all of the 12 statements showed a
moderate positive correlation between the items and the assessment. The largest
correlation of .67 was identified for Item 3, “I want to do math next year.” Both Item 12
and Item 5, “When I grow up I want a job that uses math,” and “Next year I want to stay
away from math,” respectively had a correlation of .64. The three items that were most
positively correlated with the assessment were the items that were interested in
Motivation. The lowest correlations were identified for Item 6, “Math is useless,”
correlation .31, Item 10, “Math is useful,” correlation .34, and Item 11, “I know I can get
math questions right,” also with a correlation of .34. Items 6 and 10 assessed Value and
Item 11 assessed Self-Confidence.
The results from Table 16 are presented in a more visual representation in the
bubble plot in Figure 17. The Logit Measure is on vertical axis. Overfit and underfit are
87

indicated in the horizontal axis. Items 6 and 12 are of concern, showing underfit
(unpredictable responses).

MAA Item Difficulty versus Outfit z

Measures

More

3

2

1

0

Less

-1

-2
-4

-2

0

2

4

6

Figure 17. MAA items z scores of Outfit.
In Figure 17 the similarity of the Model Standard Error is reflected in the similar
radii for all items. Items with zOF <-2 or zOF > +2 are generally of concern. Removal of item
was not advised due to the item reliability score indicating that more items would be
beneficial. However, item assessment was performed with item 6 removed; Figure 18
shows how Items 11 and 12 have an increased z in the new item assessment. The outfit z

88

score for Item 11 went from 2.5 to 3.0 and the outfit z score for item 12 went from 3.5 to
3.7 when item 6 is removed.

MAA Item Difficulty versus Outfit z Without Item 6

Measures

More

3

2

1

0

Less

-1

-2
-4

-2

0

2

4

6

Figure 18. MAA item z score of outfit item 6 removed.
Removal of Item 6 did not improve the assessment. The results from Figure 18
helped confirm that all items should be retained. If removing an item of high z score on
outfit improved the others or even kept the z scores for the other items the same there
may have been a reason to remove an item but there is no evidence that removal would
be a benefit.
Each of the 12 statements belong to the categories Self-Confidence, Value,
Motivation or Enjoyment. When attempts were made to separate the 12 statements into
89

their respective categories for further analysis, there were too few statements in each
category to allow for meaningful results.
All items were retained for the MAA. The fit statistics did not indicate the need for
removal and more items were indicated as being beneficial. This was further evidence to
support full item retention. The logit measure for each statement was used as an anchor
for further Rasch Analysis.
Rasch Analysis of Scale for MAA. Comparison of the abilities of the participants to
the difficulty of the assessment was done using Figure 19, the “Wright” map (Bond & Fox,
2012). The “X” indicates a single participant on the left and “MA#” indicates the item. For
example, “MA1” is Item 1, “Math is easy.”
Figure 19 shows the item difficulties are contained within the person abilities. The
item separability of 4.80 indicating four statistically significantly different levels of
difficulty and these are suggested in the map. Several participants’ abilities are higher
than the most difficult item (Item 12). The mean participant Logit Measure is higher than
the mean Item Logit Measure. Together these results suggested the assessment was
relatively easy for the participants.

90

Figure 19. MAA person ability and item difficulty map.

91

Figure 20 indicates the average person measure for the endorsement of each
statement or item and examines scale.

Figure 20. MAA item versus person measure for item endorsement.
In Figure 20 the average person measure for response to each item is shown. The
zero, one, two order that is expected for a scale is observed for all but one item. “MA6”,
“Math is useful.” is the only item that does not follow the ideal of “0” then “1” then “2” as
the response order. Indicating the response of “Sometimes” “Math is useful” is easier
than a “Yes” or “No” response. Note the “m” indicates missing data.

92

Anchored Logit Measures for February, April and May MAA. With person
participation confirmed, item retention confirmed and therefore anchors set (using the
February data), the Logit Measure for each participant was determined for each of the
three assessments (February, April and May). Samples of the values are given in Table 17.
Blank lines indicate where data were not reproduced in the table due to space.
Table 17
Measure, Gender and Group (Treatment or Control) for February, April and May MAA
Excerpt
Person
ID

Gender

91918
79510

FEB
APR
MAY Group
Logit
Logit
Logit
Measure Measure Measure
M
-2.23
-0.24
1.41
C
F
-1.63
-1.01
-1.16
T

78618
66621

M
M

0.16
0.17

0.77
-0.19

0.49
-0.02

C
C

80220
38412
68421
29511
44214
85414
321

M
M
F
F
F
F
M

0.55
0.59
0.67
1.36
1.36
1.36
1.41

-0.02
-0.02
0.33
1.8
1.49
1.49
0.15

-0.6
-0.05
-0.02
1.75
1
2.2
1

C
T
C
T
T
T
C

77011
45918

M
F

1.64
1.7

2.17
2.43

1.51
1.82

T
C

36913
29610
26214

M
M
F

2.36
2.43
2.46

1.8
3.44
1.18

1.82
2.08
4.75

T
T
T

7520
23421

M
F

4.46
4.54

3.29
4.53

1.76
4.75

C
C
93

The first column in Table 17 indicates the student ID number and the second
column indicates self identified gender. Table 17 was ordered based on the logit measure
for February from lowest to highest. The logit measure here is an indication of a
propensity to indicate a positive attitude toward mathematics as indicated by responses
to the MAA. The participant with a logit measure of 4.54 in February responded in a more
positive way to the assessment than the participant with a logit measure of -2.23 and
would be said to have a more positive attitude toward mathematics. Referenced values
are highlighted in Table 17. The -2.23 Logit Measure is the most negative response and
4.54 is the most positive response found in February. Overall the maximum Logit Measure
was 4.75 and the minimum measure was -4.62. Both the maximum and minimum
occurred in May.
The change in Logit Measures was not consistent. Some participants’ Logit
Measures increased in each MAA. An example of this is participant 85414. She started at
1.36 then increased to 1.49 and ended at 2.20. Others decreased, like participant 68421
(0.67, 0.33 and -0.02 respectively). And others were inconsistent in their change, showing
increases and decreases, like participant 29610.
A value of one was assigned to neutral responses. A value of zero was assigned to
negative responses and a value of two was assigned to positive responses. The higher
logit measure indicates the participants propensity to indicate a more positive attitude
toward mathematics. Participants with a score of 12 out of 12 responses received a Logit
Measure of approximately zero, -0.02. Participants with a score of 12 out of fewer than
94

12 responses received a positive Logit Measure. If the score divided by the number of
responses was greater than one, a positive Logit Measure resulted. Therefore, a negative
Logit Measure would indicate a negative attitude and a positive Logit Measure would
indicate a positive attitude. The mean anchored Logit Measure for February was 1.085.
This indicates an overall positive attitude toward mathematics using the MAA, which is
supported by a mean score of 15.8 (SD = 4.5) on the February data for all 163
participants. The average Logit measures range from .0905 for February control group to
1.501 for the May treatment group. Overall, the mean attitudes were slightly positive for
February, April and May.
This concludes the Rasch Analysis portion of the results of the MAA. Each
participant now has a Logit Measure indicating their attitude toward mathematics as
measured by the MAA. An overall slightly positive attitude was found for the participants.
Next the Rasch Logit Measures were used in a traditional statistical analysis approach to
determine if any differences are present between the control and treatment group while
considering gender, with both the students as experimental units and then with schools
as blocks.
Traditional Statistical Analysis of the MAA Results
Justification of the Traditional Statistical Analysis method used for MAA results.
The logit measures from Table 13 were used in SPSS for analyses of treatment effect. To
be consistent with the analyses methods with the two assessments (DAWM and MAA)
difference scores were tested for use in a 2X2 Two-Factor (group and gender) ANOVA
with students as the experimental units, as was a Three-way (Treatment, Gender, Block)
95

ANOVA . The data from participants who wrote all three assessments was used. The final
count was 67 participants who self identified as females and 68 who self identified as
males. The control group had 51 participants and the treatment group had 84
participants.
Descriptive Statistics of the MAA Difference Scores. Table 18, 19, and 20 contains
the descriptive statistics for the difference in Rasch Measures for the MAA data: (April –
February, May – April, and May – February. The n represents the size of each sample.
Determining if the differences indicated between treatment and gender are of a
statistically significant level is investigated later in this chapter. As with the DAWM results
there are both positive and negative differences.
Table 18
MAA Descriptive Statistics by Group and Gender April – February Difference Measure
Group
Control

Gender
Female
Male
Total
Treatment Female
Male
Total

Mean
0.01
0.18
0.10
-0.02
0.05
0.02

SD
0.70
1.01
.88
1.31
1.14
1.23

n
24
27
51
43
41
84

Table 19
MAA Descriptive Statistics by Group and Gender May – April Difference Measure
Group
Control

Gender
Female
Male
Total
Treatment Female
Male
Total

Mean
0.30
-0.04
0.12
0.27
0.31
0.29

SD
1.18
0.81
1.01
1.07
1.54
1.32

n
24
27
51
43
41
84
96

Table 20
MAA Descriptive Statistics by Group and Gender May – February Difference Measure
Group
Control

Gender
Female
Male
Total
Treatment Female
Male
Total

Mean
0.31
0.14
0.22
0.25
0.36
0.31

SD
.82
1.10
0.97
1.16
1.61
1.40

n
24
27
51
43
41
84

Checking Assumptions for analysis with students as the experimental unit. As
with the DAWN assessment, the MAA results are continuous dependent variable with
matched pairs. The difference scores were examined for equality of variance, normality
and outliers to determine if the other assumptions for use of ANOVA were met. Due to
the nature of the assessment, an ∝ = .01 was appropriate. With an alpha of .01 all three
sets of difference scores showed homogeneity of variance. The April – February had a p
of.026 on the Levene’s Test. The April – February data had a p of .019. The May – April
data showed equality of variance with a p of .319. Homogeneity of variance for the 2X2
Two-Factor ANOVA was adequately met.
Normality was tested using the Shapiro-Wilk’s test and May – April and May –
February both report p = .000 so are not normal. April – February is normal (p = .301).
Due to the sample size, n, being large (n > 30) it is advised to look beyond the test
statistics. The histograms for the data further the investigation into normality. In further
exploring normality Figures 21, 22, and 23, show both the frequency distributions and the
normal curves. The difference between the normal April – February data compared to the
May – February and the April – February data is illustrated.
97

Figure 21. MAA frequency distribution and normal curve for April – February data.
Figure 21 supports the Shapiro-Wilk’s p = .301 to indicate normality. There is
evidence of outliers, but normality is evident in the shape of the frequency distribution
graph and its relationship with the normal curve.
Figure 22 shows the frequency distribution for May – April and the normal curve.
Outliers are evident. There is a concentration of responses near the mean of zero. There
is a loose approximation of the normal curve by the frequency distribution.

98

Figure 22. MAA frequency distribution and normal curve for May – April data.
Figure 23 of the frequency distribution of the May- February data though not
normal (Shapiro- Wilk’s p = .000) does show a pattern close to a normal curve. The
outliers close to -6 are clear and a drop in frequency just above the mean of zero are
clearly visible.

99

Figure 23. MAA frequency distribution and normal curve for May – February data.
Skew and Kurtosis were examined to further explore normality. May – February
skew was -.484 with Kurtosis of 2.922. May – April skew was 1.014 and Kurtosis was
5.259. Skew standard error was .209 and Kurtosis standard error was .414. With the
guideline of ± 2 for Kurtosis, neither the May - April data nor the May – February data can
claim normality. ANOVA, however, is robust to non-normality so will still result in
meaningfully interpretable results.
The outliers suggested in the frequency distributions (Figure 21, 22, and 23) can
be more readily observed in the Box-Plots below (Figures 24, 25, and 26). For a sample
size of 135, seven outliers would be expected. The April – February data shows 3 outliers
(Figure 24). The May – April data shows 7 outliers (Figure 25). The May – February data
shows 5 outliers (Figure 26).

100

Figure 24. MAA box-plot for April – February data test for outliers.
Figure 24 shows 3 outliers. It also shows equal whiskers, minimal skewing and is
about centered at zero. A mean of zero is expected for Rasch logit measures.

Figure 25. MAA box-plot for May – April data test for outliers.

101

There were a total of 7 outliers reported for the May – April data seen in Figure
25. Again, there is a zero centre, minimal skewing and equal whiskers.

Figure 26. MAA box-plot for May – February data test for outliers.

There were a total of 5 outliers reported in the May – February data reported in
Figure 26. Though this data shows an outlier between -4 and -6, the whiskers are equal,
the centre is zero and there is minimal skewing.
Summary of Assumptions for data to be use in 2X2 Two-Factor ANOVA. To use a
2X2 Two-Factor (Treatment and Gender) ANOVA assumptions must be met. The
assumption of a continuous variable, matched pairs and no significant outliers was met
for all three difference scores (April – February, May – April and May – February). The
assumption of homogeneity of variance (HOV) was also met (∝ = .01). Normality is
reasonable to assume in April – February (p = .301). There were minor indications of
102

deviation from normality. However, histograms and boxplots suggest these variations are
of minor importance and ANOVA is robust to non-normal data. The assumptions to use a
2X2 Two Factor ANOVA were met to a sufficient level that the output from the 2X2 TwoFactor ANOVA will be able to be interpreted in a meaningful way.
Traditional Statistical Analysis of the treatment on MAA with students as the
experimental unit. Three 2X2 Two-Factor (Treatment and Gender) ANOVA tests were
preformed on April – February, May – April and May – February difference scores were
preformed. The significance results for the tests of Between-Subject Effects are reported
in Table 21. No statistically significant differences were reported between the groups (all
p > .30, ∝ = .05).
Table 21
MAA Between-Subject Effects Statistical Significance (p)
Source
Gender
Treatment
Gender and
treatment

April – Feb
p
.543
.704
.800

May – April
p
.479
.469
.381

May – Feb
P
.885
.718
.535

The lack of a statistically significant difference is supported and represented
visually in Figures 27, 28, and 29. Tables 22, 23, and 24 summarize the values used to
generate the plots in the corresponding figures.

103

Table 22
MAA Estimated Marginal Means and Standard Error April – February
Group
Control
Treatment

Gender Estimated Marginal
Mean
Female
0.008
Male
-0.018
Female
0.179
Male
0.053

2xSE
0.456
0.428
0.340
0.348

The means vary from -0.018 to 0.179 with standard errors from 0.170 to 0.228.
Figure 27 is a graphical representation of the information in Table 22.

Logit Measeure Estimated Marginal Means

Estimated Marginal Means Versus Group April Feb Separated by Gender MAA
0.8
0.6
0.4
0.2
0
-0.2
-0.4
-0.6

Control

Treatment
female

male

Figure 27. MAA estimated marginal means versus group April – February separated by
gender with 2XSE error bars.

104

Table 23
MAA Estimated Marginal Means and Standard Error May – April
Group

Gender

Control

Female
Male
Female
Male

Treatment

Estimated Marginal
Mean
0.304
0.271
-0.039
0.308

2xSE
0.496
0.468
0.370
0.380

The graphical representation of the data from Table 23 can be seen in Figure 28.

Estimated Marginal Means Versus Group May April Separated by Gender MAA
Logit Measeure Estimated Marginal Means

1
0.8
0.6
0.4
0.2
0
-0.2
-0.4
-0.6

Control

Treatment
female

male

Figure 28. MAA estimated marginal means versus group May – April separated by gender
with 2XSE error bars.
The MAA estimated marginal means are reported in Table 24 for May – February.
The means vary from 0.140 to 0.360 and the errors vary from 0.192 to 0.257.

105

Table 24
MAA Estimated Marginal Means and Standard Error May – February
Group

Gender

Control

Female
Male
Female
Male

Treatment

Estimated Marginal
Mean
0.312
0.253
0.140
0.360

2XSE
0.514
0.484
0.384
0.394

A more graphical representation of this data is shown in Figure 29.

Logit Measeure Estimated Marginal Means

Estimated Marginal Means Versus Group May Feb Separated by Gender MAA
1
0.8
0.6
0.4
0.2
0
-0.2
-0.4
-0.6

Control

Treatment
female

male

Figure 29. MAA estimated marginal means versus group May – February separated by
gender with 2XSE error bars.
The power for all difference scores were low: April – February was .03, May – April
was .28 and May – February was .12. The effect size was small for all difference scores as
well: April – February Cohen’s d = .091, May – April Cohen’s d = .168 and May – February
Cohen’s d = .093.

106

Summary of analysis of MAA intervention with students as the experimental
unit. All participants’ results were retained for analyses if they wrote all three
assessments (February, April and May). The average Logit Measures were slightly positive
for all groups on each assessment, with a mean score supporting a positive overall
attitude. Though some participants did not fit the Rasch model, no justification was found
for removals. Sample size was determined to be large enough (item reliability .96). The
Person Reliability (test reliability) of .67 indicated the MAA was only able to separate
participants in to two ability levels so more items would have been beneficial. Item
Separation of 4.80 indicated the items could be groups into 4 levels of difficulty. The
person Separation of 1.43 indicated the MAA may not be sensitive enough to separate
participants indicating a positive attitude toward mathematics on the MAA from a
negative attitude. The MAA was also indicated as being too easy for the participants.
The Math Attitude Assessment results when reported as difference scores of
Rasch Logit measures met the assumptions necessary to obtain meaningful results from
difference scores analysed using the 2X2 Two-Factor (Treatment and Gender) ANOVA. No
statistically significant differences were reported in the 2X2 Two-Factor ANOVA. The
power for each data set was low (April – February .03, May – April .28 and May –
February .12).
Summary of MAA Interventions with Blocks. A Three Factor (Treatment, Gender
and Block) ANOVA was conducted on the same five Block grouping as the DAWM data for
all three difference scores. Assumptions for ANVOA were explored previously except for
variance due to the change in experimental units. The April – February and May – April
107

data showed equality or error of variance using Levene’s test (p = .057 and p = .154
respectively). The May – February data was not indicated as having equality of error in
variance (p =.000) indicating that any significant results must be interpreted
conservatively. No significant differences were indicated for treatment, gender, or block.
No interactions were indicated and no differences between individual blocks were
indicated (all 𝑝 > .05).
Overall Summary
In summary, for both the DAWM and the MAA the participants, items and scale
were analysed using Rasch analysis. All participants and all items were retained for both
assessments. DAWM assessment items difficulty did not match the participants abilities’.
The items were more difficult than was well suited to the participants and the DAWM
could have benefitted from more items but overall was an effective assessment
instrument. The MAA difficulty level was appropriate for the students, if a bit easy, but
could have benefitted from more items. No correlation was found between the February
results for the DAWM and the MAA when comparing the individual students (r = .044).
Difference scores of Logit Measures were used in traditional statistical analysis to
test for a treatment effect. Two different approaches were used: each student as an
experimental unit and each school as a block with Treatment and Control within each
block as the experimental units.
When analysing the data with each student as an experimental unit, assumptions
for use of ANOVA were met to a satisfactory level of both assessments (DAWM ∝ =.05,
MAA ∝ =.01). For the DAWM difference score data, a statistically significant interaction
108

was identified between group and gender for April – February data (p = .017). For the
DAWM data, a statistically significant difference was identified between the treatment
and control group for the May – February data (p = .040, Cohen’s d = 0.337, power = .79).
When considering the MAA difference score data, no statistically significant interactions
or differences were identified and the power for all data sets was low (.03 to .28). Cohen’s
d values for April – February, May – April and May – February were all trivial.
When analysing the data in blocks, the assumption of equal variance (or equal
error in variance) was satisfied for the May – April and May – February data (p =.290 and p
=.406, respectively) but not for the April – February data (p = .000). The lack of equal
variance indicates that any significant findings for the April – February results must be
interpreted conservatively. For the April – February data a significant effect was found for
block (p = .025) and a significant interaction found identified for block and treatment (p =
.001). A significant difference was identified for block 4 and 5 (p =.035). No statistically
significant differences or interactions were identified for the May – April data. The May –
February data indicated a statistically significant block effect (p = .004), as well as
statistically significant interactions for treatment and block (p = .001) and treatment,
gender and block (p =.016).

109

Chapter Five Discussion
Assessing and improving attitudes toward mathematics is a common area of
concern for educators, researchers and parents. Assessing where mathematics is
recognized in the world or what students recognize as being related to mathematics may
help us understand what impacts mathematics attitude. This study was designed to
assess and improve both the student’s propensity to indicate words and/or concepts as
being related to mathematics and attitudes toward mathematics, of late primary, early
elementary school students. The assessment was comprised of two parts: the Difficulty
Associating Words with Mathematics (DAWM) and the Math Attitude Assessment (MAA).
DAWM assessment assessed children’s propensity to associate real world situations with
mathematics. MAA was a modified version of the ATMI (Tapia & Marsh, 2004). The MAA
assessed children’s’ attitudes toward mathematics using 12 statements focused on four
areas, Self-Confidence, Motivation, Value and Enjoyment. The resulting three items per
area was insufficient to examine the separate factors, so only one global attitude score
was obtained for each participant.
The assessments were done three times: pre-intervention, post-intervention, and
one-month post intervention. The intervention, Math Play, is defined as play involving
math related toys, games and activities with the purpose of exploration, discovery and
problem solving. Math Play does not involve testing or criticisms. Analysis was done to
determine if Math Play effects existed and whether or not there were differences across
gender.

110

Conclusion
Math Play was able to increase propensity of primary students to indicate words
and/or concepts as being mathematics related from pre-intervention to one-month post
intervention (when each participant was an experimental unit) suggesting further
investigation into the effectiveness of Math Play is warranted. However, Math Play was
not able to change attitudes toward mathematics.
Is the Difficulty Associating Words with Mathematics assessment (DAWM) an
appropriate assessment instrument for the participants? The difficulty associating words
with mathematics (DAWM) assessment was found to be more difficult than would be
ideal for grade 2 and 3 students for assessing the propensity to endorse words as being
associated with mathematics (Figure 5). However, while this instrument had enough
adequately preforming items to detect group differences it may not have been sensitive
enough to separate high performing participants from low performing participants. The
DAWM would benefit from more items and also had items that did not add to the
assessment quality such as the word “adding”. The common mathematics related words
(adding, subtracting and counting) were included in the assessment for face validity
purposes. Due to the age of the participants, if it may have been confusing if there were
not any words in the assessment that were commonly related to mathematics.
Will Math Play change the propensity of participants endorsement of words
associated with mathematics? A statistically significant interaction between treatment
and gender from pre-intervention to post intervention was identified (p =.017) (Tables 9),

111

indicating that the male and female students responded differently to the intervention.
The lack of a difference, immediately following intervention, was consistent for both
methods of analysis
If there is a difference in propensity, will it persist after Math Play has ended?
When each participant was an experimental unit, no difference in propensity was found,
pre-intervention to post-intervention. However, there was an increase in the long-term
effect (pre-treatment to one-month post treatment) (Table 9). Math Play was found to
have a small to medium effect Partial ETA = .032, and Cohen’s d = .337) on the grade 2
and 3 students’ propensity to indicate words and/or concepts as being related to
mathematics. The power of the effect was .79, indicating a 79% probability of detecting
an effect in the May – February data if there was one there to detect. ( If there was only a
change immediately after intervention that did not persist that would indicate the need
to persist with Math Play.
Is there a gender difference for propensity to associate words with
mathematics? No statistically significant differences were identified between the
participants who self identify as male versus those who self identify as female (Table 9
and Figures 13, 14, and 15) for either experimental unit. The lack of a gender difference is
consistent with some work done on mathematics affect and inconsistent with others.
Altindag et al (2009), de Lourdes Meta et al. (2012) and Moenikia and Zahed-Babelan
(2009) all found no gender difference, whereas, Watt (2004) did.

112

Is the math attitude assessment (MAA) an appropriate instrument to assess the
attitudes of the participants toward mathematics? The MAA was not as good a
measurement instrument as the DAWM. It matched the ability level of the participants,
but it may not have been sensitive enough to separate students with positive attitudes
from those with negative attitudes. The MAA was sensitive enough to detect group effect.
The assessment also did not have enough items to be able to separate the statements
based on factors of attitude that were being targeted. The MAA instrument would need
improvement for future use but was judged adequate for assessing treatment efficacy
and gender differences.
Will Math Play change the attitude of the participants toward mathematics?
Math Play did not have a measurable effect on the attitudes of grade 2 and 3 students
toward mathematics. Though no difference was reported it was clear from observation
that the participants were having a positive experience. Some participants asked when
Math Play sessions would happen again and smiled and laughed when they were engaged
in the activities. Some participants asked where to get the games that were introduced or
if the same games could be brought back the next week.
Changes were observed from the first Math Play sessions to the next. When the
Math Play sessions first started some participants appeared reluctant to take part but by
the end of the first day everyone in every class appeared to be taking part
enthusiastically. Every child in every class participated. Many of the teachers commented

113

on the level of engagement and the positive feedback they were getting from the
participants.
Studies indicate that attitudes toward mathematics form early and are difficult to
change (as cited in Geist, 2010). The observations may be an indication of state versus
trait (Heffner, 2017). The session’s engagement level and responses may indicate that the
participants were in a state of enjoyment when they were engaged in Math Play. Their
trait may have been a more negative view on mathematics. With evidence suggests
changing attitudes toward mathematics is difficult (as cited in Geist, 2010), it is
reasonable to assume it would take more than five, one-hour sessions to affect attitude
at meaningful level and at a level associated with changing a trait.
Is there a difference in the change of MAA results between the genders? No
gender differences were found for the MAA data.
Delimitations, Limitations and Future Research
Choices and circumstances caused limitations to this quasi-experimental study
(Creswell, 2014; Hurlburt, 2006; Pagano, 1998). The volunteers who were facilitating the
Math Play had limited time available. An increase in treatment time would be advisable in
future research.
Random assignment was not possible for the schools, teachers, students nor
facilitators. The teachers were randomly assigned to the treatment or control group (with
one exception). Increased randomization could increase the power of the results of the
study (Creswell, 2014; Hurlburt, 2006; Pagano, 1998).
114

Having trained facilitators with a pre-measured attitude facilitate the
interventions would be advisable in future research. This may control for attitude
differences.
The age of the participants and the suggestion by the school district limited the
length of the MAA. As a result of the limited length, there were not enough items per
category to allow meaningful analysis for the separate factors involved in the MAA: selfconfidence, value, enjoyment and motivation. Identifying the factors that contribute to
the attitude of Primary students and then creating an assessment instrument with at least
five items that address those factors would be advisable in future research.
The teachers’ attitudes were not assessed, nor that of the parents or siblings of
the participants. Some studies suggest teacher attitude and parent attitude can impact
the students’ attitudes (Gunderson, 2012). Further studies examining the impact of
teacher, parent, sibling and overall classroom attitudes toward mathematics would be
advisable.
Another recommendation for future investigation is expanding the sample to
include more schools. This would allow more thorough investigation with schools as
blocks and the classes in the schools as experimental units. This would be advisable
because results at a classroom level, as opposed to the individual level, would better
motive teachers to use Math Play as a resource for improvements in awareness of
mathematics in the world and attitudes toward mathematics.

115

Implications
Research implications. Further to the suggestions previously mentioned, research
needs to be undertaken to determine what can be done to create a concise definition of
attitudes toward mathematics. Various studies assess attitudes in various ways with
various instruments (Chamberlin, 2010; Chapman & Lim, 2012; Hannula, 2002; Tapia &
Marsh, 2004;). It is hard to be consistent in interpretation of and the study of something
that is not consistently defined. It also may be necessary to develop different assessment
instruments targeted to different age groups.
How incidents affect attitudes toward mathematics needs to be determined.
When discussing attitudes, Hannula (2002) referred to a particular incident contributing
to the formation of a negative attitude toward mathematics. Are attitudes toward
mathematics formed over long-term influences and then changed by an event? How
static are attitudes toward mathematics? The propensity to associate words with
mathematics can be influenced and increased in as little as five weeks with only an hour a
week. Would more exposure to Math Play or Math Play like activities improve attitudes?
Practical implications. One teacher whose class was in the study said that Math
Play had changed the way she thought of mathematics. The results suggest support for
her statement with the increase in propensity to indicate words as being mathematics
related (p =.040). Several of the teachers purchased some of the games and activities that
were brought in (after the study was complete) and invited us back. These were the
responses of the people who work with the students everyday implying the differences
116

they say from Math Play were impactful and beneficial. An increase in the understanding
that mathematics is all around us may have other benefits.
Summary
Math Play was not shown to have an effect on attitudes toward mathematics held
by primary students. An attitude is a trait (Heffner, 2017) held by an individual, and hard
to change (Geist, 2010). An increase in intervention duration together with a refinement
of the Mathematics Attitude Assessment is recommended for future studies.
The findings of the study suggest that Math Play is able to increase the propensity
of primary children to indicate words as being related to mathematics (as shown by the
DAWM results). The significant results were found at an individual level. Refining the
Difficulty Associating Words with Mathematics assessment and an increased intervention
time are recommended for future studies looking for significant results at a classroom
level.

117

References
Andrich, D. (1978a). Application of a psychometric rating model to ordered categories
which are scored with successive integers. Applied Psychological Measurement, 2,
581-594.
Andrich, D. (1978b). A rating formulation for ordered response categories. Psychometrika,
43, 561-573.
Andrich, D. (1978c). Scaling attitude items constructed and scored in the Likert tradition.
Educational and Psychological Measurement, 38, 665-680.
Andrich, D. (1988). Rasch model for measurement. Sage.
Anttonen, R. G. (1969). A longitudinal study in mathematics attitude. The Journal of
Educational Research, 62(10), (1969): 467-471.
Bond, T.G., & Fox, C. M. (2012). Applying the Rasch model, Fundamental Measurement in
the human sciences (2nd ed.). Mahwah, New Jersey: Routledge.
Brown, S. (2016), Measures of shape: Skewness and Kurtosis, Retrieved from
https://brownmath.com/stat/shape.htm#Kurt_Infer).
Capacity Building Series. (2011). Maximizing student mathematical learning in the early
years. Retrieved from
http://www.edu.gov.on.ca/eng/literacynumeracy/inspire/research/CBS_Maximize_
Math_Learning.pd
Chamberlin, S. (2010). A review of instruments created to assess affect in mathematics.
Journal of Mathematics Education. 3(1). 167-182.
Chapman, E., Lim, S. Y. (2012). Development of a short form of the attitudes toward
mathematics inventory. Educ Stud Math. 82. 145-164.
CMEC statement on play-based learning. (2014). Council of Ministers of Education,
Canada. Retrieved from
https://www.cmec.ca/Publications/Lists/Publications/Attachments/282/playbased-learning_statement_EN.pdf
Crocker, L., & Algina, J. (2006). Introduction to classical and modern test theory. Cengage
learning.
Creswell, J. W. (2014). Research design: quantitative, qualitative, and mixed methods
approaches (4th ed.): Sage Publications, Inc.
118

de Lourdes Meta, M., Monteiro. V., M., & Peixoto, F. (2012). Attitudes towards
mathematics: effects of individual, motivational, and social support factors. Child
Development Research, article id 876028. Doi:10.1155/2012/876028
Doepken, D., Lawsky, E., & Padwa, L. (1993). Modified fennema-sherman attitude
scales. ONLINE: www. woodrow. org/teachers/math/gender/08scale. html.
Dowker, A., Bennett, K., & Smith, L. (2012). Attitudes to mathematics in primary school
children. Child Development Research, 1-8.
Field, A. (2011). Discovering statistics using SPSS (3rd ed.). Thousand Oaks, CA: Sage
Publishing Ltd.
Geist, E. (2010). The anti-anxiety curriculum: combating math anxiety in the
classroom. Journal of Instructional Psychology, 37(1).
Granger, C. V., Niewczyk, P. (2010), What is measurement? Retrieved from
https://www.udsmr.org/Documents/UDSMR_What_Is_Measurement_Article.pdf
Gunder, E., Ramirez, G., Levine, S., & Beilock, S. (2012). The role of parents and teachers
in the development of gender-related math attitudes. Sex Roles, 66(3-4), 153-166.
Hannula, M. S. (2002). Attitude towards mathematics: emotions, expectations and values.
Educational Studies in Mathematics, 49, 25-46.
Hannula, M. S. (2006). Motivation in mathematics: goals reflected in emotions.
Educational Studies in Mathematic, 63, 165-178.
Heffner, C. (2017). Chapter 12: Section 4: Personality trends. Psych Central’s Virtual
Classroom. Retrieved from https://allpsych.com/personalitysynopsis/trends/
Huang, W. (2015). Benjamin Wright, renowned psychometrician, 1926-2015.
UChicagoNews, https://news.uchicago.edu/article/2015/12/15/benjamin-wrightrenowned-psychometrician-1926-201
Humphries, M. (n.d.). Missing data & how to deal: An overview of missing data,
Population Research centre. Retrieved April 25, 2018 from
https://liberalarts.utexas.edu/prc/_files/cs/Missing-Data.pdf
Hurlburt, R. T. (2006). Comprehending behavioral statistics (4th ed.). Belmont, CA,
Thomson Wadsworth.

119

Kogce, D., Yildiz, C., Aydin, M., Altindag, R. (2009). Examining elementary school students’
attitudes towards mathematics in terms of some variables. Procedia Social and
Behavioral Sciences. 1. 291-295.
Kutner, M., Nachtsheim, C., Neter, J. & Li, W. (2004) Applied Linear Statistical Models (5th
ed.). McGraw-Hill Irwin.
Laerd Statistics. (n.d.). Retrieved April 25, 2018 from https://statistics.laerd.com
Levine, Geroge. (1972). Attitudes of elementary school pupils and their parents toward
mathematics and other subjects of instruction. Journal for Research in Mathematics
Education. 3(1), 51-58.
Linacre, M. (2012a). Winsteps Tutorial 1, Retrieved from
http://www.winsteps.com/a/winsteps-tutorial-1.pdf
Linacre, M. (2012b). Winsteps Tutorial 2, Retrieved from
http://www.winsteps.com/a/winsteps-tutorial-2.pdf
Linacre, M. (2012c). Winsteps Tutorial 3, Retrieved from
http://www.winsteps.com/a/winsteps-tutorial-3.pdf
Linacre, M. (2012d). Winsteps Tutorial 4, Retrieved from
http://www.winsteps.com/a/winsteps-tutorial-4.pdf
Linacre, M. (2012e). Retrieved from http://www.winsteps.com
Linacre, M. (Retrieved on April 25, 2018). Institute for objective measurement. Institute
for Objective Measurement. Retrieved from www.rasch.org
Linacre, M. (2002). Judging Debacle in Pairs Figure Skating. Retrieved from
https://www.rasch.org/rmt/rmt154a.htm
Linacre, M. (2002b) Review of Reviews of Bond & Fox. Rasch Measurement, Transactions
of the Rasch Measurement SIG American Educational Research Association. 16(2).
Retrieved from https://www.rasch.org/rmt/rmt162.pdf
Lochhead, L., (2009). Assessment of perceived functional capacity: Using Rasch Analysis to
evaluate the measurement properties of four perceived pain & disability scales
(master’s thesis). University of British Columbia.
Ma, X., & Kishor, N. (1997) Assessing the relationship between attitude toward
mathematics and achievement in mathematics: A meta-analysis. Journal for
Research in Mathematics Education. 28(1). 26-47.
120

MacDonald, M. (2014). Professors debate the best way to teach math. University Affairs.
Retrieved from http://www.universityaffairs.ca/features/feature-article/how-toteach-math/
Marsh, H.W., Trautwein, U., Ludtke, O., Koller, O., & Baumert, J. (2005). Academic selfconcept, interest, grades, and standardized test scores: reciprocal effects models of
causal ordering. Child Development, 76(2). 397-416.
Milner, S. (n.d.). Susan’s Math Games. Retrieved on April 25, 2018, Retrieved April 25,
2018 from http://susansmathgamesca.ipage.com/puzzles-games/
Moenikia, M., & Zahed-Babelan, A. (2010). A study of simple and multiple relations
between mathematics attitude, academic motivation, and intelligence quotient with
mathematics achievement. Procedia Social and Behavioral Sciences. 2. 1537-1542.
Oswald, D., Sherratt, S., & Smith, S. (2014). Handling the Hawthorn effect: the challenges
surrounding a participant observations, Review of Social Studies, 1(1).
Pagano, R. R. (1998) Understanding statistics in the behavioral sciences (5th ed.). Pacific
Grove: CA, Brooks/Cole Publishing Company.
Plonsky, M. (1997), Analysis of variance – two way. Retrieved from
https://www4.uwsp.edu/psych/stat/index.htm
Quinn, B., & Jadav, A. D. (1987). Causal relationship between attitude and achievement
for elementary grade mathematics and reading. The Journal of Educational
Research. 80(6), 366-372.
Rasch, G. (1993). Probabilistic models for some intelligence and attainment tests. MESA
Press
Robinson, K. (2006). Do schools kill creativity? [Video file]. Retrieved from
https://www.ted.com/talks/ken_robinson_says_schools_kill_creativity
Sebok, S., (2011). “Pick Me, Pick Me, I Want to Be a Counsellor”: Assessment of a MedCounselling application selection process using Rasch analysis and generalizability
theory (master’s thesis). University of Victoria
Sick, J. (2009). Rasch analysis software programs. Shiken: JALT Testing & Evaluation SIG
Newsletter. 13, 13-16.
Smith, R. M., Schumacker, R. E., & Bush. MJ. (1998). Using item mean squares to evaluate
fit to the Rasch model. Journal of Outcome Measurement, 2, 66-78.

121

Smith, E., & Smith, R. (2004). Introduction to rasch measurement. Maple Grove,
Minnesota: JAM Press
Statistical sampling and regression: t-distribution, Retrieved from
http://ci.columbia.edu/ci/premba_test/c0331/s7/s7_4.html
Stodolsky, S., Salk, S., & Glaessner, B. (1991). Students views about learning math and
social studies. American Educational Research Journal, 28, 89-116.
Tabachnic, Fidell. (2007). Using multivariate statistics (6th ed.). Pearson.
Tapia, M., & Marsh II, G.E. (2004). An instrument to measure mathematics attitudes.
Academic Exchange Quarterly, 8.
Tezer, M., & Karasel, N. (2010). Attitudes of primary school 2nd and 3rd grade students
towards mathematics course. Procedia Social and Behavioral Sciences, 2(2), 58055812.
van den Berg, R. G., (2017). SPSS Repeated Measures ANOVA Tutorial. Retrieved from
https://www.spss-tutorials.com/spss-repeated-measures-anova/#repeatedmeasures-anova-assumptions
Wagenmakers, E., Krypotos, A., Criss, A. H., and Iverson, G (2012) On the interpretation of
removable interactions: A survey of the field 33 years after Loftus, Memory and
Cognition, 40(20), 145-160
Watt, H. (2004). Development of adolescents’ self-perceptions, values, and task
perceptions across gender and domain in 7th- through 11th-grade Australian
students. Society for Research in Child Development, 75(5). 1556-1574.
Wells, C.S., & Wollack, J.A. (2003). https://testing.wisc.edu/Reliability.pdf

122

Appendix
Appendix A Forms and Consent
•

UNBC REB approval

•

Consent forms Principals Teachers, Participants

•

Confidentiality form

123

124

125

Oct. 19, 2014
3333 University Way
Prince George BC
V2N 4Z9

Dear Principal,
My name is Jean Bowen. I am working on a Master’s in Mathematics at UNBC on math
education. I am seeking permission to conduct research in your school during the 20142015 school year. The focus of the study is attitudes towards mathematics in the primary
grades.
This project has been reviewed by Cindy Heitman, District Principal of Curriculum and
Instruction, my supervising committee and the University of Northern British Columbia’s
Research Ethics Board (REB).
Research Objective:
1) To assess attitudes toward mathematics of Grade 2 and/or 3 students.
2) To attempt to improve attitudes through the introduction of Math Play.
For the purpose of this study Math Play is defined as play involving math related toys,
games and activities with the purpose of exploration, discovery and problem solving.
Math Play does not involve testing or criticisms.

126

Proposed Plan:
I would like to enlist the help of eight (or 12) classes of Grade 2 and/or 3 students. In four
(or six) of these classes attitude assessment will occur. In the other four classes’ attitude
assessment as well as Math Play sessions will take place.
Parents will receive information letters, be invited to information sessions and be invited
to email me with any questions. In classes where Math Play takes place the entire class
will take place in the play sessions. Consent forms will be required to be signed by parents
before their child(ren) may take part in the attitude assessment portion of the study.
In classes where Math Play takes place, attitudes would be assessed four times
throughout the year: October (or later due to the strike), end of January, beginning of
April and May/June. In the other four classes assessment will happen in October (or later)
and June. Attitude assessments will be administered by the classroom teacher and would
take approximately 15 minutes each time.
A volunteer student from the UNBC MATH 190 (Mathematics of Elementary Teachers)
would attend one hour a week for the first two weeks as in class support. After Spring
Break the intervention would begin.
The Intervention:
For an hour a week from March 2, to April 2 the MATH 190 student would return to lead
sessions in Math Play. Math Play is defined as play involving math related toys, games and

127

activities with the purpose of exploration, discovery and problem solving. Math Play does
not involve testing or criticisms.

Ethical Considerations
For the Participants:
Participation is voluntary for the attitude assessment portion. As such you or any
participant may withdraw from the study at anytime. Any participant who withdraws will
have all information that has been collected destroyed unless it has already been
reported. Due to the two fold nature of the study (attitude assessment and Math Play)
participation in the Math Play sessions without attitude assessment information being
collected is easy to facilitate. This is done to make the process easier for the classroom
teacher should a parent decide that they do not want attitude information collected on
their child(ren). The entire class will participate in the Math Play session. You or the
classroom teacher may stop participation in the study at any time. A child may also
choose to stop participation in the attitude assessment portion of the study at any time.
Privacy and Confidentiality:
There will be no identifiers for the school, teachers, or students in the report. Only
grouped data will be reported. Students will be assigned number identifiers. The only
copy of the document that connects the student and their numbers will be kept under
lock and key at UNBC. My supervisor and I will be the only people with access to the
128

information. Information will be retained for one year after the final paper has been
approved and then will be shredded.
Potential Benefits:
•
•
•
•

Learning through play is supported in literature.
Several of the activities will involve logic puzzles. Development of logic skills is
academically beneficial.
This study may improve our understanding of primary students’ attitudes toward
mathematics and some of the factors that influence such attitudes.
This study may give us a new way to improve primary students’ attitudes towards
mathematics.

Potential Risks:
•

Student participants will receive five hours less instruction time with their
classroom teacher. This is not a significant amount of time in comparison to the
total instruction time that takes place in a year. Student participants will not miss
out on instruction time as the entire class will be participating.

Sharing Results:
•

The final report will be provided to the principals, the teachers, School District 57
and the parents of the children in the class.

Thank you for considering participating in this research project. I would appreciate the
opportunity to work with you and your staff. If you or any of your staff have any
questions please feel free to contact me or my supervisor, Dr. Jennifer Hyndman. I
contacted via email at bowenj@unbc.ca. Dr. Jennifer Hyndman can be contacted at
Jennifer.hyndman@unbc.ca.
Concerns or complaints should be directed to the REB (phone: 250-960-6735: email
reb@unbc.ca).
129

Sincerely,

Jean Bowen

I, _______________________________________ principal of
______________________________ give permission to Jean Bowen, Masters student at
University of Northern British Columbia, to include the above mentioned school in the
described research project.
________________________________

___________________

Signature

Date

130

Teacher Information
My name is Jean Bowen. I am working on a Master’s in Mathematics at UNBC on math
education. I am seeking permission to conduct research in your school during the 20142015 school year. The focus of the study is attitudes towards mathematics in the primary
grades.
This project has been reviewed by Cindy Heitman, District Principal of Curriculum and
Instruction, my supervising committee and the University of Northern British Columbia’s
Research Ethics Board (REB).
Research Objective:
1) To assess attitudes toward mathematics of Grade 2 and/or 3 students.
2) To attempt to improve attitudes through the introduction of Math Play.
For the purpose of this study Math Play is defined as play involving math related toys,
games and activities with the purpose of exploration, discovery and problem solving.
Math Play does not involve testing or criticisms.
Proposed Plan:
Your class would be one of 8(or 12)classes of Grade 2 and/or 3 students that will be part
of the study.
The study consists of two parts. Part one is assessing attitudes of students toward
mathematics. Part two is introducing Math Play. Only half of the classes will take part in
the Math Play.

131

Part One
Attitudes would be assessed four times throughout the year: October, end of January,
beginning of April and May/June. This is a questionnaire comprised two sections: section
1 is circling words that the students associate with mathematics, and section two is 12
statements where the student circles yes, maybe or no. As the classroom teacher you
would be asked to administer the questionnaire to avoid the added disturbance of having
extra people in the classroom. The assessment would need to be read to the students.
The assessment would take approximately 15 minutes to administer each time.
Parents will receive information letters, be invited to information sessions and be invited
to email me with any questions. Consent forms will be required to be signed by parents
before their child(ren) may take part in the study. You would be asked to hand out the
permission forms and collect them to be held until I collect them. (This paragraph was
moved.)

Part Two
A volunteer student from the UNBC MATH 190 (Mathematics of Elementary Teachers)
would attend one hour a week for the first two weeks as in class support from February 2,
2015 to February 13, 2015. After Spring Break the intervention would begin.
The Intervention:
132

For an hour a week from March 2, to April 2 the MATH 190 student would return to lead
sessions in Math Play.
Four of the classes will participate in both Part 1 and Part 2. The other four classes will
participate in only Part 1- attitude assessments. I would like to invite your class to
participate in both Part 1 and 2 of the study.
Ethical Considerations
For the Participants:
Participation is voluntary. As such you or any participant may withdraw from the study at
anytime. Any participant who withdraws will have all information that has been collected
destroyed unless it has already been reported. Due to the two fold nature of the study
(attitude assessment and Math Play) participation in the Math Play sessions without
attitude assessment information being collected is easy to facilitate. This is done to make
the process easier for the classroom teacher should a parent decide that they do not
want attitude information collected on their child(ren).
Privacy and Confidentiality:
There will be no identifiers for the school, teachers, or students in the report. Students
will be assigned number identifiers. The only copy of the document that connects the
student and their numbers will be kept under lock and key at UNBC. My supervisor and I
will be the only people with access to the information. Information collected will be kept
for a year after the final paper has been completed, it will then be shredded
133

Potential Benefits:
•
•
•
•

Learning through play is supported by several studies.
Several of the activities will involve logic puzzles. Development of logic skills is
academically beneficial.
This study may improve our understanding of primary students’ attitudes toward
mathematics and some of the factors that influence such attitudes.
This study may give us a new way to improve primary students’ attitudes towards
mathematics.

Potential Risks:
•

Student participants will receive five hours less instruction time with their
classroom teacher. This is not a significant amount of time in comparison to the
total instruction time that takes place in a year. Student participants will not miss
out on instruction time as the entire class will be participating (except the
students who do not have permission to do so).

Sharing Results:
•

The final report will be provided to the principals, the teachers and the school
district.

Thank you for considering participating in this research project. I would appreciate the
opportunity to work with you and your staff. If you or any of your staff have any
questions please feel free to contact me or my supervisor, Dr. Jennifer Hyndman. I
contacted via email at bowenj@unbc.ca. Dr. Jennifer Hyndman can be contacted at
Jennifer.hyndman@unbc.ca.
Concerns or complaints should be directed to the REB (phone: 250-960-6735, email:
reb2unbc.ca).

134

Sincerely,

Jean Bowen

If you agree to have your class be part of this study please fill out and return the portion
below to me.

I, _______________________________________ of ______________________________
give permission to Jean Bowen, Master’s Student at University of Northern British
Columbia, to include my class in the described research project. This class will take part in
both attitude assessment (Part One) and Math Play (Part Two).
________________________________

___________________

Signature

Date

135

Dear Parents,
My name is Jean Bowen. I am working on a Master’s in Mathematics at UNBC on math
education. I am seeking permission to conduct research in your school during the 20142015 school year. The focus of the study is attitudes towards mathematics in the primary
grades.
Students from your child’s class will be participating in this study.
Research Objective:
1) To assess attitudes toward mathematics of Grade 2 and/or 3 students.
2) To attempt to improve attitudes through the introduction of Math Play.
For the purpose of this study Math Play is defined as play involving math related toys,
games and activities with the purpose of exploration, discovery and problem solving.
Math Play does not involve testing or criticisms.
The Plan:
The study consists of two parts. Part one is assessing attitudes of students toward
mathematics. Part two is introducing Math Play.
Part One
Attitudes would be assessed four times throughout the year: October, end of January,
beginning of April and May/June. This is a questionnaire comprised two sections: section
1 is circling words that the students associate with mathematics, and section two is 12
statements where the student circles yes, maybe or no. As the classroom teacher you
136

would be asked to administer the questionnaire to avoid the added disturbance of having
extra people in the classroom. The assessment would need to be read to the students.
The assessment would take approximately 15 minutes to administer each time.
Part Two
A volunteer student from the UNBC MATH 190 (Mathematics of Elementary Teachers)
will attend one hour a week from February 2, 2015 to February 13, 2015 as classroom
support.
For an hour a week from March 2, to April 2 the MATH 190 student will return to lead
sessions in Math Play.
Your child’s class has been selected to participate in Part 1 of the study.

137

Ethical Considerations
This study has been reviewed by your child’s principal, your child’s teacher, the school
district, my supervisory committee and the University of Northern British Columbia
Research Ethics Board (REB).
Concerns or complaints should be taken to the UNBC REB (phone: 250-960-6735, email:
reb2unbc.ca).
For the Participants:
Participation in the attitude assessment is voluntary. As such, participation may
withdrawn from the study at anytime. Any participant who withdraws will have all
information that has been collected destroyed unless it has already been reported.
Privacy and Confidentiality:
There will be no identifiers for the school, teachers, or students in the report. Students
will be assigned number identifiers. The only copy of the document that connects the
student and their numbers will be kept under lock and key at UNBC. My supervisor and I
will be the only people with access to the information. Information collected will be kept
for one year after the final paper has been approved and then will be shredded.
Potential Benefits:
•
•

Learning through play is supported not only in literature but in SD57.
Several of the activities will involve logic puzzles. Development of logic skills is
academically beneficial.

138

•
•

This study may improve our understanding of primary students’ attitudes toward
mathematics and some of the factors that influence such attitudes.
This study may give us a new way to improve primary students’ attitudes towards
mathematics.

Potential Risks:
•

•

[This paragraph will be omitted as it does not apply to the classes where Math
Play is not taking place. Student participants will receive five hours less instruction
time with their classroom teacher. This is not a significant amount of time in
comparison to the total instruction time that takes place in a year. Student
participants will not miss out on instruction time as the entire class will be
participating (except the students who do not have permission to do so).]
There are no notable risk for the students who will be taking part in the
assessments only.

Sharing Results:
•

The final report will be provided to the principals, the teachers and the school
district.

Thank you for considering participating in this research project. I would appreciate the
opportunity to work with you and your staff. If you or any of your staff have any
questions please feel free to contact me or my supervisor, Dr. Jennifer Hyndman. I
contacted via email at bowenj@unbc.ca. Dr. Jennifer Hyndman can be contacted at
Jennifer.hyndman@unbc.ca.

Sincerely,
Jean Bowen

139

Dear Parents,
My name is Jean Bowen. I am working on a Master’s in Mathematics at UNBC on math
education. I am seeking permission to conduct research in your school during the 20142015 school year. The focus of the study is attitudes towards mathematics in the primary
grades.
Your child’s class will be participating in this study.
Research Objective:
1) To assess attitudes toward mathematics of Grade 2 and/or 3 students.
2) To attempt to improve attitudes through the introduction of Math Play.
For the purpose of this study Math Play is defined as play involving math related toys,
games and activities with the purpose of exploration, discovery and problem solving.
Math Play does not involve testing or criticisms.
The Plan:
The study consists of two parts. Part one is assessing attitudes of students toward
mathematics. Part two is introducing Math Play.
Part One
Attitudes would be assessed three times throughout the year: February, beginning of
April and May/June. This is a questionnaire comprised two sections: section 1 is circling
words that the students associate with mathematics, and section two is 12 statements
140

where the student circles yes, maybe or no. The assessment would take approximately 15
minutes to administer each time.
Part Two
Volunteer students from the UNBC MATH 190 (Mathematics of Elementary Teachers) will
attend one hour a week from February 2, 2015 to February 13, 2015 as classroom
support.
For an hour a week from March 2, to April 2 the MATH 190 student will return to lead
sessions in Math Play. The entire class will participate in the Math Play sessions. No
information will be collected from your child(ren) during these sessions.
I may be present as an observer during these times.
Your child’s class will be participating in both parts of the study. To preserve the integrity
of the study please do not discuss the connection between the two parts of the – the
attitude assessment and the Math Play- with your child until after the study is complete.
Ethical Considerations
This study has been reviewed by your child’s principal, your child’s teacher, the school
district, my supervisory committee and the University of Northern British Columbia
Research Ethics Board (REB).
Concerns or complaints should be taken to the UNBC REB (phone: 250-960-6735, email:
reb2unbc.ca).

141

For the Participants:
Participation is voluntary. As such participation may withdrawn from the study at
anytime. Any participant who withdraws will have all information that has been collected
destroyed unless it has already been reported.
Privacy and Confidentiality:
There will be no identifiers for the school, teachers, or students in the report. Students
will be assigned number identifiers. The only copy of the document that connects the
student and their numbers will be kept under lock and key at UNBC. My supervisor and I
will be the only people with access to the information. Information collected will be kept
for one year after the final paper has been approved and then will be shredded.
Potential Benefits:
•
•
•
•

Learning through play is supported not only in literature but in SD57.
Several of the activities will involve logic puzzles. Development of logic skills is
academically beneficial.
This study may improve our understanding of primary students’ attitudes toward
mathematics and some of the factors that influence such attitudes.
This study may give us a new way to improve primary students’ attitudes towards
mathematics.

Potential Risks:
•

Student participants will receive five hours less instruction time with their
classroom teacher. This is not a significant amount of time in comparison to the
total instruction time that takes place in a year. Student participants will not miss
out on instruction time as the entire class will be participating (except the
students who do not have permission to do so).

142

Sharing Results:
•

The final report will be provided to the principals, the teachers and the school
district.

Thank you for considering participating in this research project. I would appreciate the
opportunity to work with you and your staff. If you or any of your staff have any
questions please feel free to contact me or my supervisor, Dr. Jennifer Hyndman. I
contacted via email at bowenj@unbc.ca. Dr. Jennifer Hyndman can be contacted at
Jennifer.hyndman@unbc.ca.

Sincerely,
Jean Bowen

143

Child Information To be shared with the students.
Researcher: Jean Bowen, Masters Student in Mathematics, UNBC
Your class has been selected to participate in an attitude assessment portion of a study.
What the study is about:
Grade 2 and/or 3 students in Prince George will be participating in a research study.
Please check each box below to indicate that you and your child have read and
understood each statement. If you both agree that the student may participate in the
study, please complete and sign the form.
Privacy:
 The identity of your child will be kept private.
 No wherein the final report will specific schools, teachers or students be
identified.
 All information with ways to identify specific children and schools will be kept
locked at UNBC.
Participation:
 Participation in the study is voluntary.
 If you give permission now, but wish to withdraw your child from the study at any
time you may do so. Any information already collected will be destroyed and
withdrawn from the study as long as the request is before the report is written.
 The principal or teacher can withdraw the school or class from the study.
Questions:

144

If you have any questions or concerns please contact me, Jean Bowen, at
bowenj@unbc.ca, or my supervisor, Dr. Jennifer Hyndman at Jennifer.hyndman@unbc.ca.

To give permission for your child to participate in the study please fill in the second page
of this form and have your child return it to their teacher by
___________________________.

145

Attitudes Toward Mathematics and Math Play
Please fill in this form and have your child return it in the sealed envelope to their
classroom teacher.
Parent/Guardian
I, __________________________________________________ give permission for my
child,
(print full name)

_______________________________________________ to participate the attitude
assessment.
(print full name)
____________________________________

__________________________

___________________
Signature of Parent/Guardian

Relationship to child

Date

Student
I, __________________________________________________,wish to participate the
attitude
146

(print full name)
assessment.

____________________________________
Signature of Student

__________________________
Date

147

The Effects of Math Play on the Attitudes of
Primary Students Towards Mathematics

Oath of Confidentiality
The project named above is being undertaken by Jean Bowen at the University of
Northern British Columbia. The objectives of this work are to:
1. Assess attitudes of Grade 2/3 students.
2. Introduce Math Play.
For the purpose of this paper, Math Play is defined as play involving math related toys,
games and activities with the purpose of exploration, discovery and problem solving.
Math Play does not involve testing or criticisms.
The results of this project may be used to increase our understanding of what influences
the attitudes of primary students toward mathematics.
As an assistant working on this project I, (print name)
agree
to:
1. Treat as confidential all information learned through my interactions with the
students participating in this project. I will accomplish this by not discussing or
sharing confidential information learned in the context of this project in any form
or format with anyone other than Jean Bowen, the classroom teacher, and the
project supervisor, Dr. Jennifer Hyndman.
2. I will not be handling an personal information from the primary students.
I further understand and agree that this oath of confidentiality will continue in force
indefinitely, even after I cease being an assistant on this project.
Name of Assistant: (print): _________________________ (sign):
__________________________
Date: ____________________

Researcher’s Name: : (print): _______________________ (sign):
________________________
Date: ____________________
If you have any questions or concerns about this project, please contact:
148

Jean Bowen, bowenj@unbc.ca
This study has been reviewed by the Research Ethics Board at the University of Northern
British Columbia. For questions regarding participants rights and ethical conduct of
research, contact the Office of Research and Graduate Programs at (250) 960-6735.

149

Appendix B Attitude Assessment Instruments
•

DAWM

•

MAA

•

ATMI

150

151

152

153

154

155

156

157

158

159

Appendix C Games and Instructions
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

Towers
Tri-Hex
Tri-hexaflexagon
Tantrix
Rectangles
Blink
Magic Numbers
Magic Squares
Neighbours
Qwirkle
Set
Suspend
Telaga Buruk
Q-Bitz
Hidato

160

Towers
Before you start you will need.
•
•

A floor version of towers. There are 16 towers: 4 of each height, 1 unit, 2 units, 3
units and 4 units tall.
brains

The goal of this game is to place a tower of each height in each row and column based
on the clues.
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.
How to play?
•
•
•

Place a tower of each height in each column and row based on the clues.
The numbers on the outside of the grid tell you how many towers you can see in
front of you from that spot.
If there is a number given in the grid, a tower of that height is placed in that spot.

What to do to teach the game.
•

•

•

Place the towers in a stair like fashion.
4

4

4

4

3

3

3

3

2

2

2

2

1

1

1

1

Have the children come to the side by the 1’s and imagine they are something
small right in front of one of the rows of numbers, looking toward the 4 tower. Ask
them how many towers they can see? 4
Have the children move to facing the 4 towers. 1
161

•
•
•
•
•
•
•

Show them the board and tell them that the numbers on the outside tell you how
many towers you can see from that spot.
Tell them that each row and each column must have one of each height.
Assign each child a tower. That is the only tower that student may move.
Have them work together to place the towers.
Do this for at least two complete games.
When a group of students is ready to tackle the game on their own, give them a
bag with the smaller towers to work with and they can try it on their own.
If a student is able to they can try it without the towers and just do it on paper.

What math is hidden?
•
•

Problem solving
Trying to switch to doing a three dimensional problem with only paper and pencil.

Ways to modify the game.
•
•
•

If Ways to correct without it feeling like criticism.
Avoid statements like, “That is wrong.”
Ask:
o “Is that tower going to work there?”
o “Do you see a tower that is missing from that row/column?”

162

163

Tri-Hex
Before you start you will need.
•
•

A tri-hex sheet
Coloured disc

The goal of this game varies. See sheet.
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.
How to play?
•
•
•
•

There are three versions.
The instructions are on the sheet.
Separate the children into pairs.
Give each pair a sheet and the discs.

What to do to teach the game.
•

Have the children gather with you and teach them the first two versions.

What math is hidden?
•

Problem solving

Ways to modify the game.
•
•
•

Should not need to be modified.
Can just take turns placing discs.
Can move the discs around following a path.

Ways to correct without it feeling like criticism.
•
•

Avoid statements like, “That is wrong.”
Ask:
o “Where can you place that disc?”
o “How can you block the other player?”
164

165

Tri-hexaflexagon
Before you start you will need crayons, felts or pencil crayons.
The goal of this activity is to see the magic.
REMEMBER ENJOYMENT AND SUCCESS ARE KEY!
How to play and teach the game.
•
•
•
•
•
•

Show the children the template.
Ask how many sides does a piece of paper have? 2
Show them a folded trihexaflexagon.
Ask them how many sides it has? 2
Give them each a trihexaflexagon and get them to go colour both sides. It is best if
each side is different: different pattern or different colours.
When they are done show them how to fold them.
o “See how one side is pink and one side is green. If I fold it up like this, you
can only see pink, if I unfold it what should I see?”
o Now show them how to fold their own. Get them to repeat the folds until
they get their original colours back. How many folds does it take? Can you
fold it the other way?
o How many sides does it really have

What math is hidden?
•
•
•
•
•

Hexagon – a 6 edges shape made of straight lines
This hexagon has all its edges the same length.
Would it work it the edges were different lengths? NO
Equilateral triangles – triangles with all edges equal in length
6 triangles in the hexagon

Ways to modify this activity.
•
•

You may have to do a lot of help with getting the folds to work at first.
Just colouring it and seeing the triangles.

Ways to correct without it feeling like criticism.
166

•
•
•

You should not need to do much here.
Avoid statements like, “That is wrong.”
Ask leading questions.
o “Can we try folding it a different way?”
o “I like the colour you are using but could we use a different one for the
other side?” – if a child really wants both sides to be the same colour that
is fine too.

167

168

Tantrix
Before you start you will need.
•

Trantrix sets. Half the class split into four groups.

The goal of this game is to complete as many loops as you can.
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.
How to play?
•
•
•

•
•

Place all the tiles face down, so you can see the numbers on the back.
Take the first three tiles. Notice how the highest number you have is yellow. This
means when you flip those three tiles over you will make a yellow loop.
When you have done that with three tiles add the fourth. The colour of the loop
you are trying to make is the colour of the highest number you are using to make
the loop.
On the tenth tile you can make a yellow loop, a red loop and a blue loop but not
all at the same time.
Any lines that touch that are not part of the loop must also be the same colour.

What to do to teach the game.
•
•
•
•
•
•
•

Get the kids to sit with you on the floor.
Put one set of tantrix out and show them the numbers.
Pick the first three tiles.]show them the highest number being yellow.
Flip the tiles and make a yellow loop.
Explain the rest of the rules.
If need be make the loop with the fourth tile. Have them make suggestions on
where to place the tiles.
Split them into groups. (It usually will work better if you split them that if you let
them do it themselves.)

What math is hidden?
•
•

Problem solving
Spatial awareness
169

Ways to modify the game.
•

If making the other pathways one colour each is causing problems they can focus
on just the main colour.

Ways to correct without it feeling like criticism.
•
•

Avoid statements like, “That is wrong.”
Ask:
o “Is the whole path the same colour?”
o “Is the loop complete?”
o “Maybe that piece would work better in a different place?”

170

Rectangles
Before you start you will need.
•

Pencil and eraser

The goal of this game is to complete as many rectangle grids as you can.
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.
How to play?
•

Place each number in a rectangle that is made up of as many individual squares as
the number indicates.
o Remember that a square is a rectangle.

What to do to teach the game.
•
•
•
•
•
•

Put up a rectangle example grid.
Explain the rules
Walk them through as few examples
Ask them where to place the rectangles
As with hidato, let them make mistakes so they can see how to identify them and
fix them
As you are explaining the game point out the math (it got too busy the first week
to discuss the math with them while they were playing.)

What math is hidden?
•
•
•
•

Squares are rectangles
Prime numbers have to be done in a line (2, 3, 5, 7) composite numbers can be
made many ways (12 can be 1x12, 2x6, 3x4)
Multiplication
logic

Ways to modify the game.

171

•
•

If a student needs extra help walk them through it or have them work with
someone who is getting it more easily
If need be they can just draw rectangles that work for most of the numbers that is
great too.

Ways to correct without it feeling like criticism.
•
•

Avoid statements like, “That is wrong.”
Ask leading questions.
o “That rectangle seems to be in the way? Can we move one of those
rectangles?”
o “Starting with the larger numbers may help?”

172

173

Blink
Before you start you will need.
•

Deck of Blink cards.

The goal of this game is get rid of all your cards as fast as you can.
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.
How to play?
•
•
•
•
•
•
•

Split the deck(s) evenly between all the people. 6 per group works well.
Lay three cards in the center.
Each person puts three cards in their hand. As they play a card they replace it with
a card from their stack.
A card can be played if it matches colour, number or shape.
When you say go they do this as fast as they can.
Make sure you lose.
Help anyone that is falling too far behind.

What to do to teach the game.
•
•
•
•
•

Get the kids to sit with you on the floor.
Show them the different colours, shapes and numbers.
Explain the rules.
Have everyone place a card or two from their hand to get the hang of it first if
needed.
Smile as they laugh and play.

What math is hidden?
•
•
•

Matching
Patterns
Numbers

174

Ways to modify the game.
•
•
•

I have played this with very young kids and it has worked.
If the group is having a hard time take turns instead of going super fast.
They can help each other place cards.

Ways to correct without it feeling like criticism.
•
•

Avoid statements like, “That is wrong.”
At most they will play a wrong card. Make sure they understand the rules and if
need be slow the game down by turn taking.

Instructions from Boardgames.about.com

175

Magic Numbers
Before you start you will need.
•
•

Magic numbers grid sheets
Pencil and scrap paper may be needed by some

The goal of this game is to learn a magic trick and practice adding. (shhh don’t focus on
the addition practice part!!)
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.
How to play?
•
•
•
•

Have the other person pick any number that appears in the magic numbers
rectangles. They are NOT to tell you the number.
They place a disc beside each rectangle that has the number in it.
You tell them their magic number.
To find the number you add the top left hand number in each rectangle that they
placed a disc beside.

What to do to teach the game.
•
•
•
•
•
•
•
•
•

You will pick one student (or two) and find their magic number.
Then you will teach them how the trick works.
The student(s) will them go do the trick with a class mate and teach it to them.
Repeat.
You can go around and teach it to more students or supervise the kids as they do
it. This will need to be group dependant.
If you are in a grade 2 class you may need to restrict the students to picking
numbers less than 20.
If a rectangle with the number in it is missed then the trick will not work.
If addition is not perfect then the trick won’t work.
They may need help with the addition. You can pair students up.

176

What math is hidden?
•
•

Magic
Adding

Ways to modify the game.
•
•
•

Restrict the numbers to lower numbers, not all 60.
Help them with the addition.
Have them work together to find the sums.

Ways to correct without it feeling like criticism.
•
•

Avoid statements like, “That is wrong.”
Ask:
o “Are you sure you placed a disc by every rectangle that has your number
in it?”
o “You may want to double check your addition?”

177

178

Magic Squares
Before you start you will need.
•
•
•

4 of each coloured disks
Game board
Partner

The goal of this game is to place one piece of each colour in each row, each column and
each main diagonal.
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.
How to play?
•
•

Place one disk of each colour in each row and each column.
Do the 3X3 then move on to the 4X4.

What to do to teach the game.
•
•
•

Show them the board, use the 3X3 one.
Remind them that a row is horizontal and a column is vertical.
Place a different colour in each of the top spots and let them do the rest on their
own at their places with a partner.

What math is hidden?
•
•

Problem solving
Strategies

Ways to modify the game.
•
•

Each row can be the same colour.
Each column can be the same colour.

Ways to correct without it feeling like criticism.
179

•
•

Avoid statements like, “That is wrong.”
Ask:
o “Do you see that colour in that row already?”
o “Would that colour be better in a different spot?”

180

181

Neighbours
Before you start you will need.
•
•

Each child needs a neighbours sheet, pencil and ereaser.
You need the large white neighbours paper and white board markers.

The goal of this game is to place the numbers 1-4 (1-5) in the grid following the arrows
to indicate which are neighbours.
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.
How to play?
•
•
•

Any squares that have arrows between them are numbers that are neighbours.
Any squares without arrows between them may not be neighbouring numbers.
1 ↔ 2 ↔ 3 ↔ 4 so 1 is neighbours with only 2, 2 is neighbours with 1 and 3, ect.
Only one of each number 1, 2, 3, 4, (5) may appear in each row and column.

What to do to teach the game.
•
•
•
•

•

•

•
•

Get the kids to sit with you on the floor.
Draw 1 ↔ 2 ↔ 3 ↔ 4 and talk through who is neighbours with whom.
Explain what the goal of the game is.
Have them help you with the first one. As always let them make mistakes and help
them identify where the mistake is and help walk them through fixing it but let
them try to fix it on their own first. Ask leading questions.
If some of them clearly get how to do it, they may go try on their own (and with
others sitting near them). One of you can circulate around the kids working on
their own while the other continues with the group.
If there are some who need more help keep them with the other volunteer and
you can continue to work through more examples with them. As they understand
they can move to working on their own.
Some may need to stay with one of you the whole time this game is played. If so
you may wish to trade roles part way through.
Keep the 1 ↔ 2 ↔ 3 ↔ 4 with a note that 1 and 4 are not neighbours
somewhere where everyone will be able to see it.
182

What math is hidden?
•
•

Problem solving.
4X4 =16, 5X5=25

Ways to modify the game.
•

Some may only be able to place neighbouring numbers in the squares without
worrying about one of each number in each row and each column.

Ways to correct without it feeling like criticism.
•
•

Avoid statements like, “That is wrong.”
Ask:
o “Is that a neighbour?”
o “Does that number already appear in that row/column?”
o “Did you see the hint they gave us by telling us that that square has to be a
____________?”
o “What number that is a neighbour is missing from this row/column?”

183

184

Qwirkle
Before you start you will need.
•

A smile

The goal of this game is to use up as many tiles as they can as a group.
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.

How to play?
•
•
•
•
•
•
•
•
•

Have a group of six (or so half of your group for each bag) work from each bag.
Each student takes 7 tiles.
You put out a tile.
Pick a person to go first and then work around the circle.
They take turns adding tiles to the collection. For each tile they place they take
another from the bag. This continue until time is up or the bag is empty.
A row is built from tiles of the same colour or the same shape. There are 6 colours
and 6 shapes.
On your turn you can add as many tiles to a single row as you can as long as they
follow the pattern.
After a row has either all the colours or all the shapes it is full and can not be built
on any more.
You can build off of other rows as long as any tile you touch you follow the rule of
that row.

What to do to teach the game.
•
•
•

Have the students sit around you.
Take some tiles out of the bag.
Tell them the basic rules and them have them help you place several tiles until
most, if not all, of the students understand the rules.
185

•

When everyone understands separate them into two groups and get them set up.
Do not separate them until they have all the instructions they need because once
you separate then it will be harder to get them to listen.

What math is hidden?
•
•
•

Grids
Patterns
matching

Ways to modify the game.
•
•

If a student is not able to play the game the way it is designed they can be given a
collection of tiles to place on their own to make their own game.
If matching the tiles in all directions is too much just make them match in one
direction.

Ways to correct without it feeling like criticism.
•
•

Avoid statements like, “That is wrong.”
Ask leading questions.
o “Is that the same shape or colour?”
o “Do you see one of the shape already in that row?”
o “Can you see a different place that may be better for that tile?”

186

Set
Before you start you will need.
•

A Set deck of cards.

The goal of this game is find out how to always win.
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.
How to play?
•
•
•
•
•
•
•
•

Do not tell the children the rules.
Tell them that a set is a group of three cards that follows a set of rules.
Start taking sets.
Get them to try to find sets.
If a set they try is not a Set tell them the card that is an issue and see if they can
tell you what card would make a set. The group can help with this.
A set is a collection of three cards that is either the same or different across each
trait.
There is colour (red, green or purple), number (1, 2 or 3), shading (empty, striped
or solid) and shape (diamond, squiggle or oval).
The video may really help for this one.

What to do to teach the game.
•

As above.

What math is hidden?
•
•
•
•

Problem solving
Strategies
Patterns
Research

Ways to modify the game.
187

•

Play it as a group.

Ways to correct without it feeling like criticism.
•
•

Avoid statements like, “That is wrong.”
Ask:
o “What would finish this set?”
o “Use the words Same and Different. Can we always say them?”

188

Suspend
Before you start you will need.
•

Half the class broken into 3 groups

The goal of this game is to place all of the bars.
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.
How to play?
•
•
•
•
•

Place all the bars with coloured ends in a pile.
Place one orange ended bar on the suspend structure.
Take turns placing a bar on the structure.
If a piece falls add it to the pile people can take from.
Try to place all the bars.

What to do to teach the game.
•

Tell them the rules.

What math is hidden?
•
•
•

Problem solving
Strategies
Balancing weights like you balance an equation.

Ways to modify the game.
•

You should not need to but if they modify it that is fine.

Ways to correct without it feeling like criticism.
•
•

Avoid statements like, “That is wrong.”
Could say:
189

o “If it is too heavy on one side what can we do to balance it out?”
o “Is that piece going to be heavy enough/too heavy?”

190

Telaga Buruk
Before you start you will need.
•
•

Game sheet
2 play pieces

The goal of this game is to block the other player.
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.

How to play?
•
•
•

Place your pieces as show.
Take turns moving to the open space but you must stay on the lines.
You may not jump over another player or share a space.

What to do to teach the game.
•
•
•
•

Show them the board set up.
Show them a couple example moves.
Put them into pairs. Have them play a few times.
Switch partners and try again.

What math is hidden?
•
•

Problem solving
Strategies

191

Ways to modify the game.
•

There is not very much you can do to modify this game. It should not be needed.

Ways to correct without it feeling like criticism.
•
•

Avoid statements like, “That is wrong.”
You may only need to remind what the instructions are for this game.

Yes there is an end to the game.

192

193

Q-Bitz
Before you start you will need.
•

A smile

The goal of this game is to have the class complete as many of the pattern cards as
possible.
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.
How to play?
•
•
•
•

With about half the class (up to 16 children) have them work in pairs.
Each pair gets one set of cubes and a base.
Each pair gets a pattern card.
Once the pattern card is complete they bring it to you, you check the pattern and
if it is correct the pair gets a new card.

What to do to teach the game.
•
•
•
•

Have to group join you on the floor.
Bring out a pattern card and a set of cubes and a base.
Explain that the colour on the cube is equivalent to the black on the card.
Go through an example of completing the pattern on the card with as much input
from the group as possible.

What math is hidden?
•
•
•
•
•

Logic
Problem solving
Pattern recognition
Rotation of shapes

194

Ways to modify the game.
•
•
•

If a student is not able to follow the pattern you can complete the first row for
them.
If a student is not able to follow the pattern and makes their own pattern you can
write the pattern that the student created.
(If all a student is capable of is stacking the blocks or making them into a train that
can be their pattern.)

Ways to correct without it feeling like criticism.
•
•

Avoid statements like, “That is wrong.”
Ask leading questions.
o “Would it help if we rotate that block?”
o “Is that pattern exactly the same as the one on the card?”
o “If you place the cubes directly on the pattern card would that help? Do
they match now?”
o “It may help to take all the cubes off the base and start fresh.”
o “Can you see where the card and your cubes are different?”

195

Hidato
Before you start you will need!
•
•

Pencil
Eraser

The goal of the game is to place all the numbers from 1-9 (or higher for bigger games).
REMEMBER ENJOYMENT AND SUCCESS ARE KEY! Help them achieve it.
How to Play?
•
•
•
•

Numbers must make a chain that links side to side (horizontally), up and down
(vertically) or diagonally.
The starting number is bold and the last number is bold.
If there is a square that has an X in it you cannot place a number in it and you
cannot pass through it.
Sometimes working backwards helps, especially for the bigger puzzles!

1

9

3
4

6

7

196

What to do to teach the game.
•
•
•
•
•
•
•

Start by doing a few with them until they understand the game.
Talk them through the reasons for placing the numbers where you do if need
be but see if they can place the numbers.
Have them go to their places with the first puzzle.
Have them work together to find the path (unless the teacher indicates that
the student it to work alone.)
If need be place students that are having success with others that are
struggling. (Check with the teacher before making any suggestions like this.)
Exchange solved puzzles for new ones.
If a student is finding it very easy and wants a challenge, skip up to a harder
version.

What math is hidden?
•
•
•

Ordering numbers
Writing the numbers properly
Grid - multiplying 3 X 3 means 9 squares
4X 4 means 16 squares ect

•
•

What do horizontal and vertical mean?
Ask them!!

Ways to modify the game.
•

Write all the numbers in from 1 – 9 (or higher).

Ways to Correct without it feeling like criticism.
•
•

Avoid statements like, “That is wrong.”
Ask leading questions that help them get to the answer but so they feel like
they have done it themselves.
o “Are you sure that number goes there?”
o “Can you think of another number that would work better there?”

197

o “I think that number needs to be written a different way, what do you
think?”
o “If you put that number there will it get you to the next number?”
o “Your path looks blocked by that number can you move it to a different
place?”

198

199

Appendix D Rasch Material
•
•

Rasch logit scale explanation
Fit Statistics Infit and Outfit

200

Rasch Logits Scale Explanation.

201

202

Examples of Outfit and Infit Mean Square values for various response patterns.

From https://www.rasch.org/rmt/rmt82a.htm
203

204