GENDER AND ABORIGINAE DIFFERENCES IN ELEMENTARY SCHOOL STUDENTS’ CBM READING, WRITING, AND DIBELS SCORES by Shelley Wiltshire B.H.E., University o f British Columbia, 1984 THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF EDUCATION in EDUCATIONAL COUNSELLING THE UNIVERSITY OF NORTHERN BRITISH COEUMBIA July 2004 © Shelley Wiltshire, 2004 1^1 Library and Archives Canada Bibliothèque et Archives Canada Published Heritage Branch Direction du Patrimoine de l'édition 395 W ellington Street Ottawa ON K 1A 0N 4 Canada 395, rue W ellington Ottawa ON K 1A 0N 4 Canada Your file Votre référence ISBN: 0-494-04700-3 Our file Notre référence ISBN: 0-494-04700-3 NOTICE: The author has granted a non­ exclusive license allowing Library and Archives Canada to reproduce, publish, archive, preserve, conserve, communicate to the public by telecommunication or on the Internet, loan, distribute and sell theses worldwide, for commercial or non­ commercial purposes, in microform, paper, electronic and/or any other formats. AVIS: L'auteur a accordé une licence non exclusive permettant à la Bibliothèque et Archives Canada de reproduire, publier, archiver, sauvegarder, conserver, transmettre au public par télécommunication ou par l'Internet, prêter, distribuer et vendre des thèses partout dans le monde, à des fins commerciales ou autres, sur support microforme, papier, électronique et/ou autres formats. The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission. L'auteur conserve la propriété du droit d'auteur et des droits moraux qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation. In compliance with the Canadian Privacy Act some supporting forms may have been removed from this thesis. Conformément à la loi canadienne sur la protection de la vie privée, quelques formulaires secondaires ont été enlevés de cette thèse. While these forms may be included in the document page count, their removal does not represent any loss of content from the thesis. Bien que ces formulaires aient inclus dans la pagination, il n'y aura aucun contenu manquant. Canada ABSTRACT This study uses Curriculum Based Measurement data of students’ reading and writing fluency and Dynamic Indicators o f Basic Early Literacy Skills data to investigate the relationship between scores on these achievement measures, the gender of the students, and the aboriginal status o f the students. The sample consists of 2272 elementary students randomly selected for the Prince George School District norming project. The measurements were collected by teachers and other school district staff in each elementary school during October, January, and April of the 2002/2003 school year. Scores were analyzed using a 2 X 2 analysis of variance (gender by aboriginal status). Gender, aboriginal status and the dependent variables of reading and written expression scores were analyzed for each of Grade 1 through 7. Gender, aboriginal status and the dependent variables o f pre-reading and early reading skills scores were analyzed for Kindergarten and Grade 1. Repeated measures for October, January, and April were compared for trends in reading and written expression fluency and pre-literacy skills over the school year. Although male students’ mean scores in reading, writing, and in early literacy skills were lower than female students’ mean scores at every grade level and every testing period, the only consistent statistically significant gender effect was found in written expression fluency and only for Grade 2 to 7. A consistent statistically significant aboriginal status effect was found only for reading expression fluency from Grade 1 through 7 and for early literacy skills for Kindergarten and Grade 1. Aboriginal students’ mean scores in early literacy skills and in reading and writing fluency were lower than non-aboriginal students’ mean scores at every grade level and testing period except the grade five January testing for all variables. TABLE OF CONTENTS Abstract Table of Contents 111 List o f Tables List o f Figures Vll Acknowledgement V lll CHAPTER ONE INTRODUCTION Description of School District Research Questions Hypotheses 1 2 3 3 CHAPTER TWO LITERATURE REVIEW Literacy as Measured by CBM Reasons for Using CBM Limitations of CBM Reliability and Validity of CBM as a Measure of Literacy Literacy as Measured by DIBELS What is DIBELS? Uses of DIBELS Reliability and Validity of DIBELS as a Measure Literacy Modifications of DIBELS Measures Gender Studies Relating to Literacy Foundation Skills Assessment Results Studies of Specific Gender Differences Aboriginal Studies Relating to Literacy Foundation Skills Assessment Results Studies of Specific Aboriginal Differences 6 6 6 8 15 18 19 19 25 28 29 34 CHAPTER THREE METHODS Participants Instruments Procedures 36 36 37 40 CHAPTER FOUR RESULTS Results of the DIBELS Data Analysis Results of the CBM Data Analysis Effect Sizes and Trends Effect Sizes for the DIBELS Data 42 42 49 63 63 111 10 14 14 15 CHAPTER FIVE Effect Sizes for the CBM Data DIBELS and CBM Data Trends 67 77 DISCUSSION AND CONCLUSIONS Summary and Conclusions Limitations of the Study Implications for Further Research Implications for Practice 86 86 88 89 91 REFERENCES Appendix 93 Approval Forms Ethics Approval Form 96 97 LIST OF TABLES Table 1: Provincial FSA Trends of Percentage of Students Meeting or Exceeding Grade Expectations by Gender 21 Table 2; Prince George FSA Trends of Percentage of Students Meeting or Exceeding Grade Expectations by Gender 24 Table 3 : Provincial FSA Trends of Percentage of Students Meeting or Exceeding Grade Expectations by Aboriginal Status 31 Table 4: Prince George FSA Trends of Percentage o f Students Meeting or Exceeding Grade Expectations by Aboriginal Status 32 Table 5: Schedule of Testing Periods for DIBELS Measures 38 Table 6: Description o f DIBEES and CBM Variables 39 Table 7: Descriptive Statistics for ISF, LNF, PSF, NWF, and ORE 43 Table 8: Analysis of Variance for Gender and Aboriginal Diff. inISF, ENF, PSF, NWF, ORF 48 Table 9: Descriptive Statistics for WRC 52 Table 10; Descriptive Statistics for WSC 53 Table II: Descriptive Statistics for TWW 54 Table 12: Analysis of Variance for Gender and Aboriginal Differences in WRC 60 Table 13: Analysis o f Variance for Gender and Aboriginal Differences in WSC 61 Table 14: Analysis of Variance for Gender and Aboriginal Differences in TWW 62 Table 15: Effect Sizes for ISF, LNF, PSF, NWF, and ORF by Gender 64 Table 16: Effect Sizes for ISF, ENF, PSF, NWF, and ORF by Aboriginal Status 66 Table 17: Effect Sizes for WRC by Gender 69 Table 18: Effect Sizes for WSC by Gender 70 Table 19: Effect Sizes for TWW by Gender 71 Table 20: Effect Sizes for WRC by Aboriginal Status 74 Table 21: Effect Sizes for WSC by Aboriginal Status 75 Table 22: Effect Sizes for TWW by Aboriginal Status 76 VI LIST OF FIGURES Figure 1: Histogram of Scores and their Frequency for the October Testing of LNF at the Kindergarten Level 45 Figure 2: Histogram of Scores and their Frequency for the April Testing o f WSC at the Grade Four Level 50 Figure 3: Line Graph of Estimated Marginal Means for Male/Female and Aboriginal/Non-aboriginal for the Grade Two October Testing of WRC 56 Figure 4: Line Graph of Median Effect Sizes for Reading and Writing by Gender 78 Figure 5: Line Graph of Median Effect Sizes for Reading and Writing by Aboriginal Status 80 Figure 6: Eine Graph of Median o f the Three Means Within Grade Scores for WRC for Both Gender and Aboriginal Status Groups 81 Figure 7: Eine Graph of Median o f the Three Means Within Grade Scores for WSC for Both Gender and Aboriginal Status Groups 82 Figure 8: Line Graph of Median WRC Effect Sizes by Gender and Aboriginal Status 83 Figure 9: Eine Graph of Median WSC Effect Sizes by Gender and Aboriginal Status 85 V ll ACKNOWLEDGEMENT I would like to thank Dr. Peter MacMillan, Associate Professor at UNBC, for his endless support and teaching throughout the thesis process. Without his help this thesis would definitely not have been possible. I would also like to thank Catherine McGregor, Lecturer at UNBC and PhD candidate, for her support and assistance with my defense and the final document. I would like to thank Dr. Robert Tait, Dean o f Graduate Studies, and the other members of my committee. Dr. Carl Anserello, School Services Administrator for School District #57, and Paul Michel, Adjunct Professor in First Nations Studies at UNBC and Coordinator for the First Nations Centre, for their support and encouragement. I would also like to thank the Aboriginal Education Board of School District #57 for their support of this research. I would like to dedicate this thesis to my husband Ed and my daughters Savannah and Sian. Without their constant patience, love, support, and encouragement I would never have been able to complete this thesis or the masters program. V lll CHAPTER ONE: INTRODUCTION The issue of literacy and the factors that influence the success or failure of students is an increasingly examined and discussed topic. The relative importance of gender and levels o f achievement is being discussed and debated at both a school and university level. There are a number of academic indicators that point to differences between boys and girls with respect to literacy. When examining the Foundation Skills Assessment results or provincial examination results girls are outscoring boys in numerous areas including literacy. Hedekar’s (1997) study using Curriculum Based Measurement (CBM) also found definite gender differences in literacy. Another issue for educators is the matter of literacy among aboriginal students in British Columbia. There has been a long history of achievement differences between aboriginal and non-aboriginal students. Again indicators such as the Foundation Skills Assessment and provincial examination results highlight the need to examine these differences so that these issues can be addressed. The Prince George School District in British Columbia has identified a high number of students lacking in early literacy skills, particularly males and aboriginals, and have made improving student literacy, particularly in these two groups, a priority (School District No. 57, Prince George, 2003). Also of concern to the Prince George School District is the fact that the Foundation Skills Assessment results indicate the gender gap favouring females is larger in the Prince George School District than it is at the provincial level. The two test instruments the school district is using to assess these early literacy skills are Curriculum Based Measurement (CBM) and Dynamic Indicators o f Basic Early Literacy Skills (DIBELS). In the Prince George School District CBM and DIBELS are being used to measure the curriculum being taught and growth in student learning against previously established district norms. Based on the Foundation Skills Assessment results, provincial government exam results, Hedekar’s (1997) previous results and the fact that the Prince George School District has identified the area of gender and aboriginal differences in literacy as a concern, the importance of my research study is to examine the CBM reading and writing scores and the DIBELS scores for approximately 2200 students in order to analyze the effects of gender and aboriginal status on the acquisition o f early literacy skills in the Prince George School District. Curriculum Based Measurement (CBM) is a series of short, informal achievement tests that are standardized yet based on curriculum being used in the classroom (Scott & Weishaar, 2003). The CBM measures of literacy used in this study include Words Read Correctly (WRC), Words Spelled Correctly (WSC), and Total Words Written (TWW). Dynamic Indicators o f Basic Early Literacy Skills (DIBELS) are a standardized, individually administered set of tests that measure pre-reading and early reading skills (University o f Oregon (a), n.d.). The DIBELS measures used in this study include Initial Sound Fluency (ISF), Letter Naming Fluency (LNF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), and Oral Reading Fluency (ORF). Description o f School District The Prince George School District (SD #57) has been using CBM as an assessment tool since 1996. School District #57 is located in the central interior of British Columbia and covers an area o f almost 52,000 square kilometres. The communities covered by this district include Prince George, Mackenzie, McBride, Valemount, and Hixon as well as many small settlements in between. Because o f the vast area covered by the school district the schools are located in a variety o f settings including inner city, suburban, and rural. In 2003 there were approximately 16,400 students in the district, of these approximately 18% are aboriginal. There are 37 elementary schools in the district. Research Questions 1. Is there a gender difference in reading or writing fluency o f elementary school students based on CBM/DIBELS measures? Is this gender difference consistent throughout the grades? 2. Is there a difference in reading or writing fluency for aboriginal elementary school students versus non-aboriginal elementary school students based on CBM/DIBELS measures? Is this effect consistent across all grade levels? 3. Is there an interaction between gender and aboriginal status for elementary school students when examining gender and aboriginal status differences in reading or writing fluency, based on CBM/DIBEES measures? Hypotheses The following are a number of statistical hypotheses that were generated by the research questions and tested during this study. 1. Within a given grade level the mean reading fluency (as measured by the variable Words Read Correctly) of male students equals that of female students. a) Ho: p(r)gm - q(r)gf = 0 H i : p(r)gm - q (r )g f ^ 0 where r refers to reading fluency as measured by Words Read Correctly, g refers to Grades 1 through 7, and m and f refers to male and female respectively. Writing fluency is measured by two highly correlated variables Words Spelled Correctly (WRC) and Total Words Written (TWW). b) Ho: p(w)gm - p(w)gf = 0 Hi: p(w)gm - p(w)gf 0 where g, m, and f are defined as above and where w refers first to a test with the variable WSC and then with the variable TWW. 2. To investigate the second research question the means of the reading and writing fluency variables were compared for aboriginal elementary students and non­ aboriginal elementary students. a) Ho: p(r)gab - p(r)gnab = 0 Hi : p(r)gab - p(r)gnab 5^ 0 where ah refers to aboriginal and nab refers to non-aboriginal and the other symbols are defined as previously stated. b) Ho: p(w)gab - p(w)gnab = 0 Hi: p(w)gab - p(w)gnab ^ 0 where w refers first to a test with the variable WSC and then with the variable TWW. Other symbols are defined as previously stated. 3. Finally to investigate if there is any interaction between gender and aboriginal status the means for reading and writing fluency for both gender groups and aboriginal, non­ aboriginal groups were compared. a) HO- M - ( r ) g a b x g e n " M - ( r ) g a b " M 'W g g e n + H (r)g 0 H i - H (r)g a b x g e n " ^ ( O g a b " P-(r)ggen + M-(r)g ^ 0 where gen refers to gender. Other symbols are defined as previously stated. b) Ho; p ( w ) g a b Xg e n ' H ( w ) g a b - p,(w)ggen + |Ll(w)g = 0 Hi: p(w)gabxgen - H(w)gab ' fl(w)ggen + |Lt(w)g ^ 0 The symbols are defined as previously stated. CHAPTER TWO: LITERATURE REVIEW This chapter consists of four sections in which I will discuss literature relevant to this study. In the first section I will investigate literacy as measured by Curriculum Based Measurement (CBM). In the second section I will discuss literacy as measured by Dynamic Indicators of Basic Early Literacy Skills (DIBELS). The third section will be where I review gender studies relating to reading and writing and in the last section I will examine aboriginal studies and issues relating to reading and writing. Literacy as Measured by CBM Curriculum Based Measurement initially developed in the area of special education. It was developed with the intention of testing a special education intervention model that would formatively evaluate teacher instruction in order to improve their effectiveness (Deno, 2003). In the I980’s there was a need to come up with an alternative measurement system to commercial standardized achievement tests and teacher observations. This alternative would provide a data base to evaluate students’ overall proficiency in basic skills and to assist teachers in their instructional planning with the end goal of improving student achievement (Fuchs & Fuchs, 1991). CBM has emerged as a set of procedures used by teachers to evaluate student progress and instruetional effectiveness (Deno, 1985). Although CBM was initially developed and tested for reliability and validity for testing reading skills, it is also used to reliably and validly test written expression and spelling skills (Deno, 1985). Reasons fo r Using CBM There are a variety o f reasons for using CBM as an alternative measurement system the first o f which is the validity and reliability of CBM measures. Due to the standardized nature of CBM, a large number of reliability and validity studies have been conducted (Deno, 1992). Deno (1985) also reported that all CBM measures are highly correlated with performance on the standardized, norm-referenced tests with a particularly close relationship between reading aloud from text and comprehension scores. A second reason for using CBM as an alternative measure is the improved level of communication of information that can be provided by using CBM. The graphical images that can be produced using data collected by CBM procedures are clear and simple to interpret making it easy for teachers, parents, and students to see individual levels of performance and rates o f change or growth in achievement over time. These levels can then be referenced to the student’s individual goals, to the instructional program and to peers in the class, the school or the district (Fuchs & Fuchs, 1991). A third reason for using CBM is the flexibility. Although CBM procedures are standardized, teachers have the freedom to identify the curriculum materials to be used in the testing as well as the level within that curriculum that they want to be mastered by the end o f the year (Fuchs & Fuchs, 1991). This allows for individual needs and interests of the teacher or school to be met. With the current situation in education of cutbacks in funding and increased curricular demands, cost and time effectiveness is the fourth reason to use CBM. The fact that additional testing materials do not need to be purchased to use CBM is a cost saving. With commercial standardized tests is the hidden expense o f the procedure to yield a norm-referenced score which will give little information about the individual student’s performance in the local curriculum (Deno, 1985). The time saving for administering CBM is crucial as well. Due to the multiple sampling approach of CBM, performance samples are generally 1 to 3 minutes long whereas the time to administer standardized achievement tests is generally an hour or more (Deno, 2003). Another consideration for time and cost-effectiveness is the amount o f time and money required to train teachers or others to administer the CBM test samples. According to Deno (2003) it is easy for professionals, paraprofessionals and parents to learn to use CBM and still obtain reliable data. The final reason for using CBM is the fact that research has shown that when CBM is used to monitor the effectiveness of an instructional program and formulate improvements the quality of instruction as well as student achievement goes up (Fuchs & Fuchs, 1991). The reasons why a school or district might choose to use CBM is twofold in that it not only provides an assessment tool but it can assist in improving the level of both instruction and student achievement. Despite the benefits of using CBM I could not find any information in the literature to indicate that CBM is being widely used in school districts in British Columbia or Canada. Limitations o f CBM While CBM may initially be viewed as an answer to achievement measurement concerns, there are some problematic issues that need to be identified. It was previously mentioned there is a strong correlation between CBM measures for reading and reading comprehension scores, Deno (1985) also cautions that reading aloud from text may be detached from comprehension as in the case of “word callers”, students who read fluently but do not understand what they read. A study by Flamilton and Shinn (2003) investigates the question of whether or not “word callers” read fluently but lack comprehension by comparing the oral reading and comprehension skills of teacheridentified “word callers” with that of peers who were identified by the teacher as fluent readers with good comprehension skills. The 66 students involved in the study were all in Grade 3 and were administered four reading tests: the Curriculum-Based Measurement of Reading (R-CBM), the Curriculum-Based Measurement-Maze (CBM-Maze), a comprehensive oral question answering test (CQT), and the Passage Comprehension subtest of the Woodcock Reading Mastery Test (WRMT-PC). The results o f the study indicate that students in the “word caller” group not only comprehended significantly (p < .001) less well than their peers who read and comprehend well but the “word callers” also had significantly (p < .001) lower oral reading fluency scores (Hamilton & Shinn, 2003). Another finding of Hamilton and Shinn’s (2003) study was that teachers over predicted the reading fluency scores of both groups of student which brings into question the accuracy of teachers’ judgements regarding students’ reading fluency skills. This study seems to indicate teachers’ judgements about the reading fluency o f whom they identify as “word callers” may not be accurate which gives strength to the argument that CBM measures of reading fluency are valid measures of reading comprehension. Another problem arises in the area of training. Deno (1985) states teachers must be carefully trained and extremely efficient in using CBM if it is to remain a timeeffective approach to measurement of achievement. In another paper Deno (2003) also states time as being the most important barrier to teachers in implementing the measurement procedures. Lastly, the question of the most effective use of CBM needs to be addressed. As far as formative evaluation o f individual students is concerned CBM is most effective in settings where special education teachers have the time and skills to chart the progress of individual students and then adjust the student’s program in response to the data these charts provide (Deno, 2003). With the inclusion o f students with disabilities into regular classrooms and increases in class sizes it is unlikely that CBM will be as effective at improving student achievement in these settings as compared to more individualized settings. However, CBM can still be used as an effective assessment tool to measure the progress o f students and the curriculum being taught in the classroom. Reliability and Validity o f CBM as a Measure o f Literacy There are many aspects and modes of literacy, however for the scope of this study literacy will be defined as reading and writing fluency. The reliability and validity of CBM as a measure of literacy, specifically reading, writing, and spelling has been widely researched and will also be addressed here. The criterion validity o f performance on some of the CBM tasks, specifically cloze procedures (supplying words deleted from text), word meanings and reading aloud, are examined by Deno (1985) with respect to commercial standardized norm-referenced tests. The results indicate that all CBM measures except for word meanings are highly correlated (.70 to .95) with standardized norm-referenced tests such as the Literal and Inferential subtests o f the Stanford Achievement Test and the Woodcock Reading Mastery Test (Deno, 1985). Similar results of validity were found for the written expression and spelling measures of CBM. In a review by Good and Jefferson (1998) criterion-related validity coefficients were examined for the CBM measures of oral reading fluency passages and correct writing sequences with story starters. The tests with which the CBM measures were 10 validated were published, norm-referenced or criterion-referenced tests, or tests from a published basal reader series. The results indicate that the median validity coefficients for the CBM reading measure for Grades 2 to 6 range from .62 to .73, which is within the acceptable range o f concurrent, criterion-related validity coefficients of .60 to .80 (Good & Jefferson, 1998). The results for the CBM writing measure are not quite as impressive with the median validity coefficients for Grades 2 to 11 ranging from .48 to .68 (Good & Jefferson, 1998). This provides less support for the construct validity of the CBM writing measure. In another study in 1992 Shinn, Good, Knutson, Tilly, and Collins (as cited in Good & Jefferson, 1998) used multiple reading measures to test the construct validity of these measures with respect to reading comprehension. This study involves Grade 3 students and Grade 5 students. For the Grade 3 students the construct examined is reading competence and the CBM reading probes tested indicate construct validity coefficients of .88 to .90 (Good & Jefferson, 1998). For the Grade 5 students the constructs examined are decoding and comprehension and the CBM reading probes tested indicate construct validity coefficients o f .74 to .90 (Good & Jefferson, 1998). In a study conducted in School District 57 (Prince George), Fewster and MacMillan (2002) found that school-based information, such as teacher-awarded grades, adds to the validity o f CBM. Their study examined the validity of elementary school CBM scores to predict grades in future courses that are reading and writing intensive and to predict program placements. Their results indicate CBM measures of words read correctly and words spelled correctly are significant predictors of future grades particularly for words read correctly and at the Grade 8 level. The same validity is not 11 indicated for WSC as a measure of overall writing competency. A study by Gansle, Noell, VanDerHeyden, Naquin, and Slider (2003) looks at the need for a variety o f other new writing measures beyond TWW or correct word sequences as an indication o f students’ written skill levels. Their study includes third and fourth grade students from one school who completed two 3-minute writing probes on two consecutive days (Gansle et al., 2003). Students were also ranked in terms of their writing skills by their classroom teacher plus standarized criterion test scores were analyzed, specifically the Iowa Test o f Basic Skills (ITBS) for the third grade students and the Louisiana Educational Assessment Program (LEAP) for the fourth grade students. The CBM measure of TWW is one of a number of predictor variables including parts of speech, long words, words spelled correctly, total punctuation marks, correct punctuation marks, correct capitalization, complete sentences, words in complete sentences, words in correct sequence, sentence fragments, simple sentences, computerscored variables. These predictor variables are measured to determine the best predictor of the three criterion variable scores. The largest correlations in this study between predictor variables and the criterion variable of ITBS are for the variables of correct punctuation marks and words in correct sequence which had correlation coefficients ranging from .35 to .44 (Gansle et al., 2003). For correlations between the predictor variables and the criterion variable of LEAP the results are highest for number of verbs, .33, and the computer-scored variable of vocabulary complexity, .24. The largest correlations between the predictor variables and the criterion variable o f classroom teacher rankings are for the variables of words in correct sequence, .37, and correct punctuation marks, .35. These results indicate that TWW is not the best predictor of 12 written skills as measured by the criterion variables of teacher rankings, the Iowa Test of Basic Skills, and the Louisiana Educational Assessment Program. The reliability o f CBM measures is inherent in the very nature o f the frequent collection of data to assess growth in the skills being measured. Traditional achievement tests that are norm-referenced or grade-equivalent scored do not reliably reveal an individual student’s growth in reading proficiency (Deno, 1985). With CBM assessment it is possible to repeat data collection frequently with the same sample of students and with a larger number of students than would be possible with other more traditional assessment tools. In addition it was found that reading aloud from text was reliable in discriminating which students were in special education programs and which ones were not (Deno, 1985). Simple data such as words read correctly can reliably be used to monitor growth in reading. The study by Fewster and MacMillan (2002) also shows CBM reliably predicts program placements especially for honours programs. In a previous study done by Hedekar (1997) in the Prince George school district reliability and validity coefficients were reported for the CBM measures o f WRC, WSC, and TWW. For Hedekar’s (1997) study the Pearson correlation coefficients indicate a high correlation between WSC and TWW, .91 < r< .99, and a low to medium correlation between WRC and TWW, .31 < r < .48. The reliability across the 6 month testing period for Hedekar’s (1997) study also shows stability with coefficients for WRC ranging from .77 to .86 and for TWW coefficients ranging from .48 to .62. The inter-rater reliability for Hedekar’s (1997) study was also examined and was found to be very reliable with correlations of .97 to .99 between the scores given by different raters on the same tests. In the norming project for the current study in the Prince George school district 13 the Pearson correlation coefficients also indicate a high correlation between WSC and TWW, .94 < r < .99, and a low to medium correlation between WRC and TWW, .21 < r< .49 (Fewster, Fortier, Foulds, MacMillan, Struthers, & Walraven, 2003). The reliability across the 6 month testing period for the norming project also shows stability with coefficients for WRC ranging from .81 to .86 and for TWW coefficients ranging from .58 to .65 (Fewster et. al., 2003). Literacy as Measured by DIBELS Measuring literacy at the Kindergarten and Grade 1 level is a difficult task. The challenge is to find measures that will assess students’ literacy through reading and writing skills when students have not yet acquired these skills. DIBELS is a logical measurement system due to the fact that it tests early literacy skills in the grades where pre-reading, pre-writing, early reading and early writing skills are initially taught. Testing at the Kindergarten and Grade 1 level using DIBELS measures “provides a reliable and valid indicator of children’s progress toward the acquisition of early literacy skills” (Elliot, Lee, & Tollefson, 2001, p. 35). The DIBELS assessments are a standardized set of short, individually administered measures that assess three of the essential early literacy domains: phonological awareness; alphabetic principle; and fluency with connected text (University o f Oregon (a), n.d.). The original DIBELS measures are a set of 10 that were initially designed as downward extensions of the CBM reading probes (Elliot et al., 2001). The DIBELS measures used in my research include: Letter Naming Fluency (LNF), an indicator of risk for difficulty in achieving early literacy benchmark goals; 14 Initial Sound Fluency (ISF) and Phonemic Segmentation Fluency (PSF), used to assess phonological awareness; Nonsense Word Fluency (NWF), used to assess alphabetic principle; and Oral Reading Fluency (ORF), used to assess fluency with connected text (University o f Oregon (b), n.d.). The measures are intended to be used together in order to be empirically valid and reliable, as supported by Good, Kaminski, Smith, Simmons, Kameenui, and Wallin (2003) who outlined that at the Kindergarten level instructions on phonemic awareness, especially blending and segmentation, needs to be explicitly integrated with sounds of letters to ensure reading development later on. Uses o f DIBELS DIBELS is a standardized assessment system to test pre-cursor skills for early literacy. The DIBELS assessment is administered three times a year and can be used with students from Kindergarten through to Grade 3. It provides a series of benchmarks for each measure at each grade level. The resulting data that are produced from DIBELS measures has innumerable uses. The data can be used to assess the quality of instruction and supplemental programs, school outcomes, professional development, curriculum and supplemental materials adequacy and appropriateness, and additional intervention which are all elements of an effective beginning reading program (Good et al., 2003). Another positive of using DIBELS as an assessment tool is the benefits that children may gain from being exposed to these skills (Elliot et al., 2001). Reliability and Validity o f DIBELS as a Measure o f Literacy The question o f whether or not DIBELS measures emerging literacy skills needs to be addressed. The study by Elliot et al. (2001) addresses this question by correlating 15 average DIBELS scores and a variety of achievement-related criterion measures such as the Woodcock-Johnson Psycho-Educational Achievement Battery-Revised (WJ-R) Broad Reading and Skills clusters, the Test o f Phonological Awareness (TOPA), the Teacher Rating Questionnaire (TQR), the Developing Skills Checklist (DSC), and the Kaufman Brief Intelligence Test (K-BIT). The results generally support previous research on DIBELS that pre-literacy abilities in Kindergarten are associated with later reading fluency (Elliot et al., 2001). The use of DIBELS over the previously mentioned standardized tests is preferred because as well as proving technical adequacy, the DIBELS measures are more practical because they are more easily administered and repeated, more easily adapted to curriculum, more easily scored, and can be used with minimal training and materials (Elliot et al., 2001). Having demonstrated the effectiveness of DIBELS as an appropriate measure of pre-literacy skills, also of significant importance to this study is the compatibility of DIBELS with CBM measures. The DIBELS measures were originally developed as extensions to the CBM measures and so a discussion of the correlation between the two systems o f assessment is essential (Elliot et al., 2001). The test for LNF asks students to name as many letters as they can in one minute from a random presentation of upper- and lower-case letters. The LNF measure is a standardized measure of risk used to assess risk of not achieving early literacy benchmark goals in Kindergarten and has a predictive validity of .71 with the Grade I CBM Oral Reading Fluency (ORF) measure (Good, Wallin, Simmons, Kame’enui, and Kaminski (2002). The ISF measure tests students’ ability to identify and produce the beginning sound of an orally and pictorially presented word. The predictive validity of ISF with the CBM Oral Reading Fluency (ORF) 16 measure taken in the spring o f Grade 1 is .45 (Good et al., 2002). The PSF measure is used from the winter of Kindergarten through to the middle of Grade 1 and assesses the students’ ability to fluently segment three- and four-phoneme words into their individual phonemes. The PSF assessed in the spring of Kindergarten has a predictive validity of .62 with the spring o f Grade 1 CBM ORF (Good et al., 2002). The NWF measure uses a list o f nonsense words that the student has to either read or reproduce the letter sounds of each word in one minute. The predictive validity of NWF in January of Grade 1 with the CBM ORF in May of Grade 1 is .82 and with the CBM ORF in May of Grade 2 is .66 (Good et al., 2002). The DIBELS Oral Reading Fluency (DORF) assessment is a set of passages used to assess oral reading fluency from Grade 1 through to Grade 3 and has a median concurrent validity of .95 with the Test o f Reading Fluency (TORF) which is a version of the CBM ORF (Good et al., 2002). The norming project for the current study in the Prince George school district examined correlations between the four variables (ISF, LNF, PSF, and NWF) tested at the Kindergarten level (Fewster et. al., 2003). The Pearson correlation coefficients for ISF with the other three variables range from .426 to .593, the Pearson correlation coefficients for LNF with the other three variables range from .342 to .708, the Pearson correlation coefficients for PSF with the other three variables range from .342 to .593, and the Pearson correlation coefficients for NWF with the other three variables range from .446 to .708 (Fewster et. al., 2003). The reliability across the 3 month testing period for the norming project also shows stability with coefficients for three o f the Kindergarten test variables being .687 for PSF, .695 for ISF, and .741 for NWF. The reliability across the 6 month testing period for the norming project shows stability with 17 the coefficient for the fourth Kindergarten test variable LNF being .649. The norming project also examined correlations between the seven variables (PSF, NWF, LNF, ORF, WRC, TWW, and WSC) at the Grade 1 level. The Pearson correlation coefficients for PSF with the other six variables range from .231 to .559, for NWF with the other six variables the coefficients range from .428 to .821, for LNF the coefficients range from .422 to .738, for ORF the coefficients range from .231 to .925, for WRC the coefficients range from .309 to .925, for TWW the coefficients range from .413 to .944, and for WSC the coefficients range from .385 to .944 (Fewster et. al., 2003). The Grade 1 results for the norming project also indicate reliability across the testing period for the variables that were tested more than once, specifically PSF, NWF, and ORF. The Pearson correlation coefficients for these three variables over the testing period are .706 for PSF, .645 for NWF, and .903 for ORF (Fewster et. al., 2003). In summary the measures used in the Prince George norming project are good, reliable measures of early literacy skills in the Prince George school district. These measures comprise the data being analyzed in this study. Modification o f DIBELS Measures One study by Elliot et al. (2001) looked at modifying the DIBELS measures and investigating their technical adequacy for identifying Kindergarten children at risk for reading failure. The measures that are modified in their study are PSF and ISF. The measures for PSF and ISF are changed to Phoneme Segmentation Ability (PSA) and Initial Sound Ability (ISA), respectively, to differentiate these modified measures from the original DIBELS measures because the modified measures stress the measurement of accuracy instead of the measurement of fluency. The experimental measure o f Sound 18 Naming Fluency is also included in the study done by Elliot et al. (2001) because lettersound connections bave been instructed and measured extensively with children in Kindergarten. The results of the study indicate that initial support of SNF is positive but additional work is needed on instrumentation, improved training and administration of the PSA and ISA measures (Elliot et al., 2001). Gender Studies Relating to Literacy The current state o f a gender gap in literacy, with respect to reading and writing skills is without question. Numerous examples of females outperforming males can be found in assessment results across Canada. In a survey o f gender differences by Gambell and Hunter (2000), provincial exam results for Quebec, British Columbia and Saskatchewan indicated females outperform males in all literacy based courses such as English, French, Communications and Literature. While this data is interesting, for the purposes o f this study a closer examination of assessments for younger students is more appropriate. Foundation Skills Assessment Results The Foundation Skills Assessment (FSA) is a vital source o f information regarding basic literacy skills at the Grade 4, 7 and 10 levels. Every year in British Columbia over 140,000 students participate in the FSA that assesses reading comprehension, writing and numeracy in order to provide external information about performance levels in these basic skill areas and to evaluate how well these basic skills are being taught (British Columbia Ministry of Education, 2001). The British Columbia Ministry of Education cautions that the FSA results are just a snapshot o f students’ basic academic skills in relation to provincial standards and should be considered in 19 conjunction with numerous other forms of information collected by schools and districts (British Columbia Ministry of Education, 2001). For the purpose of this study only reading and writing assessment at the Grade 4 and 7 levels will be discussed. The reading comprehension portion of the FSA assessment consists of multiple-choice and written-response questions and the writing component consists of one longer, extended writing task and one shorter, focused writing task (British Columbia Ministry of Education, 2001). These questions are developed from the prescribed provincial learning outcomes that outline expectations of what students in British Columbia should know and be able to do. The results are reported by stating that students are at one of three levels: “exceeds expectations” which means the student has fully met or is beyond the expectations o f the grade level on this test; “meets expectations” which means the student meets the widely held expectations of the grade level on this test; and “not yet within expectations” which means the student does not yet have the skills to meet expectations o f the grade level on this test (British Columbia Ministry of Education, 2001). The measurement for these results is in percentages of student who fall into the various expectation categories. Statistical measurement has been used so results from year to year can be compared and comparisons of results between district and provincial levels can also be made. For discussion purposes in this study, percentages of students who “exceeds expectations” and “meets expectations” will be combined. The FSA results at the provincial level over the last 4 years for Grade 4 reading comprehension indicate the gender gap in favour of females has remained steady at about a 6 % difference for percentage o f students meeting or exceeding expectations (British 20 Columbia Ministry of Education, 2003a). The results are presented in Table 1. The provincial Grade 4 results for writing also indicate a gender gap in favour of females and it has ranged from a 5% to 8% difference for percentage o f students meeting or exceeding expectations. When examining the same provincial results for students at the Grade 7 level gaps similar to Grade 4 are present, ranging from a 5% to 7% difference in favour of females for reading comprehension. The Grade 7 results for writing indicate the gap is over twice as large as that at Grade 4 with a 13% to 18% difference in favour of females for percentage o f students meeting or exceeding expectations. Table 1 Provincial FSA Trends o f Percentage o f Students Meeting or Exceeding Grade Expectations by Gender Grade Level & Year Grade 4 2000 2001 2002 2003 Grade 7 2000 2001 2002 2003 Reading Female Reading Male Writing Female Writing Male 83 81 83 80 77 75 77 75 95 95 96 97 88 87 91 91 84 78 79 80 78 73 74 73 88 90 91 87 74 72 78 72 Another aspect worth discussing is the trend in the FSA data over the last 4 years for both female and male results at the provincial level (see Table 1). The reading comprehension results for Grade 4 females in the province over the last 4 years has shown a slight downward trend going from 83% meeting or exceeding expectations in 2000 to 80% in 2003. A similar trend is evident for Grade 4 males in reading 21 comprehension with a slight downward trend going from 77% meeting or exceeding expectations in 2000 to 75% in 2003. The trend in the provincial Grade 4 writing results is slightly upward for both males and females, going from 95% in 2000 to 97% in 2003 for females, and from 88% in 2000 to 91% in 2003 for males. The Grade 4 female and male trends follow the overall provincial trends. The trends for the provincial Grade 7 reading comprehension results also show a slight downward trend for both genders, going from 84% in 2000 to 80% in 2003 for females, and from 78% in 2000 to 73% in 2003 for males. The provincial Grade 7 writing results for both females and males indicate a small peak between 3% to 4% in 2002 from 2000, but then decline again in 2003. The Grade 7 female and male trends follow the overall provincial trends. For this study it is also relevant to review the FSA results for the Prince George school district which are recorded only for 2001 to 2003. The Prince George results over the past 3 years for Grade 4 reading comprehension indicate a gender gap favouring females by a 7% to 8% difference for percentage of students meeting or exceeding expectations, until 2003 where the gap is a 1% difference in favour o f males (British Columbia Ministry of Education, 2003b). Refer to Table 2 for percentages. This gender gap in reading for Prince George is 2% to 3% higher than the provincial gap and the males catching up and passing the females in the 2003 results for Prince George does not reflect the provincial results. The Prince George Grade 4 FSA results for 2003 match up with the Grade 4 participants in the current study of gender differences in Prince George. The Prince George Grade 4 writing results indicate a gender gap favouring females by a 7% to 11% difference over the past 3 years for percentage o f students meeting or exceeding expectations. Again the gender gap in Prince George is 2% to 3% 22 higher than the provincial gap for Grade 4 writing over the past 3 years. The Grade 7 reading comprehension results for Prince George indicate a gender gap favouring females by a difference of 6% to 13% over the past 3 years for percentage o f students meeting or exceeding expectations. The gap in Prince George at the Grade 7 level for reading is also larger than the provincial gap by 1% to 6%. The Grade 7 writing results for Prince George indicate an even larger gender gap than for reading, with a difference favouring females by 21% to 25% over the past 3 years for percentage of students meeting or exceeding expectations. Again the Prince George gap is higher, by 7% to 8%, than the provincial gap for writing at the Grade 7 level. The Grade 7 Prince George FSA results for 2003 match up with the Grade 7 participants in the current gender difference study for Prince George. With the exception of the grade 4 reading results in 2003 the Foundation Skills Assessment results in Prince George indicate a larger gender gap favouring females in literacy than for the overall provincial results. It is important to investigate these results because the Prince George School District has identified gender differences in literacy as an area of concern and made it a priority to address these gender differences in literacy in their district. 23 Table 2 Prince George FSA Trends o f Percentage o f Students Meeting or Exceeding Grade Expectations by Gender Grade Level Reading Reading Writing Female & Year Female Male Grade 4 n/a 2000 n/a n/a 2001 77 70 93 77 97 2002 69 72 73 94 2003 Grade 7 2000 n/a n/a n/a 72 2001 66 83 2002 78 73 88 76 63 2003 82 Writing Male n/a 82 90 86 n/a 62 74 57 It is particularly worthwhile to examine the Prince George results for any trends over the past 3 years due to the fact that the district has previously identified a concern for improving literacy levels especially among male students (see Table 2). The Grade 4 Prince George results for reading comprehension indicate two different trends for females and males over the past 3 years. The percentage of females meeting or exceeding expectations for reading has declined from 77% in 2001 to 72% in 2003, while the percentage o f males meeting or exceeding expectations for reading has risen from 70% in 2001 to 73% in 2003. The trend in reading for Grade 4 females in Prince George follows the provincial trend but the trend in reading for Grade 4 males in Prince George is opposite to the provincial trend. This trend in reading in the Prince George School District could indicate that the district is beginning to address the gender gap in literacy levels for males. The Grade 4 writing results indicate an altogether different trend from the reading 24 results. For both females and males the writing results peak in 2002, 97% of females and 90% of males meeting or exceeding expectations, and then the results decline in 2003 to 94% of females and 86% of males meeting or exceeding expectations. This Prince George Grade 4 trend for writing only partially follows the provincial trend, which does not experience a decline in 2003. The Grade 7 results in Prince George for reading comprehension for females and males also indicate two different trends. The percentage of females meeting or exceeding expectations has risen over the past 3 years from 72% in 2001 to 76% in 2003 with a peak of 78% in 2002. This mirrors the provincial trend for Grade 7 females in reading comprehension. The results in reading for males during this time period also experieneed a peak in 2002, of 73%, but overall from 2001 to 2003 the trend has indieated a deeline from 66% to 63% of students meeting or exceeding expectations. The provincial trend for Grade 7 males in reading comprehension remained stable during this time frame. The Prince George Grade 7 results for writing for both genders indicate a rise from 2001 to 2002, 83% up to 88% for females and 62% up to 74% for males, and then a decline in 2003, down to 82% for females and 57% males. The Grade 7 writing results for Prince George follow a similar trend in the provincial results but to a larger extent. The Prince George district results that peak in 2002, Grade 4 writing results for both genders and all the Grade 7 results for both genders, indicate an anomaly. Studies o f Specific Gender Differences Literacy is comprised o f many component skills so to say there are gender gaps in literacy is a very broad statement that needs to be more distinetly defined. The volume of studies and literature regarding gender differences in literacy will help with this task. 25 In Gambell and Hunter’s (2000) survey of gender differences in Canada, a crossCanada assessment of approximately 36000 students aged 13 and 16 years was completed as part of the School Achievement Indicators Programme (SAIF) Reading and Writing Assessment in English. One half of the sample completed a reading assessment with a follow up questionnaire detailing characteristics regarding demographics, education, curriculum, home, self-evaluation, and reading practices. The other half o f the sample completed a writing assessment followed up by a questionnaire regarding characteristics about the students, curriculum, home, self-evaluation, and writing practices. Several gender gaps became evident in reading and writing preferences, practices and attitudes (Gambell & Hunter, 2000). Some o f these items of difference include a greater percentage of females who: spend time reading for enjoyment; use reading strategies; rate themselves as confident readers; report liking to write; edit their writing; write down ideas as they think about the assignment; and use the dictionary when writing. Where gender gaps favour males there are: patterns of greater amounts of time spent on watching television; and using the computer to complete assignments. Another gender gap is evident in the genre preferred by readers. Females have much broader, more eclectic tastes in reading and were more aware of social issues than males. Some of these preferences, practices, and attitudes were found to predict reading and writing performances. Specifically, enjoyment o f reading, self-confidence with respect to reading, and use of context as a reading strategy predicted 20% to 29% o f the variation in reading test scores. Gambell and Hunter (2000) found that the results from the writing questionnaire did not have as much predictive power, only 10% to 20% of the variation in writing test scores could be predicted by editing practices, grammar handbook use, and 26 self-confidence as a writer. Gambell and Hunter’s (2000) study also lends some credence to the gender gap with respect to identification with genre and character-personification which could lead to assessment design bias on tests such as the SAIP. More research is needed to understand how the gender differences come about. In a study by Pomplun, Sundbye, and Kelley (1999) the Kansas Reading Assessment was used as a vehicle to examine the gender gap in performances on differing item formats, specifically constructed-response items. A total of 400 exam booklets were processed for female and male students at the Grade 7 and 10 levels. For the study done by Pomplun et al. (1999) students who had taken the regular assessment, a narrative passage accompanied by 8 to 12 objective items, were then asked to take the parallel assessment which consisted o f an expository passage accompanied by eight constructed-response questions. The variables measured had the following rater reliabilities; .66 for handwriting, .76 for mechanics errors, .91 for number of correct answers, .97 for total number of words written, .99 for number o f T-units written (a main clause plus any dependent structure), .86 for total number of reproductions, .51 for total number o f transformations, and .89 for total number of unrelated clauses produced by the student (p. 59). The results indicate that gender differences favouring females were found in number o f correct answers, reproductions, mechanics errors, handwriting, number of words written, T-unit length, and unrelated clauses which may explain why females perform better than males on constructed-response items. Another area of literacy to be examined for gender differences is the area of spelling ability. In a study by Allred (1990) 3000 students from Grade 1 through 6 (approximately 250 of each gender at each grade level) were tested using the 27 Comprehensive Tests of Basic Skills (CTBS) to assess proof-reading skills and a written spelling test (WST) using the same words from the CTBS. Data were collected in two ways, a count of females’ and males’ performances for each word on each test and analyses o f variance on the average differences across both tests by gender for each grade (Allred, 1990). The results indicate females in Grade 1 through 6 significantly outscored males on both the CTBS and the WST with allp values < .001. Gender differences in spelling relate to gender differences in reading achievement and Allred (1990) suggests that cultural expectations, specifically cross-cultural expectations placed on girls and boys with respect to sex-roles, play a large role in gender differences in reading but it is not the only cause. In a prior study done by Hedekar (1997) in the Prince George school district a gender difference favouring females was found in all the analyses for WSC and TWW for grades one through seven. A gender difference favouring females was also found in 14 of the 19 analyses for WRC for Grades 1 through 7 in the same study. The effect sizes, Cohen’s d, for all analyses in Hedekar’s (1997) study range from .15 to .78. Aboriginal Studies Relating to Literacy The term aboriginal was chosen to be used in this study because it is the term used by the British Columbia Ministry of Education and it refers to anyone of aboriginal ancestry which includes Status Indian, Non-Status Indians, Inuit, and Metis (British Columbia Ministry of Education, 2002). In British Columbia, students in the education system identify themselves as aboriginal on a voluntary, self-identifying basis in the September of each year (British Columbia Ministry of Education, 2002). The education system in British Columbia, and for that matter Canada, in both the 28 public and private sectors has a long and tragic history of failure with aboriginal peoples. This general failure continues today when graduation rates o f aboriginal students in British Columbia are considered. Even though graduation rates have been increasing, only 46% of aboriginal students completed high school in 2003 as compared to 79% for the entire province (British Columbia Ministry o f Education, 2004, p. 1). Following the progress of a group of Grade 8 cohorts, who started in the system in 1995, at Grade 9 about 5% of the aboriginal students, as compared to about 1% of non-aboriginal students, had left the system. Between Grade 11 and 12 the percentage of aboriginal students lost increases to about 30% as compared to about 6% for non-aboriginal students (British Columbia Ministry of Education, 2002). At the end o f the cohort period in 2000 o f those aboriginal students remaining only a little over 40% received their Dogwood graduation certificates as compared to a little over 70% for non-aboriginal students. This document shows that not only is there a large gap in graduation rates between aboriginal and non­ aboriginal students, but there is also a large gap in drop out rates at a fairly early age. This is another indication o f the failure of the education system with respeet to aboriginal students. Foundation Skills Assessment Results In addition to the gap in graduation and drop out rates there is vast documentation of the gap in achievement between aboriginal and non-aboriginal students. Some areas of achievement that have been documented in British Columbia are in the area o f literacy and numeracy under the auspices of the Foundation Skills Assessment (FSA) that is administered to grades 4, 7, and 10 students each year. Due to the scope o f this study being Kindergarten to Grade 7 students, only literacy results for grade 4 and 7 students 29 will be discussed. The two aspects of literacy that are measured by the FSA are reading comprehension and writing. As previously mentioned the British Columbia Ministry of Education cautions that the FSA results are just a snapshot o f students’ basic academic skills in relation to provincial standards and should be considered in conjunction with numerous other forms of information collected by schools and districts (British Columbia Ministry of Education, 2001). The FSA results at the provincial level over the last 4 years for Grade 4 reading comprehension indicate that the proportion of aboriginal students meeting or exceeding expectations is 21% to 24% less than for the province as a whole (British Columbia Ministry of Education, 2003a). The results are presented in Table 3. The gap between aboriginal and provincial FSA results for writing at the Grade 4 level over the last 4 years is smaller with differences ranging from 9% to 13% (British Columbia Ministry of Education, 2003a). When examining the same provincial results for students at the Grade 7 level similar gaps are present, ranging from 23% to 25% for reading comprehension, and 18% to 21% for writing. 30 Table 3 Provincial FSA Trends o f Percentages o f Students Meeting or Exceeding Grade Expectations by Aboriginal Status Grade Level Reading & Year Aboriginal Grade 4 56 2000 2001 55 2002 56 2003 56 Grade 7 2000 57 51 2001 52 2002 53 2003 Reading All Writing Aboriginal Writing All 79 78 80 77 78 77 84 85 91 91 94 94 81 76 76 77 60 61 66 61 81 81 84 79 Another issue worth mentioning is the trend in the FSA data over the last 4 years for both the provincial and aboriginal results (see Table 3). The Grade 4 reading comprehension data for the province indicates an insignificant increase in 2002 but then decreases again in 2003, while the aboriginal results replicate the increase in 2002 but remain steady for 2003. The Grade 4 writing data for the province indicates a slight increase for 2002, but the aboriginal results for this measure indicate a larger increase of 7% in 2002, over twice the size of the increase for the province as a whole. The trends for the Grade 7 measures for reading comprehension for both the provincial and aboriginal results indicate a similarly significant decrease in 2001 and then both begin to increase slightly in 2003. The trend for the Grade 7 writing measures for both the provincial and aboriginal results indicate an increase in 2002 and then both decrease by 5% in 2003. Overall, when comparing the 2000 to 2003 results of the reading comprehension and writing measures for both grades, the trends for the aboriginal and the 31 provincial data are very similar with the exception of the aboriginal writing result in 2002 which had an increase two times that o f the provincial increase. For this study it is relevant to review the FSA results for the Prince George school district as well (see Table 4). The Prince George results over the last 3 years for the Grade 4 reading comprehension measure indicate that the proportion o f aboriginal students meeting or exceeding expectations is 14% to 17% less than for the district as a whole (British Columbia Ministry o f Education, 2003b). The gap between aboriginal and district FSA results for writing at the Grade 4 level for the last 3 years is slightly smaller than for reading with the exception in 2002 where the gap is only a 5% difference. The gap, between aboriginal and district results for Grade 7 students, ranges from 17% to 21% for the reading measures for the 3 year period, but for the writing measure the gap ranges from 4% to 20%. Table 4 Prince George FSA Trends o f Percentages o f Students Meeting or Exceeding Grade Expectations by Aboriginal Status Grade Level Reading Reading Writing Writing All & Year Aboriginal Aboriginal All Grade 4 2000 n/a 76 n/a n/a 2001 60 74 77 88 94 57 73 2002 89 55 72 78 90 2003 Grade 7 n/a 2000 n/a 79 n/a 52 2001 69 53 73 2002 54 63 81 75 52 65 69 2003 69 When examining the Prince George district data from 2001 to 2003 some trends 32 are indicated (see Table 4). For the Grade 4 reading comprehension measures there is a slight downward trend for both the aboriginal and district results from 2001 to 2003. For the writing measure at Grade 4 there is a bit of an anomaly in 2002 for aboriginal results which increase significantly in that year alone. The district results for the Grade 4 writing measure also increase but not as significantly. At the Grade 7 level for the reading comprehension measure the aboriginal and district results have similar trends of a slight increase in 2002 and then in 2003 the results return to the 2001 level. For the Grade 7 writing measure the aboriginal results increase significantly in 2002 and continue with a slight increase for the next year. The Grade 7 writing measure results for the district show a similar significant increase in 2002 but then the next year drop back to the 2001 level. In summary the Prince George district FSA results for Grade 4 appear to have a slight downward trend in reading comprehension and a bit o f an anomaly in 2002 for writing. The Grade 7 results have a somewhat level trend for reading and like the Grade 4 results indicate an anomaly for writing in 2002. To complete the review o f FSA results for reading and writing it is necessary to compare the aboriginal gap at the district level to the aboriginal gap at the provincial level. The aboriginal gap at Grade 4 for reading comprehension is 7% less at the district level than that for the provincial level. The aboriginal gap at Grade 4 for writing is similar at both the district and provincial levels. For Grade 7 the aboriginal gap for reading comprehension is again smaller at the district level, by about 4% to 5% in this case. The Grade 7 aboriginal gap for writing is again similar at both district and provincial levels. When comparing provincial trends to district trends for aboriginal FSA results, 33 from 2000 to 2003, there are no similarities. At the Grade 4 level for reading the provincial trend is stable whereas the district trend shows an overall decline of about 5%. For writing at the Grade 4 level the provincial trend indicates an overall increase o f 7%, the district trend indicates an anomaly in 2002 where the results increased by 12% and then dropped again by 11% in 2003. The trend in Grade 7 reading results for aboriginal students at the provincial level indicates a decline from 2000 to 2003 whereas the district results remain stable. The Grade 7 writing results provincially for aboriginal students indicates a small peak in 2002 whereas the district results indicate a steady rise over the same time period. Studies o f Specific Aboriginal Differences In reviewing other literature regarding aboriginality and literacy the differences between aboriginal and non-aboriginal are not always quantifiable performance scores. There are many different types, modes and uses o f literacy. Curwen Doige (2001) points out that aboriginal literacy has been neither respected nor explicated throughout our history nor has it been accepted as part o f the definition o f being aboriginal. She goes on to say that reading and writing are the most narrow definition of literacy and that the language and symbols of aboriginal literacy communicate history, culture, knowledge, tradition, and systems of education and understanding: in other words literacy is vitally connected to who we are. Gaikezehongai (2003) also addresses the important contributions to aboriginal literacy made by aboriginal prophecies, history and traditional teachings being passed down. A similar point is made by Dunn (2001) with respect to the Australian aboriginal people when she talks about implementing a culturally responsive pedagogy that includes things such as knowledge of Australian aboriginal 34 social history, culturally appropriate literacy education, recognizing and addressing group and individual learning preferences, and accepting a child’s primary discourse as legitimate. Differences between aboriginal and non-aboriginal students with respect to attitudes about literacy are addressed by Ward, Shook, and Marrion (1993) in their research regarding attitudes about writing in a cross-cultural setting. The study carried out by Ward et. al. (1993) in Lytton, British Columbia surveyed students in Grade 1 and two about what they thought the purpose of writing was, their personal writing preferences, and their self-concept as writers. The results indicate that aboriginal students were not able to list as many forms o f writing as the non-aboriginal students, a higher proportion of aboriginal than non-aboriginal students enjoyed writing stories, and a slightly higher percentage o f aboriginal than non-aboriginal children saw themselves as good writers. This study in the Prince George school district examines aboriginal differences in reading and writing fluency as well as the previously mentioned gender differences. The earlier study done in Prince George by Hedekar in 1997 does not examine aboriginal differences in literacy due to the political direction given at that time; the Aboriginal Education Board did not want a separate study undertaken on aboriginal students (P. D. MacMillan, personal communication, June 3, 2004). As well, in Hedekar’s 1997 study, relative age differences were examined with respect to reading and writing fluency but due to a lack of significant differences the variable of relative age was not included in this study. 35 CHAPTER THREE: METHODS This chapter contains three sections. The first section describes the participants who were tested and how they were selected for the CBM/DIBELS norming project and this study. The second section explains the test instruments used for the CBM/DIBELS norming project and this study. The third section is a description of the procedures followed for my research. Participants This study uses the CBM/DIBELS norming data, which is an intact data set collected by teachers and district staff in School District #57 (Prince George) during the 2002-2003 school year. Therefore, this researcher did not select the participants or collect the data. The district (SD #57) deemed no signed consent forms for student participation were required because the data consisted of measures routinely collected by the school district. See Foulds (2002) or Fewster and MacMillan (2002) for earlier instances of these procedures. Participants were selected using stratified random sampling of the elementary school population from Kindergarten to Grade 7. Participants in the study comprise approximately 20% o f the total elementary student population. Each school has provided approximately 20% of its total school population. In the Technical Report of the CBM Norming Project, Fewster et. al. (2003) indicate there were a total of 2272 students used in the norming sample from Kindergarten to Grade 7. The breakdown for each grade is as follows: 245 Kindergarten students, 248 Grade 1 students, 265 Grade 2 students, 281 Grade 3 students, 308 Grade 4 students, 277 Grade 5 students, 313 Grade 6 students, and 335 Grade 7 students. Students participating in the norming project were tested three times throughout the 36 school year, once each in October, January and April. Data for all three norming periods was cleaned and entered into SPSS 9 (Fewster et. al., 2003). Therefore, no further cleaning o f the data was required by this researcher. Instruments The Kindergarten and Grade 1 participants for the CBM/DIBELS norming project were given a different series of tests from their older counterparts. Both Kindergarten and Grade I students were tested on Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), and Letter Naming Fluency (LNF). Only Kindergarten students were tested on Initial Sound Fluency (ISF). Only Grade I students were tested on Oral Reading Fluency (ORF) and only in the January and April testing periods. The Kindergarten scores for PSF and NWF were recorded only for the January and April periods whereas these scores for the Grade I participants were scored for all three testing periods. Scores on LNF were recorded for Grade I students in October only, but were recorded for all three periods for the Kindergarten students. See Table 5 for a complete schedule o f the testing times for the Kindergarten and Grade 1 DIBELS measures. See Table 6 for a complete description of the DIBELS variables. 37 Table 5 Schedule o f Testing Periods fo r DIBELS Measures DIBELS Measures Kindergarten ISF LNF PSF NWF ORF Grade I ISF LNF PSF NWF ORF Fall (October) Winter (January) Spring (April) X X X X X X X X X X X X X X X X X Participants from Grade 2 through Grade 7 were tested on Total Words Written (TWW), Words Spelled Correctly (WSC), and Words Read Correctly (WRC). Grade 1 students were also tested on TWW, WSC, and WRC but only for the April testing period. See Table 6 for variable descriptions. See Fewster et. al. (2003) for further details about any aspect o f the norming project. 38 Table 6 Description o f DIBELS and CBM Variables Variable Description 1ST Number o f correctly identified and produced initial sounds (of an orally presented word) in 1 minute LNF Numbero f letters (upper and lower case) correctly named in 1 minute PSF Number of correct phonemes (in 3- and 4-phoneme words) produced in1 minute NWF Numberof correct letter-sounds produced or read from nonsense words in 1 minute ORF Number of words read correctly on a 1 minute to read passage WRC Number o f words read correctly on a 1 minute to read passage WSC Number of words spelled correctly in a 3 minute written response to a verbal cue TWW Total number o f words written in a 3 minute written response to a verbal cue (highly correlated with WSC) An analysis performed by Fewster et. al. (2003) in the Technical Report of the Curriculum Based Measurement Norming Project provides evidence that none of the probes used in the testing showed any significant difference in difficulty level from the others (p. 29). Therefore, for this study, the reading and writing probes used at each grade level will be considered equivalent. Also from Fewster et. al.’s (2003) analysis is evidence that there is a high correlation between Total Words Written (TWW) and Words Spelled Correctly (WSC), .94 < r < .99, and a low to moderate correlation between TWW and Words Read Correctly (WRC), 2 1 < r< .49. Correlations across the 6 month norming period for both TWW and WRC show consistency and good stability with 39 coefficients ranging from .59 to .65 and from .81 to .86 respectively (Fewster et. al., 2003). For the DIBELS data in Fewster et. al.’s (2003) analysis at the Kindergarten level there is a low to moderate correlation among the four variables tested (ISF, LNF, PSF, and NWF), .342 < r < .708. Correlations across the 3 month norming period for PSF, ISF, and NWF and the 6 month norming period for LNF show consistency and good stability with coefficients of .687, .695, .741, and .649 respectively. The results for the Grade 1 DIBELS and CBM data indicates a low to high correlation among the seven variables tested (PSF, NWF, LNF, ORF, WRC, TWW, and WSC), .231 < r< .944. At the Grade 1 level for the three DIBELS variables that were tested more than once (i.e.: PSF, NWF, and ORF), the correlations across the 6 month (for PSF and NWF) and 3 month (for ORF) norming period show consistency and good stability with coefficients of .706, .645, and .903 respectively. Procedures The data that have been collected for School District #57 (Prince George) for the CBM/DIBELS Norming Study 2002/2003 will be used to investigate gender and aboriginal differences in Kindergarten to Grade 7 students with respect to their CBM reading, writing and DIBELS scores. The data were collected by the school district during the 2002-2003 school year, after which John Cook prepared a technical report for the school district under the supervision o f Dr. Peter MacMillan of the University of Northern British Columbia. Due to the fact that this study is using an intact data set ethics approval was obtained from the University o f Northern British Columbia prior to proposal approval. Relevant documentation is located in the Appendix. 40 The DIBELS data for Kindergarten and Grade 1 will be analyzed with a series of 2 X 2 gender-by-aboriginal status ANOVA using the SPSS statistical program to determine if there are any effects attributable to gender or aboriginal status and also for the variables of Initial Sound Fluency (ISF), Letter Naming Fluency (LNF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), and Oral Reading Fluency (ORF). A total oï\% ANOVA's were performed in order to examine the five variables for Kindergarten and Grade I over the three testing periods. Descriptive statistics will be reported by grade and then by all the differences. Data will be examined across test periods and grades for consistency and then a Bonferroni correction will be applied, a ! n (test) = a p (e.g. .05/3 = .016). No multivariate statistical testing will be applied. The CBM data sample o f students in Grade 1 through 7 will also be analyzed with a series of 2 X 2 gender-by-aboriginal status ANOVA using the SPSS statistical program. A determination will be made as to whether or not there are any effects attributable to gender or aboriginal status for the variables of Words Read Correctly (WRC), Words Spelled Correctly (WSC) and Total Words Written (TWW). A total of 57 ANOVA's were performed in order to examine the three variables (WRC, WSC and TWW) for each grade level for the three different testing periods. Descriptive statistics will be reported by grade and then by all the differences. Data will be examined across test periods and grades for consistency and then a Bonferroni correction will be applied, a ! n (test) = aP (e.g. .05/3 = .016). No multivariate statistical testing will be applied. 41 CHAPTER FOUR: RESULTS The results of the data analysis will be discussed in three parts. Part one will discuss the results o f the analysis o f the DIBELS data. These data have been analyzed for differences in early literacy skills in Kindergarten and Grade 1 for gender, aboriginal status and the interaction between these two independent variables. The second part will discuss the results of the analysis o f the CBM data. These data have been analyzed for differences in reading and writing fluency from Grade 2 to 7 for gender, aboriginal status and the interaction between these two variables. Part three will discuss effect sizes and trends for the analysis of both the DIBELS and CBM data. Results o f the DIBELS Data Analysis The early literacy skills measured in this study include Initial Sound Fluency (ISF), Letter Naming Fluency (LNF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), and Oral Reading Fluency (ORF). Kindergarten and Grade 1 students were tested using these DIBELS variables during the recommended testing periods (see Table 5 in Chapter 3). The sample sizes varied slightly from testing period to testing period and from grade to grade. The largest sample size was 252 for Grade I at the January testing of Phoneme Segmentation Fluency. The smallest sample size was 180 for Kindergarten at the January testing of Nonsense Word Fluency. The most common sample size was in the 240's. See Table 7 for sample sizes for all DIBELS results. 42 Table 7 Descriptive Statistics fo r ISF, LNF, PSF, NWF, and ORF Grade and testing period ISF Kindergarten October January LNF Kindergarten October January April Grade 1 October PSF Kindergarten January April Grade 1 October January April NWF Kindergarten January April Grade 1 October January April ORF Grade 1 January April N M SD Skew SE o f Kurtosis SE o f Skew Kurtosis 245 242 11.16 14.09 5.57 10.42 1.34 1.00 .16 .16 224 1.05 .31 .31 245 243 241 10.04 20.06 29.85 11.41 14.94 15.78 2.07 .91 .32 .16 .16 .16 7.05 1.50 .00 .31 .31 .31 248 33.17 17.04 .25 .16 -.58 .31 242 240 14.31 20.65 15.06 16.41 1.26 .47 .16 .16 1.37 - .84 .31 .31 248 252 231 24.50 35.90 41.07 19.05 18.83 16.44 .55 .06 -.42 .16 .15 .16 -.67 -.43 -.21 .31 .31 .32 180 239 7.01 14.89 9.08 13.87 1.63 1.93 .18 .16 2J8 7.26 .36 .31 249 251 233 19.77 37.41 53.59 17.06 21.48 30.40 1.91 .65 .89 .15 .15 .16 7.70 .77 .54 .31 .31 .32 250 232 19.73 39.24 20.79 28.29 1.93 1.03 .15 .16 4.17 .73 .31 .32 43 For each variable tested the mean score increased over the testing periods for each grade and across grades. The standard deviation also increased from testing period to testing period for each variable with the exception of the PSF testing for Grade 1, which shows a decrease in the standard deviation over the testing periods. The other statistic to note is the increase o f the standard deviation for ISF from the October testing to the January testing at the Kindergarten level. This January standard deviation is almost twice that of the October standard deviation. For a number of testing results the magnitude o f skewness was six times the standard error. These cases include: the ISF testing for Kindergarten in October and January; the LNF testing for Grade 1 in October (see Figure 1); the PSF testing for Kindergarten in January; the NWF testing for Kindergarten in January and April and for Grade 1 in October; and the ORF testing for Grade 1 in both January and April. The skew in these cases would indicate that some students have acquired the skill being tested but most have not. One testing period is negatively skewed (the Grade 1 April testing of PSF). This raises little concern due to the assumption that for equal and unequal «’s, skewed populations have very little effect on the level of significance or power (Glass & Hopkins, 1996). In addition, the fact that a directional or one-tailed test is not being performed means the skew is o f no consequence. 44 OCT LNF K 80 Std. Dev =11.41 Mean = 10.0 N = 245.00 0.0 10.0 5.0 20.0 15.0 30.0 25.0 40.0 35.0 50.0 45.0 60.0 55.0 70.0 65.0 80.0 75.0 85.0 O C T _L N F _K Figure 1. Histogram of scores and their frequency for the October testing o f LNF at the Kindergarten level, showing a positively skewed, leptokurtic distribution. A number of the testing results are leptokurtic with a kurtosis o f six times the standard error. These cases include: the ISF testing for Kindergarten in October; the LNF testing for Kindergarten in October (see Figure 1); the NWF testing for Kindergarten in both January and April and for Grade 1 in October; and the ORF testing for Grade 1 in January. A number o f the testing results are also platykurtic (see Table 7). The kurtosis effects are slight with the actual a being less than the nominal a in leptokurtic populations and the actual a exceeding the nominal a in platykurtic populations (Glass & Hopkins, 1996). At the Kindergarten level a 2 X 2 between groups ANOVA (gender by aboriginal status) was run wherever data existed for the three testing periods (October, January, and 45 April). The four variables analyzed for Kindergarten are: initial sound fluency (ISF), letter naming fluency (LNF), phoneme segmentation fluency (PSF), and nonsense word fluency (NWF). At the Grade I level a 2 X 2 between groups JAOFX (gender by aboriginal status) was run for the three testing periods (October, January, and April) where data existed. The four variables analyzed for Grade 1 are: letter naming fluency (LNF), phoneme segmentation fluency (PSF), nonsense word fluency (NWF), and oral reading fluency (ORF). A total of 18 analyses of variance were calculated for Kindergarten and Grade 1 students. Values of F’andp are reported in Table 8. The degrees of freedom between (J - 1) is always equal to 1 when there are two genders or two categories of aboriginal status. The degrees o f freedom within ( N - J ) are always V - 2 for the main effect and N - J K for the interaction so for all analyses of variance these will not be shown in the respective tables. Summaries of the DIBELS analyses of variance are found in Table 8 for gender, aboriginal status and the interaction o f gender and aboriginal status (G X Ab). Analysis o f variance that are significant aXp< .05 are marked with a single asterisk, analysis of variance that are significant at < .01 are marked with a double asterisk. To examine the assumption of homogeneity of variances the Levine’s test was run for all analyses of variance and for all cases there was no violation of this assumption (all p > .05). O f the 18 analyses of variance performed, every calculation indicated there were no significant interactions between gender and aboriginal status for early literacy skills (all ^ > .10). Therefore all main effects can be interpreted without reference to any interaction. The results are found in the G X Ab rows of Table 8. 46 There are five cases where gender differences are evident for early literacy skills. All three testing periods for phoneme segmentation fluency (PSF) at the Grade 1 level had a significant gender difference: F ( l , 248) = 5.107,/» < .05; F ( l , 252) = 10.343, p < .05; 7^(1, 231) = 10.198,/» < .05. The other two cases where significant gender differences occurred were at the Grade 1 level for nonsense word fluency (NWF) in October and for oral reading fluency (ORF) in April: F (1, 249) = 6.713,/» < .05; F (1, 232) = 4.334,/» < .05. All other early literacy skills analyses did not indicate a significant gender difference. With only 5 of the ANOVA results showing a significant gender difference, there is not consistent evidence of a gender difference across Kindergarten and Grade 1 for early literacy skills. If a modest Bonferroni correction for the two or three testing periods (e.g.: a / 2 = .025, a / 3 = .016) in a year is applied there would only be 3 of the 18 ANOVA results showing a significant gender difference. A significant difference (p < .05) between aboriginal students and non-aboriginal students was detected in 15 of the 18 analyses of variance for early literacy skills. The three cases where the results were non-significant all occurred at the Kindergarten level. The non-significant results occurred in the January and April testing o f nonsense word fluency (NWF) and in the April testing o f phoneme segmentation fluency (PSF): F ( l , 180) = 2.824, p >. 05; F(1, 239) = 1.119, p > .05; F (1, 240) = 2.677,/? > .05. With 15 of the 18 ANOVA results showing a significant difference there is consistent evidence of an aboriginal status/non-aboriginal status difference for early literacy skills across Kindergarten and Grade 1. 47 Table 8 Analysis o f Variance fo r Gender and Aboriginal D iff in ISF, LNF, PSF, NWF, ORF October F January April Source F F P P ISF Kindergarten Gender 0.447 .504 0.348 356 5.278 Aboriginal .022* 5.654 .018* 0.028 GXAb 367 0.195 .659 LNF Kindergarten Gender .127 2339 3.512 3.119 362 Aboriginal 7399 .006** 4.151 4.480 343* GXAb 0.213 .645 0.049 0.054 326 LNF Grade 1 Gender .097 2768 Aboriginal 93K5 .002** 0.089 .765 GXAb PSF Kindergarten Gender 0.002 0.374 .542 Aboriginal 2.677 8327 303** GXAb .264 2T52 L256 PSF Grade 1 Gender 5.107 325* 10.198 10.343 301** 11.788 Aboriginal 301** 14.087 6.276 300** 1.609 .206 GXAb 0.172 0.312 .678 NWF Kindergarten Gender 1.675 .197 0366 Aboriginal .095 1.779 2.824 2.161 .143 1.075 G XAb NWF Grade 1 Gender 6.713 1.744 310* 2.050 .154 10.146 Aboriginal 302** 10.616 12.685 .001** 0.080 .778 GXAb .572 0.320 0393 ORF Grade 1 Gender 1.241 4.334 .266 Aboriginal 10.874 15.529 301** 0.019 .890 0.060 G XAb Note; * p < .05, ** p < .01; p < .0005 is recorded as .000 48 P .079 335* .816 .963 .103 .144 .002** 313* .577 .546 .184 .301 .188 300** .531 .038* 300** .807 Results o f the CBM Data Analysis Grade 2 to 7 students were tested for reading and writing literacy using CBM measures of Words Read Correctly (WRC), Words Spelled Correctly (WSC), and Total Words Written (TWW). Grade 1 students were also tested using all three CBM measures but were only tested in the April testing period. The sample sizes for these variables varied from testing period to testing period and grade to grade. The largest sample size was 335 (Grade 7) and the smallest sample size was 247 (Grade 1). The average sample size was 284. For each o f the three variables tested the mean score increased over the testing periods for each grade. The standard deviation remained relatively constant for the WRC results but for the WSC and TWW results the standard deviation doubled from the October testing in Grade 2 to the Grade 6 testing in October. The majority of the testing results for WRC, WSC, and TWW are normally distributed with a skew of two times the standard error or less (see Figure 2 for an example o f this); see Table 9, 10, and 11 for complete results. For a small number o f test results the magnitude o f skewness is six times the standard error and occurs at the Grade 1 level for WRC in April and at the Grade 2 level: once for the October testing o f WRC; and again for the October and January testing of WSC. The highly positive skew for WSC at Grade 2 indicates that some students performed well at this skill but most students were not performing well at this skill. There are two testing periods that are very slightly negatively skewed. These include the Grade 6 January and April testing of WRC. This small number o f skewed results raises little concern due to the assumption that for equal and unequal «’s, skewed populations have very little effect on the level of 49 significance or power (Glass & Hopkins, 1996). The fact that a directional or one-tailed test is not being performed means the skew is o f no consequence. APR WSC 50 A std. Dev = 14.84 Mean = 43.1 N = 309.00 1.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 90.0 15.0 25. 0 35.0 45.0 55.0 65.0 75.0 85.0 A PR W SC Figure 2. Histogram of scores and their frequency for the April testing of WSC at the Grade 4 level, showing a normal (non-skewed), and mesokurtic distribution. The majority o f testing results are also mesokurtic with a kurtosis of two times the standard error or less. A very small number o f the testing results are leptokurtic with a kurtosis of six times the standard error. These three cases include: the Grade 2 October and April testing of WSC; and the Grade 3 October testing of TWW. A number o f the testing results are also slightly platykurtic (see Tables 9, 10, and 11). The kurtosis effects are slight with the actual a being less than the nominal a in leptokurtic populations and 50 the actual a exceeding the nominal a in platykurtic populations (Glass & Hopkins, 1996^ 51 Table 9 Descriptive Statistics fo r WRC n M SD Skew SE of Skew Kurtosis SE of Kurtosis Grade 1 April 247 36.02 29.60 1.15 .16 .80 .31 Grade 2 October January April 266 264 265 51.72 6T65 81.03 39.57 39.80 4232 1.00 .31 .33 .15 .15 .15 .75 -^3 -.46 .30 .30 30 Grade 3 October January April 281 282 281 88^a 101.72 110.31 40.26 41.62 39.47 .29 .29 .18 .15 .15 .15 -.34 -.01 -.03 .29 .29 .29 Grade 4 October January April 309 309 309 1()2.89 114.07 120.29 40.89 40.13 38^0 .10 .17 .07 .14 .14 .14 -.76 -.37 -36 .28 .28 .28 Grade 5 October January April 278 277 276 115.05 121.50 130.57 3&08 37^3 38^5 -.02 .08 -.09 .15 .15 .15 -.37 -.27 -32 .29 .29 .29 Grade 6 October January April 313 310 312 128.01 131.48 137.78 38J5 39.70 38.17 -.07 -.18 -.21 .14 .14 .14 -32 -.18 .22 .28 .28 .28 Grade 7 October January April 334 335 335 135.32 139.16 143.93 40.49 40.66 40.18 .29 .18 .14 .13 .13 .13 -.27 -32 -.07 .27 .27 .27 Grade and testing period 52 Table 10 Descriptive Statistics fo r WSC n M Skew SE o f Skew Kurtosis SB o f Kurtosis Grade 1 April 247 9.77 7.03 .92 .16 .50 .31 Grade 2 October January April 264 267 265 12.72 17.98 22.63 8.05 9.47 10.83 1.21 1.01 .97 .15 .15 .15 137 1.59 1.70 .30 .30 .30 Grade 3 October January April 281 283 279 23TW 28.34 31.72 10.94 12.19 12.25 .92 .39 .32 .15 .15 .15 1.52 .01 .34 .29 .29 .29 Grade 4 October January April 307 307 309 32.07 36A2 43.12 12.54 13.22 14.84 .30 .12 .02 .14 .14 .14 .11 -.05 .11 .28 .28 .28 Grade 5 October January April 278 280 277 40.84 43.84 49.17 14.21 14.10 15.74 .22 .03 .23 .15 .15 .15 -.20 .07 .79 .29 .29 .29 Grade 6 October January April 313 312 311 51.01 53J6 5 6 j# 16.54 16.17 17.33 .20 .26 .15 .14 .14 .14 .04 -.06 .68 .28 .28 .28 Grade 7 October January April 335 333 334 59.40 60.87 63.29 16.43 16.80 16.90 .21 .36 .41 .13 .13 .13 -.12 .37 1.38 .27 .27 .27 Grade and testing period 53 Table 11 Descriptive Statistics fo r TWW n M SD Skew SE o f Skew Kurtosis SE of Kurtosis Grade 1 April 247 13.45 &28 .71 .16 .13 .31 Grade 2 October January April 264 267 265 16.80 22.21 26TW &84 9.81 10.98 .72 .66 .73 .15 .15 .15 1.04 .88 1.30 .30 .30 .30 Grade 3 October January April 281 283 279 2&59 31.90 35.01 11.06 12.16 12.39 .96 .34 .23 .15 .15 .15 2.10 .05 .61 .29 .29 .29 Grade 4 October January April 307 307 309 35.44 39.28 46.03 12.89 13.54 15.00 .36 .10 .05 .14 .14 .14 .22 -.02 .31 .28 .28 .28 Grade 5 October January April 278 280 277 43.73 46.53 51.64 14.37 1426 15.77 .23 -.03 .20 .15 .15 .15 -.17 .25 .90 .29 .29 .29 Grade 6 October January April 313 312 311 53.75 55.71 59.14 16.38 15.87 17.19 .18 .28 .17 .14 .14 .14 .25 .06 .89 .28 .28 .28 Grade 7 October January April 335 333 334 61.82 63.20 65.40 16.62 16.99 16.77 .24 .36 .39 .13 .13 .13 -.12 .36 1.40 .27 .27 .27 Grade and testing period 54 For the CBM data a 2x2 between groups ANOVA (gender by aboriginal status) was run for each testing period for each o f the three variables: WRC, WSC, and TWW. A total of 57 analyses of variance Ire calculated. Values of F and p are reported in Tables 12, 13, and 14. The degrees o f freedom between {J - 1) is always equal to 1 when there are two genders or two categories of aboriginal status. The degrees of freedom within ( N - J ) are always V - 2 for the main effect and N - J K for the interaction so for all analyses o f variance these will not be shown in the respective tables. Summaries o f the CBM analyses o f variance are found in Tables 12, 13, and 14 for gender, aboriginal status and the interaction of gender and aboriginal status (G X Ab). Analysis of variance that are significant at/> < .05 are marked with a single asterisk, analysis of variance that are significant at/i < .01 are marked with a double asterisk. As with the DIBELS data, in order to examine the assumption o f homogeneity of variances the Levine’s test was run for all analyses of variance for the CBM data and for all cases there was no violation of this assumption (all p > .05). Of the 57 analyses of variance performed, every calculation indicated there were no significant interactions between gender and aboriginal status for reading and writing fluency at the .05 probability level and only 3 of the 57 for which /> < .10. The results are found in the G X Ab rows of Tables 12, 13, and 14 and the lack of interaction between gender and aboriginal status is well illustrated in Figure 3 by the parallel lines. 55 Estimated Marginal Means of OCT__WRC GRADE: c 50 2 40 GENDER F M LU 30,, Aboriginal Status Figure 3. Line graph of estimated marginal means for male/female and aboriginal/non­ aboriginal for the Grade 2 October testing of WRC, showing there is no interaction between the variables o f gender and aboriginal status. As illustrated in Figure 3, when the end points are subtracted the resulting gender gaps are approximately the same for both the non-aboriginal and aboriginal groups. When comparing the difference between the non-aboriginal and aboriginal end points the amount is approximately the same for both genders. There are three cases where gender differences are evident for reading fluency (WRC). In 3 of the 19 ANOVA's for WRC, significant gender differences were found. A significant gender difference was found for the April Grade 2 reading analysis: F { \ , 265) = 5.192, p < .05 and the Grade 6 October and January reading analyses: 56 F ( l , 266) = 4.388,p < .05; F ( l , 264) = 5.183,/? < .05. All other reading analyses did not indicate a significant gender difference. If a Bonferroni correction o f a / 3 = .016 is applied none of the 19 ANOVA’s indicate a significant difference. In the case of writing fluency there is evidence o f a significant gender difference. In 17 o f the \9 ANOVA’s for WSC and in 18 of the \9 ANOVA’s for TWW significant gender differences were found {p < .05). The analyses which did not have a significant gender result were found in the Grade 2 October and Grade 3 October results for WSC: F ( l , 264) = 3.769,/? > .05; 7^(1, 281) = 3.813,/? > .05. The other writing analysis that did not have a significant gender difference was the Grade 3 October result for TWW : f (1, 264) = 1.346,/? > .05. The results for WSC and TWW are so similar because these two variables are very highly correlated. If a Bonferroni correction of a / 3 = .016 is applied, 17 of the \9 ANOVA’s for WSC are still statistically significant. With only 3 of the \9 ANOVA results showing a significant gender difference, there is not consistent evidence o f a gender difference across all grades for reading fluency. In the case of writing fluency, with 17 of the 19 ANOVA results for WSC showing a significant gender difference, there is consistent evidence of a gender difference across all grades. The ANOVA results for aboriginal differences for WRC, WSC, and TWW are found in Tables 12, 13, and 14 respectively. As previously mentioned a total o f 57 analyses of variance were calculated. A significant difference between aboriginal students and non-aboriginal students was detected in 14 o f the 19 ANOVA’s for WRC. There were five cases where no significant differences in reading fluency between aboriginal and non-aboriginal students were found. These five cases are found at the 57 Grade 4 and 5 levels for various testing periods. These occurred in the January and April Grade 4 reading tests: F ( l , 309) = .673,/? > .05; F {\, 309) = 3.718,/? > .05; and in the October, January, and April Grade 5 reading tests: F (1, 278) = .000,/? > .05; F (I, 277) = .060,/? > .05; F ( l , 276) = .003,/? > .05. With the lack of significant results showing an aboriginal, non-aboriginal difference at the Grade 4 and 5 level for reading fluency it is difficult to state that there is a difference across all grade levels but there is a significant difference in reading fluency between aboriginal and non-aboriginal students at the Grade 2, 3, 6, and 7 level. This also indicates a lack of an explainable trend. A significant difference between aboriginal and non-aboriginal students for writing fluency is not as evident. In 7 o f the 19 ANOVA's calculated for WSC there was a significant difference detected. These differences occurred in the Grade 1 testing in April: F ( l , 247) = 9.632,/? < .05, the Grade 2 testing in January: F ( l , 267) = 12.26, p < .05, the Grade 4 testing in October: F (1, 307) = 9.49,/? < .05, the Grade 5 testing in April: F ( l , 277) = 6.403,/? < .05, the Grade 6 testing in January and April: F ( l , 312) = 3.975,/? < .05; F (1, 311) = 4.331,/? < .05, and the Grade 7 testing in April: F ( l , 334) = 5.531,/? < .05. Due to the high correlation between WSC and TWW the results for TWW were very similar to those for WSC. The effect sizes for six out of seven of these significant differences were all small. With 12 of the 19 ANOVA results showing no significant difference between aboriginal and non-aboriginal students for WSC, there is not enough evidence showing a significant difference in writing fluency between aboriginal and non-aboriginal students across the grade levels which could be a sample size issue. Whether or not students are aboriginal does appear to impact on their 58 reading fluency scores at Grade 1,2, 3, 6, and 7, but does not appear to have an impact on their writing fluency scores across all grade levels. 59 Table 12 Analysis o f Variance fo r Gender and Aboriginal Differences in WRC Source Grade 1 Gender Aboriginal GXAb October F P January F P April F P 1.764 15.224 0.151 .185 .000** 398 Grade 2 Gender Aboriginal GXAb 0^86 10.773 0.015 .445 .001** .901 2238 16.384 0.002 .136 .000** .965 5.792 20.180 0.000 317* .000** .984 Grade 3 Gender Aboriginal GXAb 0.991 10.001 0.022 320 .002** 382 2.151 9.086 0.255 .144 303** .614 1.595 9323 0.000 308 .003** .984 Grade 4 Gender Aboriginal GXAb 0.170 4.956 0.048 .680 327* 327 0.163 0.673 0.038 387 .413 .846 0.048 3.718 0.177 326 .055 .674 Grade 5 Gender Aboriginal GXAb 0.014 0.000 2769 .906 .997 .097 0339 0.060 3346 343 .807 .061 0.144 0.003 2.683 .705 .960 .103 Grade 6 Gender Aboriginal GXAb A388 11.506 0.042 .037* .001** 337 5.183 16.869 0.211 .024* .000** 346 3.075 16.425 0.763 .080 .000** 383 Grade 7 Gender Aboriginal GXAb 3.412 4.746 0.002 .066 .030* .963 3.108 6.061 0353 .079 .014* 353 0.942 4.445 1302 333 336* 355 Note: * p < .05, ** p < .QV, p < .0005 is recorded as .000 60 Table 13 Analysis o f Variance fo r Gender and Aboriginal Differences in WSC Source Grade 1 Gender Aboriginal GXAb October F P January F P April F P 5.168 9.632 1.013 .024* 302** .315 Grade 2 Gender Aboriginal GXAb 3J69 3.214 0.330 .053 .074 Ji66 9.983 12.260 0.093 .002** .001** .761 8.080 2.482 0.116 .005** .116 .734 Grade 3 Gender Aboriginal GXAb 3jT3 2928 1.127 .052 .088 .289 9.579 2823 2213 .002** .094 .138 17.654 0.471 1.082 300** 393 399 Grade 4 Gender Aboriginal GXAb 20.716 9.490 0.925 .000** .002** 337 15.025 2.751 0.420 .000** .098 .517 7.870 3388 0.007 305** .075 332 Grade 5 Gender Aboriginal GXAb 10.779 1.248 1.781 .001** 365 .183 15.598 0.200 1.563 .000** .655 312 11.963 6.403 1.686 .001** .012* .195 Grade 6 Gender Aboriginal GXAb 18.686 2319 0.041 .000** .129 339 13.428 33175 0.540 .000** .047* .463 12.629 4.331 0.408 300** 338* J33 Grade 7 Gender Aboriginal GXAb 19.773 0.561 0.005 .000** .454 .944 20.017 1.017 0.100 .000** .314 352 27.485 5.531 0399 300** .019* j8 5 Note: * p < .05, ** p < .01; p < .0005 is recorded as .000 61 Table 14 Analysis o f Variance fo r Gender and Aboriginal Differences in TWW Source Grade 1 Gender Aboriginal GXAb October F P January F P April F P 3.898 8.689 1.249 .049* .004** .265 Grade 2 Gender Aboriginal GXAb 6.881 2.567 0.005 .009** 8.923 .110 10.617 .945 0.177 .003** .001** .674 4.794 1.691 0.189 .029* .195 .664 Grade 3 Gender Aboriginal GXAb 1.346 1.939 2.196 .247 .165 .140 5.618 1.402 3.619 .018* .237 .058 14.800 0.171 0.788 .000** .680 .375 Grade 4 Gender Aboriginal GXAb 22.966 8.798 1.613 .000** 15.703 .003** 3.116 0.566 .205 .000** .079 .452 9.470 2.840 0.217 .002** .093 .642 Grade 5 Gender Aboriginal GXAb 10.412 1.507 2.826 .001** 17.012 .221 0.338 .094 3.314 .000** 10.717 .561 6.253 .070 2.026 .001** .013* .156 Grade 6 Gender Aboriginal GXAb 17.671 1.575 0.236 .000** 11.384 .210 2.894 0.584 .628 .001** 11.876 .090 3.303 .445 0.237 .001** .070 .627 Grade 7 Gender Aboriginal GXAb 14.807 0.286 0.052 .000** 18.823 0.785 .593 .819 0.252 .000** 24.998 5.175 .376 .616 0.519 .000** .024* .472 Note: < .05, ** p < .O f p < .0005 is recorded as .000 62 Effect Sizes and Trends Effect Sizes fo r the DIBELS Data With this many analyses of variance being run it is necessary to calculate Cohen’s d for ISF, LNF, PSF, NWF, and ORF for the appropriate grade level(s) and testing periods. They are reported by gender first in Table 15 and then by aboriginal status in Table 16. Effect sizes where there was a significant difference o f p < .05 are marked with a single asterisk. For early literacy skills the statistically significant difference effect size for gender ranges from small (.37 to .46) to medium (.56) with the median effect size being at the upper end of small (.45). There are no statistically significant effects that are trivial in size. The analysis is sensitive enough to detect small effects based on sample size when the effect is below the upper end of small yet not so sensitive as to detect statistically significant but trivial effects. The only test variables that indicate statistically significant effect sizes for gender are all at the Grade 1 level and include PSF, NWF, and ORF. The small number of statistically significant differences is due to a consistent lack of differences in performance on the early literacy skills test variables. The lack of consistent statistically significant differences in the sample indicate non-significant results, suggesting for the other Grade 1 results (LNF) and all the Kindergarten results no difference is detected. Therefore I do not believe there are gender differences in early literacy skills in the population. 63 Table 15 Effect Sizes fo r ISF, LNF, PSF, NWF, and ORF by Gender Grade and testing period ISF Kindergarten October January LNF Kindergarten October January April Grade 1 October PSF Kindergarten January April Grade 1 October January April NWF Kindergarten January April Grade 1 October January April ORF Grade 1 January April M Male N M 121 120 11.60 14.88 124 122 10.74 13.31 .10 .15 Trivial Trivial 11.41 14.94 15.78 121 121 116 11.87 22.64 32.54 124 122 125 8.25 17.51 27.34 .32 .34 .33 Small Small Small 17.04 117 35.78 131 30.84 .29 Small 15.06 16.41 120 116 16.05 22.03 122 124 12.60 19.53 .23 .15 Small Trivial 19.05 18.83 16.44 117 118 108 29.05 40.50 45.94 131 134 123 20.44 31.86 36.79 .45 .46 .56 Small* Small* Medium* 9.08 13.87 88 116 9.07 16.52 92 123 5.03 13.35 .45 .23 Small Small 17.06 21.48 30.40 118 118 110 23.16 39.54 56.35 131 133 123 16.72 35.53 51.13 .38 .19 .17 Small* Trivial Trivial 17.78 34.30 .20 .37 SD Female n 8.57 10.42 Effect Size Trivial Small* Note: * denotes cases where there was a significant gender difference ip < .05) 20.79 28.29 118 110 21.91 44.73 132 122 64 Also for early literacy skills the statistically significant difference effect size for aboriginal status ranges from small (.35 to .48) to medium (.50 to .63) with the median effect size being at the lower end o f medium (.50). There are no statistically significant effects of trivial size. The analysis is sensitive enough to detect medium effects based on sample size when the effect is below the lower end o f medium yet not so sensitive as to detect statistically significant but trivial effects. The only test variable that does not indicate statistically significant effect sizes for aboriginal status is NWF at the Kindergarten level. The presence of statistically significant differences is due to consistent differences in performance on the early literacy skills test variables. Statistically significant differences in the sample indicate significant results, therefore I believe there are aboriginal differences in early literacy skills in the population. The statistically significant difference effect for aboriginal status is slightly greater than the statistically significant difference effect for gender when comparing the median statistically significant effect sizes for the two groups. For gender the median effect size is .45 and for aboriginal status the median effect size is .50. This would merely be a sample size artifact as aboriginal groups are approximately 40 to 55 whereas gender groups are approximately 90 to 130. 65 Table 16 Effect Sizes fo r ISF, LNF, PSF, NWF, and ORF by Aboriginal Status Non-Aboriginal Aboriginal SD M Grade and n n M Effect Size testing period ISF Kindergarten 11.72 October .40 Small* 8.57 205 40 8.31 January 10.42 14.79 10.55 .41 Small* 202 40 LNF Kindergarten 11.41 10.91 October 205 40 5.55 .47 Small* January 14.94 20.91 Small* 203 40 15.75 .35 April 15.78 202 30.77 39 Small* 25.08 .36 Grade 1 October 17.04 34.99 .48 Small* 193 55 26.78 PSF Kindergarten 15.56 Medium* January 15.06 202 40 8.00 .50 16.41 201 Small April 21.39 39 16.85 .28 Grade 1 17.02 Medium* October 19.05 26.63 55 .50 193 January 54 Medium* 198 38.26 .58 18.83 27.26 16.44 182 42.48 .40 Small* April 49 35.84 NWF Kindergarten 151 7.48 Small January 4.55 .32 9.08 29 April 201 12.21 .23 Small 13.87 15.39 38 Grade 1 21.64 October 17.06 194 13.18 .50 Medium* 55 January 197 28.74 Medium* 21.48 39.79 54 .51 184 Medium* April 30.40 57.32 49 39.61 .58 ORF Grade 1 January 20.79 197 21.97 .51 Medium* 11.38 53 April 25.24 Medium* 28.29 183 42.99 49 .63 Note: * denotes cases where there was a significant difference {p < .05) between aboriginal and non-aboriginal students 66 Effect Sizes fo r the CBM Data As previously mentioned, with the number of analyses of variance being run it is necessary to calculate Cohen’s d for the WRC, WSC, and TWW analyses for each grade level and each testing period. These effect sizes are reported by gender first in Tables 17, 18, and 19 and then by aboriginal status in Tables 20, 21, and 22 respectively. Effect sizes where there was a significant difference o f /? < .05 are marked with a single asterisk. For WRC the statistically significant difference effect size for gender is small (.32 to .37) with the median effect size being at the mid-range of small (.35). There are no statistically significant effects that are trivial in size. The analysis is sensitive enough to detect small effects based on sample size when the effect is below the mid-range of small but not so sensitive as to declare trivial effects to be statistically significant. The lack of statistically significant differences is due to a consistent lack of differences in performance on the WRC test variable. The lack o f statistically significant differences in the sample indicates non-significant results, therefore I do not believe there are gender differences in reading in the population. For WSC the statistically significant difference effect size for gender ranges from small (.38 to .44) to medium (.50 to .67) with the median effect size being at the lower end of medium (.55). There are no statistically significant effects that are trivial in size. The analysis is sensitive enough to detect medium effects based on sample size when the effect is below the lower end of medium but not so sensitive as to declare trivial effects to be statistically significant. The presence of statistically significant differences is due to consistent differences in performance on the WSC test variable. Statistieally significant 67 differences in the sample indicate significant results, therefore 1 believe there are gender differences in writing in the population. For TWW the statistically significant difference effect size for gender also ranges from small (.33 to .49) to medium (.52 to .63) with the median effect size being at the lower end of medium (.52). There are no statistically significant effects that are trivial in size. The analysis is also sensitive enough to detect medium effects based on sample size when the effect is below the lower end o f medium but not so sensitive as to declare trivial effects to be statistically significant. The presence o f statistically significant differences is due to consistent differences in performance on the TWW test variable. As with WSC the statistically significant differences in the sample lead me to believe there are gender differences in writing in the population. 68 Table 17 Effect Sizes fo r WRC by Gender Grade and testing period SD Female n M Male n M Grade 1 April 29.60 115 40.03 132 32^3 .25 Small Grade 2 October January April 39^7 39jW 4Z32 120 115 118 54.77 73.21 89 JO 146 149 147 49.21 63J6 74.07 .14 .25 .37 Trivial Small Small* Grade 3 October January April 40.26 41.62 39.47 126 122 124 9Z33 106.57 115.23 155 160 157 85.66 98.02 106.42 .17 .21 .22 Trivial Small Small Grade 4 October Januaiy April 40.89 40.13 38J0 154 156 156 104.79 115.83 122.03 155 153 153 101.01 112.27 118.53 .09 .09 .09 Trivial Trivial Trivial Grade 5 October January April 36.08 37^3 38^5 140 139 141 118.82 126.54 135.34 138 138 135 111.23 116.43 125.58 .21 .27 .25 Small Small Small Grade 6 October January April 38^5 39.70 38.17 151 152 149 134.38 138.65 144.11 162 158 163 122.07 124.58 131.99 .32 .35 .32 Small* Small* Small Grade 7 October January April 40.49 40.66 40.18 164 166 171 140.64 145.28 148.64 170 169 164 130.19 133.15 139.03 .26 .30 .24 Small Small Small Effect Size Note: * denotes cases where there was a significant gender difference (p < .05) 69 Table 18 Effect Sizes fo r WSC by Gender Grade and testing period &0 Female n M Male n M Effect Size Grade 1 April 7.03 116 11.45 131 8.27 .45 Small* Grade 2 October January April 8.05 9.47 10.83 119 118 118 14.31 20.65 25.08 145 149 147 11.42 15.87 20j^ .36 .51 .41 Small Medium* Small* Grade 3 October January April 10.94 12.19 12.25 126 122 124 25.73 33.01 35.68 155 161 155 20.78 2Aa0 2&55 .45 .67 .58 Small Medium* Medium* Grade 4 October January April 12.54 1322 14.84 153 155 156 35.75 39.90 46.38 154 152 153 2&43 32 j# 3929 .58 .53 .44 Medium* Medium* Small* Grade 5 October January April 14.21 14.10 15.74 139 140 140 43.54 47.39 52.23 139 140 137 38.14 40.29 46.05 .38 .50 .39 Small* Medium* Small* Grade 6 October January April 16.54 16.17 17.33 151 152 149 55.96 57.97 61.84 162 160 162 46.40 48.98 5248 .58 .56 .54 Medium* Medium* Medium* Grade 7 October January April 16.43 16.80 16.90 165 166 170 64.67 65.99 68.86 170 167 164 5428 55.78 57.51 .63 .61 .67 Medium* Medium* Medium* Note: * denotes cases where there was a significant gender difference {p < .05) 70 Table 19 Effect Sizes fo r TWW by Gender Grade and testing period SO Female n M Male n M Effect Size Grade 1 April 8.28 116 15.28 131 11.83 .42 Small* Grade 2 October January April 8jW 9jU 10.98 119 118 118 18.75 24.91 29.17 145 149 147 15.20 20.08 2447 .40 .49 .38 Small* Small* Small* Grade 3 October January April 11.06 12.16 12J9 126 122 124 28.86 36.11 38.76 155 161 155 24.75 28.71 3242 .37 .61 .54 Small Medium* Medium* Grade 4 October January April 12j# 13.54 15.00 153 155 156 39.24 42.86 49.32 154 152 153 31.67 3543 42.67 .59 .53 .44 Medium* Medium* Small* Grade 5 October January April 14.37 14.26 15.77 139 140 140 46.12 49.84 54.36 139 140 137 41.35 43.21 48.85 .33 .47 .35 Small* Small* Small* Grade 6 October January April 16.38 15.87 17.19 151 152 149 58.34 59.95 63.76 162 160 162 49.48 51.68 54.90 .54 .52 .52 Medium* Medium* Medium* Grade 7 October January April 16.62 16.99 16.77 165 166 170 66.62 68.10 70.56 170 167 164 57.15 58.34 60.05 .57 .57 .63 Medium* Medium* Medium* Note: * denotes cases where there was a significant gender difference {p < .05) 71 For differences in aboriginal versus non-aboriginal status the statistically significant difference effect size for WRC ranges from small (.29 to .48) to medium (.51 to .68) with the median effect size being at the lower end of medium (.51). There are no statistically significant effects that are trivial in size. The analysis is sensitive enough to detect medium effects based on sample size when the effect is below the lower end of medium hut not so sensitive as to declare trivial effects to he statistically significant. The presence o f statistically significant differences is due to consistent differences in performance on the WRC test variable. Statistically significant differences in the sample indicate significant results therefore I believe there are aboriginal differences in reading in the population. For aboriginal status differences the statistically significant difference effect size for WSC ranges from small (.24 to .48) to medium (.53) with the median effect size being at the mid-range o f small (.31). The statistically significant effect sizes go from medium to small in a progression from Grade 2 to Grade 7. There are no statistically significant effects that are trivial in size. The analysis is sensitive enough to detect small effects based on sample size when the effect is below the mid-range of small hut not so sensitive as to declare trivial effects to he statistically significant. The lack of significant differences is due to a consistent lack o f difference in performance on the WSC test variable. The lack of statistically significant differences in the sample indicate non­ significant results, therefore I do not believe there are aboriginal differences in writing in the population. For TWW the statistically significant difference effect size is small (.23 to .49) for aboriginal status differences. The median effect size is at the mid-range o f small (.39). 72 There are no statistically significant effects that are trivial in size. The analysis is sensitive enough to detect small effects based on sample size when the effect is below the mid-range o f small but not so sensitive as to declare trivial effects to be statistically significant. The lack o f significant differences is due to a consistent lack of difference in performance on the TWW test variable. The lack o f statistically significant differences in the sample indicate non-significant results, therefore I do not believe there are aboriginal differences in writing in the population. 73 Table 20 Effect Sizes fo r WRC by Aboriginal Status SD Non-Aboriginal n M Aboriginal n M Grade and testing period Grade 1 April 29^0 197 3933 50 21.78 .60 Medium* Grade 2 October January April 39^7 39jW 4Z32 214 212 214 5533 72.54 8636 52 52 51 3532 47.71 57.86 .51 .62 .68 Medium* Medium* Medium* Grade 3 October January April 40.26 41.62 39.47 238 239 240 91.97 105.14 113.25 43 43 41 70.26 82.74 9335 .54 .54 .51 Medium* Medium* Medium* Grade 4 October January April 4&89 40.13 3830 261 263 263 105.11 114.87 122.05 48 46 46 9033 109.52 110.24 .35 .13 .31 Small* Trivial Small Grade 5 October January April 36.08 3733 3835 235 237 232 115.12 121.40 130.66 43 40 44 114.67 122.10 130.05 .01 -.02 .02 Trivial Trivial Trivial Grade 6 October January April 3835 39.70 38.17 251 246 250 131.67 136.09 141.97 62 64 62 113.18 113.78 120.85 .48 .56 .55 Small* Medium* Medium* Grade 7 October January April 40.49 40.66 40.18 279 281 280 137.27 141.40 146.00 55 54 55 125.44 127.54 133.44 .29 .34 .31 Small* Small* Small* Effect Size Note: * denotes cases where there was a significant difference (p < .05) between aboriginal and non-aboriginal students 74 Table 21 Effect Sizes fo r WSC by Aboriginal Status SD Non-Aboriginal n M Aboriginal n M Grade and testing period Grade 1 April 7.03 197 10.44 50 7.10 .48 Small* Grade 2 October January April 8.05 9.47 10.83 213 214 214 13.15 18.97 23.17 51 53 51 10.94 13.98 20.39 .28 .53 .26 Small Medium* Small Grade 3 October January April 10.94 12.19 12.25 238 240 240 23.46 2832 3Z00 43 43 39 20.44 25.65 29.95 .28 .26 .17 Small Small Trivial Grade 4 October January April 12.54 1322 14.84 259 261 264 33.01 36.97 43.80 48 46 45 27.04 33.33 39T3 .48 .28 .32 Small* Small Small Grade 5 October January April 14.21 14.10 15.74 235 238 233 41.14 43.81 49.97 43 42 44 39.21 44.02 44.93 .14 -.01 .32 Trivial Trivial Small* Grade 6 October January April 16.54 16.17 1733 251 248 249 51.77 54.27 57.95 62 64 62 47.94 49.81 52.98 .23 .28 .29 Small Small* Small* Grade 7 October January April 16.43 16.80 16.90 280 279 280 59.50 61.02 63.94 55 54 54 58.85 60.09 59.94 .04 .06 .24 Trivial Trivial Small* Effect Size Note: * denotes cases where there was a significant difference (p < .05) between aboriginal and non-aboriginal students 75 Table 22 Effect Sizes fo r TWW by Aboriginal Status Grade and testing period Grade 1 April SD Non-Aboriginal n M Aboriginal n M Effect Size &28 197 14.21 50 10.48 .45 Small* Grade 2 October January April 8.84 &81 10.98 213 214 214 17.24 23.17 2T26 51 53 51 14.96 18.36 25.08 .26 .49 .20 Small Small* Small Grade 3 October January April 11.06 12.16 12J9 238 240 240 26.94 3220 43 43 39 24.65 30.23 33.77 .21 .16 .12 Small Trivial Trivial Grade 4 October January April 12.89 13.54 15.00 259 261 264 3637 3937 46.69 48 46 45 30.44 35.91 42.13 .46 .29 .30 Small* Small Small Grade 5 October January April 14.37 14.26 15.77 235 238 233 44.07 46.52 5243 43 42 44 41.88 46.57 47.41 .15 -.00 .32 Trivial Trivial Small* Grade 6 October January April 16J8 15.87 17.19 251 248 249 5439 56.48 60.01 62 64 62 51.15 52.72 55.66 .20 .24 .25 Small Small Small Grade 7 October January April 16.62 16.99 16.77 280 279 280 61.87 6331 6&03 55 54 54 61.55 62.67 62.19 .02 .04 .23 Trivial Trivial Small* Note: * denotes cases where there was a significant difference ip < .05) between aboriginal and non-aboriginal students 76 When comparing the median effect sizes for the variables WRC, WSC, and TWW, the significant difference effect for gender is greater than the significant difference effect for aboriginal status for WSC and TWW only. For gender the WSC and TWW median effect sizes are .55 and .52 respectively and for aboriginal status the WSC and TWW median effect sizes are .31 and .39 respectively. For the variable WRC the significant difference effect for aboriginal status (median effect size of .51) is greater than the significant difference effect for gender (median effect size of .35). The non­ significant results for aboriginal status could be due to the smaller sample size for this group. DIBELS and CBM Data Trends There are some trends that are evident in the data for both the DIBELS and CBM measures. One trend is that female participants outperform male participants in all measures for early literacy skills for Kindergarten and Grade 1, as well as for both the reading and writing measures for Grade 2 to seven students. This is based on a comparison o f the mean scores for the variables tested in the DIBEES and CBM studies. Despite this overall trend of females outperforming males the only statistically significant gender differences detected are for writing for Grade I to 7 and for some of the early literacy skills for Grade I . There is a lack of a noticeable trend when examining the gap between female and male performance, males are not improving or regressing when comparing the mean scores from grade to grade and from testing period to testing period. For both genders from Kindergarten to Grade 7 there is an increase in mean scores from one testing period to another for each and every test variable. 77 When comparing effect sizes by gender for reading and writing fluency, PSF and LNF were the measures used for Kindergarten, and WRC and WSC were the measures used for Grade 1 to 7. For ease of graphical comparison, PSF for Kindergarten will be graphed along with WRC for Grade 1 to 7 and LNF for Kindergarten will be graphed along with WSC for Grade 1 to 7 (see Figure 4). There is an increasing trend in the effect size for both WRC and WSC from Kindergarten to Grade 7 with effect sizes for both measures roughly doubling between Kindergarten and Grade 7. 0.7 0.6 N 0.5 05 O 0.4 ■WRC Effect Size by Gender 0.3 •WSC Effect by Gender w J I 0.2 0.1 0 K 1 2 3 4 5 6 7 Grade Figure 4. Eine graph of median effect sizes for reading and writing by gender. A trend for aboriginal status that is evident is that non-aboriginal students outperformed aboriginal students at every grade level and test variable except for the Grade 5 testing in January for all three test variables. This is based on a comparison of mean scores for all testing variables and periods for both the DIBELS and CBM data. PSF and LNF were the measures used for comparison at the Kindergarten and Grade 1 78 levels for reading and writing fluency. For both non-aboriginal and aboriginal students from Kindergarten to Grade 7 there is an increase in mean scores from one testing period to another for each and every test variable except at the Grade 7 level for the April testing of WSC. In this case the April mean test score for aboriginal students dropped from the January mean score. When comparing effect sizes by aboriginal status for reading and writing fluency, PSF and LNF were the measures used for Kindergarten, and WRC and WSC were the measures used for Grade 1 to 7. For ease of graphical comparison, PSF for Kindergarten will be graphed along with WRC for Grade 1 to 7 and LNF for Kindergarten will be graphed along with WSC for Grade 1 to 7 (see Figure 5). Overall there is a decreasing trend in the effect sizes for both WRC and WSC from Grade 1 to Grade 7. This decrease indicates the difference between non-aboriginal students and aboriginal students is getting slightly smaller as the students reach the higher grade levels. There is however a notable increase in the effect size by aboriginal status from Kindergarten to Grade 1. For WRC the increase from Kindergarten to Grade 1 is roughly doubled. This is an indication that the difference between non-aboriginal students and aboriginal students is getting larger at these two grade levels. 79 0.7 0.6 s 0.5 I 0.4 ■WRC ESèct Size by Aboriginal Status g 0.3 ■WSC Efièct Size by Aboriginal Status 1 K 1 2 3 4 5 6 7 Grade Figure 5. Line graph of median effect sizes for reading and writing by aboriginal status. There is an anomaly at the Grade 5 level. Particularly for WRC, the effect size by aboriginal status for all three testing periods for this variable is smaller than the effect sizes for the other grade levels. The effect size for both WRC and WSC are trivial at the Grade 5 level. It is also at the Grade 5 level where aboriginal students outperformed non­ aboriginal students in the January testing period for WRC, WSC and TWW. When the median of the three means within grade scores for WRC for both gender and aboriginal status are plotted, there is a steady increase from Grade 1 right through to Grade 7. This increase is evident for all four groups: female; male; non­ aboriginal; and aboriginal students (see Figure 6). Results for Kindergarten are not included in this graph because there is no accurate measure for Kindergarten for reading. There is a similar result seen for the median o f the three means within grade scores for WSC for gender and aboriginal status (see Figure 7). The scores increase steadily from Grade 1 right through to Grade 7 for all four groups, female, male, non-aboriginal, and 80 aboriginal students. Results for Kindergarten are not included in this graph because there is no accurate measure for Kindergarten for words spelled or written correctly. 160 140 y 8 120 i § 80 8 60 Median Female Mean WRC Score -A—Median Male Mean WRC Score Median Non-Aboriginal Mean WRC Score I 40 Median Aboriginal Mean WRC Score 1 2 3 4 6 5 7 Grade Figure 6. Line graph o f median of the three means within grade scores for WRC for both gender and aboriginal status groups. 81 70 60 Median Female Mean WSC Score 8 (/) 50 u c/3 40 I -A— Median Male Mean WSC Score 30 Median Non-Aboriginal Mean WSC Score 20 Median Aboriginal Mean WSC Score I •B 10 0 2 3 4 5 6 7 Grade Figure 7. Line graph o f median o f the three means within grade scores for WSC for both gender and aboriginal status groups. These two figures (6 and 7) show the gap between males and females is beginning to widen at the upper grades for both WRC and WSC while the gap between non­ aboriginal and aboriginal students is beginning to narrow for both WRC and WSC. When comparing the median effect sizes by both gender and aboriginal status for WRC it is evident that there is a downward trend in effect sizes for the aboriginal results and an upward trend in the effect sizes for the gender results (see Figures 8). The effect sizes are more apparent when plotted on a line graph. The linear regressions and R-squared values for each trend line are as follows: y = 0.012% + 0.1689 and size by gender; andy = - 0.025% + 0.515 and = 0.1647 for WRC effect = 0.0838 for WRC effect size by aboriginal status. The slope for the gender results is positive which again is evidence of 82 an upward trend in effect sizes for gender which means females are continuing to pull ahead of the males. The slope for aboriginal status is negative which is further evidence of a downward trend in effect sizes for aboriginal status which means that aboriginal students are getting closer to non-aboriginal students. The effect sizes for WRC for gender are all in the range of just under 0.1 to just over 0.3, which is considered small for Cohen’s limits. There is also a large dip in the effect size for WRC for aboriginal status at the Grade 5 level where Cohen’s d drops to 0. This appears to be some sort o f an anomaly. 0.7 0.6 O ■WRC EiFect Size by Gender ■o •WRC ElFect Size by Aboriginal Status I 0.3 I 0.2 K 1 2 3 4 5 6 7 Grade Figure 8. Line graph of median WRC effect sizes by gender and aboriginal status. When comparing the median effect sizes by both gender and aboriginal status for WSC it is evident that there is a downward trend in effect sizes for the aboriginal results and an upward trend in the effect sizes for the gender results (see Figure 9). These effect 83 sizes become more apparent on the line graph. The linear regressions and i?-squared values for each trend line are as follows: y = 0.0302% + 0.3489 and effect size by gender; andy = - 0.0412% + 0.4579 and = 0.4949 for WSC = 0.6146 for WSC effect size by aboriginal status. The slope for the gender results is positive, which again is evidence of an upward trend in effect sizes for gender which confirms females are continuing to pull ahead of males. The slope for aboriginal status is negative which is further evidence o f a downward trend in effect sizes for aboriginal status which confirms indications that aboriginal students are getting closer to non-aboriginal students. The slopes for the WRC results for both the gender and aboriginal results are approximately half of the size of the slopes for WSC for both groups. The slope indicates a rate of change that for the WSC results is roughly twice as fast as for the WRC results. The values for WSC for gender and aboriginal status are very large, approximately 50% and 62% respectively, as compared to the WRC scores for gender and aboriginal status, approximately 17% and 8% respectively. This indicates that for WSC the data is more tightly centered around the slope line than the data is for WRC. There are noticeable trends for both the gender and aboriginal effect sizes for WSC. For both gender and aboriginal status the WSC effect sizes are very similar at the Kindergarten and Grade 1 levels (small to medium). By Grade 2 the gender and aboriginal results begin to differ greatly. For gender the effect sizes for WSC begin to climb so that by Grade 6 the effect size is medium. For aboriginal status the effect sizes for WSC begin to fall and by Grade 5 the effect size is trivial. For both groups there is a dip in effect sizes at the Grade 5 level, another anomaly. 84 0.7 0.6