LEVERAGING MACHINE LEARNING TO DECODE INSURANCE PURCHASING DISPARITIES IN CANADIAN HOUSEHOLDS: A PCA APPROACH by Chongrui Zhou BSc., University of Northern British Columbia, 2009 THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE (MSC) IN BUSINESS ADMINISTRATION UNIVERSITY OF NORTHERN BRITISH COLUMBIA August 2024 © Chongrui Zhou, 2024 Approval Page i Abstract This thesis investigates the factors influencing insurance spending among Canadian households, employing advanced machine learning techniques and Principal Component Analysis (PCA). This research develops an integrated predictive model to forecast household expenditures on life, health, and auto insurance, incorporating a comprehensive range of determinants such as household characteristics, economic conditions, and regional differences. Utilizing a robust dataset from the Survey of Household Spending (SHS) for the years 2010 to 2017, with 2019 serving as the validation year, the study applies PCA to manage highdimensional data effectively, thereby enhancing the predictive performance of the machine learning algorithms used. The results indicate that the model predicts insurance expenditures with notable accuracy; however, it slightly underestimates life insurance costs with an actual expenditure of $1,381 compared to the predicted $1,263 while providing highly accurate forecasts for health insurance. The predictions for car insurance expenditures exhibit larger variances. The findings highlight the substantial benefits of integrating PCA and machine learning to advance predictive analytics in the insurance industry. The study offers critical insights for insurance providers, policymakers, and consumers, laying a data-driven groundwork for strategic decision-making and policy development. Recommendations for future research include refining the predictive models and investigating additional variables that may influence insurance spending. This thesis not only contributes to the academic discourse but also provides actionable strategies to enhance the accuracy and efficacy of forecasting models in the insurance sector. ii Table of Contents Approval Page Abstract Table of Contents List of Tables List of Figures Glossary Acknowledgment and Dedication Chapter One Introduction Background Problem Statement Objectives Chapter Two Literature Review Methodology: Hypothesis and Predictive Model: Data Collection: Chapter Three Data Analysis Narrative/ Empirical Results Chapter Four Conclusion: Future direction: Bibliography: Appendix 1: Figures Appendix 2: Tables i ii iii iv v vi ix 1 1 3 4 5 6 6 11 13 15 21 21 34 34 37 38 41 47 iii List of Tables Table 1. Number of Observations and Corresponding Weighted Population Estimates. 47 Table 2. Eigenvalue Distribution After PCA Analysis of Life Insurance Spending 48 Table 3. Eigenvalue Distribution After PCA Analysis of Private Health Insurance Spending 49 Table 4. Eigenvalue Distribution After PCA Analysis of Car Insurance Spending 50 Table 5. 2019 Canadian Households Insurance Spending 51 Table 6. Mean Life Insurance Spending for Canadian Households: Averages Across Ten Distinct Groups 52 Table 7. Mean Health Insurance Spending for Canadian Households: Averages cross Ten Distinct Groups 53 Table 8. Mean Car Insurance Spending for Canadian Households: Averages Across Ten Distinct Groups 54 Table 9. Canadian Household Data: Post-VariMax Rotation Analysis for Life Insurance Holders 55 Table 10. Canadian Household Spending on Private Health Insurance: A Post-VariMax Rotation Analysis 57 Table 11. Canadian Household Data: An Analysis Post-VariMax Rotation for Car Insurance Holders 59 iv List of Figures Figure 1 Eigenvalue Distribution After PCA Analysis of Life Insurance Spending 41 Figure 2. Mean Predicted vs. Actual Life Insurance Spending Across 10 Groups in Canada, 2019 42 Figure 3. Eigenvalue Distribution After PCA Analysis of Private Health Insurance Spending 43 Figure 4. Mean Predicted vs. Actual Private Health Insurance Spending Across 10 Groups in Canada, 2019 44 Figure 5. Eigenvalue Distribution After PCA Analysis of Car Insurance Spending 45 Figure 6. Mean Predicted vs. Actual Car Insurance Spending Across 10 Groups in Canada, 2019 46 v Glossary Component: a "component" is a derived variable created by combining the original variables of a dataset in a way that maximizes variance captured, used especially in techniques like Principal Component Analysis to simplify data complexity. Couple With Kid(S): A subcategory of household type focuses on describing the nature of households based on the relationships between members present at the time of the interview. This classification helps in understanding the familial and social dynamics within different living arrangements. Couple With Other: Couples with other related or unrelated persons. A subcategory of household type focuses on describing the nature of households based on the relationships between members present at the time of the interview. This classification helps in understanding the familial and social dynamics within different living arrangements. Female Cloth Spending: This category includes clothing, footwear, accessories, watches, and jewelry for women and girls aged four years and older. Higher Level of Edu: Highest level of educational attainment of the reference person or the spouse. Loading: The loadings, from a numerical perspective, are equivalent to the coefficients of the variables and offer insights into which variables contribute most significantly to the components. Lone Parent: Lone parent household with no additional persons. A subcategory of household type focuses on describing the nature of households based on the relationships between members present at the time of the interview. This classification helps in understanding the familial and social dynamics within different living arrangements. vi Man Cloth Spending: This category includes clothing, footwear, accessories, watches, and jewelry for men and boys aged four years and older. Other: Occupied rent-free by the household is a subcategory of Type of Tenure. This classification refers to dwellings that are owned by the occupants who are currently paying off one or more mortgages. It indicates the type of tenure of the dwelling at the time of the interview. Other HH Type: Other households with related or unrelated persons. A subcategory of household type focuses on describing the nature of households based on the relationships between members present at the time of the interview. This classification helps in understanding the familial and social dynamics within different living arrangements. Own With Mortgage: Owned with a mortgage(s) by the household is a subcategory of Type of Tenure. This classification refers to dwellings that are owned by the occupants who are currently paying off one or more mortgages. It indicates the type of tenure of the dwelling at the time of the interview. PCA Analysis: Principal Component Analysis (PCA) is a statistical method used to reduce the dimensionality of data sets, increasing interpretability while minimizing information loss. It achieves this by transforming a set of correlated variables into a smaller number of uncorrelated variables known as principal components. Rec Car Insurance: insurance spent on Recreation vehicle RDC: Research Data Center SHS: Survey of Household Spending VariMax Rotation: A statistical method used in and principal components analysis to streamline solutions and improve the interpretation of outcomes. vii 1 Person Household: A subcategory of household type focuses on describing the nature of households based on the relationships between members present at the time of the interview. This classification helps in understanding the familial and social dynamics within different living arrangements. viii Acknowledgment and Dedication First and foremost, I extend my deepest gratitude to my supervisor, Dr. Chengbo Fu. His guidance and support were invaluable when I struggled to find a suitable topic. Dr. Fu helped me arrange the project with RDC and Statistics Canada, providing the necessary resources and connections. His patience and encouragement were instrumental in calming my restless heart and steering me toward this thesis topic. His insightful feedback and continuous support throughout the research process have been crucial in shaping this work. His dedication to my academic growth has not only helped me complete this project but also inspired me to pursue further research with confidence. I am profoundly grateful to my committee members, Dr. Kafui Monu and Dr. Fan Jiang. Dr. Monu's unwavering support throughout this journey, never giving up on me, has been a source of great strength. Dr. Jiang provided critical guidance and pointed me in the right direction when I felt lost. Their collective expertise and constructive feedback have significantly contributed to the completion of this thesis. Special thanks to Dr. Chen Jing, who helped me start this journey. During this time, I have explored various fields, read numerous books, and kept updated with the latest news. I also experienced one of the significant financial crises firsthand. Although I couldn't systematically document this major event, being at the forefront of it greatly enriched my life. As a result, I began investing with real money, which ultimately funded my school tuition. I also want to thank Dr. Chen Liang for his inspiration and encouragement, which led me to discover my path. I extend my thanks to Statistics Canada and RDC for providing me with the opportunity to use the SHS database and for having staff on-site to support me and answer my questions. ix I am deeply thankful to my family for their unwavering support throughout this process. Their encouragement has been my greatest strength. Reflecting on this journey, I realize the profound changes within myself. This experience has transformed my expectations and understanding of education, shifting from merely teaching and learning to facilitating learning. It has broadened my perspective on life, moving from a black-and-white view to understanding that life is often a guessing game. While we may not know the absolute truth, there are ways to make better guesses and get closer to it. Lastly, I want to thank the University of Northern British Columbia (UNBC) for providing me with all the support and opportunities to complete this project. x Chapter One Introduction The Canadian insurance landscape has distinct differences emerging in consumer purchasing behaviours compared to their American counterparts (Cacace & Schmid, 2008). This difference creates potential market opportunities and reveals previously hidden aspects of consumer decision-making processes. Given the important role that insurance plays in protecting the financial stability of households and the wider economy, it is essential to conduct a thorough analysis of these behavioural patterns (Nam & Hanna, 2019; Browne & Kim, 1993; Ye et al., 2022). This study utilizes advanced machine learning techniques and Principal Component Analysis (PCA) to examine insurance spending among Canadian households. The objective is to create a predictive model that illuminates the factors that impact insurance purchasing decisions, providing a useful tool for forecasting future household insurance expenses. This study is valuable for a range of stakeholders, such as insurance companies, policymakers, and consumers in Canada. Previous research in this domain has mainly focused on describing insurance spending patterns (Mapharing, M., Otuteye, E., & Radikoko, I. 2015; Browne & Kim, 1993; Bucciol & Miniaci, 2011; Campbell, 1980). However, these analyses have been limited due to the complex and multifaceted nature of the variables involved. Our research stands out by using machine learning and PCA to transform these intricate variables into a cohesive and predictive framework. This approach fosters a deeper understanding of the factors driving household insurance expenses, ultimately assisting in predicting insurance spending and enhancing decision-making within the industry. 1 The research holds significant implications for various stakeholders. For insurance providers, it offers valuable insights into consumer behaviour, enabling the customization of insurance products and the implementation of effective risk management strategies. Policymakers and regulatory bodies can utilize the critical data generated by this study to make informed decisions, thereby fostering a more resilient and adaptable insurance marketplace. Furthermore, Canadian consumers stand to benefit from a clearer understanding of the factors that influence their insurance choices, empowering them to make more informed financial decisions. This study examines the insurance purchasing patterns of Canadian households across the life, health, and auto insurance sectors. By employing Principal Component Analysis (PCA) within a machine learning framework, we aim to accurately predict insurance expenditures and identify the underlying factors influencing these purchasing decisions. The prediction of 2019 data produced the following key findings. The average life insurance expenditure of Canadian households was $1,381, slightly exceeding the predicted figure of $1,263, indicating a minor underestimation by the model. The forecast for the average expenditure on health insurance by Canadian households was highly accurate, with a prediction of $1,608 closely aligning with the actual expenditure of $1,594. The estimated car insurance expenditure by the model was $2,048, while the actual spending was $1,720. These findings illuminate the factors influencing insurance choices and underscore the potential of machine learning to enhance predictive analytics within the insurance sector. The research offers valuable insights to stakeholders in the insurance industry by outlining spending patterns and highlighting discrepancies between projected and actual expenditures. This paves the way for more precise and data-driven decision-making processes. 2 It is crucial to emphasize the financial impact of our predictive model. By leveraging PCA, we estimate Canadian household insurance spending with a high degree of accuracy. Precisely forecasting future insurance expenditures translates to significant economic benefits for various stakeholders. The financial implications of our study extend beyond academic inquiry, delivering tangible benefits for both insurance companies and policymakers. Our model fosters a deeper understanding of the dynamics at play within the insurance market, consequently strengthening the sector's economic resilience. It promotes data-driven and efficient financial practices, ultimately contributing to the sustained health and vitality of the insurance industry. Background The Canadian insurance industry is a vital pillar of the nation's economy, offering essential risk mitigation tools for individuals and families. It's a diverse sector encompassing various insurance types like life, health, and auto insurance. This industry thrives on high competition and a wide range of players. From large, multinational corporations with comprehensive insurance portfolios to smaller, niche companies specializing in specific products or customer segments, Canadian insurance caters to a varied clientele. Life insurance safeguards policyholders' dependents financially in the event of the policyholder's death. Many insurers offer supplemental products like disability, critical illness, and long-term care insurance alongside basic life insurance policies. The life insurance sector is a blend of established national and international companies, alongside smaller players specializing in specific life insurance products or catering to particular demographics. Private health insurance complements Canada's public healthcare system by covering services like prescription drugs, dental care, vision care, and allied health services that fall 3 outside the public system's scope. A diverse group of insurers, including life and health insurance companies, property and casualty insurers, and specialized health insurers, offer private health insurance in Canada. Research like this study, which explores key trends and patterns in life, private health, and car insurance spending, plays a critical role in understanding and anticipating these upcoming changes. This type of research is particularly significant given the crucial role insurance plays in individual and household financial security, and by extension, the health of the Canadian economy. Problem Statement Cacace and Schmid (2008) highlight that in 2005, private insurance financing accounted for 36.6% of total healthcare spending in the USA, compared to only 12.9% in Canada. This trend has consistently shown lower private insurance spending in Canada from 1990 to 2005. Despite the close cultural and economic ties between the two nations, Canadian households exhibit differences in insurance ownership compared to those in the United States. This disparity suggests potential opportunities for growth in the Canadian insurance market. However, the reasons behind this gap remain unclear, particularly considering the various factors that may influence insurance decisions. The role of housing characteristics is particularly noteworthy, as research suggests that factors such as suitability, income, education, and age can significantly influence spending (Mapharing et al., 2015). Browne and Kim (1993) analyze the relationships between various economic, social, and cultural factors and the demand for life insurance across countries, aiming to understand the determinants of life insurance consumption rather than predicting future spending. Bucciol and Miniaci (2011) analyze the distribution of risk tolerance among U.S. 4 households, finding significant heterogeneity influenced primarily by age and wealth, while other factors such as education, gender, and race show no significant impact. Their study focuses on understanding these relationships rather than predicting insurance spending. Campbell (1980) analyzes the relationships between economic uncertainties, particularly labour income uncertainty, and the demand for life insurance, deriving optimal insurance demand equations and emphasizing the influence of risk aversion and perceived insurance costs on household decisionmaking. This research predominantly focuses on comprehending these theoretical relationships rather than predicting insurance spending. Our research addresses this gap by developing a machine learning-based model to predict insurance spending patterns among Canadian households. By incorporating a broad range of variables, this study seeks to comprehensively understand the factors shaping household insurance expenditure. Objectives This study aims to provide a comprehensive data-driven analysis of factors influencing insurance expenditure in Canadian households, focusing on life, health, and car insurance. These spending patterns are influenced by various factors such as household income, home value, urbanicity, and geographic location, creating a complex multidimensional space. To handle this complexity, the research leverages machine learning, particularly Principal Component Analysis (PCA), for dimensionality reduction and feature extraction. This enhances the robustness and predictive accuracy of the model. The research aims to decipher and model the intricate web of factors influencing insurance spending through this synergistic approach, providing a robust analytical framework that combines data-driven insights with predictive capabilities. 5 Understanding how households spend on insurance has significant implications for various stakeholders. For insurance companies, this research offers valuable insights for developing effective strategies and personalized insurance plans, tailoring coverage options to different income brackets, and informing marketing strategies to target specific customer segments. Policymakers and regulators can benefit by using the data to identify demographics with low insurance coverage, leading to policies that promote competition and innovation in the insurance sector, increasing access to insurance and ensuring a more equitable market. Consumers can make more informed decisions about their insurance needs, potentially leading to lower costs and improved coverage, thereby enhancing their financial security and peace of mind. Financial advisors and investors can leverage these insights to offer more accurate guidance and inform investment decisions. Advisors can help clients anticipate future insurance costs and identify potential savings opportunities, while investors can use spending trends and future predictions for strategic investment choices. Finally, this research contributes to the broader academic field by providing a methodology and data foundation for further studies in insurance and household finance. It fosters innovation and deepens the understanding of these fields, potentially extending the findings to different contexts or countries. The primary contribution of this research is to produce a predictive model practical for use by insurance companies and policymakers. This model aims to enhance strategic decisionmaking and policy development, ultimately benefiting the insurance market and its stakeholders. Chapter Two Literature Review This study significantly advances our understanding of Canadian household insurance spending by transitioning from merely identifying influential factors to proactively predicting 6 future trends. This shift marks a monumental leap forward, enabling regulators, policymakers, and insurance companies to make data-driven, anticipatory decisions. Prior research has predominantly focused on the American context. For instance, Bucciol and Miniaci (2011) analyzed the distribution of risk tolerance among U.S. households, finding significant heterogeneity influenced primarily by age and wealth. Similarly, Campbell (2006) explored the financial behaviours and decision-making processes of American households, examining empirical data and theoretical models to highlight the challenges and discrepancies between actual and ideal financial practices. Nam and Hanna (2019) found that higher risk aversion decreases the likelihood of single-parent households owning term life insurance but increases the likelihood of owning cash-value life insurance, with smokers less likely to own term life insurance but more likely to own cash-value life insurance. This study utilized data from the Survey of Consumer Finances (SCF) to analyze American households. However, Canada's unique characteristics, such as its universal healthcare system and distinct approach to genetic information in life insurance (Knoppers & Joly, 2004), necessitate a dedicated examination. While historical Canadian studies (De Bromhead & Borowiecki, 2016) offer valuable insights, they may not accurately reflect the current landscape due to significant demographic and technological shifts. Mapharing et al. (2015). considered a broader range of factors in a more recent study but lacked a model for predicting future trends. Calvet, Campbell, and Sodini (2007) consider various factors such as financial sophistication (wealth, education, pension contributions, liabilities) and demographic characteristics (age, employment status, household size, entrepreneurship, immigration status). Key factors influencing life insurance spending include higher income levels, increased education, and urbanization, all of which positively impact demand. Inflation can negatively 7 affect demand unless mitigated by concurrent economic growth and reforms. Social and demographic changes, such as smaller family sizes and an aging population, further drive the need for life insurance as a source of financial security (Hwang & Gao, 2003). Guiso, Haliassos, and Jappelli (2003) make it clear that insurance spending is a significant component of household financial portfolios in countries like the UK and the Netherlands. This is driven by insurance technical reserves. This trend is influenced by demographic shifts towards an aging population, institutional developments such as pension reforms, and government policies promoting private retirement savings through tax incentives. Collectively, these factors increase the reliance on insurance products for long-term financial planning. According to Browne and Kim (1993), the primary factors influencing life insurance spending include the dependency ratio, national income, government social security spending, inflation, the price of insurance, predominant religion, education level, and life expectancy. Bucciol and Miniaci (2011) found that risk tolerance in U.S. households decreases with age and increases with wealth, while education, gender, race, and household size do not significantly influence risk attitudes. Their results are robust across different portfolio definitions and sample compositions. Liu, Zhang, Chen, & Yang (2021) discovered that superstition, specifically the belief in the zodiac year, significantly influences economic behaviour among Chinese rural households. This belief leads to an 18.5% increase in life insurance spending during the household head's zodiac year, underscoring the role of cultural beliefs in shaping financial decisions. Utilizing data from the Chinese Household Finance Survey (CHFS) and the Peking University Digital Financial Inclusion Index (PKU-DFIIC), Ye, Pu, and Xiong (2022) found that digital finance significantly promotes household participation in risky financial markets. Digital finance achieves this by 8 reducing investment barriers, enhancing access to financial information, and increasing risk appetite. Furthermore, it reduces wealth and cognitive thresholds, thereby reflecting the inclusive nature of digital finance. While previous studies have explored insurance spending, they lack comprehensive predictive models that consider a diverse array of independent variables. Further complicating the picture, regional variations within Canada, such as urban size and provincial regulations, may also influence households' insurance decisions (De Bromhead & Borowiecki, 2016). However, the contribution of these regional differences to predicting spending remains understudied. By creating a predictive model that utilizes data from the Survey of Household Spending (SHS) and incorporates factors like internet access and tenure length, this study aims to substantially advance the understanding of insurance spending among Canadian households. Analyzing these variables will enhance our comprehension of the determinants of insurance spending and provide valuable insights into future trends. This will ultimately facilitate better decision-making by all stakeholders, including households, insurers, and policymakers. To provide a well-rounded perspective, this section explores previous research that sheds light on factors influencing insurance spending and related financial decisions, even if they are not directly relevant to our specific methodology. Di Matteo and Emery (2002) investigated the relationship between wealth and life insurance demand in Canada. Campbell (2006) highlighted the critical need for improved data on household finances and pointed out common financial mistakes made by households. Law et al. (2013) examined the rise of private healthcare payments in Canada. Research by Browne and Kim (1993), Hwang and Gao (2003), Outreville (2013), Merkoulova and Veld (2022), Bucciol and Miniaci (2011), and others explored the connection between income, education, risk aversion, and insurance demand across various 9 countries. Mapharing et al. (2015) examined determinants of life insurance demand in Canada, while De Bromhead and Borowiecki (2016) analyzed the relationship between immigration and life insurance demand using historical census data. Our model will assess the predictive power of a comprehensive range of factors categorized as demographic (province, household type, education level, urban size), housing-related (suitability, tenure length/type, house value, building age), financial (major income source, internet access), and lifestyle (spending on clothing/shoes, recreational car insurance). This paper introduces a novel Principal Component Analysis (PCA)-based approach to predict future insurance spending. Unlike previous Canadian insurance demand studies (Mapharing et al., 2015) that focused on influencing factors, our work leverages PCA to construct a robust forecasting model. This represents a significant advancement from earlier studies solely focused on identifying influential factors. This study's innovative approach to predicting Canadian household insurance spending through the application of PCA builds on a well-documented foundation of PCA's efficacy across various domains. PCA's effectiveness in constructing predictive models across diverse disciplines, including healthcare and insurance, is well-established. Bro and Smilde (2003) provided clear quantitative and visual evidence of the impact of these preprocessing steps on the interpretation of principal components and the variance. For instance, Kanchan and Kishor (2016) effectively utilized PCA to reduce the dimensionality of their dataset, mitigate noise, prevent overfitting, and improve the interpretability and performance of machine learning algorithms for predicting heart disease and diabetes. This led to the creation of more efficient, robust, and accurate predictive models. Similarly, Wang and Wang (2015) demonstrated PCA's remarkable ability to enhance the accuracy of supervised machine learning algorithms in 10 predicting diseases. This approach holds particular significance for our research, providing a robust methodological framework for handling the complex, high-dimensional data inherent in analyzing Canadian household insurance expenditures. The successful application of PCA in a multitude of contexts underscores its adaptability and proficiency in deciphering multifaceted datasets, a crucial capability for understanding the intricate dynamics of insurance purchasing behaviour. Methodology: Research Philosophy This study adheres to a positivist research philosophy, which emphasizes the acquisition of knowledge through objective, verifiable evidence. This aligns perfectly with our goal of developing a predictive model for Canadian household insurance spending. A positivist approach necessitates a rigorous and systematic data collection and analysis strategy. Quantitative methods are central to this research design. We will utilize large datasets from reliable sources like Statistics Canada’s Survey of Household Spending (SHS) database. This allows for high levels of accuracy and precision in our analysis. Statistical analysis of this quantitative data will enable the construction of robust predictive models. Our focus on quantitative methods is a deliberate choice. While qualitative research offers valuable insights, a quantitative approach is better suited to our goals. It allows us to generate generalizable results and facilitates in-depth statistical analysis and mathematical modelling, which are crucial for building a strong predictive model. Predictive modelling is a hallmark of positivist research, and it aligns perfectly with our methodology. This approach thrives on the large datasets and advanced statistical techniques that are integral to this study. In essence, our research philosophy, grounded in positivism, 11 emphasizes empirical, observable data. We leverage robust quantitative methods to achieve our objective of predicting Canadian household insurance spending with accuracy and generalizability. Research Approaches and Strategies The foundation of this research is a top-down deductive approach, which is fundamentally rooted in the tenets of existing theories. This approach begins with a general theory, formulating specific hypotheses. These hypotheses are then tested empirically, allowing the research to confirm, revise, reject, or refine the initial theory. In this study, our deductive approach will start with the general theory of insurance spending, drawn from an extensive literature review and existing empirical studies. From this broad theory, we will develop specific hypotheses related to the predictors of Canadian household insurance spending. These hypotheses will then be tested using quantitative data, allowing us to examine outcomes and draw conclusions. The use of quantitative data is integral to our research strategy. This method will enable us to test specific predictions and make generalizations based on the results. Quantitative data offers a degree of measurement precision and objectivity that aligns well with our deductive approach. The data will be collected and analyzed using robust statistical methods. This approach, often used in scientific research, enables us to draw conclusions and make inferences about the relationship between the variables under investigation. The statistical analysis will include techniques such as regression analysis, time series analysis, and machine learning algorithms, depending on the nature and structure of the data. 12 These statistical methods not only provide a means for testing our hypotheses but also allow us to build a predictive model for Canadian household insurance spending. This model will enable us to make projections about future spending patterns based on current and past data. Similar to Kanchan & Kishor, (2016), our study divided the data set into two parts: Year 2010~2017 as a training dataset and year 2019 as a test dataset. The training dataset is used to feed the algorithms, allowing them to learn from this data. After the learning phase, the test dataset is used to evaluate the performance of the algorithms. Our research approach and strategy rest on the principles of deductive reasoning and empirical observation. This framework, grounded in the traditions of scientific research, provides a rigorous and systematic means for testing hypotheses and generating reliable, generalizable findings. Hypothesis and Predictive Model: Our research posits that the deployment of Principal Component Analysis (PCA) on a 32dimensional dataset, representing Canadian household insurance expenditure, will facilitate the extraction of a condensed subset of principal components. These components are anticipated to encapsulate a substantial portion of the dataset's variance. The employment of these principal components as input features is hypothesized to enhance the efficacy of our predictive model. Our model utilizes PCA within a machine learning framework to predict household insurance spending. PCA is employed for dimensionality reduction, transforming the original high-dimensional data into a smaller set of uncorrelated components that capture the most significant variance. Data for this study is sourced from Statistics Canada's "Canadian Survey of Household Spending (SHS)" database spanning 2010 to 2017, with 2019 data used for validation. Variables considered include household income, education level, household size, 13 location, home value, length of tenure, and expenditures on life, health, and car insurance. The model hypothesis posits that PCA will extract principal components encapsulating substantial variance, thereby enhancing the predictive model's efficacy. In practice, the model collects and preprocesses data, adjusting for inflation using the Consumer Price Index (CPI). PCA is then applied to reduce data dimensionality, focusing on the most relevant features. The dataset is split into a training set (2010-2017) and a test set (2019) to develop and validate the predictive model. Principal components derived from PCA are used as input features for machine learning algorithms to train the model, which is subsequently validated using the test set to assess predictive accuracy. The trained model predicts insurance spending for Canadian households, allowing for an analysis of key factors influencing insurance purchasing decisions. Similar models and methodologies have been discussed in various literature. For instance, Bro and Smilde (2003) highlight the importance of centring and scaling in component analysis, while Kanchan and Kishor (2016) demonstrate PCA's application in disease prediction, showcasing its role in reducing data dimensionality and enhancing model accuracy. Studies like Bucciol and Miniaci (2011) examine household portfolios and implicit risk preferences, which are crucial for understanding insurance demand factors. Moreover, Mapharing et al., (2015) investigate determinants of life insurance demand in Canada, emphasizing comprehensive variable analysis. The integration of PCA with machine learning is further illustrated by Wang and Wang (2015), who explore its use in stock market prediction, underscoring PCA's effectiveness in improving predictive accuracy. These references underscore the broad applicability of PCA in various fields, including insurance spending analysis and predictive 14 modelling, demonstrating its robustness in handling complex datasets and making accurate predictions. The selection of PCA is not arbitrary but is underpinned by several critical considerations. PCA is renowned for its proficiency in simplifying complex, high-dimensional datasets. This characteristic is particularly germane to our study, given the intricate nature of insurance purchasing behaviour, which is embedded in a multitude of life factors. Our choice of PCA is further justified by the following rationale: Complexity of Insurance Purchasing Behavior: The decision-making process in insurance purchasing is complex, being influenced by an array of life factors. It necessitates an analytical tool that can sift through numerous variables to isolate those with the most significant impact on insurance purchasing decisions. Enhancing Model Robustness and Generalizability: To enhance the robustness and generalizability of predictive models, it is crucial to focus on the most relevant components within the dataset. PCA effectively addresses the issue of numerous variables, ensuring that the model remains applicable and reliable in practical scenarios while mitigating overfitting. The application of PCA will simplify the dataset and produce a robust and generalizable predictive model. This model will provide reliable insights into Canadian household insurance expenditure behaviours. Data Collection: This research will utilize data from the Canadian Survey of Household Spending (SHS) from 1997 to 2019. However, for this analysis, the dataset will be restricted to the years 2010 to 2017 and 2019. The rationale behind this choice is twofold: firstly, a significant redesign of the survey methodology in 2010 renders the data pre-2010 less comparable (Tremblay et al., 2010). 15 Secondly, data for the year 2018 is not available, necessitating a focus on the period from 2010 to 2017, with 2019 serving as a benchmark for comparison with our predictions. Although the SHS data does not constitute a panel dataset, it closely aligns with Campbell's (2006) five ideal household data set criteria. It offers a representative sample of the population, provides highly accurate data, clearly distinguishes between asset classes, measures a comprehensive breakdown of wealth, and allows for tracking households over time. For this study, the data will be adjusted to 2017 levels using the Consumer Price Index (CPI). This adjustment ensures that all the values are comparable in real terms, eliminating the effects of inflation over the years under consideration. We aim to apply Principal Component Analysis (PCA) to generate a predictive model, using variables such as household income, highest level of education, size of the population center, location (provinces), household size, length of tenure, and the value of the house. The dependent variables will be life insurance, private health insurance, and car insurance spending. These variables were chosen for their potential impact on insurance spending, as indicated by previous research. Due to the skewed distribution of insurance spending, with many Canadian households reporting no spending on diverse insurance types, the analysis will exclude non-insurance buyers. This approach aims to obtain a more accurate picture of insurance spending patterns among households actively participating in the insurance market. Software This study will employ STATA, a widely used statistical software package, for data analysis. STATA offers robust capabilities for data manipulation, statistical modelling, and visualization, making it suitable for planned analyses. 16 Microsoft Excel is used for organization tasks, such as formatting and preliminary data matching. However, all data will be ultimately converted to STATA format to ensure consistency and facilitate comprehensive statistical analysis. Ethical Considerations This research project strictly adheres to the Statistics Canada Research Data Centre (RDC) guidelines to ensure the privacy and rights of individuals. Our commitment to ethical data practices is reflected in the comprehensive framework provided by the guidelines, which guarantee that all research data used here has undergone rigorous ethical and legal review. Furthermore, transparency is paramount to our research approach. We will meticulously document our data collection, analysis, and reporting methodologies. Open access to these documents will allow for thorough scrutiny and validation of our findings, fostering trust and confidence in our research. This study upholds the highest ethical standards. We are committed to protecting individual privacy and confidentiality, maintaining the integrity of the research process, and making meaningful and ethical contributions to the knowledge base on household insurance spending. Data Preparation To ensure the integrity and reliability of our study, meticulous data preparation was paramount. Our analysis utilized the Canadian Survey of Household Spending (SHS) data spanning the years 2010-2017 and 2019. For consistency in comparing data across these timelines, we adjusted all monetary figures to 2017 levels using the Consumer Price Index (CPI) for data analysis, and after making the prediction, we adjusted the prediction figure to several steps were undertaken to maintain data consistency: Variable Consistency: Variable names were harmonized across datasets to ensure uniformity. 17 Data Merging: The pooled SHS data were integrated with the bootstrap data, creating a comprehensive dataset. Data Transformation for STATA Compatibility included converting text entries to numeric values, ensuring proper formatting and labelling, and coding for missing values. Appending Data: The pooled data from 2010 to 2013 was combined with data from 2014 to 2017. To ensure that this combined dataset accurately represented the entire Canadian demographic, we adjusted the weight variable by 50%. This adjustment was made to ensure that the data from 2010 to 2017 continued to accurately represent the entire population of Canadian households. Intermediate Variable Creation: Variables such as pre-tax income, length of tenure, household size, urban size, insurance and clothing expenditure, house value, and recreational vehicle insurance underwent log transformations. Additionally, adjustments for CPI and data centring were essential for subsequent PCA analysis. Refinements: Non-insurance buyers were excluded to prevent potential data skewing. Several categorical variables like education level, building age, and urban size were converted to continuous metrics for detailed analysis. Meanwhile, categories like provinces and major income sources were transformed into dummy variables. After these refinements, we delved deep into the survey samples, examining variables like log income before tax, log length of tenure, education level, and urban size. This revealed insightful trends, such as rising life insurance purchases correlating with higher educational attainment. Correlation and Variable Analysis To construct a robust model, it was essential to understand the relationships between variables. We started with correlation tests, identifying and removing variables with high inter-correlations 18 to mitigate multicollinearity, which ensures a more reliable model. During the analysis of the dependent variable, we observed that a significant majority of the variables demonstrated strong relationships, indicating their importance for the model. We also transformed categorical variables into dummy variables to facilitate analysis and complied with RDC vetting requests by excluding certain variables. In managing outliers, especially in the income variable, we addressed data skew by capping income between $5,000 and $500,000, following RDC guidelines that recommended a thoughtful data cut-off and aiming to represent the majority of the household demographic accurately. Through meticulous data preparation, we have established a solid foundation for our predictive model, ensuring it is based on a clean, consistent, and insightful dataset. Advantages of PCA analysis Principal Component Analysis (PCA) has seamlessly integrated itself as a pivotal tool in multivariate data analysis, predominantly attributed to its adept capability to manage and decipher high-dimensional data. This is crucial in many real-world contexts, such as finance and social sciences, where data is ubiquitous and manifests across numerous dimensions or variables. PCA judiciously addresses these complexities by proffering a methodology that encapsulates this multivariate data in a concise, reduced form, judiciously preserving vital information. It ingeniously transmutes the original data dimensions into a fresh set of dimensions, termed 'principal components.' The orthogonality of these components is crucial, ensuring linear independence and offering a significant upper hand in sidestepping multicollinearity within predictive modelling. In the contemporary era of data-driven decision-making, Machine Learning (ML) has burgeoned as an indispensable facet, especially in deciphering patterns within high-dimensional data, which 19 is often challenging to navigate. The integration of PCA and ML could potentially pave the way for more precise and computationally efficient predictive models by reducing the dimensionality of the data fed into ML algorithms, thereby mitigating the risk of overfitting and enhancing model interpretability. The nuanced sophistication of PCA is manifested in its ability to prioritize these components astutely, anchoring this prioritization on the variance each component sequesters from the original data. A hierarchical structure emerges, wherein the first principal component sequesters the maximal variance, followed by the second, and so forth. This hierarchical variance capture enables a judicious reduction in the dimensions utilized for subsequent analyses or modelling, as typically, only the paramount components—those sequestering the maximal variance—are retained. Leveraging the prowess of PCA allows researchers and analysts to distill their data structures, rendering them more navigable and interpretable and enhancing the efficacy and speed of subsequent modelling processes. Especially when integrated with ML algorithms, this amplifies the modelling process and ensures that models concocted upon this reduced data are less susceptible to overfitting, being stripped of redundant or insignificant dimensions. Thus, the resultant model is not only computationally efficient and robust but also adept at capturing the quintessence of the original data, tactfully navigating through the quagmires associated with high dimensionality. In this paper, we explore the potent synergy between Principal Component Analysis (PCA) and Machine Learning (ML) in building robust and dependable predictive models for complex multivariate data, specifically focusing on Canadian household insurance expenditure behaviours. PCA elegantly transforms high-dimensional data into a set of uncorrelated "principal 20 components," capturing the most significant variance while mitigating multicollinearity. This dimensionality reduction unlocks a cascade of benefits: enhanced computational efficiency reduced overfitting risk, and improved model interpretability. By channelling the power of ML through these streamlined components, we build models that not only achieve superior precision but also excel in robustness and generalizability. This advantage stems from PCA's ability to concentrate on key data features, leading to models that perform admirably in real-world scenarios. Our commitment to a quantitative analytical approach, fueled by credible data sources like the Survey of Household Spending, aligns perfectly with the data-driven philosophy of predictive modelling. This rigorous statistical and mathematical foundation ensures objectivity and generalizability, surpassing the limitations inherent in qualitative data and paving the way for insightful and dependable models that illuminate the intricacies of Canadian household insurance expenditure patterns. Chapter Three Data Analysis Narrative/ Empirical Results The general predictive model of Canadian households' insurance spending is as follows: ௣ ܲ‫ܥ‬௜ = ෍ ௝ୀଵ ߙ௜௝ ܺ௝ ܻ෠௉஼஺ = ݂୔େ୅ (ܲ‫)ܥ‬ ܲ‫ ܥ‬represents the principal components obtained from the original dataset. ݂୔େ୅ represents the predictive model function that uses the principal components. ܻ෠௉஼஺ represents the predicted outcomes based on the predictive model. ߙ௜௝ are the loading for the original variables ܺ௝ ܺ௝ are the original variables. 21 Life Insurance Predictive Model and Discussion The life insurance predictive model is as follows: ܻ෠௉஼஺,௟௜௙௘ = ߚ଴ + ߚଵ · (ܲ‫ܥ‬ଵ ) + ߚଶ · (ܲ‫ܥ‬ଶ ) + ߚଷ · (ܲ‫ܥ‬ଷ ) + ⋯ + ߚ୬ · (ܲ‫ܥ‬௡ ) • ߚ଴ is the intercept. • ߚ୧ are the coefficients for the principal components ܲ‫ܥ‬௜ While exploring the life insurance portfolios of Canadian households, we encountered a statistical issue: high correlations (multicollinearity) between multiple independent variables. This phenomenon, where two or more predictors in a multiple regression model are correlated, poses a challenge as it can distort the interpretation and reliability of statistical results. To counteract this, we harnessed the power of Principal Component Analysis (PCA), a sophisticated and advanced statistical method often used to handle many variables, improve visualization, and reduce noise. This rigorous process allowed us to transform our primary set of 16 independent predictors into an expanded set of 33 distinct components. In doing so, we improved both the precision and robustness of our analysis, allowing for more nuanced insights. At the outset, our methodological approach was anchored to Kaiser's (1960) widely acknowledged criterion. This involved adopting an eigenvalue threshold of 1 for component inclusion, a practice that has historically demonstrated efficacy in discerning principal components. However, as our investigation evolved and the intricacies of our dataset became evident, it became manifestly clear that solely relying on a limited subset of components inadequately captured the underlying complexities and inherent variance of our data. By recalibrating the eigenvalue benchmark to a more conservative 0.5, we succeeded in explicating a substantial portion, nearly 95%, of the variance. This methodological shift also had the advantageous effect of reducing the dimensionality by seven, streamlining our analytical processes. (Table 2, Figure 1) 22 (insert Table 2, Figure 1 here) The dataset was not merely a collection of numbers; it encompassed 37,610 data points, which, when appropriately weighted, reflected a sizable population of 5,610,380. This number is not trivial; it is equivalent to as many Canadian households, providing a substantial sample size for our analyses (Table 1). (insert Table 1 here) Given the provided data, the Principal Component Analysis (PCA) was conducted to understand the underlying structures of various variables related to insurance. Specifically, the variable "couple with kid(s)" showed a strong loading of 0.6926 on Component 1, suggesting that it can be a pivotal variable in explaining the variance in this component. "Size of household" also presented a substantial loading of 0.6261 on Component 1, indicating its influential role in this principal component. The variable "1-person household" is notably associated with Component 1 as well, with a loading of -0.3353, which implies a significant but inverse relationship with this component. Furthermore, "government payment" displayed a substantial positive loading on Component 2, equating to 0.858, highlighting its primary role in explaining the variance within this component. While these variables and their respective loadings offer a glimpse into the key factors influencing the various components, it is also critical to consider the eigenvalues, which represent the variance explained by each component. An eigenvalue threshold of 0.5 was applied to discern the significance of the components. Consequently, components with eigenvalues exceeding this threshold are deemed to significantly encapsulate the variance within the dataset, thereby meriting further exploration and interpretation. The comprehensive investigation of variables with substantial loadings in components, alongside considerations of respective 23 eigenvalues, facilitates a more nuanced understanding of the predominant patterns within the data, thereby informing targeted strategic approaches in the context of life insurance within Canadian households. This nuanced understanding could potentially translate into actionable insights for stakeholders within the insurance domain, assisting in formulating strategies that are optimally aligned with discerned patterns and trends. The post-rotation phase of our analysis was particularly enlightening. Several data-driven narratives emerged: The inaugural component was delineated by its robust affinity with demographic characteristics. Households typified as 'Couples with kid(s)' and metrics related to the 'Size of household' were dominant players. In a somewhat contrasting narrative, 'Single-person households' registered a negative loading on this component, suggesting varied insurance behaviours across household types. The narrative surrounding the second component was multifaceted. Households that were predominantly reliant on governmental assistance as a primary income channel emerged as significant, registering a pronounced positive loading. Concurrently, income metrics displayed an inverse relationship with this component, a finding that resonated with broader socioeconomic patterns. The implications were clear: higher income tiers bolster insurance allocations—a revelation that harmoniously aligns with existing literature. In contrast, households primarily sustained by government subsidies exhibited a tempered enthusiasm for life insurance investments, potentially reflecting socio-economic challenges. While being of significant analytical interest, the tertiary component was distinctly shaped by metrics often associated with stability and future financial planning. 'Length of tenure' and 24 'Anticipated property value' were both indicative of a positive influence on life insurance outlays, suggesting that long-term planning and property valuation play a role in insurance decisions. In exploring Canadian household life insurance expenditures, we employed a regression analysis that juxtaposed these expenditures, represented logarithmically, against the first 26 principal components. This approach yielded significant associations between each component and the dependent variable. Notably, while these associations were robust, the magnitudes of the corresponding coefficients were relatively subdued. This observation suggests a nuanced decision-making landscape in life insurance purchases within Canadian households. The incremental contributions of various determinants paint a complex and multifaceted picture of the factors influencing insurance decisions. Central to our findings is the revelation that the 26 principal components collectively capture a substantial 94.78% of the total variance. This coverage highlights the depth and comprehensiveness of our PCA-centric approach. Our analysis further delineates the relationships of these components with life insurance spending. Component 1, for instance, exhibits a positive correlation with life insurance expenditures. This is particularly pronounced in households with children, as evidenced by the strong positive loading for 'couple with kid(s).' Such households are inclined to allocate more resources to life insurance, presumably for ensuring financial security, especially concerning their children's future. In contrast, single-person households, as indicated by their negative loading, appear less inclined to significant life insurance investment, likely due to a reduced need stemming from fewer dependents. Conversely, Component 2 displays an inverse relationship with life insurance spending. Households predominantly reliant on government payments, as suggested by the strong positive 25 loading on 'government payment' within this component, tend to invest less in life insurance. This trend could be attributed to financial constraints or differing priorities in financial planning. In contrast, the negative loadings for groups such as 'one-person household' and 'couple with kid(s)' in this component suggest a higher propensity to invest in life insurance among households that are not primarily dependent on government support, potentially reflective of better financial stability or a different risk assessment framework. Utilizing datasets spanning 2010 to 2017, a period marked by significant economic and sociopolitical shifts, we meticulously fine-tuned our model leveraging the PCA framework. This refined model was subsequently extrapolated to the 2019 household data, a year of interest for our study. Our prognostications, formulated using this robust methodology, pegged the average insurance expenditure for 2019 at $1,263. However, when we compared the figures to the actual data, we found that they were slightly higher than our forecasts. The total came to $1,381. The model's projection was based on a rigorous methodology, but it did not perfectly align with the actual results. For the average 2019 Canadian household life insurance expenditure, our model's estimate stood at $1,263 per insured household. Contrasting this with the empirically observed weighted mean of $1,381, we discerned a variance of $118 (Table 5). While our model's estimations hovered in proximity to actual figures, diverse elements—including external economic forces, regulatory shifts, and unforeseen variables—might have catalyzed this divergence. It is pivotal to contextualize the model's efficacy within this expansive spectrum of potential determinants. Such deviations, while noteworthy, are not uncommon in predictive analytics. Remarkably, a discrepancy of $118, when juxtaposed against the vast landscape of household expenditures, underscores the model's commendable precision and its potential applicability in future research endeavours. 26 (insert Table 5 here) Navigating through the intricate landscape of life insurance spending predictions and actual expenditures in 2019, our exploration across ten groups has unearthed pivotal insights and showcased the commendable strengths of the predictive model employed. Initially embarking on a methodical analysis of life insurance spending data, we sorted predictions of Canadian households' insurance spending and systematically categorized the weighted data into ten equitable groups. A calculation of the mean for each group, and subsequently, the actual insurance spending for each corresponding group (elaborated in Table 6 and Note: Figure 2), revealed a strikingly accurate predictive capability in several instances. Specifically, groups 1, 2, 3, 5, and 8 exhibited an impressively minimal divergence between predicted and actual life spending, maintaining a deviation constrained within a mere ±5%. This undeniably attests to the model's robust predictive prowess in several demographic sectors. Even in instances where the model manifested discrepancies, such as the slightly larger variations observed in groups 4, 6, 7, 9 and 10, the invaluable insights derived from these outcomes provide a pathway for iterative refinement. The varied magnitudes and percentage changes across groups, while indicative of differential predictive accuracy, pave the way for targeted model optimization across diverse segments. Thus, while this analysis underscores the importance of continuous scrutiny and refinement of the predictive model, it simultaneously celebrates the model’s successes, spotlighting its potential and laying a foundation upon which future predictive endeavours can build, ensuring a progressively more accurate alignment with actual spending trajectories in the life insurance domain. The model, therefore, not merely stands as a testament to the impactful confluence of data and predictive analytics but also as a catalyst for future advancements in predictive accuracy across a multitude of demographic landscapes. 27 (insert Note: Figure 2 and Table 6 here) An exploration into the predictive model's accuracy reveals a particularly intriguing trend concerning higher insurance spending groups, where the mean exhibits a notably larger spread. This phenomenon could be intricately tied to the financial behaviours and risk mitigation strategies employed by higher-income families, often characterized by a multifaceted approach to safeguarding their financial stability. Notably, for these families, life insurance may constitute merely one facet of a broader financial strategy, intertwined with various other risk-hedging mechanisms, thereby introducing an additional layer of complexity and variability into their insurance spending patterns. This divergent approach towards insurance spending, particularly prevalent in higher-income demographics, inherently embeds a degree of unpredictability into the model, especially when compared to lower spending groups. Therefore, while the model showcases commendable predictive capabilities across numerous segments, it encounters heightened challenges when navigating through the intricacies of the higher insurance spending strata. This not only underscores the necessity for a nuanced and adaptive predictive model that is finely attuned to the multifarious financial landscapes across different income brackets but also opens avenues for exploring more tailored predictive models, which consider the multifactorial financial behaviours intrinsic to various demographic segments. Private Health Insurance Predictive Model and Discussion The predictive model of the Canadian households' private health insurance spending is as follows: ܻ෠௉஼஺,௛௘௔௟௧௛ = 2 · (ߚ଴ + ߚଵ · (ܲ‫ܥ‬ଵ ) + ߚଶ · (ܲ‫ܥ‬ଶ ) + ߚଷ · (ܲ‫ܥ‬ଷ ) + ⋯ + ߚ୬ · (ܲ‫ܥ‬௡ )) • ߚ଴ is the intercept. • ߚ୧ are the coefficients for the principal components ܲ‫ܥ‬௜ 28 Our dataset revealed that out of 46,568 Canadian households, a significant number opted for private health insurance. Adjusting for the weighted data, this number is projected to represent approximately 6,126,880 households across Canada. It is noteworthy to mention that the overall Canadian household count approximates 14 million (Table 1). To ascertain the rigour and uniformity of our analytical process, we applied a consistent methodological framework across disparate datasets. The emphasis was predominantly on Canadian households that had chosen private health insurance. It was fascinating to observe that the patterns discerned from the eigenvalue chart were congruent with patterns from our prior analyses on life insurance. The initial three components emerged as notably influential, encapsulating a significant fraction of the overall variance. From the fourth to the twenty-sixth component, the explanatory significance remained unwavering. Following the implementation of the Principal Component Analysis (PCA) and its subsequent VariMax rotation, we discerned distinct regional tendencies. Specifically, households in Quebec (QBC) and Newfoundland and Labrador (NEW) exhibited a heightened affinity for private health insurance as opposed to other provinces. A nuanced exploration of the dataset underscored certain household attributes as critical determinants. For example, households identified under the category "couple with children" showcased a pronounced positive loading of 0.7169 within the inaugural component. Concurrently, the "size of the household" bore a compelling loading of 0.6216 within the same component. The second component underscored salient factors such as "tenure duration" (0.7944) and "anticipated residential resale value" (0.4944), both boasting robust positive loadings (Table 10). It is pivotal to note that while elements like "government subsidies" and "gross income" were prominently loaded in the third component, the latter manifested an inverse association with the dependent variable. This insinuates that a "gross 29 income" surge correlates positively with insurance outlay. However, households primarily sustained by government subsidies tend to exhibit a reticence towards procuring private health insurance. An exhaustive delineation of these revelations, accompanied by their respective loadings, is elucidated in Table 5. (Insert Table 5, Table 10 here) Our predictive framework, capitalizing on PCA for the curtailment of dimensionality, postulated the average 2019 expenditure of Canadian households on private health insurance to hover around $1608. In contrast, the empirically observed average was pegged at $1594, marking a marginal variance of $14. This minuscule discrepancy of $14 between the anticipated and actualized values vouches for the predictive model's astuteness in approximating the expenditure trajectory of Canadian households vis-à-vis private health insurance. Given the convoluted dynamics and capricious tendencies characteristic of financial datasets, such a marginal divergence accentuates the robustness and precision of our PCA-augmented predictive paradigm. Table 7and Note: Figure 4 provided data underscores a fascinating yet complex dynamic in spending prediction disparities among ten distinct groups. A conspicuous variance between anticipated and actual spending is noted, with certain clusters, notably Groups 1, 2, 4, 7, and 10, overshooting their financial forecasts, while a contrasting underestimation is exhibited by Groups 3, 5, 6, 8, and 9. Particularly, Group 1 overtly exceeded their predictions, overshooting by an assertive 11.45% or $158.08, while Group 8 markedly undervalued their actual spending by a stark $217.12 or -15.72%. Interestingly, while certain collectives like Group 6 displayed a minuscule disparity of $8.77 or -0.63%, presenting a near-accurate forecast, others, such as Group 8, 30 highlighted a more substantial discord, hinting at challenges in prediction or unforeseen spending events. These divergences hint towards a multifaceted issue, potentially mandating a meticulous re-evaluation of the employed predictive analytics. The alternating pattern of over and underprediction across these groups does not follow a uniform trend ( Note: Figure 4, Table 7), either in terms of magnitude or directionality, suggesting that the inconsistencies may not be anchored in a singular bias towards overestimation or underestimation. Furthermore, while certain groups like 7 and 10 displayed analogous absolute differences, the proportional disparities provide additional depth, accentuating the necessity to gauge variations in both absolute and percentage metrics for a thorough financial analysis and consequently recalibrating predictive mechanisms to enhance future forecasting reliability. (Insert Figure 4, Table 7 here) Car Insurance Predictive Model and Discussion The predictive model of Canadian households' car insurance spending is as follows: ܻ෠௉஼஺,௖௔௥ = 1 · (ߚ଴ + ߚଵ · (ܲ‫ܥ‬ଵ ) + ߚଶ · (ܲ‫ܥ‬ଶ ) + ߚଷ · (ܲ‫ܥ‬ଷ ) + ⋯ + ߚ୬ · (ܲ‫ܥ‬௡ )) 2 • ߚ଴ is the intercept. • ߚ୧ are the coefficients for the principal components ܲ‫ܥ‬௜ In our comprehensive analysis of car insurance data using Principal Component Analysis (PCA), distinct patterns emerged warranting further scrutiny. Two components, both with eigenvalues surpassing 1, were especially pronounced, signifying their substantial influence on the dataset's variance. Interestingly, after the fourth component, there was a marked decrease in relevance, suggesting that the unaccounted variance in our model fluctuated between 10% and 30%. Upon rotation, a wealth of insights surfaced. The first component revealed significant loadings tied to various household compositions: a loading of -0.3503 for single-person households, 31 0.6963 for couples with children, and 0.613 capturing the broader household size. Component 2 was characterized by strong loadings reflecting the duration of homeownership at 0.7921 and the anticipated property resale value at 0.5135. Component 3 was predominantly influenced by factors related to income and educational levels. Remarkably, the fourth component was almost singularly driven by urban considerations, highlighting its critical importance. (Table 11) (Insert Table 11 here) A key revelation from our research was the pronounced role of geographical factors on insurance spending. Households in regions such as British Columbia (BC), Manitoba (MA), Alberta (AB), and Newfoundland and Labrador (NEW) exhibited a stronger propensity to allocate more towards car insurance, diverging from trends in other provinces. This regional distinction underscores the criticality of understanding local dynamics when forecasting insurance expenditure patterns. In analyzing Canadian car insurance spending, it's important to consider the potential reasons for the discrepancies between predicted and actual expenditures across provinces. This includes acknowledging the diverse nature of insurance providers—public insurers in provinces like British Columbia and Manitoba, private ones in Ontario and Alberta, and hybrids in Quebec and Saskatchewan. These regional policy differences significantly influence household insurance expenses. A comprehensive analysis reveals that provinces with public insurance, particularly BC and Manitoba, show a forward movement in component weights for car insurance spending, suggesting a consistent trend with the type of provincial insurer. This observation is key to understanding national insurance spending patterns. However, car insurance was a different story. The model overshot the mark by a significant 19.07%. This indicates there might be other variables that could impact car insurance costs. 32 Zooming in on ten different household groups, a pattern emerged. The model consistently overestimated car insurance spending for all groups except the highest spenders. Overestimation ranged from 4.0% to 24.4%. This suggests a bias in the model, or perhaps missing variables, that particularly affect lower-spending households. On the flip side, the model's performance for car insurance spending itself wasn't a complete miss. It predicted an average household expenditure of $2,048 for car insurance in 2019. The actual average was $1,720, meaning the model underestimated by $328. While this is a modest variance, it highlights the inherent challenges of prediction and areas for improvement. Delving into a critical evaluation of Note: Figure 6 and Table 8, which presents the predicted and actual mean car insurance spending across ten distinct groups within Canadian households, pronounced discrepancy surfaces, meriting a meticulous exploration and discussion about the accuracy and reliability of predictive models utilized. The data encompasses a variety of mean spending, spanning from $591.71 to $1,695.93 in predicted values and $255.07 to $1,719.81 in actual expenditures across groups 1 through 10, respectively. Notably, in 9 out of the 10 groups, the anticipated spending perceptibly overshoots the actual spending, with group 1 witnessing the most sizable difference of $336.63, equating to a 24.3758% overestimation. Conversely, group 10 stands out as an anomaly, wherein the actual spending slightly surpasses the predicted by $23.88, accounting for a -1.7292% variance. (Insert Note: Figure 6, Table 8 here) The prevalent discrepancies, particularly the overarching trend of overestimations across the majority of groups and the contrasting underestimation in Group 10 unfold a multi-layered discussion regarding the data, variables, and methodologies encapsulated within the predictive 33 models. Given the substantial dataset of 8,251 observations and a significant total population size of 10,395,761, the findings underline a pivotal need for refining predictive methodologies. A thorough inquiry into whether the predictive models were aptly calibrated to accommodate potential caps on insurance spending due to vehicle values and whether they adeptly navigated the complexities and restrictions imposed by such value constraints is paramount. This exploration potentially hints at a requirement for the predictive models to more closely align with and represent the variables and characteristics inherent to each group, thereby optimizing the precision and reliability of future forecasts in car insurance spending across various segments of Canadian households. This approach would, in theory, contribute to enhancing the alignment between predicted and actual spending, driving more informed and strategic decision-making within the insurance sector and facilitating more accurate future financial planning and policymaking. Chapter Four Conclusion: Our study concentrated on the complex elements that affect the process of making insurance decisions among Canadian households. By employing principal component analysis (PCA), we were able to refine our analysis and mitigate challenges such as multicollinearity, thereby revealing significant insights into the determinants of insurance decisions across a range of demographic groups, economic conditions, and regions. The predictive model, developed using principal component analysis (PCA) and machine learning techniques, provided robust forecasts for household insurance spending in 2019. The model predicted the average expenditure for life insurance to be $1,263, for private health insurance, $1,608, and for car insurance, $2,048. A comparison of the model's predictions with 34 actual spending revealed commendable accuracy for private health insurance, with a slight deviation of only $14 from the actual average of $1594. The prediction for life insurance was also relatively accurate, with an actual average of $1,381, indicating a minor underestimation of $118. However, the model exhibited a larger variance for car insurance, with an overestimation of actual spending by $328. The 2019 insurance spending data from Canadians demonstrated the efficacy of the prediction model. The model demonstrated particular efficacy in forecasting life and health insurance expenditures. The predictions were found to be in close alignment with reality, with deviations of -8.54% and 0.88%, respectively. This indicates that the model effectively identified the primary factors influencing expenditure on these types of insurance. Our research makes a significant theoretical contribution to the field of insurance by advancing the use of predictive analytics. By integrating principal component analysis with machine learning, we demonstrated how high-dimensional data can be effectively reduced and analyzed to identify the most significant variables influencing insurance spending. This approach not only enhances the understanding of the factors driving household insurance expenditures but also establishes a precedent for future studies employing similar methodologies in other domains. Practically, the insights from our research offer valuable applications for multiple stakeholders. Insurance companies can leverage the predictive model to develop more effective and personalized insurance products tailored to different demographics. Policymakers can use the findings to design policies that address gaps in insurance coverage, promoting competition and innovation in the sector. Consumers benefit by making more informed financial decisions, potentially leading to better coverage and cost savings, thus enhancing their financial security. 35 The model also aids financial advisors and investors in providing accurate guidance and making strategic investment choices. Additionally, the study's methodology and data serve as a foundation for further studies, fostering innovation and a deeper understanding of insurance and household finance. Overall, the paper significantly enhances decision-making processes, contributing to more effective insurance products, informed policies, empowered consumers, and a more robust and equitable insurance market. A significant finding was the high degree of accuracy demonstrated by the predictive model for life insurance expenditures, with only a minimal discrepancy between the predicted and actual spending. However, the model demonstrated certain limitations, particularly in the projections for private health and car insurance, where discrepancies were identified, indicating potential areas for improvement. The results underscored the pivotal role of factors such as household type, size of households, income levels, length of tenure property valuations, and regional dynamics in influencing insurance behaviours. It is of the utmost importance to continuously refine and calibrate predictive models in order to ensure that they remain attuned to evolving landscapes and intricate dynamics. This study highlights the critical importance of perpetual model refinement in predictive analytics to enhance precision and reliability in forecasting insurance expenditures. Future research should concentrate on integrating more sophisticated machine learning algorithms and investigating additional variables to further improve the model’s predictive capacity and applicability across diverse scenarios. In conclusion, our research offers substantial theoretical and practical contributions to the field of insurance analytics. It provides a robust framework for understanding and predicting household insurance spending. The predictive model developed in this study has significant 36 potential for application in real-world scenarios, enhancing decision-making processes for various stakeholders in the insurance industry. Future direction: Future research should delve deeper into the intricate factors shaping insurance spending across Canada by building on the initial insights from our principal component analysis (PCA). This exploration should consider shifting trends over time to provide valuable historical context and enhance predictive modelling through advanced machine learning algorithms and a broader range of variables, including macroeconomic indicators and societal trends. Additionally, qualitative research methods, such as surveys and interviews, can offer insights into the motivations and challenges influencing insurance spending decisions across different household types. A critical analysis of public policy impacts is also necessary to understand the complex relationship between policymaking and consumer behaviour. Furthermore, a comparative study with global insurance spending trends can help identify unique Canadian phenomena and assess the applicability of international strategies within the domestic context. By pursuing these avenues, we aim to create a comprehensive and accurate model that reflects the complex reality of the Canadian insurance landscape. 37 Bibliography: Browne, M. J., & Kim, K. (1993). An international analysis of life insurance demand. Journal of Risk and Insurance, 616–634. Bro, R., & Smilde, A. K. (2003). Centering and scaling in component analysis. Journal of Chemometrics, 17(1), 16–33. https://doi.org/10.1002/cem.773 Bucciol, A., & Miniaci, R. (2011). Household portfolios and implicit risk preference. The Review of Economics and Statistics, 93(4), 1235–1250. https://doi.org/10.1162/REST_a_00138 Cacace, M., & Schmid, A. (2008). The healthcare systems of the USA and Canada: Forever on divergent paths? Social Policy & Administration, 42(4), 396–417. https://doi.org/10.1111/j.1467-9515.2008.00611.x Calvet, L. E., Campbell, J. Y., & Sodini, P. (2007). Down or out: Assessing the welfare costs of household investment mistakes. Journal of Political Economy, 115(5), 707–747. JSTOR. https://doi.org/10.1086/524204 Campbell, J. Y. (2006). Household finance. The Journal of Finance, 61(4), 1553–1604. https://doi.org/10.1111/j.1540-6261.2006.00883.x Campbell, R. A. (1980). The demand for life insurance: An application of the economics of uncertainty. The Journal of Finance, 35(5), 1155–1172. https://doi.org/10.2307/2327091 De Bromhead, A., & Borowiecki, K. (2016). Immigration and the demand for life insurance: Evidence from Canada, 1911. European Review of Economic History, 20, 147–175. https://doi.org/10.1093/ereh/hev022 38 Di Matteo, L., & Herbert Emery, J. C. (2002). Wealth and the demand for life insurance: Evidence from Ontario, 1892. Explorations in Economic History, 39(4), Article 4. https://doi.org/10.1016/S0014-4983(02)00004-9 Guiso, L., Haliassos, M., & Jappelli, T. (2003). Household stockholding in Europe: Where do we stand and where do we go? Economic Policy, 18(36), 123–170. https://doi.org/10.1111/1468-0327.00104 Hwang, T., & Gao, S. (2003). The determinants of the demand for life insurance in an emerging economy–the case of China. Managerial Finance. Kaiser, H. F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20(1), 141–151. Kanchan, B. D., & Kishor, M. M. (2016). Study of machine learning algorithms for special disease prediction using the principle of component analysis. 2016 International Conference on Global Trends in Signal Processing, Information Computing and Communication (ICGTSPICC), 5–10. https://doi.org/10.1109/ICGTSPICC.2016.7955260 Law, M. R., Daw, J. R., Cheng, L., & Morgan, S. G. (2013). Growth in private payments for health care by Canadian households. Health Policy (Amsterdam, Netherlands), 110(2–3), Article 2–3. https://doi.org/10.1016/j.healthpol.2013.01.014 Liu, Y., Zhang, Y., Chen, X., & Yang, Y. (2021). Superstition and farmers' life insurance spending. Economics Letters, p. 206, 109975. https://doi.org/10.1016/j.econlet.2021.109975 Mapharing, M., Otuteye, E., & Radikoko, I. (2015). Determinants of demand for life insurance: The case of Canada. Journal of Comparative International Management, 18(2), 1–22. 39 Merkoulova, Y., & Veld, C. (2022). Why do individuals not participate in the stock market? International Review of Financial Analysis, 83, 102292. https://doi.org/10.1016/j.irfa.2022.102292 Nam, Y., & Hanna, S. D. (2019). The effects of risk aversion on life insurance ownership of single-parent households. Applied Economics Letters, 26(15), Article 15. Outreville, J. F. (2013). The relationship between insurance and economic development: 85 empirical papers for a review of the literature. Risk Management and Insurance Review, 16(1), Article 1. Tremblay, J., Lynch, J., & Dubreuil, G. (2010). Pilot survey results from the Canadian survey of household spending redesign—Joint Statistical Meetings of the American Statistical Association, Vancouver, British Columbia. Wang, J., & Wang, J. (2015). Forecasting stock market indexes using principle component analysis and stochastic time effective neural networks. Neurocomputing, 156, 68–78. https://doi.org/10.1016/j.neucom.2014.12.084 Ye, Y., Pu, Y., & Xiong, A. (2022). The impact of digital finance on household participation in risky financial markets: Evidence-based study from China. PLOS ONE, 17(4), e0265606. https://doi.org/10.1371/journal.pone.0265606 40 Appendix 1: Figures Figure 1 Eigenvalue Distribution After PCA Analysis of Life Insurance Spending 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 Note: Figure 1 is generated from Table 2, illustrates the eigenvalues of each component after performing PCA analysis on the independent variables. The slope from components 4 to 26 is gradual. A noticeable dip after component 26 suggests that subsequent components contribute even less to the variance. 41 Figure 2. Mean Predicted vs. Actual Life Insurance Spending Across 10 Groups in Canada, 2019 $3,000.00 $2,500.00 $2,000.00 $1,500.00 $1,000.00 $500.00 $- 1 2 3 4 5 6 7 8 9 10 Pred Life Insurance Spending 2019 (mean) Actual Life Insurance Spending 2019 (mean) Note: Figure 2. generated from Table 6, shows that the predicted life insurance spending aligns with the actual spending. However, at higher spending groups, the predictions tend to be lower than the actual values, causing the spread to increase. 42 Figure 3. Eigenvalue Distribution After PCA Analysis of Private Health Insurance Spending 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 Note: Figure 3. This graph, generated from Table 3, illustrates the eigenvalues of each component after performing PCA analysis on the independent variables. The slope from components 4 to 26 is gradual. A noticeable dip after component 26 suggests that subsequent components contribute even less to the variance. 43 Figure 4. Mean Predicted vs. Actual Private Health Insurance Spending Across 10 Groups in Canada, 2019 $3,000.00 $2,500.00 $2,000.00 $1,500.00 $1,000.00 $500.00 $0.00 1 2 3 4 5 6 7 8 9 10 Pred Health Insurance Spending 2019 (mean) Actual Health Insurance Spending 2019 (mean) Note: Figure 4. The line graph compares mean predicted private health insurance spending to actual spending across ten groups in Canada for 2019. Both predicted (blue line) and actual (orange line) spending show an upward trend across the groups. In the initial groups (1 to 5), actual spending is slightly lower than predicted spending, indicating minor overestimation. In the higher groups (6 to 10), actual spending intersects with and occasionally exceeds predicted spending, suggesting improved accuracy in predictions. Overall, the graph demonstrates that while predictions are generally close to actual spending, there are slight discrepancies, particularly in the lower groups, with better alignment in higher-spending groups. 44 Figure 5. Eigenvalue Distribution After PCA Analysis of Car Insurance Spending 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 Note: Figure 5. is generated from Table 4, illustrates the eigenvalues of each component after performing PCA analysis on the independent variables. The slope from components 4 to 26 is gradual. A noticeable dip after component 26 suggests that subsequent components contribute even less to the variance. 45 Figure 6. Mean Predicted vs. Actual Car Insurance Spending Across 10 Groups in Canada, 2019 $2,000.00 $1,800.00 $1,600.00 $1,400.00 $1,200.00 $1,000.00 $800.00 $600.00 $400.00 $200.00 $0.00 1 2 3 4 5 6 7 8 9 10 Pred Car Insurance Spending 2019 (mean) Actual Car Insurance Spending 2019 (mean) Note: Figure 6. The line graph compares mean predicted car insurance spending to actual spending across ten groups in Canada for 2019. It shows an upward trend for both predicted and actual spending as group numbers increase. Initially, predicted spending (blue line) is higher than actual spending (orange line) for most groups, indicating overestimation. However, this gap narrows towards the higher groups, where actual spending meets and eventually exceeds predicted spending around groups 9 and 10. This suggests that while predictions were generally higher, actual spending aligns more closely with predictions in higher-spending groups, highlighting discrepancies primarily in lower groups. 46 Appendix 2: Tables Table 1. Number of Observations and Corresponding Weighted Population Estimates. 2010-2017 2019 weighted Observations weighted population population life 37,610 5,610,380 5,315 6,822,934 health 46,568 6,126,880 6,425 7,610,624 car 75,451 11,153,340 8,251 10,395,761 Note: Table 1 shows the number of observations and corresponding weighted population estimates for life, health, and car insurance from 2010-2017 and in 2019. While the number of observations for life and car insurance decreased in 2019, the weighted population estimates for life and health insurance increased during this period. Observations 47 Table 2. Eigenvalue Distribution After PCA Analysis of Life Insurance Spending Component Eigenvalue Difference Proportion Cumulative Comp1 4.212 2.345 0.132 0.132 Comp2 1.867 0.231 0.058 0.190 Comp3 1.635 0.268 0.051 0.241 Comp4 1.367 0.040 0.043 0.284 Comp5 1.327 0.065 0.042 0.325 Comp6 1.261 0.057 0.039 0.365 Comp7 1.204 0.064 0.038 0.402 Comp8 1.141 0.010 0.036 0.438 Comp9 1.130 0.012 0.035 0.473 Comp10 1.119 0.005 0.035 0.508 Comp11 1.113 0.010 0.035 0.543 Comp12 1.104 0.035 0.035 0.578 Comp13 1.068 0.016 0.033 0.611 Comp14 1.053 0.017 0.033 0.644 Comp15 1.035 0.012 0.032 0.676 Comp16 1.024 0.074 0.032 0.708 Comp17 0.949 0.041 0.030 0.738 Comp18 0.908 0.043 0.028 0.766 Comp19 0.865 0.013 0.027 0.793 Comp20 0.853 0.037 0.027 0.820 Comp21 0.815 0.056 0.026 0.845 Comp22 0.759 0.066 0.024 0.869 Comp23 0.693 0.037 0.022 0.891 Comp24 0.656 0.066 0.021 0.911 Comp25 0.590 0.010 0.018 0.930 Comp26 0.580 0.138 0.018 0.948 Comp27 0.442 0.037 0.014 0.962 Comp28 0.404 0.107 0.013 0.974 Comp29 0.298 0.024 0.009 0.984 Comp30 0.274 0.144 0.009 0.992 Comp31 0.129 0.005 0.004 0.996 Comp32 0.125 . 0.004 1.000 Note: Table 2 shows the eigenvalue distribution from PCA analysis of life insurance spending, with the first component explaining 13.2% of the total variance and the first three components cumulatively explaining 24.1%. As the component number increases, their individual contributions to variance decrease, indicating that only a few components account for most of the variance. 48 Table 3. Eigenvalue Distribution After PCA Analysis of Private Health Insurance Spending Component Eigenvalue Difference Proportion Cumulative Comp1 3.978 2.037 0.124 0.124 Comp2 1.940 0.212 0.061 0.185 Comp3 1.728 0.374 0.054 0.239 Comp4 1.354 0.046 0.042 0.281 Comp5 1.308 0.033 0.041 0.322 Comp6 1.275 0.065 0.040 0.362 Comp7 1.210 0.055 0.038 0.400 Comp8 1.155 0.014 0.036 0.436 Comp9 1.140 0.019 0.036 0.472 Comp10 1.121 0.007 0.035 0.507 Comp11 1.114 0.009 0.035 0.541 Comp12 1.106 0.033 0.035 0.576 Comp13 1.073 0.019 0.034 0.609 Comp14 1.054 0.014 0.033 0.642 Comp15 1.040 0.009 0.033 0.675 Comp16 1.031 0.060 0.032 0.707 Comp17 0.971 0.058 0.030 0.737 Comp18 0.913 0.028 0.029 0.766 Comp19 0.885 0.031 0.028 0.794 Comp20 0.855 0.022 0.027 0.820 Comp21 0.833 0.051 0.026 0.846 Comp22 0.782 0.047 0.024 0.871 Comp23 0.735 0.058 0.023 0.894 Comp24 0.677 0.096 0.021 0.915 Comp25 0.580 0.007 0.018 0.933 Comp26 0.574 0.158 0.018 0.951 Comp27 0.415 0.030 0.013 0.964 Comp28 0.385 0.084 0.012 0.976 Comp29 0.301 0.043 0.009 0.985 Comp30 0.258 0.138 0.008 0.993 Comp31 0.120 0.029 0.004 0.997 Comp32 0.091 . 0.003 1.000 Note: Table 3 provides the eigenvalue distribution from a Principal Component Analysis (PCA) of health insurance spending, highlighting how each component contributes to explaining the total variance in the data. It shows that a few principal components account for a significant portion of the variance, while the contributions of subsequent components gradually diminish. 49 Table 4. Eigenvalue Distribution After PCA Analysis of Car Insurance Spending Component Eigenvalue Difference Proportion Cumulative Comp1 4.013 2.109 0.125 0.125 Comp2 1.904 0.125 0.060 0.185 Comp3 1.778 0.412 0.056 0.241 Comp4 1.367 0.074 0.043 0.283 Comp5 1.293 0.036 0.040 0.324 Comp6 1.257 0.052 0.039 0.363 Comp7 1.205 0.048 0.038 0.401 Comp8 1.157 0.030 0.036 0.437 Comp9 1.128 0.007 0.035 0.472 Comp10 1.120 0.009 0.035 0.507 Comp11 1.112 0.007 0.035 0.542 Comp12 1.105 0.045 0.035 0.576 Comp13 1.060 0.008 0.033 0.609 Comp14 1.052 0.012 0.033 0.642 Comp15 1.040 0.012 0.033 0.675 Comp16 1.028 0.054 0.032 0.707 Comp17 0.974 0.052 0.030 0.737 Comp18 0.921 0.037 0.029 0.766 Comp19 0.884 0.023 0.028 0.794 Comp20 0.860 0.026 0.027 0.821 Comp21 0.834 0.059 0.026 0.847 Comp22 0.775 0.069 0.024 0.871 Comp23 0.706 0.056 0.022 0.893 Comp24 0.649 0.075 0.020 0.913 Comp25 0.575 0.003 0.018 0.931 Comp26 0.572 0.135 0.018 0.949 Comp27 0.437 0.050 0.014 0.963 Comp28 0.387 0.098 0.012 0.975 Comp29 0.289 0.031 0.009 0.984 Comp30 0.258 0.124 0.008 0.992 Comp31 0.134 0.008 0.004 0.996 Comp32 0.125 . 0.004 1.000 Note: Table 4 provides the eigenvalue distribution from a Principal Component Analysis (PCA) of car insurance spending, highlighting how each component contributes to explaining the total variance in the data. It shows that a few principal components account for a significant portion of the variance, while the contributions of subsequent components gradually diminish. 50 Table 5. 2019 Canadian Households Insurance Spending Predicted Actual Spending Insurance Type Difference Percentage Spending (mean) (mean) Life Insurance 1263 1381 -118 -8.54% Health 1608 1594 14 0.88% Insurance Car Insurance 2048 1720 328 19.07% Note: Table 5. provides an analysis of Canadian households' insurance spending in 2019, comparing predicted spending to actual spending across different types of insurance. 51 Table 6. Mean Life Insurance Spending for Canadian Households: Averages Across Ten Distinct Groups Pred Life Actual Life Insurance Insurance difference in Percent change of groups Spending 2019 Spending 2019 mean difference (mean) (mean) 1 $648.16 $631.44 $16.71 1.2% 2 $816.46 $811.79 $4.67 0.3% 3 $945.13 $958.36 -$13.24 -1.0% 4 $1,063.37 $1,227.18 -$163.81 -11.9% 5 $1,172.85 $1,221.17 -$48.32 -3.5% 6 $1,266.39 $1,456.21 -$189.82 -13.7% 7 $1,382.58 $1,724.51 -$341.94 -24.8% 8 $1,518.86 $1,457.97 $60.89 4.4% 9 $1,699.46 $1,891.80 -$192.35 -13.9% 10 $2,120.77 $2,424.50 -$303.74 -22.0% Number of obs = 5,315 Population size = 6,822,934 Note: Table 6. presents a comparative analysis of predicted and actual mean life insurance spending for Canadian households across ten distinct groups for the year 2019. 52 Table 7. Mean Health Insurance Spending for Canadian Households: Averages cross Ten Distinct Groups Actual Health Pred Health Percent Insurance difference groups Insurance Spending change of Spending 2019 in mean 2019 (mean) difference (mean) 1 $158.08 11.4% $970.06 $811.98 2 $135.49 9.8% $1,170.10 $1,034.61 3 -$58.31 -4.2% $1,295.81 $1,354.13 4 $116.79 8.5% $1,408.65 $1,291.86 5 -$35.11 -2.5% $1,516.55 $1,551.65 6 -$8.77 -0.6% $1,618.15 $1,626.92 7 $62.04 4.5% $1,725.50 $1,663.46 8 -15.7% $1,868.42 $2,085.54 -$217.12 9 -$72.80 -5.3% $2,032.18 $2,104.98 10 $62.05 4.5% $2,481.41 $2,419.36 Number of obs = 6,425 Population size = 7,610,624 Note: Table 7 provides an analysis of mean health insurance spending for Canadian households in 2019, comparing predicted and actual spending across ten distinct groups. 53 Table 8. Mean Car Insurance Spending for Canadian Households: Averages Across Ten Distinct Groups Actual Car Pred Car Insurance Percent Insurance difference groups Spending 2019 change of Spending 2019 in mean (mean) difference (mean) 1 $336.63 24.4% $591.71 $255.07 2 $237.97 17.2% $696.04 $458.08 3 $202.28 14.6% $771.25 $568.97 4 $199.73 14.5% $847.95 $648.22 5 $162.01 11.7% $929.57 $767.56 6 $104.38 7.6% $1,012.16 $907.78 7 $155.42 11.3% $1,106.72 $951.29 8 $209.95 15.2% $1,222.83 $1,012.88 9 $54.98 4.0% $1,365.04 $1,310.05 10 -$23.88 -1.7% $1,695.93 $1,719.81 Number of obs = 8,251 Population size = 10,395,761 Note: Table 8 provides an analysis of mean car insurance spending for Canadian households in 2019, comparing predicted spending to actual spending across ten distinct groups. 54 Table 9. Canadian Household Data: Post-VariMax Rotation Analysis for Life Insurance Holders lg_EP007_raj Coef. 0.0923 -0.0397 0.1261 -0.0015 0.0137 Variable Comp1 Comp2 Comp3 Comp4 Comp5 Prv_BC -0.004 0.005 -0.015 0.000 0.057 Prv_PEI 0.000 -0.007 0.007 -0.004 0.071 Prv_NS -0.002 -0.005 0.005 -0.001 0.097 Prv_NB -0.002 -0.011 0.009 -0.005 -0.876 Prv_QBC -0.013 0.014 -0.013 0.029 0.305 Prv_MA -0.004 0.004 -0.011 -0.002 0.067 Prv_SA -0.004 0.002 -0.008 -0.011 0.086 Prv_AB -0.005 0.017 -0.022 0.007 0.066 Prv_NEW -0.002 -0.002 0.004 -0.005 0.104 1 Person Household -0.335 -0.094 -0.134 0.059 0.022 Couple With Kid(S) 0.693 -0.034 -0.042 0.010 0.004 Couple With Other -0.022 0.001 -0.017 0.003 -0.002 Lone Parent -0.040 -0.023 -0.020 0.011 0.003 Other HH Type -0.040 -0.005 -0.030 0.010 0.000 Own With Mortgage 0.015 -0.016 -0.129 0.846 0.003 Miscellaneous Income Source -0.001 0.001 0.011 0.012 0.001 Self Employee 0.002 -0.015 0.004 -0.006 0.004 Investment Income 0.000 -0.006 -0.003 -0.008 0.002 Government Payment 0.019 0.858 0.062 -0.019 0.015 Other Major Source Income -0.003 -0.034 0.003 -0.012 0.010 Suitable Place To Live 0.010 -0.002 0.015 0.002 -0.004 Have Internet -0.003 -0.011 -0.002 -0.003 0.001 Size of Household 0.626 0.038 -0.017 0.022 0.003 Length of Tenure -0.022 0.028 0.812 -0.145 -0.011 House Selling Value -0.022 0.028 0.498 0.501 0.005 Higher Level of Edu -0.012 0.036 -0.040 -0.004 -0.019 Urban Size 0.014 -0.039 0.058 -0.059 0.303 Building Age 0.005 -0.014 0.023 0.010 0.009 Income 0.095 -0.495 0.205 -0.044 0.023 Rec Car Insurance -0.005 0.000 -0.007 -0.008 0.013 Man Cloth Spending -0.023 0.000 -0.020 0.004 0.002 Female Cloth Spending -0.022 -0.003 -0.020 0.000 0.005 0.0045 Comp6 0.053 0.069 0.093 0.118 0.260 0.064 0.084 0.059 -0.899 0.011 0.000 -0.002 0.000 -0.001 0.004 0.001 0.002 0.002 -0.002 0.006 -0.005 0.000 0.008 -0.004 0.002 -0.014 0.280 0.007 -0.019 0.014 0.002 0.005 0.0310 Comp7 0.060 0.066 -0.908 0.115 0.282 0.067 0.081 0.070 0.097 0.014 0.003 -0.001 0.002 0.000 0.000 0.001 0.003 0.001 0.006 0.006 -0.003 0.001 0.004 -0.008 0.008 -0.013 0.220 0.007 0.006 0.009 0.001 0.003 0.0015 Comp8 -0.088 -0.011 -0.041 -0.039 -0.437 -0.067 -0.031 0.864 -0.030 0.013 -0.001 -0.005 0.001 -0.003 -0.006 0.003 0.005 0.005 0.020 0.012 -0.007 0.003 -0.006 -0.033 0.038 -0.033 0.198 0.008 0.030 0.021 0.005 0.009 -0.0048 Comp9 0.053 0.063 0.086 0.108 0.234 0.063 -0.920 0.060 0.093 0.016 0.013 0.003 0.002 0.004 0.023 -0.001 0.002 0.002 -0.011 0.005 -0.006 0.000 -0.003 0.024 -0.043 -0.006 0.227 0.002 -0.041 0.013 0.004 0.007 0.0710 Comp10 -0.005 0.003 0.002 0.003 -0.008 -0.005 -0.003 -0.007 0.003 -0.074 -0.244 0.929 -0.042 -0.046 0.006 -0.001 0.002 -0.001 0.018 -0.003 -0.005 0.000 0.242 -0.025 -0.006 -0.016 0.025 0.007 0.081 -0.004 -0.020 -0.017 0.0653 Comp11 -0.030 0.021 0.013 0.026 -0.062 -0.016 0.007 -0.052 0.019 -0.028 -0.021 -0.016 -0.006 -0.019 -0.022 0.005 0.011 0.005 0.081 0.022 0.004 0.005 -0.014 -0.081 0.072 0.949 0.176 0.021 0.194 0.010 -0.010 -0.005 -0.0045 Comp12 -0.035 0.941 -0.076 -0.097 -0.162 -0.047 -0.069 -0.034 -0.084 -0.009 -0.002 0.002 -0.001 0.002 -0.003 -0.002 -0.004 -0.002 -0.010 -0.009 0.004 -0.001 0.000 0.011 -0.006 0.019 -0.236 -0.008 -0.011 -0.014 -0.002 -0.005 0.0444 Comp13 0.914 -0.025 -0.055 -0.057 -0.345 -0.071 -0.044 -0.124 -0.047 -0.012 -0.005 -0.004 -0.004 -0.006 -0.011 0.003 0.002 0.004 0.005 0.005 -0.004 0.001 -0.007 -0.030 0.042 -0.024 0.100 0.005 0.000 0.016 0.002 0.005 55 56 Table 9. Canadian Household Data: Post-VariMax Rotation Analysis for Life Insurance Holders (continued) lg_EP007_raj Coef. 0.0399 0.0112 0.0485 -0.0190 0.0163 0.0310 -0.0600 -0.0097 -0.0136 0.0114 0.0366 0.0176 0.0530 Variable Comp14 Comp15 Comp16 Comp17 Comp18 Comp19 Comp20 Comp21 Comp22 Comp23 Comp24 Comp25 Comp26 Prv_BC 0.004 -0.078 0.008 -0.004 -0.006 0.023 0.006 -0.006 0.007 0.004 0.003 0.001 0.006 Prv_PEI -0.002 -0.041 -0.006 0.000 0.002 -0.017 -0.011 0.005 -0.011 -0.002 -0.005 -0.001 -0.003 Prv_NS -0.001 -0.070 -0.004 -0.001 0.001 -0.011 -0.010 0.003 -0.008 -0.002 -0.004 -0.001 -0.001 Prv_NB -0.002 -0.081 -0.007 -0.002 0.002 -0.020 -0.015 0.006 -0.015 -0.003 -0.007 -0.002 -0.004 Prv_QBC 0.021 -0.310 0.032 -0.003 -0.005 0.061 -0.002 -0.025 0.022 0.001 0.008 0.006 0.020 Prv_MA 0.001 0.929 0.002 -0.004 -0.005 0.009 0.002 -0.002 0.003 0.003 0.001 0.001 0.003 Prv_SA -0.005 -0.061 -0.009 -0.002 -0.003 -0.017 -0.004 0.007 -0.007 0.001 -0.002 0.000 -0.003 Prv_AB 0.009 -0.104 0.018 0.000 -0.006 0.040 0.012 -0.013 0.020 0.005 0.009 0.005 0.011 Prv_NEW -0.003 -0.068 -0.007 0.001 0.002 -0.020 -0.011 0.007 -0.009 -0.002 -0.004 0.000 -0.003 1 Person Household -0.194 -0.013 -0.196 -0.308 -0.283 -0.051 0.016 0.133 -0.097 0.001 -0.013 -0.062 -0.011 Couple With Kid(S) -0.072 -0.012 -0.066 -0.156 -0.160 -0.018 0.012 -0.024 -0.040 -0.004 -0.005 -0.009 -0.010 Couple With Other -0.023 -0.005 -0.020 -0.042 -0.048 -0.005 0.008 -0.008 -0.005 -0.001 0.002 0.000 -0.002 Lone Parent -0.044 -0.004 -0.043 0.931 -0.066 -0.010 0.003 0.020 -0.026 -0.001 -0.005 -0.014 -0.004 Other HH Type -0.045 -0.005 -0.043 -0.063 0.934 -0.011 0.010 0.020 -0.014 0.000 0.002 -0.010 -0.002 Own With Mortgage 0.004 -0.012 -0.006 0.016 0.012 -0.018 0.032 0.010 -0.030 0.035 -0.015 -0.008 -0.022 Miscellaneous Income 0.001 0.002 0.001 -0.001 0.000 0.001 -0.005 -0.002 0.002 0.996 0.001 0.001 0.002 Self Employee 0.001 0.001 0.000 -0.003 0.002 -0.002 -0.005 -0.001 -0.012 0.001 0.994 -0.004 -0.003 Investment Income -0.001 0.002 -0.003 -0.003 -0.001 -0.004 0.000 0.002 -0.007 0.002 -0.003 -0.002 0.997 Government Payment 0.014 0.007 0.006 -0.030 0.010 0.009 -0.038 -0.012 -0.078 0.001 -0.039 -0.026 -0.015 Miscellaneous Income Source -0.007 0.002 -0.010 -0.020 -0.010 -0.007 -0.008 0.004 0.971 0.002 -0.013 -0.010 -0.008 Suitable Place To Live 0.017 -0.001 0.018 0.014 0.015 0.011 -0.005 0.982 0.004 -0.002 -0.001 0.004 0.002 Have Internet -0.005 0.001 -0.006 -0.010 -0.007 -0.002 -0.002 0.004 -0.009 0.001 -0.004 0.996 -0.002 Size of Household -0.009 0.002 -0.013 0.061 0.061 -0.003 0.005 0.091 0.027 0.003 0.012 -0.011 0.012 Length of Tenture -0.035 -0.025 -0.039 -0.020 -0.041 -0.020 0.060 0.034 -0.007 0.038 0.003 -0.006 -0.015 House Selling Value 0.003 0.035 0.021 -0.017 -0.006 0.027 -0.075 -0.028 0.056 -0.074 0.026 0.018 0.041 Higher Level of Edu -0.011 -0.013 -0.006 -0.004 -0.018 0.011 0.025 0.005 0.025 0.006 0.014 0.006 0.006 Urban Size -0.023 -0.019 -0.056 0.018 0.029 -0.155 -0.053 0.047 -0.068 -0.012 -0.029 -0.008 -0.034 Building Age 0.006 0.002 0.005 0.002 0.008 0.000 0.987 -0.005 -0.007 -0.005 -0.005 -0.002 0.000 Income 0.068 0.015 0.054 -0.024 0.073 0.045 -0.092 -0.047 -0.164 0.002 -0.086 -0.051 -0.028 Rec Car Insurance -0.011 0.005 -0.014 -0.007 -0.008 0.980 0.000 0.011 -0.008 0.000 -0.002 -0.002 -0.005 Man Cloth Spending 0.972 0.000 -0.028 -0.034 -0.037 -0.012 0.007 0.018 -0.008 0.001 0.001 -0.006 -0.002 Female Cloth Spending -0.028 0.001 0.971 -0.034 -0.035 -0.015 0.006 0.019 -0.011 0.001 0.000 -0.007 -0.003 Note: Table 9. includes various variables and their corresponding coefficients across ten components (Cmp1 to Cmp10). The first three components (Cmp1, Cmp2, Cmp3) from the Post-VariMax rotation analysis highlight critical factors influencing life insurance holdings among Canadian households. Component 1 shows a positive relationship with life insurance spending, indicating that households with couples and children, as well as larger household sizes, tend to spend more on life insurance. Component 2 has a negative relationship with life insurance spending, suggesting that households receiving government payments, typically with lower incomes, tend to spend less on life insurance. Component 3 also exhibits a positive relationship with life insurance spending. Table 10. Canadian Household Spending on Private Health Insurance: A Post-VariMax Rotation Analysis lg_EP007_raj Coef. 0.04173 0.0908 -0.1001 -0.0078 -0.0347 -0.0017 Variable Comp1 Comp2 Comp3 Comp4 Comp5 Comp6 Prv_BC -0.002 -0.012 0.002 0.000 0.060 0.057 Prv_PEI 0.000 0.006 -0.005 -0.004 0.073 0.071 Prv_NS -0.001 0.005 -0.004 -0.002 0.102 0.099 Prv_NB 0.000 0.009 -0.009 -0.005 -0.881 0.115 Prv_QBC -0.002 -0.008 0.003 0.027 0.273 0.247 Prv_MA -0.003 -0.011 0.004 0.000 0.072 0.070 Prv_SA -0.003 -0.012 0.006 -0.012 0.099 0.097 Prv_AB -0.002 -0.019 0.011 0.010 0.080 0.075 Prv_NEW 0.002 0.007 -0.002 -0.005 0.106 -0.896 1 Person Household -0.293 -0.122 -0.083 0.069 0.024 -0.003 Couple With Kid(S) 0.717 -0.036 -0.025 0.016 0.002 -0.006 Couple With Other -0.026 -0.016 0.003 0.004 -0.002 -0.004 Lone Parent -0.038 -0.015 -0.019 0.012 0.003 -0.002 Other HH Type -0.042 -0.028 -0.005 0.013 0.000 -0.004 Own With Mortgage 0.020 -0.123 -0.025 0.845 0.001 0.002 Miscellaneous Income Source -0.001 0.009 0.001 0.010 0.002 0.002 Self Employee 0.002 0.005 -0.015 -0.006 0.004 0.002 Investment Income 0.001 -0.002 -0.008 -0.009 0.002 0.001 Government Payment 0.014 0.076 0.865 -0.022 0.014 0.003 Other Major Source Income -0.001 0.014 -0.046 -0.015 0.010 0.005 Suitable Place To Live 0.008 0.014 -0.003 0.001 -0.002 -0.001 Have Internet -0.002 0.001 -0.010 -0.002 0.002 0.000 Size of Household 0.622 -0.006 0.034 0.018 0.001 0.005 Length of Tenture -0.020 0.794 0.042 -0.145 -0.014 -0.010 House Selling Value -0.032 0.494 0.054 0.501 0.009 0.009 Higher Level of Edu -0.011 -0.042 0.036 -0.002 -0.017 -0.014 Urban Size 0.010 0.042 -0.013 -0.058 0.308 0.289 Building Age 0.005 0.026 -0.017 0.010 0.007 0.006 Income 0.077 0.281 -0.483 -0.050 0.029 0.002 Rec Car Insurance -0.006 -0.009 0.004 -0.005 0.010 0.010 Man Cloth Spending -0.029 -0.024 0.000 0.006 0.003 0.000 Female Cloth Spending -0.026 -0.023 -0.001 0.002 0.004 0.001 0.08089 Comp7 0.062 0.068 0.097 0.111 0.254 0.074 -0.901 0.084 0.100 0.019 0.013 0.004 0.003 0.005 0.022 -0.001 0.001 0.001 -0.016 0.001 -0.004 0.000 -0.007 0.030 -0.041 -0.002 0.254 -0.002 -0.052 0.011 0.006 0.007 0.03466 Comp8 0.065 0.069 -0.901 0.113 0.277 0.075 0.096 0.090 0.102 0.016 0.000 -0.002 0.002 0.000 0.001 0.002 0.003 0.001 0.007 0.006 -0.001 0.001 0.003 -0.009 0.007 -0.011 0.236 0.006 0.012 0.008 0.002 0.003 -0.0784 Comp9 -0.083 -0.009 -0.039 -0.030 -0.494 -0.068 -0.034 0.836 -0.026 0.011 0.000 -0.003 0.002 -0.001 -0.003 0.003 0.003 0.004 0.015 0.008 -0.005 0.002 -0.001 -0.029 0.031 -0.029 0.192 0.005 0.032 0.015 0.003 0.005 0.04445 Comp10 0.002 -0.003 -0.002 -0.003 0.016 -0.002 -0.008 0.008 0.000 -0.237 -0.079 -0.024 -0.046 -0.053 0.005 0.001 0.001 0.000 0.012 -0.004 0.017 -0.004 0.005 -0.044 0.008 -0.014 -0.020 0.009 0.066 -0.011 0.961 -0.037 0.09534 Comp11 -0.027 0.019 0.013 0.024 -0.060 -0.018 0.004 -0.055 0.020 -0.034 -0.020 -0.015 -0.005 -0.018 -0.016 0.005 0.011 0.005 0.078 0.029 0.006 0.005 -0.010 -0.090 0.058 0.950 0.154 0.022 0.206 0.004 -0.013 -0.011 0.04651 Comp12 0.004 -0.005 -0.003 -0.006 0.022 -0.001 -0.009 0.013 -0.003 -0.224 -0.070 -0.021 -0.044 -0.049 -0.003 0.001 0.001 -0.001 0.009 -0.005 0.017 -0.004 0.000 -0.047 0.022 -0.011 -0.036 0.009 0.057 -0.012 -0.038 0.964 0.00119 Comp13 -0.003 0.002 0.002 0.003 -0.004 -0.005 -0.004 -0.006 0.005 -0.065 -0.213 0.940 -0.037 -0.043 0.008 0.000 0.002 0.000 0.018 0.001 -0.004 0.001 0.235 -0.026 -0.009 -0.015 0.018 0.008 0.073 -0.004 -0.023 -0.020 57 58 Table 10. Canadian Household Spending on Private Health Insurance: A Post-VariMax Rotation Analysis (continued) lg_EP007_raj Coef. -0.0177 0.0632 -0.0478 0.0143 -0.0027 -0.0523 0.0560 0.0273 -0.0339 -0.0039 0.0325 0.0344 0.0070 Variable Comp14 Comp15 Comp16 Comp17 Comp18 Comp19 Comp20 Comp21 Comp22 Comp23 Comp24 Comp25 Comp26 Prv_BC -0.039 0.082 0.913 -0.003 -0.001 0.003 0.016 0.003 -0.005 0.004 0.002 0.003 0.001 Prv_PEI 0.939 0.045 -0.029 0.001 -0.001 -0.007 -0.014 -0.009 0.003 -0.003 -0.004 -0.003 -0.002 Prv_NS -0.081 0.077 -0.061 0.001 -0.001 -0.008 -0.010 -0.008 0.001 -0.002 -0.004 -0.001 -0.001 Prv_NB -0.096 0.080 -0.058 0.001 -0.003 -0.011 -0.017 -0.013 0.003 -0.004 -0.006 -0.003 -0.003 Prv_QBC -0.154 0.320 -0.342 0.001 0.002 -0.007 0.047 0.009 -0.022 0.004 0.003 0.013 0.004 Prv_MA -0.052 -0.920 -0.078 -0.005 -0.003 0.002 0.006 0.003 -0.002 0.002 0.001 0.002 0.001 Prv_SA -0.082 0.075 -0.057 -0.004 -0.002 0.001 -0.016 -0.002 0.005 0.000 -0.001 -0.002 0.000 Prv_AB -0.044 0.131 -0.148 -0.002 0.002 0.007 0.035 0.014 -0.013 0.007 0.006 0.009 0.004 Prv_NEW -0.087 0.072 -0.051 0.005 0.003 -0.009 -0.016 -0.007 0.002 -0.003 -0.004 -0.002 -0.001 1 Person Household -0.014 0.015 -0.009 -0.291 -0.284 0.021 -0.051 -0.083 0.122 0.001 -0.012 -0.004 -0.046 Couple With Kid(S) -0.003 0.011 -0.004 -0.150 -0.140 0.012 -0.016 -0.031 -0.025 -0.002 -0.004 -0.007 -0.005 Couple With Other 0.002 0.005 -0.003 -0.042 -0.037 0.009 -0.005 -0.001 -0.006 0.000 0.002 -0.001 0.001 Lone Parent -0.002 0.003 -0.002 -0.060 0.940 0.003 -0.010 -0.021 0.016 0.000 -0.004 -0.002 -0.010 Other HH Type 0.300 0.005 -0.003 0.933 -0.062 0.011 -0.012 -0.010 0.019 0.001 0.001 0.000 -0.007 Own With Mortgage -0.002 0.007 -0.009 0.016 0.016 0.030 -0.012 -0.036 0.008 0.029 -0.017 -0.024 -0.007 Miscellaneous Income Source -0.002 -0.002 0.003 0.000 0.000 -0.004 0.000 0.002 -0.001 0.997 0.001 0.002 0.001 Self Employee -0.003 -0.001 0.001 0.001 -0.003 -0.005 0.000 -0.014 -0.001 0.001 0.994 -0.004 -0.004 Investment Income -0.002 -0.001 0.002 0.000 -0.002 -0.001 -0.002 -0.008 0.000 0.002 -0.004 0.997 -0.002 Government Payment -0.008 -0.009 0.004 0.007 -0.026 -0.045 0.014 -0.094 -0.011 0.001 -0.038 -0.019 -0.023 Other Major Source Income -0.008 -0.002 0.003 -0.006 -0.018 -0.014 0.000 0.960 0.000 0.002 -0.016 -0.010 -0.011 Suitable Place To Live 0.002 0.001 -0.003 0.014 0.012 -0.006 0.008 0.001 0.986 -0.001 -0.001 0.000 0.003 Have Internet -0.001 -0.001 0.001 -0.005 -0.007 -0.002 -0.001 -0.009 0.003 0.001 -0.003 -0.002 0.997 Size of Household 0.002 -0.005 -0.001 0.079 0.076 0.004 -0.004 0.032 0.080 0.001 0.011 0.014 -0.006 Length of Tenture 0.012 0.023 -0.027 -0.040 -0.016 0.072 -0.024 0.010 0.034 0.033 0.007 -0.013 0.000 House Selling Value -0.009 -0.024 0.037 -0.009 -0.014 -0.071 0.020 0.074 -0.025 -0.062 0.031 0.047 0.017 Higher Level of Edu 0.017 0.014 -0.021 -0.017 -0.004 0.026 0.004 0.029 0.007 0.006 0.013 0.006 0.006 Urban Size -0.249 0.030 0.077 0.019 0.009 -0.026 -0.120 -0.041 0.031 -0.017 -0.019 -0.023 -0.006 Building Age -0.006 -0.002 0.003 0.009 0.002 0.985 0.004 -0.012 -0.006 -0.004 -0.006 -0.001 -0.003 Income -0.014 -0.026 0.007 0.062 -0.023 -0.119 0.048 -0.225 -0.041 -0.001 -0.087 -0.041 -0.049 Rec Car Insurance -0.011 -0.003 0.011 -0.009 -0.007 0.004 0.987 -0.001 0.008 0.000 0.000 -0.002 -0.001 Man Cloth Spending -0.003 0.002 0.001 -0.046 -0.041 0.010 -0.013 -0.005 0.019 0.001 0.001 0.000 -0.005 Female Cloth Spending -0.004 0.002 0.002 -0.042 -0.038 0.010 -0.013 -0.007 0.019 0.001 0.001 -0.001 -0.005 Note: Table 10 presents the results of a VariMax rotation analysis for private health insurance spending. The table includes various variables and their corresponding coefficients across ten components (Cmp1 to Cmp10). Table 10's highlights key factors influencing private health insurance spending through components 1 to 10. Component 1 indicates that larger households, especially those with couples and children, tend to spend more on private health insurance, whereas single-person households spend less. Component 2 highlights that longer residence duration and higher property values are associated with increased spending, alongside a positive relationship with government payments. Component 3, however, shows a negative relationship with private health insurance spending, with households receiving government payments and lower incomes spending less on private health insurance. Table 11. Canadian Household Data: An Analysis Post-VariMax Rotation for Car Insurance Holders lg_EP007_raj Coef. 0.1057 0.0426 -0.1025 0.0214 -0.0022 Variable Comp1 Comp2 Comp3 Comp4 Comp5 Prv_BC -0.006 -0.015 0.003 -0.107 0.001 Prv_PEI 0.001 0.007 -0.004 -0.022 -0.003 Prv_NS -0.002 0.005 -0.003 -0.056 -0.002 Prv_NB -0.001 0.009 -0.007 -0.055 -0.003 Prv_QBC -0.006 -0.004 0.004 -0.152 0.019 Prv_MA -0.005 -0.009 0.002 -0.072 -0.001 Prv_SA -0.016 -0.021 0.004 -0.373 -0.018 Prv_AB -0.011 -0.028 0.015 0.805 0.014 Prv_NEW 0.000 0.006 -0.003 -0.050 -0.004 1 Person Household -0.350 -0.131 -0.078 0.032 0.087 Couple With Kid(S) 0.696 -0.048 -0.032 0.002 0.026 Couple With Other -0.025 -0.017 0.000 -0.004 0.007 Lone Parent -0.041 -0.019 -0.017 0.002 0.018 Other HH Type -0.045 -0.032 -0.007 0.001 0.017 Own With Mortgage 0.021 -0.109 -0.021 0.006 0.850 Miscellaneous Income Source 0.000 0.010 0.000 0.002 0.011 Self Employee 0.003 0.008 -0.015 0.003 -0.007 Investment Income 0.001 -0.001 -0.008 0.005 -0.010 Government Payment 0.027 0.095 0.824 0.012 -0.032 Other Major Source Income 0.000 0.009 -0.032 0.008 -0.016 Suitable Place To Live 0.010 0.015 -0.001 -0.007 0.000 Have Internet 0.000 0.004 -0.007 0.001 -0.001 Size of Household 0.613 -0.013 0.037 -0.008 0.021 Length of Tenture -0.024 0.792 0.028 -0.019 -0.152 House Selling Value -0.027 0.514 0.042 0.003 0.484 Higher Level of Edu -0.012 -0.035 0.022 -0.029 -0.001 Urban Size 0.035 0.070 -0.032 0.400 -0.065 Building Age 0.006 0.023 -0.011 0.003 0.009 Income 0.085 0.239 -0.553 0.007 -0.058 Rec Car Insurance -0.005 -0.006 0.001 0.021 -0.006 Man Cloth Spending -0.019 -0.015 -0.001 0.007 0.006 Female Cloth Spending -0.023 -0.021 -0.003 0.010 0.003 -0.1183 Comp6 -0.114 -0.031 -0.067 -0.067 0.837 -0.078 -0.381 -0.193 -0.061 0.023 0.003 -0.002 0.000 0.002 0.027 0.000 0.001 0.005 -0.009 0.005 -0.008 0.000 -0.009 0.018 -0.056 -0.022 0.276 -0.003 -0.039 0.021 0.008 0.012 0.0613 Comp7 0.886 -0.033 -0.068 -0.068 -0.152 -0.080 -0.310 -0.178 -0.063 0.005 -0.001 -0.003 -0.002 -0.002 -0.004 0.003 0.000 0.003 -0.003 0.003 -0.005 0.001 -0.010 -0.023 0.020 -0.024 0.201 0.001 -0.020 0.018 0.005 0.007 0.0470 Comp8 -0.004 0.003 0.002 0.003 -0.004 -0.005 -0.006 -0.010 0.004 -0.077 -0.232 0.934 -0.044 -0.050 0.011 0.000 0.003 0.000 0.022 -0.001 -0.004 0.002 0.234 -0.027 -0.009 -0.014 0.030 0.007 0.069 -0.003 -0.017 -0.019 0.0387 Comp9 0.066 0.065 0.092 -0.896 0.084 0.065 0.352 0.063 0.099 0.010 -0.002 -0.003 0.002 -0.001 -0.004 0.002 0.004 0.002 0.019 0.009 -0.002 0.000 0.005 -0.024 0.030 -0.016 0.164 0.009 0.033 0.010 0.001 0.004 -0.0236 Comp10 0.064 0.065 0.091 0.102 0.080 0.064 0.340 0.060 -0.902 0.001 -0.007 -0.004 -0.001 -0.003 -0.002 0.001 0.002 0.001 0.005 0.006 -0.002 -0.001 0.009 -0.017 0.024 -0.015 0.171 0.007 0.006 0.011 0.001 0.004 0.0582 Comp11 0.071 0.062 -0.910 0.100 0.091 0.067 0.331 0.075 0.096 0.007 -0.001 -0.002 0.001 -0.001 -0.004 0.001 0.002 0.001 0.010 0.006 -0.001 -0.001 0.006 -0.016 0.024 -0.010 0.121 0.007 0.016 0.007 0.000 0.003 0.0171 Comp12 -0.032 0.017 0.012 0.019 -0.037 -0.014 0.000 -0.067 0.017 -0.047 -0.026 -0.015 -0.011 -0.021 -0.013 0.005 0.011 0.006 0.075 0.019 0.006 0.004 -0.017 -0.082 0.060 0.958 0.180 0.018 0.142 0.009 -0.007 -0.007 0.0122 Comp13 0.010 -0.005 -0.004 -0.005 0.019 0.002 -0.008 0.024 -0.005 -0.202 -0.075 -0.022 -0.044 -0.049 -0.001 0.001 0.002 -0.002 0.013 -0.006 0.019 0.000 -0.007 -0.044 0.023 -0.007 -0.058 0.007 0.052 -0.014 -0.025 0.968 59 60 Table 11. Canadian Household Data: An Analysis Post-VariMax Rotation for Car Insurance Holders (continued) lg_EP007_raj Coef. 0.0599 0.0428 0.0116 -0.0239 0.0356 0.0078 -0.0227 -0.0592 0.0050 0.0057 -0.0085 -0.0167 0.0190 Variable Comp14 Comp15 Comp16 Comp17 Comp18 Comp19 Comp20 Comp21 Comp22 Comp23 Comp24 Comp25 Comp26 Prv_BC -0.095 -0.003 -0.003 -0.045 0.008 0.025 0.002 0.004 -0.007 0.004 0.001 0.005 0.001 Prv_PEI -0.044 0.002 0.000 0.945 -0.001 -0.013 -0.009 -0.008 0.003 -0.002 -0.003 -0.003 0.001 Prv_NS -0.074 0.001 -0.002 -0.075 0.000 -0.009 -0.010 -0.008 0.002 -0.001 -0.003 -0.001 0.001 Prv_NB -0.078 0.001 -0.002 -0.085 -0.001 -0.013 -0.013 -0.012 0.003 -0.002 -0.005 -0.003 0.001 Prv_QBC -0.123 0.000 -0.002 -0.054 0.014 0.037 -0.004 0.009 -0.014 0.001 0.002 0.010 0.001 Prv_MA 0.926 -0.004 -0.004 -0.049 0.002 0.008 -0.001 0.001 -0.002 0.002 0.000 0.002 0.002 Prv_SA -0.281 -0.005 -0.006 -0.228 0.000 -0.013 -0.013 -0.011 0.004 0.003 -0.002 -0.001 0.006 Prv_AB -0.131 -0.004 -0.002 -0.034 0.016 0.052 0.008 0.020 -0.016 0.006 0.006 0.013 0.002 Prv_NEW -0.074 0.003 0.001 -0.082 -0.001 -0.015 -0.011 -0.008 0.003 -0.002 -0.003 -0.002 0.002 1 Person Household -0.003 -0.303 -0.297 -0.003 -0.163 -0.053 0.023 -0.069 0.143 0.001 -0.010 -0.007 -0.021 Couple With Kid(S) -0.010 -0.173 -0.163 0.001 -0.068 -0.017 0.017 -0.030 -0.019 -0.001 -0.003 -0.006 0.002 Couple With Other -0.004 -0.048 -0.043 0.003 -0.020 -0.004 0.009 -0.002 -0.007 0.000 0.003 0.000 0.003 Lone Parent -0.004 -0.068 0.932 0.000 -0.038 -0.010 0.005 -0.018 0.021 -0.001 -0.003 -0.002 -0.004 Other HH Type -0.004 0.926 -0.070 0.002 -0.042 -0.013 0.012 -0.010 0.024 0.001 0.002 0.000 -0.002 Own With Mortgage -0.002 0.020 0.024 0.001 0.008 -0.012 0.027 -0.035 0.006 0.030 -0.015 -0.026 -0.001 Miscellaneous Income Source 0.002 0.001 -0.001 -0.002 0.000 0.000 -0.005 0.002 -0.002 0.996 0.001 0.002 0.000 Self Employee 0.000 0.002 -0.002 -0.003 0.002 0.001 -0.006 -0.014 -0.002 0.001 0.992 -0.005 -0.004 Investment Income 0.002 0.000 -0.002 -0.002 0.000 -0.003 -0.001 -0.009 0.001 0.002 -0.005 0.996 -0.002 Government Payment 0.003 0.013 -0.025 -0.010 0.013 0.016 -0.046 -0.102 -0.015 -0.001 -0.057 -0.029 -0.027 Other Major Source Income 0.001 -0.006 -0.014 -0.007 -0.003 -0.003 -0.009 0.969 0.001 0.002 -0.016 -0.010 -0.007 Suitable Place To Live -0.002 0.017 0.015 0.002 0.015 0.010 -0.006 0.002 0.981 -0.002 -0.002 0.001 0.001 Have Internet 0.001 -0.001 -0.003 0.001 0.000 0.001 -0.003 -0.006 0.001 0.000 -0.004 -0.002 0.998 Size of Household 0.000 0.069 0.067 0.000 0.000 -0.003 0.003 0.029 0.095 0.001 0.013 0.013 -0.006 Length of Tenture -0.015 -0.044 -0.019 0.018 -0.029 -0.019 0.066 0.000 0.037 0.038 0.011 -0.014 0.008 House Selling Value 0.007 -0.006 -0.022 -0.020 0.001 0.021 -0.071 0.068 -0.028 -0.067 0.026 0.050 0.003 Higher Level of Edu -0.013 -0.018 -0.009 0.017 -0.007 0.010 0.020 0.020 0.006 0.006 0.012 0.007 0.005 Urban Size 0.060 0.024 0.019 -0.162 -0.031 -0.148 -0.036 -0.055 0.038 -0.011 -0.016 -0.032 0.004 Building Age 0.000 0.009 0.004 -0.008 0.006 0.002 0.988 -0.008 -0.006 -0.005 -0.006 -0.001 -0.003 Income 0.001 0.062 -0.014 -0.013 0.047 0.046 -0.092 -0.181 -0.041 -0.004 -0.104 -0.048 -0.050 Rec Car Insurance 0.007 -0.009 -0.007 -0.011 -0.010 0.983 0.002 -0.004 0.010 0.000 0.001 -0.003 0.001 Man Cloth Spending 0.002 -0.032 -0.029 -0.001 0.980 -0.011 0.006 -0.004 0.015 0.000 0.002 0.000 0.000 Female Cloth Spending 0.002 -0.039 -0.036 -0.004 -0.026 -0.015 0.008 -0.007 0.020 0.001 0.001 -0.002 -0.001 Note: Table 11. resents a Post-VariMax rotation analysis of car insurance spending among Canadian households, focusing on the first three components. Component 1 reveals that households with couples and children (loading: 0.70) and larger households (loading: 0.61) are associated with higher car insurance spending, while single-person households (loading: -0.35) tend to spend less. Component 2 indicates that households with longer residence durations (loading: 0.79) and higher property values (loading: 0.51) tend to spend more on car insurance. Component 3 shows a negative relationship with car insurance spending, where lower household incomes (loading: 0.55) and those receiving government payments (loading: 0.82) are associated with lower car insurance expenditures. These findings suggest that household composition, financial stability, and income levels significantly influence car insurance spending patterns among Canadian households.