A Token-Based Local Help Platform with NLP Support

by

Jiaen Tao
B.Sc., Zhejiang International Studies University, 2015

PROJECT SUBMITTED IN PARTIAL FULFILLMENT OF
THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
IN
COMPUTER SCIENCE

THE UNIVERSITY OF NORTHERN BRITISH COLUMBIA
October 2025
© Jiaen Tao, 2025

Abstract

The motivation behind this thesis arises from the high labor costs commonly
observed in Canadian communities, where residents are often forced to acquire multiple skills to cope with everyday needs.A web-based skill-exchange platform—where
two people trade services using their respective skills—would be valuable. However,
population sparsity often makes matching difficult. To address this challenge, we explore a novel approach to the sharing economy: a local mutual-aid platform. Within
this platform, users can consume services provided by others through virtual tokens,
while the only way to earn tokens is by offering services themselves. Since these tokens are purely virtual, mutual-aid activities do not incur legal liabilities, nor do
they risk creating full-time workers motivated solely by financial profit, which could
undermine the spirit of reciprocity.
On the implementation side, this thesis leverages an optimized Retrieval-Augmented
Generation (RAG) approach to enable query handling under sparse data conditions,
ensuring that even limited datasets can yield accurate and explainable recommendations. Furthermore, a self-developed distributed transaction manager based on
the Saga pattern ensures the integrity of user data across distributed environments,
supporting consistent balance updates, order confirmations, and notifications.
The prototype platform we developed demonstrates how combining modern AI
techniques with lightweight distributed systems can provide both practical utility
and long-term sustainability for local help ecosystems.

ii

Contents

Abstract

ii

List of Tables

v

List of Figures

vi

Acknowledgement

viii

1 Introduction

1

1.1

Research Background and Motivation . . . . . . . . . . . . . . . . . .

1

1.2

Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

1.2.1

Feasibility of the Sharing Economy . . . . . . . . . . . . . . .

2

1.2.2

Feasibility of Token-Based Incentives . . . . . . . . . . . . . .

3

1.2.3

Feasibility of Search Under Sparse Data . . . . . . . . . . . .

3

Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

1.3

2 Related Work

6

2.1

Geographical distribution characteristics of Canadian Communities .

7

2.2

Demographic and Social Characteristics of Canadian Communities . .

9

2.3

Sharing Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1

History of Sharing Economy . . . . . . . . . . . . . . . . . . . 12

iii

2.3.2

Implementation of a Sharing Economy Platform . . . . . . . . 13

2.3.3

Distributed Transactions . . . . . . . . . . . . . . . . . . . . . 14

2.4

Token-Based Incentive Systems . . . . . . . . . . . . . . . . . . . . . 16

2.5

Recommend under Sparse Data Conditions . . . . . . . . . . . . . . . 17
2.5.1

Traditional Recommendation . . . . . . . . . . . . . . . . . . 18

2.5.2

Limitations of Traditional Recommendation . . . . . . . . . . 19

2.5.3

Recommendation Based on LLMs . . . . . . . . . . . . . . . . 20

3 Methodology
3.1

3.2

22

The Implementation of Local Help Platform . . . . . . . . . . . . . . 22
3.1.1

Home Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.1.2

Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.1.3

Login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.1.4

Profile Management

3.1.5

Profile Search . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.1.6

Service Selection . . . . . . . . . . . . . . . . . . . . . . . . . 38

. . . . . . . . . . . . . . . . . . . . . . . 26

The Deployment of Local Help Platform . . . . . . . . . . . . . . . . 44

4 Evaluation

45

4.1

Performance Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.2

Recommendation Quality Testing . . . . . . . . . . . . . . . . . . . . 47

5 Conclusion and Discussion

51

Bibliography

53

Appendix: RAGAS Evaluation Samples

iv

59

List of Tables
2.1

Comparison of Median Age in Selected Canadian Regions (2024) . . . 12

3.1

Order log row data (id = 2033). . . . . . . . . . . . . . . . . . . . . . 42

4.1

Latency statistics under 1200 QPS load . . . . . . . . . . . . . . . . 45

4.2

System resource usage during benchmark . . . . . . . . . . . . . . . . 46

4.3

RAGAS Evaluation Scores of Clear Question. . . . . . . . . . . . . . 50

4.4

RAGAS Evaluation Scores of Ambiguous Question. . . . . . . . . . . 50

v

List of Figures
2.1

Remoteness classification of Canadian census subdivisions based on
the Remoteness Index. . . . . . . . . . . . . . . . . . . . . . . . . . .

7

2.2

Recommendation Workflow . . . . . . . . . . . . . . . . . . . . . . . 18

3.1

Positive feedback loop between the service system and the token system 22

3.2

Home Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.3

The Components in Home Page . . . . . . . . . . . . . . . . . . . . . 23

3.4

Registration entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.5

Workflow of the registration process. . . . . . . . . . . . . . . . . . . 25

3.6

Successful login & registration . . . . . . . . . . . . . . . . . . . . . . 25

3.7

User login with a standard username and password. . . . . . . . . . . 26

3.8

Feature overview for uploading service information . . . . . . . . . . . 27

3.9

Workflow across User, Server, and Search Service . . . . . . . . . . . 28

3.10 CSRF token issuance and submission . . . . . . . . . . . . . . . . . . 28
3.11 Embedding Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.12 Example embedding records . . . . . . . . . . . . . . . . . . . . . . . 30
3.13 Structure of the search results . . . . . . . . . . . . . . . . . . . . . . 31
3.14 Workflow of server and search server interaction . . . . . . . . . . . . 31
3.15 Overall RAG workflow extend → retrieve → rerank → generate. . . . 32

vi

3.16 Successfully Matched to “I Want to Pursue a PhD” — Feedback
Requested . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.17 Show the reason for the empty result . . . . . . . . . . . . . . . . . . 35
3.18 product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.19 freezed time slot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.20 End-to-end purchase workflow

. . . . . . . . . . . . . . . . . . . . . 40

3.21 Begin Check Out . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.22 Address Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.23 Notification format sent via the email API . . . . . . . . . . . . . . . 41
3.24 Saga workflow with TM . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.25 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.1

RAG-based evaluation workflow . . . . . . . . . . . . . . . . . . . . . 48

vii

Acknowledgement

I would like to express my sincere gratitude to Professor Chen for his invaluable
guidance throughout the entire business workflow design. His insights on compliance and technical architecture provided critical direction for this project. I am also
deeply thankful to Professor Li for his helpful suggestions on the data slicing strategy, which significantly improved the effectiveness of the system. Moreover, I would
like to thank Professor Jiang for his constructive advice on academic standardization,
which made the overall structure of my thesis more rigorous and well-organized.
As an international student, I acknowledge that ChatGPT (GPT-5), a generative
AI system, was used for grammar correction and language polishing in this thesis.
In addition, the synthetic evaluation dataset used in the RAGAS experiments was
generated with the assistance of ChatGPT(GPT-5).

viii

Chapter 1
Introduction
1.1

Research Background and Motivation

This work is motivated by an innovative idea proposed by my supervisor, Dr. Liang
Chen. In Canada—and increasingly worldwide—labour and service costs are high;
therefore, a web-based skill-exchange platform, analogous to a product-exchange
marketplace, may offer a compelling solution in which people trade services using
their respective skills. Two challenges arise: (i) simultaneous, one-to-one matching
is difficult—especially in sparsely populated regions; and (ii) certain categories of
work require quality and risk management. In Dr. Chen’s design, the platform
addresses these issues by employing a token system—users earn tokens by completing
services for others and redeem them later for services they need—and, for selected
categories, arranging group insurance through licensed insurers to manage quality
and risk, thereby providing legal protection for both providers and recipients.
Consequently, a new business scenario emerged: the concept of a local help platform. Such a platform would allow residents to find local partners with specific skills
to help with daily tasks, while also enabling individuals to showcase their abilities

1

and assist others.
However, no platform currently exists that is explicitly designed for the local help
scenario. Existing systems fall into two categories: sharing economy platforms and
crowdsourcing platforms. Sharing economy platforms such as Airbnb and Uber [1]
operate as intermediaries, matching demand with service providers. Yet, Airbnb
focuses on short-term rentals and Uber focuses on ride-sharing, neither of which
addresses the skill-sharing needs of local help. Crowdsourcing platforms [2], such as
gig platforms and Freelancer, are closer to a C2B structure, where individuals provide services to businesses, rather than facilitating true peer-to-peer skill exchange,
and thus do not align with the authentic needs of local help.
It is also worth noting that both types of platforms involve cash transactions,
which still carry significant economic and legal risks. This raises an important
question: could we design a platform based on the principles of the sharing economy,
but where incentives are provided through non-cash mechanisms? In such a model,
mutual aid between users would be motivated by goodwill or by personal interest in
practicing and applying one’s skills. This type of platform would not only provide
practical services that improve users’ lives but also, through sustained participation,
strengthen interpersonal relationships within the community.

1.2

Problem Statement

1.2.1

Feasibility of the Sharing Economy

From the above discussion, it can be seen that local help essentially represents a
form of the sharing economy that does not rely on cash incentives. Therefore, we
can expect to encounter business and technical challenges similar to those faced by
existing sharing economy systems. Moreover, we need to explore which functions
2

must be implemented in the first version of the platform under the sharing economy
model to ensure overall usability.

Handling Diverse Website Interactions At the same time, given the limited
human resources available within local communities, the system design must emphasize minimizing operational and maintenance costs, as well as reducing expenses
associated with handling user complaints and dispute resolution. It is foreseeable
that the platform will involve a large number of user interactions, such as searches
and clicks, and will also need to interact with numerous external systems, such as
email delivery. Therefore, we have also studied the implementation of distributed
transactions and compared the relationship between integration cost and user experience across different approaches, in order to select a solution that balances
development cost and usability.

1.2.2

Feasibility of Token-Based Incentives

Another key consideration is the incentive mechanism. Since our platform explicitly
avoids using money as a stimulus, a natural question arises: can tokens be used
to encourage user participation? Is token-based stimulation theoretically feasible?
Most importantly, how should tokens be issued and utilized so as to maximize
community engagement at the lowest possible cost and legal risk? The goal is for
tokens to gradually attract users and encourage participation and interaction.

1.2.3

Feasibility of Search Under Sparse Data

Finally, since the platform is targeted at local communities, it is foreseeable that
local community life will be highly diverse. If we rely on traditional inverted index
approaches to perform keyword-based searches, it will be very difficult to retrieve
3

relevant data. Therefore, we must confront the challenge of sparse data—for example, cases where user profiles or service information are limited. The critical
technical issue is how to ensure that such limited data can still be effectively retrieved and presented with sufficient accuracy, so that the platform remains useful
under data-scarce conditions. Another issue is user experience: due to data scarcity,
users may still fail to find relevant information. Hence, we need to design interaction
mechanisms that allow users to adjust and refine their inputs.

1.3

Thesis Structure

the Related Work This section analyzes the geographic distribution and social background of Canadian communities to demonstrate that the demand for local help
aligns with current societal conditions; it then provides an overview of the origins
and development of the sharing economy, discusses potential challenges such platforms may face such as trust, governance, and long-term user engagement, and
identifies the core functionalities required for an initial version, including reliable
service matching, transparent reputation systems, and user-friendly interfaces. In
addition, it examines the theoretical feasibility of token-based incentive mechanisms
in fostering participation, rewarding contributions, and mitigating the free-rider
problem, and reviews existing research on low-cost query handling in sparse data
environments, with particular attention to RAG techniques, distributed transaction management, and adaptive indexing to ensure scalability and efficiency under
resource-constrained conditions.
The Methodology section will then present the proposed implementation of the
platform, covering both functional design and the underlying technical architecture.
The Evaluation section evaluates the system’s performance and recommendation

4

accuracy, confirming its low-cost deployment and its recommendation capability
under sparse data conditions.
Finally, the Conclusion will summarize how the proposed platform addresses the
aforementioned challenges, evaluate its effectiveness, and outline possible directions
for future iterations and technical enhancements.

5

Chapter 2
Related Work
At present, there is no platform specifically dedicated to mutual aid. Therefore,
this chapter will also analyze the geographic distribution and cultural background
of Canadian communities in order to explore the feasibility of local mutual aid.
As introduced in the Introduction, the essence of this platform is a sharing
platform. The website merely serves as an information provider and matchmaking
intermediary, while the actual completion of services still depends on interactions
between users. Accordingly, in terms of theoretical feasibility and risk assessment,
this paper will primarily draw upon business models related to the sharing economy.
This chapter will also examine some of the subsequent issues encountered by sharing
economy platforms and propose strategies to avoid them.
Of course, the platform also incorporates several unique features, such as a tokenbased incentive mechanism and natural language interaction capabilities, which require independent theoretical exploration.

6

2.1

Geographical distribution characteristics of Canadian Communities

Figure 2.1: Remoteness classification of Canadian census subdivisions based on the
Remoteness Index.
Source: Statistics Canada, Remoteness Index Map (Remote/Very remote vs. Accessible areas)
[3].

First, as the second-largest country in the world by land area, Canada has a
very sparse population. However, based on the following analysis, we can see that
most communities are clustered along rivers or transportation routes, making intercommunity mutual aid geographically feasible.

Extreme Distance from Urban Centres According to Statistics Canada’s Remoteness Index, most Inuit Nunangat communities are situated more than 1,000
7

kilometres away from major urban centers, with no road access and reliance exclusively on air transportation. Such extreme isolation severely limits access to
external human resources and diminishes the attractiveness of these communities
for long-term settlement. Statistics from 2016 further indicate that a majority of
Inuit lived in areas classified as very remote (57%) or remote (23%), compared to
only about 3% of the non-Indigenous population [4]. This sharp contrast underscores the structural challenges of sustaining population growth and labour markets
in these regions.

Low Population Density and Internal Dispersion Remote Canadian settlements typically have extremely low population densities—often fewer than 0.5
persons per square kilometre—compared to over 4,700 persons per square kilometre
in metropolitan Toronto. Combined with a strong cultural emphasis on individual
privacy, this spatial dispersion hinders residents’ awareness of each other’s skills
and capacities, creating barriers to effective resource sharing and mutual support in
times of need.

Feasible Inter-Community Proximity While intra-community dispersion is
significant, many remote communities in provinces such as British Columbia and
Ontario are located within moderate driving distances of one another. In practical
terms, this often translates into travel times that can be managed within a few
hours, suggesting the possibility of inter-community support and cooperation across
neighbouring settlements.
At the national level, Statistics Canada’s Remoteness Index (Figure 2.1) further
demonstrates that many Canadian communities are categorized as “remote” rather
than “very remote,” indicating that they remain within acceptable reach of neighbouring settlements. This is particularly true in provincial contexts, where road and
8

ferry connections facilitate inter-community travel.

2.2

Demographic and Social Characteristics of Canadian Communities

This section analyzes the economic situation and cultural background of Canadian
communities. We observe that the vast majority of Canadian communities, even
those in remote areas, benefit from solid infrastructure and reliable Internet connectivity. Moreover, residents generally possess relatively high levels of education,
with most being able to use computers and access the Internet. We also found that
Canadian communities exhibit a highly diverse age distribution, which implies that
the needs of community residents are also highly diverse. Therefore, from a cultural
perspective, a local help platform can effectively address and fulfill the needs of
these residents.

Digital Literacy and Internet Use in Communities Communities located
in mega cities naturally enjoy advanced infrastructure and thus are not the focus
of this discussion. Instead, we turn our attention to remote communities. Despite geographic isolation, remote Canadian communities benefit from the country’s
strong educational foundations and widespread digital infrastructure. As of 2020,
nearly 94% of Canadian households had fixed-broadband Internet access, and this
availability continues to expand into remote and rural areas, enabling residents to
engage with various online services [5]. Furthermore, most residents in these communities are capable of using electronic devices. Younger populations demonstrate
significantly higher digital literacy: 77% of Canadians aged 15–34 are classified as
“Proficient or Advanced” Internet users, whereas the majority of older individu9

als fall into the “Non-user or Basic” categories [6]. Historical initiatives such as
the Community Access Program (CAP) have also delivered essential digital exposure and training to underserved communities by providing access points in local
schools, libraries, and community centers [7]. Taken together, these data suggest
that even in remote regions of Canada, the population is generally capable of using
computers and accessing the Internet. These communities are therefore fully able
to adopt and benefit from the local help platform.

Community Mutual-Aid Tradition Canada has a long-standing tradition of
mutual aid, which is clearly reflected in national data on volunteering and charitable giving. According to Statistics Canada’s Survey on Giving, Volunteering and
Participating (SGVP), both formal volunteering through organizations and informal
neighbour-to-neighbour assistance are widely embedded in daily life.
The SGVP data collected in 2023, during the COVID-19 period, show that
although the national volunteering rate declined compared to 2018, nearly 73% of
Canadians still engaged in various forms of volunteering. On average, each volunteer
contributed 173 hours annually, which underscores that mutual aid in Canadian
society is not a temporary response to crises but rather a deeply rooted and sustained
practice.
Moreover, the scale of contributions is substantial. In 2023, Canadians devoted
approximately 4.1 billion hours to formal and informal volunteering combined, a
figure that, while lower than in 2018, remains impressive. Notably, the top 10% of
volunteers alone contributed more than 60% of total volunteer hours, highlighting
the presence of highly committed individuals. Taken together, these findings support
the view that, even without monetary incentives, a significant proportion of local
residents can be mobilized to support others [8].

10

Diverse Community Needs The diverse age composition of Canadian communities is a key factor contributing to the heterogeneity of local help demands. Table 2.1
summarizes the comparative median ages across selected regions. In the far northern territories, such as Nunavut, the population is remarkably young, with a median
age of only 26.8 years—the lowest in the country—contrasting sharply with rural
regions in provinces such as Ontario, where the median age approaches 47 years.
Meanwhile, the Northwest Territories (36.0 years), Yukon (38.4 years), and the national average (40.3 years) illustrate a spectrum of intermediate age structures. This
demographic variation inevitably leads to diverse community needs.
In addition, cultural diversity is also a defining characteristic of Canadian communities. Canada’s population has long been marked by multicultural features.
From the early English and French settlements, to the large influx of European
immigrants in the early 20th century, and later, since the 1970s, to newcomers
from Asia, Africa, and Latin America, the country has gradually developed into
a society that includes “visible minorities” and a wide variety of ethnic groups.
This structure implies significant differences among community members in terms
of language, dietary practices, religious beliefs, educational traditions, and social
interactions. Such cultural diversity translates directly into distinct everyday needs:
for example, Asian and South Asian groups may require bilingual or multilingual
public services, Arab or Muslim communities may prioritize access to halal food
and religious facilities, while Indigenous communities emphasize the preservation of
land, language, and traditional practices.
Based on these findings [9], we can argue that a local-help platform must be
capable of supporting highly diverse queries. Whether in healthcare, education, or
social services, the design and implementation of such platforms must account for
ethnic differences and cultural sensitivity; otherwise, true equity and inclusiveness
11

in communities cannot be achieved.
Table 2.1: Comparison of Median Age in Selected Canadian Regions (2024)
Region
Median Age (years)
Nunavut
26.8 (youngest nationally)
Northwest Territories (NWT)
36.0
Source: Statistics
Yukon
38.4
Canada (national average)
40.3
Rural regions (e.g., Ontario)
∼47
Canada, Median age on July 1, 2024 [10]; rural median age data from Rural Ontario Institute
[11].

2.3

Sharing Economy

It can be seen that there is indeed a demand for implementing a local help platform.
Moreover, the business model of this platform is similar to that of an agent, enabling
two users with corresponding needs to quickly match. The overall operating model
resembles that of a sharing-economy platform. Therefore, we will encounter similar
risks that need to be mitigated, as well as comparable functional features that must
be implemented.

2.3.1

History of Sharing Economy

The generally acknowledged starting point of the sharing economy was in 2008,
when Airbnb and Uber were founded in San Francisco and are regarded as the
pioneers of this domain [1]. Their initial core challenge was that demand had to
occur instantaneously and simultaneously. For instance, in the case of Uber, when
passengers required a ride in the morning, a driver had to be present at the same
time and place, heading in the same direction. If time or location did not coincide,
the transaction could not be completed. Hence, a platform was necessary to match
12

service providers with service seekers. Similarly, Airbnb faced the same situation:
a guest needed accommodation at time A in location A, while a host in location A
had to have availability during that period; only under such conditions could the
transaction be realized.

Challenges in the Later Stage of the Sharing Economy Over time, the
original emphasis on idle resource exchange and sharing within sharing economy
platforms has gradually faded. As platforms such as Airbnb and Uber expanded in
scale, professional service providers began to emerge, such as multi-listing Airbnb
hosts or organized Uber fleets. This trend fundamentally transformed both the business models and community atmosphere of these platforms: services were no longer
primarily provided by individual users but instead dominated by professional sellers. Prior research has shown that such professionalization not only affects pricing
structures and supply but may also erode the initial spirit of mutual aid, leading to
deteriorated user experience and increased regulatory risks [12].

2.3.2

Implementation of a Sharing Economy Platform

It can be seen that a sharing economy platform must not only process large amounts
of user-generated content (e.g., service postings, resource descriptions, and availability updates), but also provide an efficient and optimized query and retrieval mechanism that enables users to quickly locate relevant resources. This dual requirement
implies that even a minimum viable sharing economy platform must implement a
range of fundamental and critical functions, as noted by [13, 14]. Specifically, the
platform must support a complete set of user interaction features, ranging from
registration and content creation to matchmaking and transaction completion. Furthermore, since such platforms inevitably involve the storage of users’ personal in13

formation, security must be ensured. In addition, the platform should be capable of
notifying users of changes related to their transactions and interacting with a wide
range of external systems. To reduce maintenance costs, it is essential to incorporate fault-tolerance mechanisms that can maintain system stability in the presence
of external failures. In summary, the platform must satisfy the following technical
requirements:
• Platform Availability (Ability): At the MVP stage, the most critical aspect is that the system can successfully support user registration, resource
posting, and basic matchmaking. Moreover, in the event of failures in external dependencies (e.g., messaging APIs), the platform should ensure that user
data remains consistent without requiring manual intervention.
• Basic Data Security (Integrity): Users must at least trust that the platform will not leak their basic information (e.g., account credentials). Therefore, the immediate priority is to implement minimal data security mechanisms, such as encrypted password storage and secure session management.
• Basic Resource Description (Product Trust): The platform should provide a clear and transparent resource description interface to reduce cognitive
discrepancies between users, thereby improving the likelihood of successful
matches and supporting long-term retention.

2.3.3

Distributed Transactions

As noted in [13], sharing economy platforms inevitably involve extensive user interactions and integrations with external systems such as payment gateways and
notification services. To ensure a seamless user experience, we must introduce the
concept of distributed transactions [15]. This is essential to guarantee data integrity
14

even when dependent services experience outages. Once those services recover, the
system should be able to resume normal operations without manual intervention.
Currently, most distributed transaction solutions incur significant operational
and development overhead. For example:
• XA protocol [16]: it ensures strong consistency but requires tight coupling
with resource managers and can negatively impact performance.
• TCC (Try-Confirm-Cancel): This idea was first proposed in 2007 and
was later elaborated in the 2016 ACM Queue version [15]. Building on this
line of thought, the Try-Confirm-Cancel (TCC) pattern was later formalized,
providing fine-grained control over transaction stages but demanding complex
business logic and typically relying on independently deployed transaction
coordinators.
• SAGA [17]: A more lightweight approach that decomposes long-running
transactions into a sequence of local transactions, each with a corresponding compensation action. SAGA is easier to implement and can be integrated
via SDKs without the need for standalone coordination services.
Although XA and TCC offer robust guarantees, their complexity and operational
costs make them less suitable for high-concurrency, high-interaction environments
typical of sharing economy platforms. In contrast, the SAGA pattern strikes a
balance between consistency and simplicity, making it a practical choice for systems
built on microservices architecture.

15

2.4

Token-Based Incentive Systems

It can be observed that platforms involving cash-based or contribution-based economies
often raise concerns related to insurance or economic disputes [18], as well as the deterioration of service quality brought about by the emergence of full-time providers
(e.g., professional Airbnb hosts) [12]. Therefore, it is also necessary to investigate
the feasibility of a purely token-incentivized platform.
The following discussion is not about blockchain-based designs; but it applies to
normal platform tokens. Although token-based incentives are typically weaker than
direct monetary rewards, prior studies have shown that, as long as stable acquisition
and usage rules are in place, tokens can substantially shape user behavior [19]. In
the context of a local help platform, users can redeem tokens for services provided
by others, which in turn motivates them to actively provide services in order to earn
tokens, thereby forming a positive feedback loop. Such a mechanism is not only
theoretically sound but also supported by practical cases.
A notable real-world analogy is Stack Overflow, one of the world’s largest online
Q&A communities, with over 29 million registered users, millions of questions and
answers, and hundreds of thousands of active contributors each year. Empirical
research has shown that its badge system (a form of virtual tokens) significantly
enhances user engagement: after obtaining a badge, users become more active not
only in the activities directly associated with the badge but also exhibit spillover
effects in unrelated activities. More importantly, even badges with seemingly negative connotations (such as the Tumbleweed badge, awarded to unanswered questions) can motivate users to improve their reputation and increase contributions. In
other words, badges, as non-monetary tokens, influence user motivation through social identity, reputation, and psychological cues, ultimately fostering the production

16

and circulation of community knowledge [20, 21].
Fundamentally, badge and token systems provide a form of non-monetary incentive whose impact extends beyond short-term engagement to the long-term cultivation of habitual behavior. Researchers have noted that users may initially be
motivated by external rewards, but over time they gradually internalize these external drivers, transforming them into a sense of belonging, social identity, and even
self-actualization. Thus, tokens and badges are not merely a “points system,” but
rather a dual mechanism that combines external incentives with the cultivation of
internal motivation.
A similar phenomenon can also be observed in GitHub, the world’s largest opensource platform. Mechanisms such as stars, watches, and contribution logs do not
involve direct economic benefits but serve as public records of participation. Developers are motivated to increase their activity precisely because their contributions
are made visible to others [22].
Likewise, although token-based incentives cannot rival monetary rewards in direct economic value, their role in sustaining community governance and user participation should not be overlooked. In the design of a local help platform, carefully
incorporating badge and token mechanisms similar to those of Stack Overflow can
not only encourage continuous user engagement but also maintain the platform’s
long-term vitality and healthy development through positive feedback loops.

2.5

Recommend under Sparse Data Conditions

As mentioned earlier, local help platforms must face the challenge of data sparsity,
while keeping solution costs reasonably low. In our investigation, we found that
although traditional recomendation and search solutions are widely used across var-

17

ious systems, they are not well-suited for local-help platforms.

2.5.1

Traditional Recommendation

Figure 2.2: Recommendation Workflow
Traditional recommender systems typically follow a multi-stage pipeline. The
first step is to collect user–item interaction data, such as explicit feedback (ratings, likes, purchases), as illustrated in Fig. 2.2. These data are then transformed
into a sparse user–item matrix. In the candidate generation stage, classical methods include collaborative filtering (user-based or item-based) and matrix factorization, which project users and items into a shared latent space to perform similarity
18

matching [23]. In addition, user behavior and item attributes can be incorporated
to improve coverage. The retrieved candidates are subsequently scored and ranked,
often relying on linear models or heuristic functions. Hand-crafted rules may also
be added to improve precision, and finally the Top-K results are presented to users.
There are also search-based solutions, such as the inverted index systems used in Solr
and Elasticsearch, which rely on keyword matching. However, these approaches are
even less suitable for the local-help business scenario and are therefore not discussed
further.

2.5.2

Limitations of Traditional Recommendation

It is clear that the aforementioned approaches have certain limitations. For instance,
they are highly dependent on user behavioral data, but our local-help system is
unlikely to have a sufficiently large user base in its early stages. Therefore, more
advanced recommendation systems, such as YouTube’s approach [24], have been
developed. In the candidate generation stage, user profiles and possible input queries
are transformed into vectors, while all item information is also vectorized. The
matching task is then formulated as an approximate nearest neighbor (ANN) search
problem in the joint vector space. In the scoring and ranking stage, deep neural
models are applied to return the top-ranked results.
This vector-based matching approach is still adopted in our local-help scenario,
as the development of efficient ANN algorithms, such as HNSW [25], together with
embedding models, has made ANN search extremely fast and effective [26]. By
leveraging deep neural networks for the final scoring, this approach partially alleviates the data sparsity problem. Nevertheless, it should be noted that in order
to continuously improve recommendation accuracy, models must be frequently retrained, which results in high overall maintenance costs. This raises the question:
19

could large language models (LLMs) be leveraged as the final reranking and filtering mechanism, thereby achieving strong recommendation performance without
extensive retraining?

2.5.3

Recommendation Based on LLMs

Compared with traditional deep recommendation models, large language models
(LLMs) can achieve comparable or even equivalent recommendation accuracy without relying on large-scale training data, particularly in cold-start or few-shot scenarios [27]. A similar phenomenon is also observed in [28], where even untrained LLMs
demonstrate strong performance in retrieval tasks. This observation is especially
relevant for local help platforms with limited user bases and sparse interaction data,
as it implies that such platforms do not need to bear the high cost of model training.
Moreover, given the relatively small user base, there is also no need to be overly
concerned about the high inference costs of LLMs in large-scale online systems, as
pointed out in the study.
For query processing under sparse data conditions, solutions have also been
proposed. In particular, [29] introduces methods to improve query matching with
LLMs through query expansion. Two strategies are discussed: Generative Query
Rewriting (GQR), which produces multiple synonymous variants of the original
query, and Generative Query Expansion (GQE), which generates additional content
relevant to the query to enhance retrieval. Empirical evidence shows that GQE
significantly outperforms GQR, achieving substantial improvements in metrics such
as NDCG and Recall across multiple datasets. However, it is important to note
that query expansion must be combined with appropriate parameter settings, such
as temperature adjustment. While higher temperatures may yield more diverse
expansions, in lightweight community-based recommendation scenarios such as local
20

help platforms, excessive diversity often reduces recommendation precision, leading
to expanded queries that deviate from the platform’s core service orientation [30].
Furthermore, as highlighted in [31], temperature adjustment alone is insufficient;
prompts must explicitly specify that all expanded queries are grounded in the local
help context in order to effectively constrain the LLM’s tendencies at the final query
expansion and filtering stages.
Therefore, the existing literature suggests that with carefully set temperatures
and explicitly defined prompts, untrained LLMs can already meet the search and
recommendation requirements of local help platforms under sparse data conditions.

21

Chapter 3
Methodology
This chapter will provide a detailed introduction to the main functions of the Local
Help Platform and the corresponding technical solutions.

3.1

The Implementation of Local Help Platform

Figure 3.1: Positive feedback loop between the service system and the token system
The functional design of this system revolves around two core components: the
service system and the token system. Through the service system, users can quickly
find suitable services and spend tokens. The only way to earn tokens, apart from the
initial gift granted upon registration, is by helping others. This mechanism incentivizes users to provide services for others. As illustrated in Figure 3.1, this process
22

forms a positive feedback loop, which continuously strengthens the atmosphere of
mutual assistance within the community.

3.1.1

Home Page

Figure 3.2: Home Page

Figure 3.3: The Components in Home Page
Figure 3.2 shows the homepage, which serves as the entry point of the entire
23

platform. The upper section functions as the entry point for the search feature, while
the lower section mainly displays community activities and highlights residents who
are willing to offer help, configured manually. The detailed composition of the
modules is illustrated in Figure 3.3. The search and shopping cart modules in the
figure will be introduced in detail in the following sections.

3.1.2

Registration

Figure 3.4: Registration entry
User registration in this platform is based on the user’s email address. A nonduplicate email is considered a new user, as shown in Figure 3.4. Otherwise, it will
be recognized as a duplicate registration.
As illustrated in the workflow in Figure 3.5, once the user successfully registers,
the system automatically grants 20 tokens as a welcome bonus, enabling the user
to enjoy community services and encouraging them to participate in community
activities (see Figure 3.6).

24

Figure 3.5: Workflow of the registration process.

Figure 3.6: Successful login & registration

3.1.3

Login

As shown in Figure 3.7, the login process adopts a standard username and password
mechanism. After a successful login, the right side of the homepage displays the
number of tokens currently owned by the user (see Figure 3.6). Once logged in,
the user can immediately enjoy the services provided by other community members.
Furthermore, for a considerable period of time after a successful login, the user’s
authentication token is stored in the browser’s cookies, allowing seamless access

25

Figure 3.7: User login with a standard username and password.
without the need to log in repeatedly.

3.1.4

Profile Management

This feature is a core part of the service system. It covers (1) uploading a user’s
service information and (2) parsing that information so it can be easily discovered
and used by other users.
Functional Overview As shown in Figure 3.8, users provide their skills and
recent availability. The availability format is yyyy-MM-dd hh:mm, indicating the
specific hour during which the user is free.
Avatar Upload Users first upload an avatar from their local device. The server
stores the image in a local filesystem directory and returns the resulting avatar URL
to the user. If satisfied, the user clicks Submit to send their basic profile information.
Server-Side Processing Upon receiving the submission, the server:
1. Validates that all required fields exist and are well-formed (skills, availability,
avatar URL, etc.).
2. Persists the record to the local database.
26

Figure 3.8: Feature overview for uploading service information
3. Forwards name and skills to the downstream search service, which updates
the user’s resume/profile index. The indexing update flow is summarized in
Section Profile Feature Engineering.
The end-to-end interaction among the three services is illustrated in Figure 3.9.
After processing completes, the user can be found via the search module.

Security Considerations: CSRF Protection User identity artifacts are stored
in browser cookies and corresponding session state persists on the server (e.g.,
mapped to files or memory). If a cookie were stolen, an attacker could attempt
to forge submissions.
To mitigate this, the server issues a CSRF token after each successful login and
stores it in server memory. Whenever a form submission is required, the server also
sends the CSRF token to the client (Figure 3.10); on submit, the client includes this
token so the server can verify the request originated from a legitimate user action.
Additionally this mechanism also works in other writing workflow such as payment,
address update.
The search service performs feature engineering on the received profile fields
27

Figure 3.9: Workflow across User, Server, and Search Service

Figure 3.10: CSRF token issuance and submission
(e.g., skills normalization, keyword extraction) and updates the associated search
index to ensure fast and accurate retrieval.

Profile Feature Engineering
When the search server detects that a UserId has new updates (see Figure 3.11),
it performs the following high-level steps:
28

1. Purge existing vectors for the user: Remove any prior embeddings and
documents for this UserId from the vector database to avoid duplication or
stale content.
2. Collect service data: The server retrieves the latest service records associated with the user.
3. Aggregate to plain text: The collected fields are normalized and concatenated into plain text to form the indexing corpus for this user.
4. Run the indexing pipeline: Proceed with the steps in Indexing Pipeline(See
section Profile Feature Engineering).

Figure 3.11: Embedding Workflow

Indexing Pipeline The indexing pipeline consists of the following stages:
1. Cleanup: Remove special characters and anomalous content (e.g., stray control symbols) from the aggregated text.

29

2. Chunk: Split the text into slices by either fixed length (e.g., 100 characters)
or sentence boundaries to preserve semantic coherence.
3. Embedding: Convert both the original (full) text and each chunk into vector
representations and store them in the vector database. Alongside the raw text,
persist rich metadata such as the service provider’s name and the corresponding UserId. An example of embedded records is shown in Figure 3.12.
All stages persist execution logs and checkpoints. Therefore, if a transient failure
(e.g., network interruption) occurs mid-pipeline, a subsequent run resumes from the
last failed step instead of re-executing the completed stages.

Model & Vector Dimension Given that the overall text size is small and each
single-user service description is under 1 KB, we use the sentence-transformer model
all-MiniLM-L6-v2. [26]To balance accuracy, throughput, and cost, we store embeddings at a vector dimension of 384.

Figure 3.12: Example embedding records

3.1.5

Profile Search

After the user uploads the services they can provide, other users can search for and
use them. As shown in Figure 3.3, the entry point is located on the homepage. Any
30

Figure 3.13: Structure of the search results

Figure 3.14: Workflow of server and search server interaction
user, including strangers, can use this function. The structure of the search is shown
in Figure 3.13, where in addition to the search results, the reasons for recommending
these users are also given. The specific interaction is shown in Figure 3.14, where
the user sends a request to the server, the search server searches for suitable services
and returns the corresponding user IDs, and then the main server assembles the
data and returns it. As mentioned earlier, the data in our resume database is not
particularly large.
Since, as a service provider, the user is increasingly comparable to a product, we
use the concept of an SKU from e-commerce to represent the user’s service, with
skuId serving as the identifier for the service. In order to support diverse queries
under such a dataset, we adopted the following solution.

31

Search Pipeline

Figure 3.15: Overall RAG workflow extend → retrieve → rerank → generate.
Our search approach is based on Retrieval-Augmented Generation (RAG) [32].
The end-to-end workflow is illustrated in Figure 3.15, and consists of four stages:
extend, retrieve, rerank, and generate.The dependent LLM is is gpt-4o-min.
To maximize the likelihood of matching relevant data, the following search pipeline
iterates up to 3 rounds until suitable results are found.
Also, during implementation, we must carefully adjust the prompts to control
how the LLM expands and interprets the query. For instance, using a standard RAGstyle prompt to expand the query “Today is too hot” (Accepted set by L. Chen)
yields the following:
Standard RAG Prompt
32

1. What are some ways to cope
with the heat today? #next-question
2. How can I stay
cool during this hot weather?
3. What tips do you have for
dealing with high temperatures?
4. What are effective
methods to beat the heat today?
...
Community-Oriented Prompt
1. What local resources are available to help me stay cool
during this heatwave?
2. Are there any community centers or
cooling stations open today to escape the heat?
3. Can you recommend any nearby parks or shaded
areas where I can relax in this hot weather?
4. Is there a local mutual-aid group that provides
assistance for those struggling with the heat?
...
It can be seen that without constraining the direction of prompt expansion, the
questions become overly divergent, which is detrimental to recommendation.The
following are the details of each step in the pipeline.

Extend The purpose of this stage is to expand the original query to avoid failure
in matching caused by underspecified questions. As noted above, the prompts must

33

be controlled to ensure expansions move in the intended direction. Furthermore,
since the pipeline may be executed iteratively, the LLM must be instructed to exclude questions generated in previous rounds. The detailed template is shown in
Expansion Prompt 3.1.5.

Retrieve All expanded questions are processed concurrently. Because retrieval is
parallelized, increasing the number of expansions does not materially impact endto-end latency.

Rerank We deduplicate candidates (e.g., by sku id or name) and retain the top10 most relevant to the original question. Since the candidate pool is usually around
30 and LLM-based ranking may be slower or less precise, we adopt a cross-encoder
reranker to ensure accuracy and efficiency [33].

Generate In the final generation stage, the prompts must enforce that answers
are grounded in community-oriented scenarios, while preventing the LLM from
over-focusing on keyword overlap. For instance, the query “I want to pursue a
PhD”(Figure 3.16) should still return the SKU “I am a doctor of computer science,”
even with low keyword overlap. Similarly, user capability levels must be respected
(e.g., senior-level requests such as “I need a plumber with 10+ years of experience”).
Additionally, If no suitable candidate is found, the system still returns a clear reason
so the user can adjust the query(Figure 3.17).The complete prompte is provided
in Final Selection Prompt 3.1.5. Regarding temperature, based on [30], accuracy
differences in the range 0 ≤ temperature ≤ 1 are negligible for selection-style tasks.
However, to maximize stability and reproducibility, we set temperature=0.

34

Cache What we can anticipate is that the above process is rather time-consuming,
especially for cases where no answer can be found. To ensure a good user experience,
we have currently added a Least Recently Used Cache(LRU Cache) at the interface
generation stage so that identical queries can be returned quickly.

Figure 3.16: Successfully Matched to “I Want to Pursue a PhD” — Feedback Requested

Figure 3.17: Show the reason for the empty result

Expansion Prompt
You are a helpful HR assistant in a local community mutual-aid system.
Your task is to generate {expand_to_n}
alternative versions of the given user question.

The goal is to make the questions better
reflect local community scenarios, so they can
35

retrieve more relevant documents from a vector database.
When rephrasing, think about what
kind of help the user might actually need in
a neighborhood or local worker context, and adapt the variations accordingly.

Each variation should stay faithful to
the user’s intent, but can expand naturally
to include possible services or
assistance they might be seeking.
Provide the variations separated by ’{separator}’.
Strictly Avoid and Do not generate
questions similar to: {exclude}
Original question: {question}
Final Selection Prompt
You are a local community HR specialist.
Your responsibility is to help residents
find suitable candidates who can solve their **practical problems**
in daily life or personal development.
This includes but is not limited to:
health, home repair, childcare,
transportation, and academic or educational support.

Always match based on the actual intent
and domain of the user’s question, not
on superficial keyword overlap.
36

When the user requests professional
expertise or seniority, prefer candidates
whose name or description reflects that level.
Select only the most appropriate
candidate(s); partial matches or
loosely related candidates should not be chosen.

Use only the provided context to
evaluate candidates and infer
the most appropriate one(s)
based on the user’s question.
Each candidate has a unique sku_id.

If suitable candidate(s) exist,
output **only** a JSON object
in the exact format below.
If no candidate is suitable,
still output a JSON object
with an empty sku_id_list
and provide the reason why.

JSON format:
{{
"reason": "Explain clearly why the
selected candidate(s) can reasonably
37

and practically address the user’s question.
If none are suitable, explain why.",
"sku_id_list": "comma-separated
sku_id(s), or empty string if none"
}}

Do not add any extra text,
explanation, or formatting.
Base your judgment strictly on
the provided context, the user’s question,
and practical relevance.

User question: {query}
Context: {context}

3.1.6

Service Selection

Figure 3.18: product
After the search completes, the user clicks a service to open its detail page
(Figure 3.18), where an appointment list is available. The user selects a preferred
timeslot and then places an order. Upon placing the order, the request is first
38

Figure 3.19: freezed time slot
added to the shopping cart; the user then navigates to the cart to checkout, and
subsequently pays with tokens. After a successful payment, notification emails are
sent to both parties for follow-up communication, and the service timeslot is locked
(see Figure 3.19). This flow is designed to minimize complaints caused by worker
scheduling conflicts; see Figure 3.20 for the overall process.

Place Order
Next, we describe the order placement process in detail. From the homepage (see
Figure 3.3), the user can see the shopping cart on the homepage and any other page;
clicking checkout (Figure 3.21) starts the order flow.
During checkout, the user must provide an exact address to enable subsequent
service delivery, see Figure 3.22. After submission, the system performs token deduction and sends notifications.

Notice
Notifications are sent via an external email API. In the email, you will see two links:
one indicates that the order has been successfully completed — after clicking it, the
token will be credited to the worker’s account. The other indicates that there is an
issue with the order — after clicking it, the token will be returned. Ultimately, the
decision of whether to release the token is left to the user. A consistency risk exists
39

Figure 3.20: End-to-end purchase workflow
because token deduction uses our internal database, while notices are delivered by
an external API. If deduction succeeds but the notice fails, the system enters an
inconsistent state and must compensate the user to avoid complaints. Since the
help community functions as a sharing-economy platform, user trust is fragile in the

40

Figure 3.21: Begin Check Out

Figure 3.22: Address Configuration

Figure 3.23: Notification format sent via the email API
early stages. Therefore, it is critical to guarantee the correctness of token data to
prevent the trust from breaking down.
Therefore, we employ the mechanism below to guarantee that—without manual
intervention—user data is eventually correct. The notification format is shown in

41

Figure 3.23.

Transaction
Field
order id
goods context
retry time
status
step
release time

Value
billing-151
{”attribute”: {”amount”:”6.99”, ”}
1
2
3
0

Table 3.1: Order log row data (id = 2033).

Our solution is implemented using the Saga pattern [17] and is delivered as an
SDK (no separate service deployment required). All user actions are recorded in a
journal table; once a record is successfully written, downstream steps automatically
retry until a business-correct outcome is reached. The journal interface (Table 3.1)
includes the userId and the execution context for subsequent steps. After logging,
the SDK begins processing the record.
The SDK has two roles: the Transaction Manager (TM), which orchestrates
the flow, and actions, which encapsulate business logic. Here, the actions are token
consume and notice send. Actions are executed sequentially: after token consume
succeeds, notice send runs. If an action (e.g., notice send) fails definitively (e.g.,
API blocked), the TM triggers compensation in the previous action (refund). The
end-to-end Saga flow is illustrated in Figure 3.24.

42

Figure 3.24: Saga workflow with TM
43

3.2

The Deployment of Local Help Platform

The system relies only on Qdrant and MySQL, and uses just two languages. As shown
in Figure 3.25, it has very few dependencies, so even without a dedicated operations
team the deployment can be completed smoothly.

Figure 3.25: Dependencies

44

Chapter 4
Evaluation
Our evaluation is organized around two primary goals. First, the system should
run with minimal human intervention and low resource consumption. Second, the
quality of recommendations must be accurate and trustworthy.

4.1

Performance Testing

This part focuses on ensuring low cost and stable performance under normal user
traffic. We identify high-traffic endpoints by feature usage and subject them to
load testing. We use wrk as the load generator. Metrics include QPS, latency
(P50/P95/P99 [34]), CPU usage, and memory usage. The results below summarize
representative endpoints.
API Endpoint
/homepage
/new employee
/checkout/complete

QPS
1214
1125
1198

P50 (ms)
23
51
129

P95 (ms)
39
79
151

P99 (ms)
157
178
279

Table 4.1: Latency statistics under 1200 QPS load

45

API Endpoint
/homepage
//new employee
/checkout/complete

CPU Usage (%)
73%
77%
79%

Memory Usage (MB)
2780
2721
2842

Table 4.2: System resource usage during benchmark
Metrics Given that our system targets a broad user base, we pay special attention
to maintaining a consistent user experience under high-concurrency scenarios. As
highlighted by Dean and Barroso in The Tail at Scale [34], tail latency plays a
decisive role in overall system performance and user-perceived responsiveness in
large-scale distributed systems. Even a small fraction of high-latency requests can
significantly degrade overall user satisfaction.
To comprehensively evaluate system performance under realistic workloads, we
selected three representative endpoints for testing: /homepage, /new employee, and
/checkout/complete. The /homepage endpoint serves as the entry point for most
users upon accessing the platform, /new employee represents recommendation logic
for new users, and /checkout/complete is the universal endpoint for all order
completion operations. Together, these endpoints effectively reflect the core user
journey and cover the system’s critical business paths under concurrent load.
Based on this setup, our performance testing and monitoring activities do not
rely solely on the traditional mean latency metric. We have introduced P95 and
P99 latency statistics to more precisely characterize the system’s extreme response
behavior under heavy load. These high-percentile indicators allow us to identify
hidden performance bottlenecks and assess the potential impact of load conditions
on user experience.
In addition, we also monitor CPU and memory consumption to evaluate overall
resource utilization during load testing. These data help us determine the efficiency
of system resource usage and identify potential optimization margins while main46

taining a stable quality of service.

Result Analysis It can be observed that during the stress tests of each endpoint,
CPU utilization remained below 80%, and memory consumption did not exceed
3 GB. This indicates that the overall system has relatively low resource usage and no
frequent JVM garbage collection (GC) events occurred, suggesting that the resource
configuration is well-balanced and efficient.
Next, we analyze the response time. The /checkout/complete endpoint shows
the highest latency since it involves multiple validation steps and data writing operations. The /new employee endpoint requires rendering data from several services,
resulting in slightly higher latency than /homepage. Nevertheless, the P50, P95,
and P99 latencies all remain within normal ranges and do not negatively affect user
experience. The slightly higher P95 compared to P50 is mainly due to cache misses
in certain requests, while the P99 increase is attributed to database connection
reinitialization and GC during the test.

4.2

Recommendation Quality Testing

We adopt the RAGAS evaluation scheme [35]. The key advantage is that it does not
require pre-authored ground truth; instead, an LLM is prompted to assess answer
grounding and relevance.

Test Flow As explained in Section 3.1.5, we represent each service using an SKU
identifier In our platform, the testing flow is: the user calls the API and receives
an answer; the test script extracts sku id list from the answer, concatenates it
into a SKU query string, and then enters the RAG pipeline (retrieve → augment
→ generate). Finally, RAGAS metrics are computed. Figure 4.1 illustrates the
47

workflow.The test set used by the platform was generated with GPT-5 and consists
of 40 everyday life questions—20 with explicit requests and 20 with vague ones—used
for testing separately.The test data are provided in the Appendix.

Figure 4.1: RAG-based evaluation workflow

Metrics We report three metrics as defined in RAGAS: Faithfulness, and Context Relevance, Since the primary requirement of the local mutual-aid platform is to
48

verify whether the recommended SKUs (services) are accurate, and the responses
mainly consist of reasons for selecting these users rather than being generated strictly
based on keywords, the Answer Relevance metric cannot be used as a reliable reference.

Faithfulness Inputs: question q, answer a, and the retrieved context sku detail.
1. Use an LLM to decompose the reasoning/answer content of a into an atomic
statement set S = {s1 , . . . , sn }. The statement-extraction prompt follows the
RAGAS paper [35].
2. For each si ∈ S, use an LLM verifier to check whether si is supported by
sku detail. Let V ⊆ S be the subset judged as supported and denote fs =
|V |.
3. The Faithfulness score is

F =

fs
|V |
=
∈ [0, 1].
|S|
|S|

Context Relevance Inputs: question q, answer a, retrieved context sku detail
(a set of sentences/items).
1. As above, sku detail provides the candidate evidence for answering q.
2. Prompt an LLM to extract only the sentences/items from sku detail that are
necessary to answer q; denote this essential subset by R.
3. The Context Relevance score is

CR =

|R|
∈ [0, 1],
|sku detail|
49

where |sku detail| is the total number of sentences/items in the context.
Aggregate RAGAS Results The reported values are obtained by averaging the
RAGAS metrics across all test questions. A total of 40 questions were prepared
for evaluation, consisting of 20 clear questions(Table 4.3) and 20 ambiguous questions(Table 4.4). The following section presents the average metrics for explicit and
vague questions separately.The overall results indicate that the recommendation
accuracy is sufficient to meet user needs. One notable point is that the Context
Relevance score for clear questions is lower than that for ambiguous questions. A
possible explanation is that when the question is ambiguous, the model can still
achieve a high score even if it arbitrarily extends the user’s query, since RAGAS has
difficulty discriminating between different SKUs. In contrast, when the question is
clearly defined, unnecessary responses can be explicitly filtered out.
Metric
Faithfulness
Context Relevance

Score (0–1)
0.925
0.8

Table 4.3: RAGAS Evaluation Scores of Clear Question.

Metric
Faithfulness
Context Relevance

Score (0–1)
0.808
0.9

Table 4.4: RAGAS Evaluation Scores of Ambiguous Question.

50

Chapter 5
Conclusion and Discussion
We have successfully implemented a local help platform that, in terms of its business model, satisfies the criteria of a minimal sharing economy platform. It can
be launched quickly and avoids entanglement with cash transactions, thus posing
relatively low legal risks. One final point worth mentioning is that Canada has
a long-standing tradition of mutual aid [8], which further supports the value and
relevance of this platform.
From a technical perspective, as described in the Methodology section, the optimized RAG pipeline enables flexible natural language queries, resulting in a very low
entry barrier for end-users. During the testing phase, we also observed that setting
temperature=0 is more effective in our filtering-oriented tasks, whereas higher temperature values cause the LLM’s evaluation criteria to become overly divergent [30].
Furthermore, due to the presence of the incentive mechanism, our system must
ensure data consistency when interacting with external services. This paper demonstrates that it is indeed feasible to implement a lightweight transaction manager at
a relatively low cost. Nevertheless, there remain several avenues for future work,
which we discuss below.

51

Field Work As our web application is still at the prototype stage, the most
important next step is to recruit a group of volunteers to try it out and provide
practical feedback based on their real usage. This feedback will help us identify
issues and opportunities to improve, and guide us in refining the design to enhance
the overall user experience.

Insurance In practice, workers may still get injured while providing assistance.
Although the entire system functions merely as an intermediary platform, we still
have an obligation to offer insurance to our clients. The insurance model can be
similar to the Airbnb Host Guarantee [36], where users purchase the corresponding
insurance at the time they post their service order.

Prompt Engineering There is also room for improvement in our prompting
strategies. We observed that when users’ queries contain specialized terms (e.g.,
insomnia, Polysomnography), the LLM may fail to retrieve relevant data. In future
versions, query expansion could be improved by incorporating domain-specific vocabulary based on the query’s intent. Another optimization lies in reducing reliance
on prompt-level constraints to guide the LLM. Instead, future iterations could first
analyze all data within the service profiles, extract technical keywords, and then
allow the LLM to use these keywords for query expansion and final validation. In
addition, the RecPrompt [37] mechanism could be introduced, enabling the LLM to
generate prompts based on the data distribution and gradually converge toward an
optimal prompt. This approach could significantly enhance both the efficiency and
accuracy of the LLM’s operations.

52

Bibliography
[1] C. Öberg. Towards a typology of sharing economy business model transformation. Technovation, 123:102722, May 2023.

[2] F. D. A. Alauddin, A. Aman, M. F. Ghazali, and S. Daud. The influence
of digital platforms on gig workers: A systematic literature review. Heliyon,
11(1):e41491, January 2025. Online available: Dec. 26, 2024. Open Access
under CC BY 4.0.

[3] Statistics Canada. Remoteness index map by census subdivision. https:
//www150.statcan.gc.ca/n1/pub/11-633-x/11-633-x2020002-eng.htm,
2020. Accessed: 2025-08-31.

[4] Statistics Canada. Distance as a factor for first nations, métis, and inuit
high school completion. https://www150.statcan.gc.ca/n1/pub/81-595-m/
81-595-m2023002-eng.htm, 2023. Accessed: 2025-08-31.

[5] Statistics Canada. Access to the internet in canada, 2020. https://www150.
statcan.gc.ca/n1/daily-quotidien/210531/dq210531d-eng.htm,
2021.
Accessed: 2025-08-31.

[6] Statistics Canada.
Internet-use typology of canadians: Online activities and digital skills. https://www150.statcan.gc.ca/n1/pub/11f0019m/
11f0019m2021008-eng.htm, 2021. Accessed: 2025-08-31.

53

[7] Industry Canada. Community access program: Proposal guide. Technical
report, Industry Canada, Ottawa, ON, Canada, August 1997. Cat. No. C23271-1-1997, ISBN 0-662-63122-6.

[8] Statistics Canada. Volunteering and charitable giving in canada, 2018 to 2023.
Technical Report Catalogue no. 11-001-X, Statistics Canada, June 2025. Released in The Daily, June 23, 2025.

[9] Peter S. Li. Cultural diversity in canada: The social construction of racial
differences. Technical Report rp02-8e, Department of Justice Canada, Research
and Statistics Division, Ottawa, 2000. Research Paper.

[10] Statistics Canada. Population estimates by age and median age — canada,
provinces and territories, july 1 2024. https://www150.statcan.gc.ca/n1/
daily-quotidien/240925/dq240925a-eng.htm, 2025. Accessed: 2025-08-31.

[11] Rural Ontario Institute. Age factsheet: Rural and urban median age comparison in ontario. Technical Report Factsheet No. 19, Rural Ontario Institute,
2022. Accessed: 2025-08-31.

[12] K. Xie, C. Y. Heo, and Z. E. Mao. Do professional hosts matter? evidence from
multi-listing and full-time hosts in airbnb. Journal of Hospitality and Tourism
Management, 47:413–421, June 2021.

[13] M. J. Pouri and L. M. Hilty. The digital sharing economy: A confluence of technical and social sharing. Environmental Innovation and Societal Transitions,
38:127–139, March 2021.

[14] K. Stanoevska-Slabeva, V. Lenz-Kesekamp, and V. Suter. Platforms and
the sharing economy: An analysis. report from the eu h2020 research
project ps2share: Participation, privacy, and power in the sharing econ54

omy. Technical Report Tech. Rep. D5.1, Univ. St. Gallen, November 2017.
[Online]. Available: https://www.bi.edu/globalassets/forskning/h2020/
ps2share_platform-analysis-paper_final.pdf.

[15] P. Helland. Life beyond distributed transactions. Queue, 14(5):69–98, October
2016.

[16] X/Open Company Ltd. Distributed transaction processing: The XA specification. X/Open CAE Specification XO/CAE/91/300, X/Open Company Ltd.,
Reading, UK, December 1991.

[17] H. Garcia-Molina and K. Salem. Sagas. ACM Sigmod Record, 16(3):249–259,
1987.

[18] R. Calo and A. Rosenblat. The taking economy: Uber, information, and power.
Columbia Law Review, 117(6):1623–1690, 2017.

[19] A. E. Kazdin and R. R. Bootzin. The token economy: An evaluative review.
Journal of Applied Behavior Analysis, 5(3):343–372, 1972.

[20] Z. Li, K.-W. Huang, and H. Cavusoglu. Quantifying the impact of badges on
user engagement in online q&a communities. In Proceedings of the Thirty Third
International Conference on Information Systems (ICIS), Orlando, FL, USA,
2012. Research-in-Progress.

[21] A. Anderson, D. Huttenlocher, J. Kleinberg, and J. Leskovec. Steering user
behavior with badges. In Proc. 22nd Int. Conf. World Wide Web (WWW),
pages 95–106. ACM, 2013.

[22] L. Dabbish, C. Stuart, J. Tsay, and J. Herbsleb. Social coding in github:

55

Transparency and collaboration in an open software repository. In Proc. ACM
Conf. Computer Supported Cooperative Work (CSCW), pages 1277–1286, Seattle, WA, USA, February 2012. ACM.

[23] Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. Item-based
collaborative filtering recommendation algorithms. In Proceedings of the 10th
International Conference on World Wide Web (WWW ’01), pages 285–295,
New York, NY, USA, 2001. ACM.

[24] Paul Covington, Jay Adams, and Emre Sargin. Deep neural networks for
youtube recommendations. In Proceedings of the 10th ACM conference on recommender systems, pages 191–198, 2016.

[25] Yu A Malkov and Dmitry A Yashunin. Efficient and robust approximate nearest
neighbor search using hierarchical navigable small world graphs. IEEE transactions on pattern analysis and machine intelligence, 42(4):824–836, 2018.

[26] A. Rao, H. Alipour, and N. Pendar. Rethinking hybrid retrieval: When
small embeddings and llm re-ranking beat bigger models. arXiv preprint
arXiv:2506.00049, 2025. [Online]. Available: https://arxiv.org/abs/2506.00049
(accessed Sep. 13, 2025).

[27] J. Huang, S. Wang, L. Ning, W. Fan, S. Wang, D. Yin, and Q. Li. Towards
next-generation recommender systems: A benchmark for personalized recommendation assistant with LLMs. arXiv preprint arXiv:2503.09382, 2025. [Online]. Available: https://arxiv.org/abs/2503.09382 (accessed Sep. 13, 2025).

[28] T. Shen, G. Long, X. Geng, C. Tao, T. Zhou, and D. Jiang. Large language
models are strong zero-shot retriever. arXiv preprint arXiv:2304.14233, 2023.
[Online]. Available: https://arxiv.org/abs/2304.14233 (accessed Sep. 13, 2025).

56

[29] M. A. K. Ayoub, Z. Su, and Q. Li. A case study of enhancing sparse retrieval using llms. In Companion Proceedings of the ACM Web Conference 2024 (WWW
’24 Companion), pages 1609–1615, Singapore, Singapore, May 2024. ACM.

[30] M. Renze. The effect of sampling temperature on problem solving in large
language models. In Findings of the Association for Computational Linguistics:
EMNLP 2024, pages 7346–7356. Association for Computational Linguistics,
November 2024.

[31] C. Yang, Y. Shi, Q. Ma, M. X. Liu, C. Kästner, and T. Wu. What prompts
don’t say: Understanding and managing underspecification in llm prompts.
arXiv preprint arXiv:2505.13360, 2025. [Online]. Available: https://arxiv.
org/abs/2505.13360 (accessed: Sep. 13, 2025).

[32] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler,
M. Lewis, W. Yih, T. Rocktäschel, S. Riedel, and D. Kiela. Retrieval-augmented
generation for knowledge-intensive nlp tasks. In Proc. 34th Conf. Neural Information Processing Systems (NeurIPS), Vancouver, Canada, 2020.

[33] H. Déjean, S. Clinchant, and T. Formal. A thorough comparison of crossencoders and llms for reranking splade. arXiv preprint arXiv:2403.10407, 2024.
[Online]. Available: https://arxiv.org/abs/2403.10407(accessed Sep. 13, 2025).

[34] J. Dean and L. A. Barroso. The tail at scale. Communications of the ACM,
56(2):74–80, February 2013.

[35] S. Es, J. James, L. Espinosa Anke, and S. Schockaert. Ragas: Automated evaluation of retrieval augmented generation. In Proc. 18th Conf. of the European
Chapter of the Association for Computational Linguistics: System Demonstrations (EACL Demo), pages 150–158, St. Julians, Malta, March 2024. Association for Computational Linguistics.

57

[36] C. Marzen, D. A. Prum, and R. J. Aalberts. The new sharing economy: The
role of property, tort and contract law for managing the Airbnb model. SSRN
Electronic Journal, 2016.

[37] D. Liu, B. Yang, H. Du, D. Greene, N. Hurley, A. Lawlor, R. Dong, and I. Li.
Recprompt: A self-tuning prompting framework for news recommendation using large language models. In Proc. 33rd ACM Int. Conf. Inf. Knowl. Manage.
(CIKM), pages 3902–3906, Boise, ID, USA, 2024. ACM.

58

Appendix: RAGAS Evaluation
Samples
This appendix provides a selection of prompts and explanations used in the RAGASbased evaluation. Only a subset of the test questions is included here for illustration
purposes, as the full dataset is too large to present in this document. Context note.
During testing, all cases shared the same retrieval context: a unified résumé/profile
database. To avoid redundancy, we do not repeat the full context for each example.
Below is a compact SQL-style schema and a small sample of rows to illustrate the
format of the context.
-- Schema (illustrative)
CREATE TABLE profiles (
sku_id

INTEGER PRIMARY KEY,

title

TEXT,

domain

TEXT,

skills

TEXT

);

-- Sample rows (excerpt)
INSERT INTO profiles (sku_id, title, domain, skills) VALUES
59

(1,

’Residential Electrician’,

’Electrical’,

’wiring; panel upgrades; lighting installation; troubleshooting’),
(14,

’Lighting Installation Electrician’,’Electrical’,
’ceiling lights; wall lamps; energy-saving systems; breakers’),

(100, ’Home Lighting Electrician’,

’Electrical’,

’home lighting; LED retrofit; ceiling/wall lamps’),
(10,

’Smart Home Electrician’,

’Electrical’,

’smart lighting; thermostats; IoT; home automation’),
(12,

’Mobile Auto Repair Technician’,

’Automotive’,

’on-site repair; battery replacement; diagnostics; brakes’),
(5,

’Mobile Car Technician’,

’Automotive’,

’battery; tires; minor engine diagnostics (on-site)’),
(2,

’Auto Mechanic’,

’Automotive’,

’diagnostics; oil changes; brake repairs; maintenance’),
(8,

’Motorcycle Mechanic’,

’Automotive’,

’engine tuning; brake adjustment; chain service’),
(13,

’Bathroom Plumbing Specialist’,

’Plumbing’,

’low pressure; clogs; pipe issues; bathroom fixtures’),
(3,

’Emergency Plumber’,

’Plumbing’,

’24/7; leaks; drains; water heater issues’);
Example 1: Clear Questions

60

USER INPUT
User question

Can you replace the brake pads and rotors on my car this weekend?

RETRIEVED CONTEXTS
[’Experienced in replacing brake pads, calipers, and rotors.

Ensures your

vehicle stops safely and smoothly.’, ’Replaces brake pads, fluids, and
rotors.

Ensures responsive braking.’]

EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
Please install a smart thermostat and ensure it’s connected to my
User question
HVAC system.
RETRIEVED CONTEXTS
[’Installs smart thermostats, lighting systems, and security devices.
Ensures modern electrical integration in your home.’, ’Installs smart
lighting, thermostats, and home automation systems.
integration with existing wiring.’]
EVALUATION
Faith

1

Cont. Rel

1

61

Ensures safe

USER INPUT
I need a licensed electrician to upgrade my breaker panel to handle
User question
more circuits.
RETRIEVED CONTEXTS
[’Upgrades outdated electrical panels to modern circuit breakers.

Ensures

safety and increased capacity.’, ’Handles residential and commercial
wiring projects.

Experienced in circuit installation, breaker panel

upgrades, and fault diagnosis.’, ’Upgrades outdated fuse boxes to modern
breaker panels, increasing electrical capacity and home safety.’]
EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
User question

Can you unclog my kitchen sink drain and check for any leaks?

RETRIEVED CONTEXTS
[’Fixes leaky pipes, replaces corroded plumbing, and handles emergency
indoor water issues efficiently.’, ’Unclogs kitchen, bathroom, and floor
drains using mechanical and chemical methods.
EVALUATION
Faith

0.333

Cont. Rel

0.5

62

Fast and affordable.’]

USER INPUT
User question

Please install a ceiling fan in my bedroom and ensure proper wiring.

RETRIEVED CONTEXTS
[’Handles residential and commercial wiring projects.

Experienced in

circuit installation, breaker panel upgrades, and fault diagnosis.’,
’Installs smart lighting, thermostats, and home automation systems.
Ensures safe integration with existing wiring.’, ’Specialist in installing
ceiling lights, wall fixtures, and energy-efficient LED systems for homes
and offices.’]
EVALUATION
Faith

1

Cont. Rel

0

USER INPUT
User question

My water heater stopped working can you repair or replace it?

RETRIEVED CONTEXTS
[’Repairs and installs water heaters.
systems.’]
EVALUATION
Faith

1

Cont. Rel

1

63

Handles both tank and tankless

USER INPUT
User question

I need someone to install an EV charging station in my garage.

RETRIEVED CONTEXTS
[’Sets up electric vehicle home charging stations with proper grounding
and capacity checks.’, ’Installs home EV charging stations compatible with
Tesla, Nissan Leaf, and other electric vehicles.’]
EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
Can you replace the old light switches with dimmer switches in my
User question
living room?
RETRIEVED CONTEXTS
[’Provides safe and efficient installation of ceiling lights, wall lamps,
and energy-saving lighting systems.’, ’Specialist in installing ceiling
lights, wall fixtures, and energy-efficient LED systems for homes and
offices.’]
EVALUATION
Faith

1

Cont. Rel

0

64

USER INPUT
Please install a new dishwasher and connect it to the existing
User question
plumbing.
RETRIEVED CONTEXTS
[’Specialist in sink, faucet, and dishwasher installation and repair.
Keeps your kitchen running smoothly.’]
EVALUATION
Faith

0.5

Cont. Rel

0

USER INPUT
My car’s A/C isn’t cooling can you recharge the refrigerant and
User question
check for leaks?
RETRIEVED CONTEXTS
[’Diagnoses and repairs car air conditioning systems.

Services include

refrigerant recharge and compressor replacement.’, ’Repairs vehicle AC
systems including refrigerant top-up and compressor fixes.’]
EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
User question

Can you install LED recessed lighting in my kitchen ceiling?

RETRIEVED CONTEXTS
[’Specialist in installing ceiling lights, wall fixtures, and
energy-efficient LED systems for homes and offices.’]
EVALUATION
Faith

1

Cont. Rel

1

65

USER INPUT
I need someone to replace the corroded pipes under my bathroom
User question
sink.
RETRIEVED CONTEXTS
[’Replaces corroded or leaking water pipes in kitchens, bathrooms, and
basements.

Uses durable materials and precise fitting.’, ’Installs and

repairs toilets, bathtubs, and showers.

Solves low pressure, clogging,

and pipe issues efficiently.’]
EVALUATION
Faith

0.667

Cont. Rel

1

USER INPUT
User question

Please install a garbage disposal unit in my kitchen sink.

RETRIEVED CONTEXTS
[’Installs and seals new sinks, faucets, and garbage disposals in
kitchens.’]
EVALUATION
Faith

1

Cont. Rel

1

66

USER INPUT
Can you repair the electrical wiring for my oven? It’s not heating
User question
properly.
RETRIEVED CONTEXTS
[’Handles residential and commercial wiring projects.

Experienced in

circuit installation, breaker panel upgrades, and fault diagnosis.’,
’Fixes electrical issues in ovens, washing machines, dryers, and other
household appliances.

In-home repair visits available.’]

EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
I need someone to install a smart lighting system throughout my
User question
home.
RETRIEVED CONTEXTS
[’Installs smart lighting, thermostats, and home automation systems.
Ensures safe integration with existing wiring.’]
EVALUATION
Faith

1

Cont. Rel

1

67

USER INPUT
User question

Can you replace the suspension struts on my vehicle?

RETRIEVED CONTEXTS
[’Fixes vehicle suspension systems for a smoother and safer ride.’, ’Fixes
worn-out shocks, struts, and suspension components.

Improves ride comfort

and vehicle stability.’]
EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
User question

Please install a surge protector at the main electrical panel.

RETRIEVED CONTEXTS
[’Installs whole-home surge protection devices to prevent appliance damage
during voltage spikes.’]
EVALUATION
Faith

1

Cont. Rel

1

68

USER INPUT
User question

My toilet keeps leaking an you replace the internal components?

RETRIEVED CONTEXTS
[’Installs and replaces standard and smart toilets.

Handles leaks,

flushing issues, and low water pressure.’, ’Responds quickly to pipe
bursts, leaks, and severe clogs.

Available 24/7.’]

EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
I need someone to install outdoor waterproof lighting along my
User question
walkway.
RETRIEVED CONTEXTS
[’Installs garden and pathway lighting with waterproof systems.
aesthetics and security.’]
EVALUATION
Faith

1

Cont. Rel

1

69

Enhances

USER INPUT
User question

Can you perform a full electrical safety inspection for my home?

RETRIEVED CONTEXTS
[’Performs safety inspections for residential and rental properties.
Identifies fire hazards and code violations.’, ’Inspects residential and
commercial wiring for safety compliance and issues.’]
EVALUATION
Faith

1

Cont. Rel

0.5

Example 2: Ambiguous Questions
USER INPUT
User question

My car is making a strange noise can someone check it?

RETRIEVED CONTEXTS
[’Experienced in replacing brake pads, calipers, and rotors.

Ensures your

vehicle stops safely and smoothly.’, ’Performs diagnostics and repairs on
engine misfires, leaks, overheating, and timing belt replacements.’, ’Uses
OBD-II and advanced tools to find check engine issues, sensor faults, and
system inefficiencies.’, ’Skilled mechanic providing car diagnostics, oil
changes, brake repairs, and general vehicle maintenance services.’]
EVALUATION
Faith

1

Cont. Rel

1

70

USER INPUT
User question

The lights in my house keep flickering. Any idea why?

RETRIEVED CONTEXTS
[’Handles residential and commercial wiring projects.

Experienced in

circuit installation, breaker panel upgrades, and fault diagnosis.’,
’Certified residential electrician with experience in wiring, panel
upgrades, lighting installation, and electrical troubleshooting for
homes.’, ’Provides safe and efficient installation of ceiling lights,
wall lamps, and energy-saving lighting systems.’]
EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
User question

I have a leak in the bathroom. Can anyone help?

RETRIEVED CONTEXTS
[’Responds quickly to pipe bursts, leaks, and severe clogs.

Available

24/7.’, ’Fixes leaky pipes, replaces corroded plumbing, and handles
emergency indoor water issues efficiently.’, ’Handles plumbing for full
bathroom renovations including showers and vanities.’, ’Installs and
repairs toilets, bathtubs, and showers.
and pipe issues efficiently.’]
EVALUATION
Faith

0.667

Cont. Rel

1

71

Solves low pressure, clogging,

USER INPUT
User question

My dishwasher isn

cleaning dishes properly.

RETRIEVED CONTEXTS
[’Specialist in sink, faucet, and dishwasher installation and repair.
Keeps your kitchen running smoothly.’]
EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
User question

The water pressure in my shower is really low.

RETRIEVED CONTEXTS
[’Installs and replaces standard and smart toilets.

Handles leaks,

flushing issues, and low water pressure.’, ’Installs and replaces toilets,
showers, faucets, and vanity sinks.

Also handles related water pressure

issues.’, ’Installs and repairs toilets, bathtubs, and showers.
low pressure, clogging, and pipe issues efficiently.’]
EVALUATION
Faith

1

Cont. Rel

1

72

Solves

USER INPUT
User question

My thermostat isn’t working as expected.

RETRIEVED CONTEXTS
[’Installs smart thermostats, lighting systems, and security devices.
Ensures modern electrical integration in your home.’]
EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
User question

One of my breakers keeps tripping randomly.

RETRIEVED CONTEXTS
[’Troubleshoots tripping or overheating breakers.

Offers repair or

replacement services with full testing.’]
EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
User question

My kitchen sink is draining very slowly.

RETRIEVED CONTEXTS
[’Unclogs drains in kitchens, bathrooms, and basements using mechanical
tools.’, ’Unclogs kitchen, bathroom, and floor drains using mechanical and
chemical methods.

Fast and affordable.’]

EVALUATION
Faith

1

Cont. Rel

1

73

USER INPUT
User question

The A/C in my car isn’t cooling well.

RETRIEVED CONTEXTS
[’Diagnoses and repairs car air conditioning systems.

Services include

refrigerant recharge and compressor replacement.’, ’Repairs vehicle AC
systems including refrigerant top-up and compressor fixes.’]
EVALUATION
Faith

0.5

Cont. Rel

1

USER INPUT
User question

My washing machine stopped spinning.

RETRIEVED CONTEXTS
[’Fixes electrical issues in ovens, washing machines, dryers, and other
household appliances.

In-home repair visits available.’]

EVALUATION
Faith

1

Cont. Rel

1

74

USER INPUT
User question

The ceiling light in my living room doesn’t turn on.

RETRIEVED CONTEXTS
[’Handles residential and commercial wiring projects.

Experienced in

circuit installation, breaker panel upgrades, and fault diagnosis.’,
’Provides safe and efficient installation of ceiling lights, wall lamps,
and energy-saving lighting systems.’, ’Specialist in installing ceiling
lights, wall fixtures, and energy-efficient LED systems for homes and
offices.’]
EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
User question

My toilet keeps running after flushing.

RETRIEVED CONTEXTS
[’Responds quickly to pipe bursts, leaks, and severe clogs.
24/7.’, ’Installs and repairs toilets, bathtubs, and showers.

Available

pressure, clogging, and pipe issues efficiently.’]
EVALUATION
Faith

0

Cont. Rel

0

75

Solves low

USER INPUT
User question

There

water under my kitchen sink.

RETRIEVED CONTEXTS
[’Repairs or replaces leaking, dripping, or stiff kitchen faucets.
Handles both modern and traditional fixtures.’, ’Specialist in sink,
faucet, and dishwasher installation and repair.

Keeps your kitchen

running smoothly.’]
EVALUATION
Faith

0.5

Cont. Rel

1

USER INPUT
User question

My car won’t start this morning.

RETRIEVED CONTEXTS
[’Provides vehicle repair services at your location, including battery
replacement, minor engine diagnostics, and brake service.’]
EVALUATION
Faith

1

Cont. Rel

1

76

USER INPUT
User question

The oven isn

heating up properly.

RETRIEVED CONTEXTS
[’Fixes electrical issues in ovens, washing machines, dryers, and other
household appliances.

In-home repair visits available.’]

EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
User question

My motorcycle brakes feel soft.

RETRIEVED CONTEXTS
[’Experienced in repairing motorcycles:

engine tuning, brake adjustments,

chain servicing, and general maintenance.’, ’Experienced in replacing
brake pads, calipers, and rotors.

Ensures your vehicle stops safely and

smoothly.’, ’Replaces brake pads, fluids, and rotors.
braking.’]
EVALUATION
Faith

1

Cont. Rel

1

77

Ensures responsive

USER INPUT
User question

The garage door opener isn

responding.

RETRIEVED CONTEXTS
[’Troubleshoots tripping or overheating breakers.

Offers repair or

replacement services with full testing.’]
EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
User question

My dryer is making a loud noise.

RETRIEVED CONTEXTS
[’Fixes electrical issues in ovens, washing machines, dryers, and other
household appliances.

In-home repair visits available.’]

EVALUATION
Faith

1

Cont. Rel

1

USER INPUT
User question

The smart light bulbs aren

connecting to the app.

RETRIEVED CONTEXTS
[’Installs smart lighting, thermostats, and home automation systems.
Ensures safe integration with existing wiring.’]
EVALUATION
Faith

0.5

Cont. Rel

1

78

USER INPUT
User question

My bathroom fan stopped working.

RETRIEVED CONTEXTS
[’Handles plumbing for full bathroom renovations including showers and
vanities.’, ’Installs and replaces toilets, showers, faucets, and vanity
sinks.

Also handles related water pressure issues.’]

EVALUATION
Faith

0

Cont. Rel

0

79