A New Chapter in Smart Governance: AI Empowering Innovation in Government Services

In the digital age, traditional public services face significant challenges, including inefficiency and outdated information. Large language models (LLMs), while impressive, struggle with “hallucinations” (generating fluent but incorrect responses) and a lack of domain-specific data, making them inadequate for high-accuracy demands. To address these issues, the Retrieval-Augmented Generation (RAG) framework has emerged as a transformative…


Hailiang Chen, Miao Yu: When it comes to handling enquiries for government services, a Retrieval-Augmented Generation (RAG) framework is capable of resolving more complex issues, while still providing accurate answers. The time required, computing power, and costs are also far lower than Incremental Pre-training and Fine-tuning methods.

Can general LLMs be directly applied to government services?

In recent years, Large Language Models (LLMs) like ChatGPT have rapidly gained worldwide popularity and shown great potential. Through massive training using vast amounts of data, these models can produce coherent and semantically sound scripts, equipping the scripts with excellent question-and-answer capabilities. When it comes to government services, members of the public generally use government websites or mobile applications to enquire about policies, regulations and procedures. People may also seek support from physical help centres.

While traditional government services are primarily delivered over a physical counter or by phone, service efficiency and response speed can often face limitations. This is especially true when complicated questions come up, which can lengthen queuing times and result in lower public satisfaction. As technology develops, especially from the emergence of LLMs, government services are gradually moving toward more intelligent automation. By combining LLMs with Conversational AI, government services have become more efficient and can significantly enhance the user experience.

Although LLMs perform well in handling common enquiries, they still face many challenges in the more specialised fields of government policies and regulations. LLM training relies primarily on data from public content that is online. The lack of in-depth, professional knowledge could lead to inaccurate answers or contradictory responses. Under certain scenarios, the models may even have “hallucinations” that provide false information – with total confidence. These issues may be difficult for the average person to detect if they do not have professional knowledge in a specialised subject.

Moreover, government service policies are constantly changing. If LLMs fail to include the latest policy updates, their answers could be at odds with existing policies. In this regard, it is not enough to just ensure the accuracy of answers of government service AI systems. A boost to the systems’ interpretability is essential to help the public understand the basis for each answer. This will also help improve the systems’ transparency and credibility.

Application characteristics of RAG in government services

The RAG framework was created to fully use LLMs in government service scenarios. The RAG technology optimises answers through two steps. First, by searching for documents and text extractions relevant to the user’s question. Second, based on these results, the reply is then generated using an LLM. The introduction of RAG has resolved some existing issues in government services and has shown important advantages (Note 1).

RAG has drastically increased the accuracy of answers by integrating an external database with references to the latest and authoritative policies and regulations. RAG also supports more complex queries, as it is able to handle a multi-level and multi-dimensional problem, instead of relying solely on keyword matches. Through the combined search results and generated responses, RAG has not only improved the accuracy of its replies but also strengthened the users’ trust in AI systems. This allows people to see the justification of each answer and reduces concerns caused by LLM hallucinations.

Furthermore, RAG technology greatly cuts costs and power consumption. Compared to Incremental Pre-training and Fine-tuning approaches, RAG does not need to retrain large models. Instead, it improves the quality of its answers by supplementing them with search results from external data sources. Given this, the RAG framework requires far less time, computing power and running costs than the Incremental Pre-training and Fine-tuning methods.

Foundational model, Fine-tuning model and RAG framework: Which one is best?

To further evaluate the RAG framework’s performance in a legal service setting, the authors made a comparative analysis by grading its semantic rules and factual consistencies. This was done to identify any discrepancies in response quality and accuracy between the base model, the fine-tuned model and the RAG framework.

Research results showed that the Fine-tuned model performed best in dealing with semantic queries, due to its linguistic style and wording being closest to the model answer. However, there were glaring problems, as hallucinations were more severe than in the Base model, generating answers that often included information that was inconsistent with facts. This affected its reliability in practical applications.

In comparison, the Base model was able to give relatively concise answers and with less information. It still had limited ability for handling complicated questions and it performed below par on high-precision tasks.

As for the RAG framework, it was effective in reducing hallucinations. Responses were less contradictory to answers generated by the Fine-tuned model. RAG also provided more accurate explanations with high factual consistency. While ensuring for accuracy and consistency, the RAG framework can reduce hallucinations, making it especially suitable for complicated tasks that demand external knowledge support, such as consultations and analyses on government services.

In terms of cost, using a comprehensive analysis of performance and resource consumption, the Fine-tuned model was found to be five times that of the RAG framework, whereas the cost of the RAG framework was similar to the base model. On top of ensuring generational quality, the RAG framework also significantly cut costs, making it an ideal choice in striking a balance between efficiency and cost. Crucially, data safety is of critical importance to governments and enterprises. The RAG framework can be applied through a fully private deployment, so it does not need to upload sensitive, internal data to third-party platforms. This effectively reduces the risks of data leakage (Note 2).

The future of government services: From automation to personalisation

Presently, government service chatbot supported by RAG technology can effectively address shortfalls in conventional service delivery such as delayed information, inefficient searches, and human intervention. As technology advances, the RAG system is expected to broaden its application in wider domains and provide more personalised government services like smart content recommendations and ‘digital humans’. Not only will this help realise low-latency voice interaction, but it can also customise a government service experience to meet the user’s needs, allowing for a more personalised and targeted service.

Note 1:Special Report:  “A New Chapter in Smart Governance: AI Empowering Innovation in Government Services”
https://fwik3jehaxr.feishu.cn/file/FFCjbsGLzoiHuOxKcTPcNqn1nMb

Note 2:Special Report:  “A New Chapter in Smart Governance: AI Empowering Innovation in Government Services”
https://fwik3jehaxr.feishu.cn/file/FFCjbsGLzoiHuOxKcTPcNqn1nMb

Professor Hailiang Chen
Assistant Dean (Taught Postgraduate)
Director of AI Research Institute

Professor in Innovation and Information Management

Miao Yu
Research Postgraduate Student, HKU Business School

This article was also published on January 3, 2025 on the Financial Times’ Chinese website

Translation

從人工視窗到智慧問答:大語言模型與RAG技術重塑政務服務


陳海亮、于淼:研究顯示RAG框架在處理政務查詢時,可支援更複雜的問題,提供準確度高的答案,而所需的時間、算力和成本亦遠低於增量預訓練和微調方法。

通用大語言模型能否直接應用於政務服務?


近年,類似於ChatGPT的大語言模型(Large Language Model,LLM)在全球迅速普及,展示出巨大的應用潛力。 通過海量數據的訓練,這些模型能夠生成連貫且語義合理的文本,並具有卓越的問答能力。 在政務服務領域,公眾一般通過政府官網、移動應用查詢政策法規、辦事流程等資訊,或者前往政務服務大廳求助。 傳統的政務服務主要依賴人工視窗服務和電話諮詢,服務效率和回應速度往往受到限制,尤其是在遇到複雜問題或需要長時間排隊等待時,公眾滿意度往往較低。 隨著技術的發展,特別是大語言模型的出現,政務服務逐漸邁進智慧化、自動化。 結合大語言模型的對話式諮詢服務更為高效,能顯著提升公眾體驗。

儘管大語言模型在處理常規問題時表現優異,但在涉及政策法規等專業領域時,仍面臨諸多挑戰。 LLM的訓練數據主要來源於互聯網的公開內容,缺乏深入專業知識,可能導致生成不準確或不符合實際情況的回答。 在某些特定場景下,模型可能會出現「幻覺」現象,即自信地提供錯誤資訊。 這種問題對普通用戶來說通常難以識別,因為他們未必具備相關領域的專業知識。 此外,政務政策經常更新變化,如果大語言模型未能及時獲取最新的政策資訊,其回答可能會與當前政策相衝突。 由此可見,為提高政務AI系統的可靠性,除了確保回答的準確性外,還需加強其可解釋性,以幫助使用者理解每個回答的依據,此舉亦有助提升系統的透明度和信任度。

RAG在政務服務中的應用特點


為了將大語言模型更好地應用於政務服務等場景,檢索增強生成(Retrieval-Augmented Generation,RAG)框架應運而生。 RAG技術通過兩個步驟優化答案生成:首先,檢索與用戶問題相關的文檔或文本片段; 其次,利用LLM基於這些檢索結果生成解答。 引入RAG解決了當前政務服務中的一些問題,並表現出一定的優勢【註1】

首先,RAG通過集成外部知識庫,確保模型生成的答案基於最新且權威的政策法規文本,大幅提升回答的準確性。 其次,RAG支援更為複雜的查詢方式,能夠處理多層次和多維度的問題,不依賴單一的關鍵詞匹配。 通過結合檢索結果與生成內容,RAG不僅提高了信息的準確度,還增強了使用者對AI系統的信任度。 這樣,用戶可清晰地看到每個回答背後的依據,減少了幻覺現象帶來的困擾。 此外,RAG技術的應用大幅降低成本和算力消耗。 與增量預訓練(Incremental Pre-training)和微調(Fine-tuning)相比,RAG無需重新訓練大型模型,而是通過外部數據源的檢索與補充來提升生成品質。 因此,RAG框架在構建政務問答系統時所需的時間、算力和成本遠低於傳統的增量預訓練和微調方法。

基礎模型、微調模型與RAG框架:哪個最出色?


為了進一步評估RAG框架在政務法律領域的效果,筆者通過語義規則評分和事實一致性評分的對比分析,探討基礎模型、微調模型與RAG框架在生成內容品質及準確性方面的差異。 研究結果顯示,微調模型在語義評分方面表現最佳,全因其回答的語言風格和用詞更接近標準答案。 然而,微調模型存在一個顯明問題,其「幻覺」現象較基礎模型更為嚴重,生成的回答常包含許多與事實不符的內容,影響實際應用的可靠性。 與之相比,基礎模型生成的回答較為簡潔,資訊量相對較少。 在處理複雜問題時,能力有限,應對高精度任務時表現欠佳。 RAG框架則有效緩解幻覺現象,生成的回答中矛盾點明顯少於微調模型,並且提供更多準確的信息,事實一致性較強。 RAG框架在確保準確性和一致性的同時,能夠有效減少幻覺現象,特別適用於需要外部知識支援的複雜任務,如政務諮詢和分析。

在成本方面,經綜合分析性能與資源消耗后發現,微調模型的成本是RAG框架的五倍,而RAG框架的成本則與基礎模型相近。 RAG框架在確保生成質量的同時,亦能顯著降低成本,是效率與成本之間的理想選擇。 更關鍵的是,對於政府和企業而言,數據安全至關重要。 RAG框架可通過完全私有化部署的方式應用,無需將敏感內部數據上傳至第三方平臺,有效降低數據洩露的風險【註2】

政務服務的未來:從自動化到個人化


現時,RAG技術的政務問答助手,能夠有效應對傳統政務服務中的資訊滯後、檢索低效及人工干預等問題。 隨著技術持續進步,RAG系統預計在更廣泛的應用領域上進一步擴展其功能,提供更加個人化的政務服務,例如智慧內容推薦和數位人助手。 這不僅有助於實現低延遲的語音交互,還能依據使用者需求提供定製化的政務服務體驗,讓服務變得更個人化及精準。

本文作者為港大經管學院助理院長(碩士課程)、人工智慧研究所所長、創新及資訊管理學教授陳海亮以及港大經管學院博士生于淼。

註1:智慧政務新篇章:AI助力政務服務創新專題報告 https://fwik3jehaxr.feishu.cn/file/FFCjbsGLzoiHuOxKcTPcNqn1nMb

註2:智慧政務新篇章:AI助力政務服務創新專題報告 https://fwik3jehaxr.feishu.cn/file/FFCjbsGLzoiHuOxKcTPcNqn1nMb