Abstract
This paper investigates the application of artificial intelligence (AI) language models in writing policy briefing notes within the context of public administration by juxtaposing the technologies’ performance against the traditional reliance on human expertise. Briefing notes are pivotal in informing decision-making processes in government contexts, which generally require high accuracy, clarity, and issue-relevance. Given the increasing integration of AI across various sectors, this study aims to evaluate the effectiveness and acceptability of AI-generated policy briefing notes. Using a structured evaluation-by-experts methodology, the research scrutinizes and compares the output of three leading AI language models—OpenAI’s ChatGPT, Google’s Gemini, and Mistral’s Le Chat—across ten critical dimensions reflective of policy briefing notes quality. These dimensions encompass both structural and content-related aspects, ranging from linguistic precision to the depth and applicability of the generated information. The discussion is anchored in the technology acceptance model (TAM) theory and its extensions, which offer a theoretical framework for understanding the factors influencing the adoption and usefulness of technology in public administration. Our comparative analysis reveals that while AI models exhibit notable competencies in meeting some structural and linguistic benchmarks, they fail to address the nuances and depth required by policy experts for undertaking informed decision-making adequately. This discrepancy underscores the enduring value of human expertise in synthesizing complex information and navigating ethical considerations, even as AI enhances efficiency in certain aspects of the policy briefing note crafting process.