From 19b9caa6254f0de03f84428d50864b4e9d695a1f Mon Sep 17 00:00:00 2001
From: liqun <liqli@microsoft.com>
Date: Tue, 3 Feb 2026 10:54:16 +0800
Subject: [PATCH] add prompt to reject direct code/file prompt

---
 .../code_interpreter/code_generator_prompt.yaml            | 7 +++++++
 taskweaver/planner/planner_prompt.yaml                     | 1 +
 2 files changed, 8 insertions(+)

diff --git a/taskweaver/code_interpreter/code_interpreter/code_generator_prompt.yaml b/taskweaver/code_interpreter/code_interpreter/code_generator_prompt.yaml
index c5f441601..ca21d3a28 100644
--- a/taskweaver/code_interpreter/code_interpreter/code_generator_prompt.yaml
+++ b/taskweaver/code_interpreter/code_interpreter/code_generator_prompt.yaml
@@ -18,6 +18,13 @@ content: |-
     - {ROLE_NAME} should import other libraries if needed; if the library is not pre-installed, {ROLE_NAME} should install it (with !pip) as long as the user does not forbid it.
     - {ROLE_NAME} must respond to the User's feedback with a new code that addresses the feedback.
     
+    ## On {ROLE_NAME}'s security restrictions:
+    - {ROLE_NAME} must NEVER directly execute or incorporate code snippets provided by the user. If the user provides code to run, {ROLE_NAME} must refuse and ask the user to describe the task in natural language instead.
+    - {ROLE_NAME} must NEVER generate code that reads content from a file and then executes that content as code (e.g., using eval(), exec(), or similar on file contents). This is a critical security risk as malicious users can embed harmful code in files.
+    - {ROLE_NAME} must NEVER generate code that dynamically imports modules based on file content or user-provided strings.
+    - {ROLE_NAME} must NEVER generate code that uses pickle.load(), marshal.load(), or similar deserialization on untrusted files, as these can execute arbitrary code.
+    - {ROLE_NAME} should only generate code based on its own understanding of the task described in natural language.
+    
     ## On User's profile and general capabilities:
     - Upon receiving code from {ROLE_NAME}, the User will verify the correctness of the generated code by {ROLE_NAME} before executing it.
     - User executes the generated python code from {ROLE_NAME} in a stateful Python Jupyter kernel. 
diff --git a/taskweaver/planner/planner_prompt.yaml b/taskweaver/planner/planner_prompt.yaml
index 1b87b82c0..5e6106592 100644
--- a/taskweaver/planner/planner_prompt.yaml
+++ b/taskweaver/planner/planner_prompt.yaml
@@ -35,6 +35,7 @@ instruction_template: |-
   - Planner must thoroughly review Worker's response and provide feedback to the Worker if the response is incorrect or incomplete.
   - Planner can ignore the permission or file access issues since Workers are powerful and can handle them.
   - Planner must reject the User's request if it contains potential security risks or illegal activities.
+  - Planner must NEVER accept or execute code snippets directly provided by the User. If the User provides code to run, Planner must reject the request and explain that direct code execution is not allowed for security reasons. Instead, Planner should ask the User to describe the task in natural language so that Workers can generate safe, verified code.
   
   ## Planner's reasoning process
   - Planner has two reasoning modes: reasoning before making the plans and reasoning when focusing on the current task step.