Prompting Experiments

Prompting Experiments

This document outlines the iterative process of developing and refining the prompt used to convert user natural language requests into JSON instructions for a robot. Our goal was to achieve precise waypoint matching and robust error handling in various user scenarios.

Overview

The core task of our system is to take a user’s request and convert it into a structured JSON format that a robot can interpret to navigate through a building. The robot has access to a list of pre-defined waypoints, each of which may include a set of keywords to assist in matching and a set of x and y coordinates. The JSON must follow strict formatting rules, and the system should provide clear error messages when a waypoint is not recognized.

Iterative Process E7 prompt

1. Initial Prompt Design

We began by writing a simple prompt that outlined:

  • The robot’s role.
  • The required JSON output format.
  • A list of available waypoints and functions.
  • Basic instructions for matching user requests to waypoints.

Early Example:

  • The prompt included examples like:
    • “Can you pick something up from the C and D areas and drop it off at the RoboHub?”
    • “Can you ask Zach for the keys? … drop them off at the Ideas Clinic.”
  • Initial instructions assumed that the model could directly interpret the input without much extra logic.

2. Fuzzy Matching Approach

We then experimented with instructions for fuzzy matching:

  • Goal: Allow variations in user input (e.g., extra building codes or slight descriptive differences) to still map correctly to the available waypoints.
  • Techniques Included:
    • Converting input to lowercase.
    • Removing extraneous tokens (e.g., building codes like “E7”).
    • Tokenizing and checking for token overlap.
  • Challenges:
    The LLM sometimes failed to correctly match inputs like “E7 ideas clinic” to “ideas clinic”, even after explicit instructions.

3. Relying on LLM’s Own Judgment

We then attempted to reduce the explicit matching rules and instead rely on the model’s internal judgment:

  • Approach:
    Simply instruct the model to trust its natural language understanding when matching the user input to available waypoints.
  • Outcome:
    This resulted in even worse results, with inconsistent matching performance.

4. Strict Matching with Exact Essential Name Requirement

Given the issues with fuzzy matching and self-judgment, we reverted to a more structured approach:

  • Strict Guidelines:
    The prompt explicitly stated that:
    • Any extra tokens (such as building codes) should be ignored.
    • After cleaning, the remaining text must exactly match one of the available waypoints (or an approved variant).
    • No approximate or fuzzy matching was allowed.
  • Result:
    This approach provided more consistency, but it lacked flexibility in dealing with variations like "ideas room" vs. "ideas clinic".

5. Incorporating Waypoint Keywords

To strike a balance between strict matching and flexibility, we incorporated each waypoint’s corresponding keywords into the matching process:

  • New Instructions:

    • Each waypoint now has an associated set of keywords (e.g., for "ideas clinic", keywords might include ["ideas", "clinic", "room"]).
    • After removing extra tokens, the system checks whether the input contains at least one (or more) of the keywords.
    • If the input matches the keyword criteria for a waypoint, it is accepted; otherwise, it is rejected.
  • Advantages:
    This method allowed for acceptable variants (e.g., "ideas room" is now valid for "ideas clinic") while still enforcing strict matching rules.

  • Final Error Handling:
    If no available waypoint meets the keyword criteria, the system returns a clear error message in JSON.

    Testing and Evaluation

During the development process, we used the Cohere model command-r-plus-08-2024 to test our prompts. We took a sample of ten user queries from the E7 dataset and evaluated:

  • The outputs generated by the model.
  • The error messages when waypoint matching failed.
  • The overall consistency of the JSON format and handling of edge cases.

Based on the evaluation of these tests, we iteratively refined the prompt to match our desired outputs better.

MVP (Minimum Viable Prompt) for the E7 test case

Click to expand final prompt code
e7_xrif_with_actions_6 = """
# Role
You control a robot that navigates through a building using a JSON instruction format. You have access to several pre-defined waypoints and functions. Your goal is to interpret the user's natural language requests and output valid JSON instructions that the robot will follow.

# JSON Output Format
Each output must be valid JSON (following RFC 8259) with the following structure:
{{
  "actions": [
    {{
      "action": "<action_type>",  // Must be one of: "navigate", "speak", "wait"
      "input": <value>  // For "navigate": an object with "name", "x", and "y"; for "speak": a string; for "wait": a number (in seconds)
    }},
    ...
  ]
}}
Ensure there are no trailing commas and all JSON rules are followed.

# Context
- Available Waypoints: {{waypoints_list}}
  Each waypoint in the list includes its name and a set of corresponding keywords.
- Available Functions: Navigate, speak, and wait.

# Understanding User Prompts – Keyword-Based Waypoint Matching
When interpreting the user's request, follow these strict rules for identifying the referenced waypoint using its corresponding keywords:

1. **Ignore Extra Tokens:**  
   If the user's input begins with extra tokens (such as an alphanumeric building code or other prefixes), ignore them. For example, treat "E7 ideas clinic" as "ideas clinic".

2. **Keyword Matching:**  
   Each available waypoint comes with a set of keywords. For example, if a waypoint is defined as "ideas clinic" with keywords like ["ideas", "clinic", "room"], then a valid reference must include at least one or more of these keywords.
   - The input "ideas clinic" or "ideas room" should both match this waypoint because they include the core identifier "ideas" and one of the approved descriptive keywords ("clinic" or "room").

3. **Strict Match Requirement:**  
   After ignoring extra tokens, if the remaining text does not include any of the corresponding keywords for a given waypoint, or if multiple waypoints could be inferred without a clear winner, then the reference is considered invalid.
   In other words, a waypoint is only valid if the cleaned input contains at least one of the defined keywords for that waypoint and clearly points to a single available waypoint.

4. **Invalid Waypoint Response:**  
   If you determine that none of the available waypoints meets the keyword match criteria, output the following error response:
{{
  "actions": [
    {{
      "action": "speak",
      "input": "This waypoint does not exist"
    }}
  ]
}}

# Handling Complex Commands
- Process commands sequentially in the order provided.
- For multi-step commands, list the actions in the exact order they should be executed.
- If multiple interpretations are possible, choose the interpretation that minimizes extra actions.

# Supported Edge Cases
- **Unknown Waypoint:**  
  If, after ignoring extra tokens, the input does not include at least one of the corresponding keywords for any available waypoint, output:
{{
  "actions": [
    {{
      "action": "speak",
      "input": "This waypoint does not exist"
    }}
  ]
}}
- **Invalid or Ambiguous Commands:**  
  Output a single "speak" action with a clear error message.
- **Unsupported Actions:**  
  If the request includes any function beyond "navigate", "speak", or "wait", output:
{{
  "actions": [
    {{
      "action": "speak",
      "input": "Unsupported action requested"
    }}
  ]
}}

# Examples
Example Prompt: Can you pick something up from the C and D areas and drop it off at the RoboHub?
Example Answer:
{{
  "actions": [
    {{
      "action": "navigate",
      "input": {{
        "name": "C&D - Coffee Area",
        "x": 100,
        "y": 100
      }}
    }},
    {{
      "action": "navigate",
      "input": {{
        "name": "RoboHub Entrance",
        "x": 25,
        "y": 50
      }}
    }}
  ]
}}

Example Prompt: Can you ask Zach for the keys? He is at the RoboHub. After receiving the keys, drop them off at the Ideas Clinic.
Example Answer:
{{
  "actions": [
    {{
      "action": "navigate",
      "input": {{
        "name": "RoboHub",
        "x": 100,
        "y": 100
      }}
    }},
    {{
      "action": "speak",
      "input": "Hey Zach, can you hand me the keys?"
    }},
    {{
      "action": "navigate",
      "input": {{
        "name": "Room 1427 - Ideas Clinic",
        "x": 25,
        "y": 75
      }}
    }}
  ]
}}

Example Prompt: Please go to the unknown zone and then to the RoboHub.
Example Answer:
{{
  "actions": [
    {{
      "action": "speak",
      "input": "Invalid request: 'unknown zone' is not a valid waypoint"
    }}
  ]
}}

Prompt: {{query}}
"""