Skip to content

Could not reproduce the reported result with official models #7

@zyzhang1130

Description

@zyzhang1130

Hi, can you confirm that the model available here: https://huggingface.co/shenwzh3/alpha-umi-planner-13b, etc. are the same model used to produce the result in your paper? I tried to use the examples given in Fig.6, 7 and Table 5, 6 with greedy decoding, but it gives quite bad results for both 13b and 7b models, unlike the reported results. Not sure what is going on..

Example:

User: I want to explore the catalog of products available in the ’Electronics’ category. Please provide me with the list of products along with their details.

planner_agent: Based on the previous action, it seems that the function "catalog_details_for_product_catalog" is not valid. To explore the catalog of products available in the 'Electronics' category, I can try using the function "catalog_details_for_ " +
    "category_name = 'Electronics' +
    "limit = 10 +
    "fields = 'product_name, product_details, product_price, product_description, product_image, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image

caller_agent: I will call the function 'catalog_details_for_' next step.
Action: catalog_details_for_
Action Input: {
  "category_name": "Electronics",
  "limit": 10,
  "fields": "product_name, product_details, product_price, product_description, product_image, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_image_url, product_

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions