Skip to content

Caching not working for claude-haiku-4-5 #3396

Description

@R-Inbarasu

Describe the bug

Prompt caching does not seem to work while using claude-haiku-4-5 model through bedrock. I'm using converse API through Aws::BedrockRuntime::Client module. My code config is correct. I can see the cache write and read token when I switch the model to claude-sonnet-4-5

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

The request should have atleast made write to cache if this is first round of request. And read from cache from next requests.

Current Behavior

It does not read/write to cache. I've also tested with longer prompts and more tools.

Reproduction Steps

Example request & response(Replaced message content):
Request

  "messages": [
    {
      "role": "user",
      "content": [
        {
          "text": "<my-content>"
        },
        {
          "cache_point": {
            "ttl": "5m",
            "type": "default"
          }
        }
      ]
    }
  ],
  "model_id": "arn:aws:bedrock:us-east-1:<aws-account-id>:inference-profile/global.anthropic.claude-haiku-4-5-20251001-v1:0",
  "tool_config": {
    "tools": [
      {
        "tool_spec": {
          "name": "tool_1",
          "description": "tool_1 description",
          "input_schema": {
            "json": {
              "type": "object",
              "required": [
                "query"
              ],
              "properties": {
                "limit": {
                  "type": "integer",
                  "description": "blah blah blah"
                },
                "query": {
                  "type": "string",
                  "description": "blah blah blah."
                }
              }
            }
          }
        }
      },
      {
        "tool_spec": {
          "name": "tool_2",
          "description": "tool_2 description",
          "input_schema": {
            "json": {
              "type": "object",
              "required": [
                "phone_number"
              ],
              "properties": {
                "phone_number": {
                  "type": "string",
                  "description": "blah blah blah."
                }
              }
            }
          }
        }
      },
      {
        "cache_point": {
          "ttl": "5m",
          "type": "default"
        }
      }
    ]
  },
  "inference_config": {}
}

Response:

{
  "usage": {
    "input_tokens": 1084,
    "total_tokens": 1218,
    "output_tokens": 134,
    "cache_read_input_tokens": 0,
    "cache_write_input_tokens": 0
  },
  "output": {
    "message": {
      "role": "assistant",
      "content": [
        {
          "tool_use": {
            "name": "tool_1",
            "type": "tool_use",
            "input": {
              "limit": 30,
              "query": "<tool_input>"
            },
            "tool_use_id": "tooluse_WapxgbAhnK3tv4PsCPQHQl"
          }
        },
        {
          "tool_use": {
            "name": "tool_1",
            "type": "tool_use",
            "input": {
              "limit": 30,
              "query": "<tool_input>"
            },
            "tool_use_id": "tooluse_MgkFx5D7VycpozoL7dIf5U"
          }
        }
      ]
    }
  },
  "metrics": {
    "latency_ms": 1521
  },
  "stop_reason": "tool_use"
}

As you can see no read/write cache tokens in the response.

Possible Solution

No response

Additional Information/Context

No response

Gem name ('aws-sdk', 'aws-sdk-resources' or service gems like 'aws-sdk-s3') and its version

aws-sdk-bedrock (1.69.0), aws-sdk-bedrockruntime (1.81.0)

Environment details (Version of Ruby, OS environment)

Ruby 3.4.4 & Rails 8.1.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    service-apiGeneral API label for AWS Services.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions