From 126c00d78e042218b986174b00492c331700ad3e Mon Sep 17 00:00:00 2001 From: charliesheh <59520443+charliesheh@users.noreply.github.com> Date: Sun, 17 May 2026 15:45:11 -0700 Subject: [PATCH] Add ImageContent documentation page --- docs-website/docs/concepts/data-classes.mdx | 6 + .../concepts/data-classes/imagecontent.mdx | 213 ++++++++++++++++++ 2 files changed, 219 insertions(+) create mode 100644 docs-website/docs/concepts/data-classes/imagecontent.mdx diff --git a/docs-website/docs/concepts/data-classes.mdx b/docs-website/docs/concepts/data-classes.mdx index 91efbc3674..e07ec4b6f2 100644 --- a/docs-website/docs/concepts/data-classes.mdx +++ b/docs-website/docs/concepts/data-classes.mdx @@ -120,6 +120,12 @@ image = ByteStream.from_file_path("dog.jpg") Read the detailed documentation for the `ChatMessage` data class on a dedicated [ChatMessage](data-classes/chatmessage.mdx) page. +### ImageContent + +`ImageContent` represents image-based content used in multimodal chat messages and vision-language pipelines. + +Read the detailed documentation for the `ImageContent` data class on the dedicated [ImageContent](./data-classes/imagecontent) page. + ### Document #### Overview diff --git a/docs-website/docs/concepts/data-classes/imagecontent.mdx b/docs-website/docs/concepts/data-classes/imagecontent.mdx new file mode 100644 index 0000000000..6c2302751e --- /dev/null +++ b/docs-website/docs/concepts/data-classes/imagecontent.mdx @@ -0,0 +1,213 @@ +--- + +title: "ImageContent" +id: imagecontent +slug: "/imagecontent" +description: "`ImageContent` represents image-based content in Haystack chat messages and multimodal pipelines." +---------------------------------------------------------------------------------------------------------------- + +# ImageContent + +`ImageContent` is a Haystack data class used to represent image-based content in chat messages and multimodal AI pipelines. + +It is commonly used with: + +* multimodal LLMs +* vision-language models +* image-aware chat applications +* document/image processing workflows + +`ImageContent` stores images as base64-encoded strings together with metadata such as MIME type and image detail level. + +If you are looking for the full API reference, see the [API documentation](/reference/data-classes-api#imagecontent). + +--- + +# Creating ImageContent + +You can create an `ImageContent` object directly from a base64 string: + +```python +from haystack.dataclasses import ImageContent + +image = ImageContent( + base64_image="your_base64_encoded_image", + mime_type="image/png" +) + +print(image) +``` + +--- + +# Loading Images from a File Path + +The `from_file_path()` class method provides a convenient way to load local image files. + +```python +from haystack.dataclasses import ImageContent + +image = ImageContent.from_file_path( + "sample.png", + detail="low" +) + +print(image) +``` + +The optional `detail` parameter is currently supported by OpenAI vision models and accepts: + +* `"auto"` +* `"high"` +* `"low"` + +You can also resize images while loading: + +```python +image = ImageContent.from_file_path( + "sample.png", + size=(512, 512) +) +``` + +This helps reduce: + +* memory usage +* processing time +* payload size + +when working with multimodal LLM APIs. + +--- + +# Loading Images from a URL + +You can also create an `ImageContent` object directly from an image URL: + +```python +from haystack.dataclasses import ImageContent + +image = ImageContent.from_url( + "https://images.unsplash.com/photo-1546182990-dffeafbe841d", + detail="low" +) + +print(image) +``` + +Internally, Haystack downloads the image and converts it into a base64 representation. + +--- + +# Using ImageContent with ChatMessage + +`ImageContent` is commonly used together with `ChatMessage` for multimodal conversations. + +```python +from haystack.dataclasses import ChatMessage, ImageContent + +image = ImageContent.from_url( + "https://images.unsplash.com/photo-1546182990-dffeafbe841d", + detail="low" +) + +message = ChatMessage.from_user( + content_parts=[ + "What does this image show?", + image + ] +) + +print(message) +``` + +This allows multimodal LLMs to process both: + +* textual prompts +* image inputs + +within the same message. + +--- + +# Metadata + +The optional `meta` parameter allows you to attach custom metadata to the image. + +```python +image = ImageContent.from_url( + "https://example.com/image.png", + meta={"source": "example-dataset"} +) +``` + +This can be useful for: + +* tracing +* dataset tracking +* workflow metadata +* custom application logic + +--- + +# Validation + +By default, `ImageContent` validates: + +* base64 encoding +* MIME type correctness +* image MIME compatibility + +Validation can be disabled to improve performance: + +```python +image = ImageContent( + base64_image="your_base64_encoded_image", + mime_type="image/png", + validation=False +) +``` + +--- + +# Serialization + +`ImageContent` supports dictionary serialization. + +```python +image_dict = image.to_dict() + +restored_image = ImageContent.from_dict(image_dict) +``` + +--- + +# Displaying Images + +The `show()` method can display images directly in: + +* Jupyter notebooks +* local desktop environments + +```python +image.show() +``` + +This requires the `Pillow` package: + +```bash +pip install pillow +``` + +--- + +# Related Components + +`ImageContent` is frequently used with: + +* `ChatMessage` +* multimodal chat generators +* image converters +* vision-language pipelines +* `PDFToImageContent` +* `ImageFileToImageContent`