From 864be0e5f1b48cd56fc9c5ce6e56a9e5b45e1712 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Olender?= <92638966+TC-MO@users.noreply.github.com> Date: Tue, 26 May 2026 15:42:03 +0200 Subject: [PATCH 1/4] docs: explain how to export non-default datasets --- .../dataset_schema/multiple_datasets.mdx | 23 ++++++++++++++----- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx b/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx index 670a13f32d..a1fb825895 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx +++ b/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx @@ -103,17 +103,28 @@ echo $ACTOR_STORAGES_JSON | jq '.datasets.categories' ``` -## Configure the output schema - -### Storage tab +## View and export datasets The **Storage** tab in the Actor run view displays all datasets defined by the Actor and used by the run (up to 10). -The Storage tab shows data but doesn't surface it clearly to end users. To present datasets more clearly, define an [output schema](../../actor_definition/output_schema/index.md). +To export a non-default dataset: + +1. Open the Actor run. +1. Select the **Storage** tab. +1. Select the dataset you want to export. +1. Select **Export** and choose a format: JSON, CSV, XML, Excel, HTML, RSS, or JSONL. + +:::caution Run page Export button + +The **Export** button on the Run page exports only the `default` dataset. To export other datasets, use the **Storage** tab as described above. -### Output schema +::: + +To access non-default datasets programmatically, use the [Dataset API](/api/v2/dataset-items-get) with the dataset ID from `ACTOR_STORAGES_JSON` or the SDK reference returned by `openDataset` / `open_dataset`. See [Datasets](../../../../storage/dataset.md) for export formats and query parameters. + +## Configure the output schema -Actors with output schemas can reference datasets through variables using aliases: +The Storage tab shows data but doesn't surface it clearly to end users. To present datasets more prominently on the run page, define an [output schema](../../actor_definition/output_schema/index.md) that references each dataset by alias: ```json { From fb662aaae2d96f69b3fdf5135eae0e14081d29c5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Olender?= <92638966+TC-MO@users.noreply.github.com> Date: Tue, 26 May 2026 22:52:02 +0200 Subject: [PATCH 2/4] docs: link SDK references and refine multi-dataset page --- .../dataset_schema/multiple_datasets.mdx | 24 ++++++++++--------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx b/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx index a1fb825895..cdccb563e5 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx +++ b/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx @@ -39,29 +39,31 @@ Provide schemas for individual datasets as file references or inline. Schemas fo The keys of the `datasets` object are aliases that refer to specific datasets. The previous example defines two datasets aliased as `default` and `categories`. -:::info Alias versus named dataset - -Aliases and names are different. Named datasets have specific behavior on the Apify platform (the automatic data retention policy doesn't apply to them). Aliased datasets follow the data retention of their run. Aliases only have meaning within a specific run. - -::: - Requirements: - The `datasets` object must contain the `default` alias - The `datasets` and `dataset` objects are mutually exclusive (use one or the other) +:::info Alias versus named dataset + +On the Apify platform, aliases and names behave differently. Named datasets are persistent. The automatic data retention policy doesn't apply to them. Aliased datasets follow the data retention of their run, and aliases only have meaning within a specific run. + +Behavior differs when an SDK runs outside the platform. See the SDK notes below. + +::: + See the full [Actor schema reference](../actor_json.md#reference). ## Access datasets in Actor code -Access aliased datasets: using the Apify SDK, or reading the `ACTOR_STORAGES_JSON` environment variable directly. +Access aliased datasets through the Apify SDK or by reading the `ACTOR_STORAGES_JSON` environment variable directly. ### Apify SDK -In the JavaScript/TypeScript SDK `>=3.7.0`, use `openDataset` with `alias` option: +In the JavaScript/TypeScript SDK `>=3.7.0`, use [`Actor.openDataset`](https://docs.apify.com/sdk/js/reference/class/Actor#openDataset) with the `alias` option: ```js const categoriesDataset = await Actor.openDataset({alias: 'categories'}); @@ -76,7 +78,7 @@ When the JavaScript SDK runs outside the Apify platform, aliases fall back to na -In the Python SDK `>=3.3.0`, use `open_dataset` with `alias` parameter: +In the Python SDK `>=3.3.0`, use [`Actor.open_dataset`](https://docs.apify.com/sdk/python/reference/class/Actor#open_dataset) with the `alias` parameter: ```py categories_dataset = await Actor.open_dataset(alias='categories') @@ -120,9 +122,9 @@ The **Export** button on the Run page exports only the `default` dataset. To exp ::: -To access non-default datasets programmatically, use the [Dataset API](/api/v2/dataset-items-get) with the dataset ID from `ACTOR_STORAGES_JSON` or the SDK reference returned by `openDataset` / `open_dataset`. See [Datasets](../../../../storage/dataset.md) for export formats and query parameters. +To export programmatically, call the [Dataset API](/api/v2/dataset-items-get) with the dataset ID from `ACTOR_STORAGES_JSON`, or use the SDK methods shown in [Access datasets in Actor code](#access-datasets-in-actor-code). See [Datasets](../../../../storage/dataset.md) for export formats and query parameters. -## Configure the output schema +## Surface datasets on the run page The Storage tab shows data but doesn't surface it clearly to end users. To present datasets more prominently on the run page, define an [output schema](../../actor_definition/output_schema/index.md) that references each dataset by alias: From 44f3f9273a37fc13864fbc1b1f80532f700b906c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Olender?= <92638966+TC-MO@users.noreply.github.com> Date: Tue, 26 May 2026 23:33:46 +0200 Subject: [PATCH 3/4] docs: align multi-dataset export procedure with Console UI --- .../dataset_schema/multiple_datasets.mdx | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx b/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx index cdccb563e5..9032fca5aa 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx +++ b/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx @@ -111,10 +111,10 @@ The **Storage** tab in the Actor run view displays all datasets defined by the A To export a non-default dataset: -1. Open the Actor run. -1. Select the **Storage** tab. -1. Select the dataset you want to export. -1. Select **Export** and choose a format: JSON, CSV, XML, Excel, HTML, RSS, or JSONL. +1. On the Actor run page, select the **Storage** tab. +1. Open the **Dataset** dropdown and select the dataset you want to export. +1. Under **Export dataset**, choose a format: JSON, CSV, XML, Excel, HTML Table, RSS, or JSONL. +1. Select **Download**. :::caution Run page Export button @@ -122,7 +122,12 @@ The **Export** button on the Run page exports only the `default` dataset. To exp ::: -To export programmatically, call the [Dataset API](/api/v2/dataset-items-get) with the dataset ID from `ACTOR_STORAGES_JSON`, or use the SDK methods shown in [Access datasets in Actor code](#access-datasets-in-actor-code). See [Datasets](../../../../storage/dataset.md) for export formats and query parameters. +To export programmatically: + +- Call the [Dataset API](/api/v2/dataset-items-get) with the dataset ID from `ACTOR_STORAGES_JSON`. The API returns items in any supported format via query parameters. +- From inside an Actor, open the dataset (see [Access datasets in Actor code](#access-datasets-in-actor-code)), then call `getData` / `get_data` to read items into memory, or `exportTo` / `export_to` to write a JSON or CSV file to the key-value store. + +See [Datasets](../../../../storage/dataset.md) for formats and query parameters. ## Surface datasets on the run page From f85cf2a51c1fb54d5f86606ed3b982af588e3e45 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Olender?= <92638966+TC-MO@users.noreply.github.com> Date: Wed, 27 May 2026 00:07:38 +0200 Subject: [PATCH 4/4] docs: trim redundant sentence from Run page Export caution --- .../actor_definition/dataset_schema/multiple_datasets.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx b/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx index 9032fca5aa..4396ccd437 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx +++ b/sources/platform/actors/development/actor_definition/dataset_schema/multiple_datasets.mdx @@ -118,7 +118,7 @@ To export a non-default dataset: :::caution Run page Export button -The **Export** button on the Run page exports only the `default` dataset. To export other datasets, use the **Storage** tab as described above. +The **Export** button on the Run page exports only the `default` dataset. :::