Skip to content

[FSTORE-2021] Add support for SAP Hana as a Data Source#571

Open
jimdowling wants to merge 3 commits intologicalclocks:mainfrom
jimdowling:sap_hana
Open

[FSTORE-2021] Add support for SAP Hana as a Data Source#571
jimdowling wants to merge 3 commits intologicalclocks:mainfrom
jimdowling:sap_hana

Conversation

@jimdowling
Copy link
Copy Markdown
Contributor

Summary

  • Document SAP HANA data source.
  • Document default SAP HANA port as 39015.

Test plan

  • Build passes
  • Manual test: create SAP HANA storage connector via UI
  • Manual test: read external feature group sourced from SAP HANA
  • Manual test: DLT ingestion from SAP HANA into managed FG

🤖 Generated with Claude Code

jimdowling and others added 3 commits May 4, 2026 11:04
Adds a SAP HANA creation page modeled on the Snowflake guide,
registers it under Cloud Agnostic in the data source index, links it
into mkdocs.yml under Configuration and Creation, and extends the
dltHub ingestion page's supported-source list to mention SAP HANA in
the SQL family.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Match the new Java/Python/UI default. 39015 is the SQL port for the
first tenant DB on a default multi-tenant or HANA Express install
(instance 90); 30015 is documented as the alternative for a non-tenant
single-host install (instance 00).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a Type mapping table that lists the source HANA type → Hopsworks
offline feature type Hopsworks now produces, so users can predict the
shape of their feature group up front (DECIMAL(p,s) preserves
precision/scale, SMALLINT/TINYINT distinct from INT, BOOLEAN/REAL
mapped, etc.).

Add a Known limitations section calling out the two practical traps
discovered while bringing up the integration:

- Tables under the SYSTEM schema do not reflect cleanly through the
  sqlalchemy-hana DLT path. Recommend creating a regular user schema
  (HOPSDEMO etc.) and renaming source tables there.
- DLT online ingestion validates non-null primary keys. If the source
  can hold NULLs in the chosen PK column, filter them out, pick a
  different PK, or disable online serving for the feature group.

Move the existing Authentication admonition from the prerequisites
list into Known limitations so all caveats live together.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the Hopsworks documentation site with a new SAP HANA Data Source guide and wires it into the existing navigation and ingestion-source documentation.

Changes:

  • Added a new “SAP HANA” Data Source creation guide.
  • Updated Data Source index and mkdocs navigation to include the new SAP HANA page.
  • Updated the dltHub ingestion guide to list SAP HANA among supported SQL-family sources.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 10 comments.

File Description
mkdocs.yml Adds the SAP HANA page to the Data Source creation nav.
docs/user_guides/fs/feature_group/ingest_with_dlthub.md Documents SAP HANA as part of the supported SQL-family ingestion sources.
docs/user_guides/fs/data_source/index.md Adds SAP HANA to the “Cloud Agnostic” Data Source list.
docs/user_guides/fs/data_source/creation/sap_hana.md New how-to page describing SAP HANA Data Source prerequisites, UI setup, and limitations.

Comment on lines +1 to +7
# How-To set up a SAP HANA Data Source

## Introduction

SAP HANA is an in-memory relational database used by many enterprises as the system of record for ERP, CRM, and analytics workloads.

A SAP HANA Data Source in Hopsworks stores the connection details required to read tables and views from a HANA tenant database.

SAP HANA is an in-memory relational database used by many enterprises as the system of record for ERP, CRM, and analytics workloads.

A SAP HANA Data Source in Hopsworks stores the connection details required to read tables and views from a HANA tenant database.
Comment on lines +23 to +27
The default is `39015`, the SQL port for the first tenant database on a default
multi-tenant or HANA Express (HXE) install (instance number 90).
For a non-tenant single-host install (instance 00) use `30015`.
SAP HANA Cloud typically uses `443`.
Consult your DBA if you are unsure.
Comment on lines +34 to +36
Use this when your SAP HANA system hosts more than one tenant database and you need to target a specific one.
- **Schema**: The default schema applied to unqualified queries on the connection.
If you leave this empty, queries must fully qualify table names with the schema prefix.
If you leave this empty, queries must fully qualify table names with the schema prefix.
- **Table**: The default table the connector points at when no SQL query is provided.
- **Application**: A short identifier surfaced in HANA's session tracing (`APPLICATION` session variable).
This makes it easier to attribute load to Hopsworks in HANA monitoring tools.
05. Provide the **Password** for that user.
06. Optionally fill in **Database**, **Schema**, **Table**, and **Application**.
07. Optionally add additional key/value arguments.
These are forwarded both to the Python driver used by the on-demand read path and to the Spark JDBC reader used by notebook jobs.

## Use it as an ingestion source

Once the SAP HANA data source exists, you can also use it with the dltHub-based ingestion workflow described in [Ingest Data with dltHub](../../feature_group/ingest_with_dlthub.md).

## Next Steps

Move on to the [usage guide for data sources](../usage.md) to see how you can use your newly created SAP HANA connector.
4. [HopsFS](creation/hopsfs.md): Easily connect and read from directories of Hopsworks' internal File System.
5. [CRM, Sales & Analytics](creation/crm_sales_analytics.md): Connect to supported CRM, sales, and analytics platforms.
6. [REST API](creation/rest_api.md): Connect to external HTTP APIs with configurable headers and authentication.
7. [SAP HANA](creation/sap_hana.md): Query SAP HANA tenant databases using SQL.
Comment on lines +103 to +104
Place tables you intend to ingest or expose as feature groups in a regular user schema (for example a project-specific `MYAPP` or `HOPSDEMO`).
Tables created under the system-owned `SYSTEM` schema do not reflect cleanly through the SQLAlchemy HANA dialect that powers DLT ingestion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants