diff --git a/README.md b/README.md index 2aec098..0eaca05 100644 --- a/README.md +++ b/README.md @@ -14,7 +14,7 @@ Find the docs at https://disyinformationssysteme.github.io/cadenza-analytics-pyt * Flask * Pandas * requests-toolbelt - +* chardet ## Example: Example extensions can be found in [examples](https://github.com/DisyInformationssysteme/cadenza-analytics-python/tree/main/examples). diff --git a/docs/intro.md b/docs/intro.md index 93699b5..750b63d 100644 --- a/docs/intro.md +++ b/docs/intro.md @@ -1,4 +1,6 @@ -
+This is `cadenzaanalytics` version {{version}}.
+
+
!! This module is currently in beta status !!
It can be used for testing, but there may be breaking changes before a full release.
@@ -8,13 +10,13 @@
# disy Cadenza Analytics Extensions
-An Analytics Extension extends the functional spectrum of [disy Cadenza](https://www.disy.net/en/products/disy-cadenza/) with an analysis function or a visualisation type.
-An Analytics Extension is a web service that exchanges structured data with disy Cadenza via the Cadenza API.
+An Analytics Extension extends the functional spectrum of [disy Cadenza](https://www.disy.net/en/products/disy-cadenza/) with an analysis function or a visualisation type.
+An Analytics Extension is a web service that exchanges structured data with disy Cadenza via the Cadenza API.
A user can integrate an analysis extension into disy Cadenza via the Management Center and manage it there (if they have the appropriate rights).
As of disy Cadenza Autumn 2023 (9.3), the following types and capabilities of analysis extensions are officially supported:
-- **Visualization**
+- **Visualization**
The Analytics Extension type `visualization` provides a new visualization type for displaying a bitmap image (PNG).
- **Data enrichment**
@@ -27,24 +29,25 @@ As of disy Cadenza Autumn 2023 (9.3), the following types and capabilities of an
An Analytics Extension defines one endpoint that, depending on the HTTP method of the request, is used to supply the Extension's configuration to disy Cadenza, or exchange data and results with Cadenza respectively.
-
When receiving an `HTTP(S) GET` request, the endpoint returns a JSON representation of the extention's configuration.
This step is executed once when registering the Analytics Extension from the disy Cadenza Management Center GUI and does not need to be repeated unless the extension's configuration changes.
-By sending an `HTTP(S) POST` request to the same endpoint and including the data, metadata and parameters as specified in the extension's configuration as payload, the extension is executed.
+By sending an `HTTP(S) POST` request to the same endpoint and including the data, metadata and parameters as specified in the extension's configuration as payload, the extension is executed.
This step is executed each time that the Analytics Extension is invoked from the disy Cadenza GUI and Cadenza takes care of properly formatting the payload.
-The `cadenzaanalytics` module provides the functionality to abstract the required communication and easily configure the Analytics Extension's responses to the above requests.
+The `cadenzaanalytics` module provides the functionality to abstract the required communication and easily configure the Analytics Extension's responses to the above requests.
# Installation
-As long as this package is in beta, it is only available on GitHub, and an installation via source is necessary. In the near future this package will also be made available via the Python Package Index (PyPI).
+As long as this package is in beta, it is only available on GitHub, and an installation via source is necessary.
+In the near future this package will also be made available via the Python Package Index (PyPI).
Furthermore, a corresponding version will be packaged as source code with each release of disy Cadenza.
@@ -56,17 +59,18 @@ The `cadenzaanalytics` package has the following dependencies:
* [Flask](https://flask.palletsprojects.com/en/3.0.x/)
* [Pandas](https://pandas.pydata.org/)
* requests-toolbelt
+* chardet
For each disy Cadenza version, the correct corresponding library version needs to be used.
The disy Cadenza main version is reflected in the corresponding major and minor version of `cadenzaanalytics` (e.g. 10.4.0 for Cadenza 10.4), while the last version segment is increased for both bugfixes and functional changes.
-For Cadenza 10.2 and earlier versions, `cadenzaanalytics` used a semantic versioning scheme.
-The first version of disy Cadenza that supported Analytics Extensions is disy Cadenza Autumn 2023 (9.3).
+For Cadenza 10.2 and earlier versions, `cadenzaanalytics` used a semantic versioning scheme.
+The first version of disy Cadenza that supported Analytics Extensions is disy Cadenza Autumn 2023 (9.3).
-
-
## Installation from Source
-The source of the package can be obtained from the project's public [GitHub repository](https://github.com/DisyInformationssysteme/cadenza-analytics-python).
-Alternatively with each release of disy Cadenza, the offline source code of the matching version of `cadenzaanalytics` is packaged in the distributions `developer.zip`.
+The source of the package can be obtained from the project's public [GitHub repository](https://github.com/DisyInformationssysteme/cadenza-analytics-python).
-Once the repository is locally available, the package can be installed using the package installer [`pip`](https://pypi.org/project/pip/).
+Once the repository is locally available, the package can be installed using the package installer [`pip`](https://pypi.org/project/pip/).
To install the package from source, navigate to the root folder of the project and run:
```console
@@ -112,7 +113,7 @@ We specify what data can be passed from disy Cadenza to the Anylytics Extension
my_attribute_group = ca.AttributeGroup(
name='my_data',
print_name='Any numeric attribute',
- data_types=[ca.DataType.INT64,
+ data_types=[ca.DataType.INT64,
ca.DataType.FLOAT64],
min_attributes=1,
max_attributes=1
@@ -142,7 +143,7 @@ my_param = ca.Parameter(
```
This object again requires a `name` and a `print_name`, as well as a [`ParameterType`](cadenzaanalytics/data/parameter_type.html).
Optionally, we can specify whether a parameter is mandatory and/or a default value for it.
-Multiple parameters can be defined.
+Multiple parameters can be defined.
As an alternative to requesting input of a parameter in one of the standard data types, a list from which a user selects a value can be defined via the `SELECT` type:
@@ -175,7 +176,7 @@ my_extension = ca.CadenzaAnalyticsExtension(
```
The `relative_path` defines the endpoint, i.e. the subdirectory of the URL under wich the extension will be available after deployment.
-Further parameters include the `print_name` shown in Cadenza, and the attribute groups and parameters defined above.
+Further parameters include the `print_name` shown in Cadenza, and the attribute groups and parameters defined above.
Additionally, the appropriate [`ExtensionType`](cadenzaanalytics/data/extension_type.html) (visualization, enrichment, or calculation) must be specified.
The `analytics_function` is the name of the Python method that should be invoked (see next section).
@@ -183,7 +184,7 @@ The `analytics_function` is the name of the Python method that should be invoked
## Including Custom Analytics Code
The analysis function `my_analytics_function` (or whatever you choose to name it) is the method that contains the specific functionality for the extension.
-It implements what the extension should be doing when being invoked from disy Cadenza.
+It implements what the extension should be doing when being invoked from disy Cadenza.
This method takes two arguments, `metadata` and `data`, which both will be passed to it automatically when the extension is invoked from Cadenza.
```python
@@ -192,14 +193,14 @@ def my_analytics_function (metadata: ca.RequestMetadata, data: pd.DataFrame):
return #something
```
-The actual content and return type of this function will depend both on the extension type (visualization, enrichment, or calculation) and naturally the actual analytics code that the extension should execute.
+The actual content and return type of this function will depend both on the extension type (visualization, enrichment, or calculation) and naturally the actual analytics code that the extension should execute.
### Reading Data, Metadata and Parameters
Accessing the data that is transferred from Cadenza is simple.
Within the defined analytics function, a [Pandas DataFrame](https://pandas.pydata.org/) `data` is automatically available, which holds all the data passed from Cadenza.
-Same as the `data` object, a [`RequestMetadata`](cadenzaanalytics/request/request_metadata.html) object is also automatically available in the analysis function as `metadata`.
+Same as the `data` object, a [`RequestMetadata`](cadenzaanalytics/request/request_metadata.html) object is also automatically available in the analysis function as `metadata`.
The `metadata` object contains information on the columns in the `data` DataFrame, such as their print name and type in disy Cadenza, their column name in the pandas DataFrame, or additional information like a `geometry_type`, where applicable.
@@ -213,7 +214,8 @@ if 'my_data' in columns_by_attribute_group:
my_data = data[column.name]
```
-While it is also possible to directly access the columns of `data` by name or by index, this is less robust, since the actual column names of the dataframe depend on their configuration in disy Cadenza and changing them there might lead to the extension not functioning properly anymore. However it is possible to get the metadata to a specific colum of the `data` DataFrame.
+While it is also possible to directly access the columns of `data` by name or by index, this is less robust, since the actual column names of the dataframe depend on their configuration in disy Cadenza and changing them there might lead to the extension not functioning properly anymore.
+However, it is possible to get the metadata to a specific colum of the `data` DataFrame.
```python
for column_name, column_data in data.items():
@@ -231,7 +233,7 @@ The table shows the mapping to Pyton data types:
| Number (Long) | pandas.Long64Dtype | `1` | |
| Floating point number (Double) | pandas.Float64Dtype | `1.23` | |
| Date | string | `"2022-11-12T12:34:56+13:45[Pacific/Chatham]"` | A date is represented as an [ISO string with time zone offset from UTC](https://en.wikipedia.org/wiki/ISO_8601#Coordinated_Universal_Time_(UTC)) (UTC) and additional time zone identifier in brackets. |
-| Geometry | string | `"POINT(8.41594949941623 49.0048124984033)"` | A geometry is represented as a [WKT](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry) string.
*Note:* By default, coordinates use the WGS84 projection. |
+| Geometry | string | `"POINT(8.41594949941623 49.0048124984033)"` | A geometry is represented as a [WKT](https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry) string.
*Note:* By default, coordinates use the WGS84 projection. |
Parameters are stored in `metadata` as well.
@@ -251,6 +253,7 @@ A [`CsvResponse`](cadenzaanalytics/response/csv_response.html) is used for calcu
The response must include the data and the proper metadata.
The following minimal example echos the data received from disy Cadenza as part of an `AttributeGroup` named `'any_data'` back to it without modification.
+Therefore, it just forwards the original metadata as the metadata of the response.
```python
def echo_analytics_function(metadata: ca.RequestMetadata, data: pd.DataFrame):
@@ -283,7 +286,7 @@ response_columns = [
### Data Enrichment
A [`CsvResponse`](cadenzaanalytics/response/csv_response.html) is used for enrichments as well.
-The response must be in the format of a text, a CSV file or a DataFrame so that it fits.
+The response must be in the format of a text, a CSV file or a DataFrame so that it fits.
TODO
@@ -307,7 +310,7 @@ return ca.ImageResponse(image)
### Returning an Error
-In order to abort the execution of the function with an error and pass an according message to disy Cadenza, a [`ErrorResponse`](cadenzaanalytics/response/error_response.html) can be returned.
+In order to abort the execution of the function with an error and pass an according message to disy Cadenza, an [`ErrorResponse`](cadenzaanalytics/response/error_response.html) can be returned.
```python
if my_data is None:
@@ -316,20 +319,26 @@ if my_data is None:
## Registering the Extension
-TBD
+Finally, the extension needs to be registered with a [`CadenzaAnalyticsExtensionService`](cadenzaanalytics/cadenza_analytics_extension_service.html).
+This makes the service available at the configured endpoint.
+
```python
analytics_service = ca.CadenzaAnalyticsExtensionService()
analytics_service.add_analytics_extension(my_extension)
```
-TODO "directory" service multiple extensions
+
-# Deployment
+# Deployment
+Since `cadenzaanalytics` is built on the [Flask framework](https://flask.palletsprojects.com/en/stable), the deployment options for a Cadenza Analytics Extension are basically the same as for any Flask application.
+Below, we present a few options, a more comprehensive overview can be found in the [Deploying to Production](https://flask.palletsprojects.com/en/stable/deploying/index.html) section of the official Flask documentation.
-Since `cadenzaanalytics` is built on the [Flask framework](https://flask.palletsprojects.com/en/3.0.x/), ...
+## Local Execution (development only)
+For development purposes, using the built-in development server, debugger, and reloader is the most convenient.
+However, it should not be used in production, as it has not been designed for security, stability, or efficiency.
-## Local Execution
+The development server can either be invoked from within the python code...
```python
if __name__ == '__main__':
@@ -338,3 +347,7 @@ if __name__ == '__main__':
```
## WSGI Deployment
+
+## Docker
+
+
diff --git a/pyproject.toml b/pyproject.toml
index cef35fa..73dd9f1 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -10,7 +10,7 @@ authors = [
"Daniel Dittmar ",
"Matthias Budde "
]
-version="10.4.0a0.dev"
+version="10.3.0a3.dev"
description = "Official Python Package for creation of disy Cadenza analytics extensions"
readme = "README.md"
license = "Apache-2.0"
@@ -30,6 +30,7 @@ Werkzeug = "3.0.4"
Flask-Cors = "3.0.10"
requests-toolbelt = "1.0.0"
pandas = " ^2.0.2"
+chardet = "5.2.0"
[project]
name = "cadenzaanalytics"
diff --git a/src/cadenzaanalytics/__init__.py b/src/cadenzaanalytics/__init__.py
index 519925f..d8d1fe5 100644
--- a/src/cadenzaanalytics/__init__.py
+++ b/src/cadenzaanalytics/__init__.py
@@ -5,8 +5,6 @@
The purpose of this module is to encapsulate the communication via the Cadenza API.
-This is `cadenzaanalytics` version {{version}}.
-
.. include:: ../../docs/intro.md
"""
from cadenzaanalytics.cadenza_analytics_extension import CadenzaAnalyticsExtension