The required tenant API key is only available for selected partner companies. Please contact your Wacom representative for more information.
- Installation
- Quick Start
- Introduction
- Technology Stack
- Functionality
- Choosing Between Sync and Async Clients
- Samples
- Development
- Documentation
- Contributing
- License
Install the library using pip:
pip install personal-knowledge-libraryThis library requires Python 3.10 or higher (supports Python 3.10, 3.11, 3.12, and 3.13).
To install development dependencies for testing and code quality tools:
pip install personal-knowledge-library[dev]Here's a minimal example to get you started with the Wacom Knowledge Service:
from knowledge.services.graph import WacomKnowledgeService
from knowledge.base.ontology import OntologyClassReference, ThingObject
from knowledge.base.entity import Label
from knowledge.base.language import EN_US
# Initialize the client
client = WacomKnowledgeService(
service_url="https://private-knowledge.wacom.com",
application_name="My Application"
)
# Login with your credentials
client.login(tenant_api_key="<your-tenant-key>", external_user_id="<your-user-id>")
# Search for entities
results, _ = client.search_labels(search_term="Leonardo da Vinci", language_code=EN_US)
for entity in results:
print(f"{entity.uri}: {[l.content for l in entity.label]}")Note: You need a valid tenant API key from Wacom to use this library.
In knowledge management there is a distinction between data, information, and knowledge. In the domain of digital ink this means:
- Data— The equivalent would be the ink strokes
- Information— After using handwriting-, shape-, math-, or other recognition processes, ink strokes are converted into machine-readable content, such as text, shapes, math representations, other digital content
- Knowledge / Semantics - Beyond recognition content needs to be semantically analyzed to become semantically understood based on shared common knowledge.
The following illustration shows the different layers of knowledge:

For handling semantics, Wacom introduced the Wacom Private Knowledge System (PKS) cloud service to manage personal ontologies and its associated personal knowledge graph.
This library provides simplified access to Wacom's personal knowledge cloud service. It contains:
- Basic datastructures for an Ontology object and entities from the knowledge graph
- Clients for the REST APIs
- Connector for Wikidata public knowledge graph
Ontology service:
- List all Ontology structures
- Modify Ontology structures
- Delete Ontology structures
Entity service:
- List all entities
- Add entities to the knowledge graph
- Access object properties
Search service:
- Search for entities for labels and descriptions with a given language
- Search for literals (data properties)
- Search for relations (object properties)
Group service:
- List all groups
- Add groups, modify groups, delete groups
- Add users and entities to groups
Named Entity Linking service:
- Linking words to knowledge entities from the graph in a given text (Ontology-based Named Entity Linking)
Wikidata connector:
- Import entities from Wikidata
- Mapping Wikidata entities to WPK entities
The tasks of the ontology within Wacom's private knowledge system are to formalize the domain the technology is used in, such as education-, smart home-, or creative domain. The domain model will be the foundation for the entities collected within the knowledge graph, describing real world concepts in a formal language understood by an artificial intelligence system:
- Foundation for structured data, knowledge representation as concepts and relations among concepts
- Being explicit definitions of shared vocabularies for interoperability
- Being actionable fragments of explicit knowledge that engines can use for inferencing (Reasoning)
- Can be used for problem-solving
An ontology defines (specifies) the concepts, relationships, and other distinctions that are relevant for modeling a domain.
- Knowledge graph is generated from unstructured and structured knowledge sources
- Contains all structured knowledge gathered from all sources
- Foundation for all semantic algorithms
- Extract knowledge from various sources (Connectors)
- Linking words to knowledge entities from the graph in a given text (Ontology-based Named Entity Linking)
- Enables a smart search functionality which understands the context and finds related documents (Semantic Search)
For importing entities into the knowledge graph, the samples/import_entities.py script can be used.
The ThingObject supports a NDJSON-based import format, where the individual JSON files can contain the following structure.
| Field name | Subfield name | Data Structure | Description |
|---|---|---|---|
| source_reference_id | str | A unique identifier for the entity used in the source system | |
| source_system | str | The source system describes the original source of the entity, such as wikidata, youtube, ... | |
| image | str | A string representing the URL of the entity's icon. | |
| labels | array | An array of label objects, where each object has the following fields: | |
| value | str | A string representing the label text in the specified locale. | |
| locale | str | A string combining the ISO-3166 country code and the ISO-639 language code (e.g., "en-US"). | |
| isMain | bool | A boolean flag indicating if this label is the main label for the entity (true) or an alias (false). | |
| descriptions | array | An array of description objects, where each object has the following fields: | |
| description | str | A string representing the description text in the specified locale. | |
| locale | str | A string combining the ISO-3166 country code and the ISO-639 language code (e.g., "en-US"). | |
| type | str | A string representing the IRI of the ontology class for this entity. | |
| literals | array[map] | An array of data property objects, where each object has the following fields: |
The personal knowledge graph backend is implemented as a multi-tenancy system. Thus, several tenants can be logically separated from each other and different organizations can build their one knowledge graph.
In general, a tenant with their users, groups, and entities are logically separated. Physically, the entities are stored in the same instance of the Wacom Private Knowledge (WPK) backend database system.
The user management is rather limited, each organization must provide their own authentication service and user management. The backend only has a reference of the user (“shadow user”) by an external user id.
The management of tenants is limited to the system owner —Wacom —, as it requires a tenant management API key. While users for each tenant can be created by the owner of the Tenant API Key. You will receive this token from the system owner after the creation of the tenant.
⚠️ Stores the Tenant API Key in a secure key store, as attackers can use the key to harm your system.
The Tenant API Key should be only used by your authentication service to create shadow users and to log in your user into the WPK backend. After a successful user login, you will receive a token which can be used by the user to create, update, or delete entities and relations.
The following illustration summarizes the flows for creation of tenant and users:
The organization itself needs to implement their own authentication service which:
- handles the users and their passwords,
- controls the personal data of the users,
- connects the users with the WPK backend and share with them the user token.
The WPK backend only manages the access levels of the entities and the group management for users. The illustration shows how the access token is received from the WPK endpoint:
The entities used within the knowledge graph and the relationship among them are defined within an ontology managed with Wacom Ontology Management System (WOMS).
An entity within the personal knowledge graphs consists of these major parts:
- Icon— a visual representation of the entity, for instance, a portrait of a person.
- URI— a unique resource identifier of an entity in the graph.
- Type— the type links to the defined concept class in the ontology.
- Labels— labels are the word(s) used in a language for the concept.
- Description— a short abstract that describes the entity.
- Literals— literals are properties of an entity, such as the first name of a person. The ontology defines all literals of the concept class as well as its data type.
- Relations— the relationship among different entities is described using relations.
The following illustration provides an example of an entity:
Entities in general are language-independent as across nationalities or cultures we only use different scripts and words for a shared instance of a concept.
Let's take Leonardo da Vinci as an example. The ontology defines the concept of a Person, a human being. Now, in English its label would be Leonardo da Vinci, while in Japanese レオナルド・ダ・ヴィンチ. Moreover, he is also known as Leonardo di ser Piero da Vinci or ダ・ビンチ.
Now, in the given example all words that are assigned to the concept are labels. The label Leonardo da Vinci is stored in the backend with an additional language code, e.g. en.
There is always a main label, which refers to the most common or official name of an entity. Another example would be Wacom, where Wacom Co., Ltd. is the official name while Wacom is commonly used and be considered as an alias.
📌 For the language code the ISO 639-1:2002, codes for the representation language names —Part 1: Alpha-2 code. Read more, here
Every user-facing service client ships in two flavours: the synchronous client in knowledge.services.* and the async counterpart in knowledge.services.asyncio.*.
Both expose the same methods with the same parameters and return types — switching only changes the call style.
- Scripts, CLIs, one-off tools, and notebooks. No event loop required, easier to reason about, and
pdbworks as expected. All samples in this repository use the sync client. - Single-shot calls embedded in otherwise synchronous code. Mixing
asyncio.run(...)into a sync codebase just to make one request is rarely worth it. - Callers that want transparent recovery from transient failures. Sync clients install a
urllib3.Retry(total=3, backoff_factor=0.1, status_forcelist=[502, 503, 504])at the transport layer, so 5xx blips are retried for you.
- Backend services that hold many concurrent connections — FastAPI, aiohttp, Starlette, etc. Mixing blocking I/O into an async event loop blocks every other request; use the async client to keep the loop free.
- High-throughput batch jobs that benefit from issuing many requests in parallel via
asyncio.gather(...). - Callers that already own a retry / circuit-breaker / idempotency layer. The async clients deliberately ship no transport-level retry — backend callers typically have their own policies (back-pressure, circuit breakers, idempotency keys), and a hidden retry layer would interfere with them.
📌 The retry asymmetry between the sync and async clients is intentional. If your async caller does not already implement retries, wrap the failing call yourself rather than asking for retries inside the SDK.
Most service clients exist in both forms, with one exception:
| Client | Sync | Async |
|---|---|---|
WacomKnowledgeService |
✅ | ✅ |
OntologyService |
✅ | ❌ |
UserManagementServiceAPI |
✅ | ✅ |
GroupManagementService |
✅ | ✅ |
SemanticSearchClient |
✅ | ✅ |
IndexManagementClient |
✅ | ✅ |
QueueManagementClient |
✅ | ✅ |
InkServices |
✅ | ✅ |
ContentClient |
✅ | ✅ |
WacomEntityLinkingEngine |
✅ | ✅ |
Ontology management is sync-only; everything else is available as both Foo and AsyncFoo.
The infrastructure modules session.py, tenant.py, and helper.py are also sync-only by design — they are not service clients.
Each client owns its own TokenManager; there is no global session registry, and use_session(session_id) only resolves IDs that live in that client's manager.
A session created by knowledge_client.login(...) is therefore not visible to content_client — calling content_client.use_session(session.id) raises WacomServiceException("Unknown session id:= …").
To avoid a second login round-trip when multiple clients work against the same user, use one of these two patterns instead.
Pattern 1 — per-call auth_key= override. Most service methods accept an optional auth_key= parameter that bypasses the bound session for a single call:
session = knowledge_client.login(tenant_api_key, external_user_id)
# Use the same token in another client without registering a session there
items = content_client.list_content(uri=entity_uri, auth_key=session.auth_token)Pattern 2 — register_token() on the second client. Reuse the auth token (and refresh token, if any) obtained from the first login to register a RefreshableSession in the second client's token manager. No second network call to /user/login is made:
session = knowledge_client.login(tenant_api_key, external_user_id)
content_client.register_token(
auth_key=session.auth_token,
refresh_token=session.refresh_token,
)
# content_client now uses that session for subsequent calls
content_client.list_content(uri=entity_uri)📌
register_tokenproduces aRefreshableSession, not aPermanentSession. The two clients refresh independently after this point — refreshing on one does not propagate the new token to the other. If long-lived auto re-login (thePermanentSessionbehavior) is required on the second client too, callclient.login(tenant_api_key, external_user_id)on it directly.
Async clients hold an aiohttp.ClientSession. Always close it before the program exits:
async_client: AsyncWacomKnowledgeService = AsyncWacomKnowledgeService(...)
await async_client.login(tenant_api_key, external_user_id)
try:
...
finally:
await async_client.close_all_sessions()This samples shows how to work with the graph service.
import argparse
from typing import Optional, Dict, List
from knowledge.base.entity import Description, Label
from knowledge.base.language import LocaleCode, EN_US, DE_DE
from knowledge.base.ontology import OntologyClassReference, OntologyPropertyReference, ThingObject, ObjectProperty
from knowledge.services.graph import WacomKnowledgeService
# ------------------------------- Knowledge entities -------------------------------------------------------------------
LEONARDO_DA_VINCI: str = 'Leonardo da Vinci'
SELF_PORTRAIT_STYLE: str = 'self-portrait'
ICON: str = "https://upload.wikimedia.org/wikipedia/commons/thumb/8/87/Mona_Lisa_%28copy%2C_Thalwil%2C_Switzerland%29."\
"JPG/1024px-Mona_Lisa_%28copy%2C_Thalwil%2C_Switzerland%29.JPG"
# ------------------------------- Ontology class names -----------------------------------------------------------------
THING_OBJECT: OntologyClassReference = OntologyClassReference('wacom', 'core', 'Thing')
"""
The Ontology will contain a Thing class where is the root class in the hierarchy.
"""
ARTWORK_CLASS: OntologyClassReference = OntologyClassReference('wacom', 'creative', 'VisualArtwork')
PERSON_CLASS: OntologyClassReference = OntologyClassReference('wacom', 'core', 'Person')
ART_STYLE_CLASS: OntologyClassReference = OntologyClassReference.parse('wacom:creative#ArtStyle')
IS_CREATOR: OntologyPropertyReference = OntologyPropertyReference('wacom', 'core', 'created')
HAS_TOPIC: OntologyPropertyReference = OntologyPropertyReference.parse('wacom:core#hasTopic')
CREATED: OntologyPropertyReference = OntologyPropertyReference.parse('wacom:core#created')
HAS_ART_STYLE: OntologyPropertyReference = OntologyPropertyReference.parse('wacom:creative#hasArtstyle')
def print_entity(display_entity: ThingObject, list_idx: int, client: WacomKnowledgeService,
short: bool = False):
"""
Printing entity details.
Parameters
----------
display_entity: ThingObject
Entity with properties
list_idx: int
Index with a list
client: WacomKnowledgeService
Knowledge graph client
short: bool
Short summary
"""
print(f'[{list_idx}] : {display_entity.uri} <{display_entity.concept_type.iri}>')
if len(display_entity.label) > 0:
print(' | [Labels]')
for la in display_entity.label:
print(f' | |- "{la.content}"@{la.language_code}')
print(' |')
if not short:
if len(display_entity.alias) > 0:
print(' | [Alias]')
for la in display_entity.alias:
print(f' | |- "{la.content}"@{la.language_code}')
print(' |')
if len(display_entity.data_properties) > 0:
print(' | [Attributes]')
for data_property, labels in display_entity.data_properties.items():
print(f' | |- {data_property.iri}:')
for li in labels:
print(f' | |-- "{li.value}"@{li.language_code}')
print(' |')
relations_obj: Dict[OntologyPropertyReference, ObjectProperty] = client.relations(uri=display_entity.uri)
if len(relations_obj) > 0:
print(' | [Relations]')
for r_idx, re in enumerate(relations_obj.values()):
last: bool = r_idx == len(relations_obj) - 1
print(f' |--- {re.relation.iri}: ')
print(f' {"|" if not last else " "} |- [Incoming]: {re.incoming_relations} ')
print(f' {"|" if not last else " "} |- [Outgoing]: {re.outgoing_relations}')
print()
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("-u", "--user", help="External Id of the shadow user within the Wacom Personal Knowledge.",
required=True)
parser.add_argument("-t", "--tenant", help="Tenant Id of the shadow user within the Wacom Personal Knowledge.",
required=True)
parser.add_argument("-i", "--instance", default='https://private-knowledge.wacom.com',
help="URL of instance")
args = parser.parse_args()
TENANT_KEY: str = args.tenant
EXTERNAL_USER_ID: str = args.user
# Wacom personal knowledge REST API Client
knowledge_client: WacomKnowledgeService = WacomKnowledgeService(service_url=args.instance, application_name="Wacom Knowledge Listing")
knowledge_client.login(args.tenant, args.user)
page_id: Optional[str] = None
page_number: int = 1
entity_count: int = 0
print('-----------------------------------------------------------------------------------------------------------')
print(' First step: Find Leonardo da Vinci in the knowledge graph.')
print('-----------------------------------------------------------------------------------------------------------')
res_entities, next_search_page = knowledge_client.search_labels(search_term=LEONARDO_DA_VINCI,
language_code=LocaleCode('en_US'), limit=1000)
leo: Optional[ThingObject] = None
s_idx: int = 1
for res_entity in res_entities:
# Entity must be a person and the label matches with full string
if res_entity.concept_type == PERSON_CLASS and LEONARDO_DA_VINCI in [la.content for la in res_entity.label]:
leo = res_entity
break
print('-----------------------------------------------------------------------------------------------------------')
print(' What artwork exists in the knowledge graph.')
print('-----------------------------------------------------------------------------------------------------------')
relations_dict: Dict[OntologyPropertyReference, ObjectProperty] = knowledge_client.relations(uri=leo.uri)
print(f' Artwork of {leo.label}')
print('-----------------------------------------------------------------------------------------------------------')
idx: int = 1
if CREATED in relations_dict:
for e in relations_dict[CREATED].outgoing_relations:
print(f' [{idx}] {e.uri}: {e.label}')
idx += 1
print('-----------------------------------------------------------------------------------------------------------')
print(' Let us create a new piece of artwork.')
print('-----------------------------------------------------------------------------------------------------------')
# Main labels for entity
artwork_labels: List[Label] = [
Label('Ginevra Gherardini', EN_US),
Label('Ginevra Gherardini', DE_DE)
]
# Alias labels for entity
artwork_alias: List[Label] = [
Label("Ginevra", EN_US),
Label("Ginevra", DE_DE)
]
# Topic description
artwork_description: List[Description] = [
Description('Oil painting of Mona Lisa\' sister', EN_US),
Description('Ölgemälde von Mona Lisa\' Schwester', DE_DE)
]
# Topic
artwork_object: ThingObject = ThingObject(label=artwork_labels, concept_type=ARTWORK_CLASS,
description=artwork_description,
icon=ICON)
artwork_object.alias = artwork_alias
print(f' Create: {artwork_object}')
# Create artwork
artwork_entity_uri: str = knowledge_client.create_entity(artwork_object)
print(f' Entity URI: {artwork_entity_uri}')
# Create relation between Leonardo da Vinci and artwork
knowledge_client.create_relation(source=leo.uri, relation=IS_CREATOR, target=artwork_entity_uri)
relations_dict = knowledge_client.relations(uri=artwork_entity_uri)
for ontology_property, object_property in relations_dict.items():
print(f' {object_property}')
# You will see that wacom:core#isCreatedBy is automatically inferred as a relation as it is the inverse property of
# wacom:core#created.
# Now, more search options
res_entities, next_search_page = knowledge_client.search_description('Michelangelo\'s Sistine Chapel',
EN_US, limit=1000)
print('-----------------------------------------------------------------------------------------------------------')
print(' Search results. Description: "Michelangelo\'s Sistine Chapel"')
print('-----------------------------------------------------------------------------------------------------------')
s_idx: int = 1
for e in res_entities:
print_entity(e, s_idx, knowledge_client)
# Now, let's search all artwork that has the art style self-portrait
res_entities, next_search_page = knowledge_client.search_labels(search_term=SELF_PORTRAIT_STYLE,
language_code=EN_US, limit=1000)
art_style: Optional[ThingObject] = None
s_idx: int = 1
for entity in res_entities:
# Entity must be a person and the label matches with full string
if entity.concept_type == ART_STYLE_CLASS and SELF_PORTRAIT_STYLE in [la.content for la in entity.label]:
art_style = entity
break
res_entities, next_search_page = knowledge_client.search_relation(subject_uri=None,
relation=HAS_ART_STYLE,
object_uri=art_style.uri,
language_code=EN_US)
print('-----------------------------------------------------------------------------------------------------------')
print(' Search results. Relation: relation:=has_topic object_uri:= unknown')
print('-----------------------------------------------------------------------------------------------------------')
s_idx: int = 1
for e in res_entities:
print_entity(e, s_idx, knowledge_client, short=True)
s_idx += 1
# Finally, the activation function retrieving the related identities to a pre-defined depth.
entities, relations = knowledge_client.activations(uris=[leo.uri], depth=1)
print('-----------------------------------------------------------------------------------------------------------')
print(f'Activation. URI: {leo.uri}')
print('-----------------------------------------------------------------------------------------------------------')
s_idx: int = 1
for e in res_entities:
print_entity(e, s_idx, knowledge_client)
s_idx += 1
# All relations
print('-----------------------------------------------------------------------------------------------------------')
for r in relations:
print(f'Subject: {r[0]} Predicate: {r[1]} Object: {r[2]}')
print('-----------------------------------------------------------------------------------------------------------')
page_id = None
# Listing all entities that have the type
idx: int = 1
while True:
# pull
entities, total_number, next_page_id = knowledge_client.listing(ART_STYLE_CLASS, page_id=page_id, limit=100)
pulled_entities: int = len(entities)
entity_count += pulled_entities
print('-------------------------------------------------------------------------------------------------------')
print(f' Page: {page_number} Number of entities: {len(entities)} ({entity_count}/{total_number}) '
f'Next page id: {next_page_id}')
print('-------------------------------------------------------------------------------------------------------')
for e in entities:
print_entity(e, idx, knowledge_client)
idx += 1
if pulled_entities == 0:
break
page_number += 1
page_id = next_page_id
print()
# Delete all personal entities for this user
while True:
# pull
entities, total_number, next_page_id = knowledge_client.listing(THING_OBJECT, page_id=page_id,
limit=100)
pulled_entities: int = len(entities)
if pulled_entities == 0:
break
delete_uris: List[str] = [e.uri for e in entities]
print(f'Cleanup. Delete entities: {delete_uris}')
knowledge_client.delete_entities(uris=delete_uris, force=True)
page_number += 1
page_id = next_page_id
print('-----------------------------------------------------------------------------------------------------------')Performing Named Entity Linking (NEL) on text and Universal Ink Model.
import argparse
from typing import List, Dict
import urllib3
from knowledge.base.language import EN_US
from knowledge.base.ontology import OntologyPropertyReference, ThingObject, ObjectProperty
from knowledge.nel.base import KnowledgeGraphEntity
from knowledge.nel.engine import WacomEntityLinkingEngine
from knowledge.services.graph import WacomKnowledgeService
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
TEXT: str = "Leonardo da Vinci painted the Mona Lisa."
def print_entity(entity: KnowledgeGraphEntity, list_idx: int, auth_key: str, client: WacomKnowledgeService):
"""
Printing entity details.
Parameters
----------
entity: KnowledgeGraphEntity
Named entity
list_idx: int
Index with a list
auth_key: str
Authorization key
client: WacomKnowledgeService
Knowledge graph client
"""
thing: ThingObject = knowledge_client.entity(auth_key=user_token, uri=entity.entity_source.uri)
print(f'[{list_idx}] - {entity.ref_text} [{entity.start_idx}-{entity.end_idx}] : {thing.uri}'
f' <{thing.concept_type.iri}>')
if len(thing.label) > 0:
print(' | [Labels]')
for la in thing.label:
print(f' | |- "{la.content}"@{la.language_code}')
print(' |')
if len(thing.label) > 0:
print(' | [Alias]')
for la in thing.alias:
print(f' | |- "{la.content}"@{la.language_code}')
print(' |')
relations: Dict[OntologyPropertyReference, ObjectProperty] = client.relations(auth_key=auth_key, uri=thing.uri)
if len(thing.data_properties) > 0:
print(' | [Attributes]')
for data_property, labels in thing.data_properties.items():
print(f' | |- {data_property.iri}:')
for li in labels:
print(f' | |-- "{li.value}"@{li.language_code}')
print(' |')
if len(relations) > 0:
print(' | [Relations]')
for re in relations.values():
print(f' |--- {re.relation.iri}: ')
print(f' |- [Incoming]: {re.incoming_relations} ')
print(f' |- [Outgoing]: {re.outgoing_relations}')
print()
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("-u", "--user", help="External Id of the shadow user within the Wacom Personal Knowledge.",
required=True)
parser.add_argument("-t", "--tenant", help="Tenant Id of the shadow user within the Wacom Personal Knowledge.",
required=True)
parser.add_argument("-i", "--instance", default="https://private-knowledge.wacom.com", help="URL of instance")
args = parser.parse_args()
TENANT_KEY: str = args.tenant
EXTERNAL_USER_ID: str = args.user
# Wacom personal knowledge REST API Client
knowledge_client: WacomKnowledgeService = WacomKnowledgeService(
application_name="Named Entity Linking Knowledge access",
service_url=args.instance)
# Wacom Named Entity Linking
nel_client: WacomEntityLinkingEngine = WacomEntityLinkingEngine(
service_url=args.instance,
service_endpoint=WacomEntityLinkingEngine.SERVICE_ENDPOINT
)
# Use special tenant for testing: Unit-test tenant
user_token, refresh_token, expiration_time = nel_client.request_user_token(TENANT_KEY, EXTERNAL_USER_ID)
entities: List[KnowledgeGraphEntity] = nel_client.\
link_personal_entities(text=TEXT, language_code=EN_US, auth_key=user_token)
idx: int = 1
print('-----------------------------------------------------------------------------------------------------------')
print(f'Text: "{TEXT}"@{EN_US}')
print('-----------------------------------------------------------------------------------------------------------')
for e in entities:
print_entity(e, idx, user_token, knowledge_client)
idx += 1The sample shows how access to entities can be shared with a group of users or the tenant.
import argparse
from typing import List
from knowledge.base.entity import Label, Description
from knowledge.base.language import EN_US, DE_DE, JA_JP
from knowledge.base.ontology import OntologyClassReference, ThingObject
from knowledge.services.base import WacomServiceException
from knowledge.services.graph import WacomKnowledgeService
from knowledge.services.group import GroupManagementService, Group
from knowledge.services.users import UserManagementServiceAPI
# ------------------------------- User credential ----------------------------------------------------------------------
TOPIC_CLASS: OntologyClassReference = OntologyClassReference('wacom', 'core', 'Topic')
def create_entity() -> ThingObject:
"""Create a new entity.
Returns
-------
entity: ThingObject
Entity object
"""
# Main labels for entity
topic_labels: List[Label] = [
Label('Hidden', EN_US),
Label('Versteckt', DE_DE),
Label('隠れた', JA_JP),
]
# Topic description
topic_description: List[Description] = [
Description('Hidden entity to explain access management.', EN_US),
Description('Verstecke Entität, um die Zugriffsteuerung zu erklären.', DE_DE)
]
# Topic
topic_object: ThingObject = ThingObject(label=topic_labels, concept_type=TOPIC_CLASS, description=topic_description)
return topic_object
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("-u", "--user", help="External Id of the shadow user within the Wacom Personal Knowledge.",
required=True)
parser.add_argument("-t", "--tenant", help="Tenant Id of the shadow user within the Wacom Personal Knowledge.",
required=True)
parser.add_argument("-i", "--instance", default='https://private-knowledge.wacom.com',
help="URL of instance")
args = parser.parse_args()
TENANT_KEY: str = args.tenant
EXTERNAL_USER_ID: str = args.user
# Wacom personal knowledge REST API Client
knowledge_client: WacomKnowledgeService = WacomKnowledgeService(application_name="Wacom Knowledge Listing",
service_url=args.instance)
# User Management
user_management: UserManagementServiceAPI = UserManagementServiceAPI(service_url=args.instance)
# Group Management
group_management: GroupManagementService = GroupManagementService(service_url=args.instance)
admin_token, refresh_token, expiration_time = user_management.request_user_token(TENANT_KEY, EXTERNAL_USER_ID)
# Now, we create a user
u1, u1_token, _, _ = user_management.create_user(TENANT_KEY, "u1")
u2, u2_token, _, _ = user_management.create_user(TENANT_KEY, "u2")
u3, u3_token, _, _ = user_management.create_user(TENANT_KEY, "u3")
# Now, let's create an entity
thing: ThingObject = create_entity()
entity_uri: str = knowledge_client.create_entity(thing, auth_key=u1_token)
# Only user 1 can access the entity from cloud storage
my_thing: ThingObject = knowledge_client.entity(entity_uri, auth_key=u1_token)
print(f'User is the owner of {my_thing.owner}')
# Now only user 1 has access to the personal entity
knowledge_client.entity(entity_uri, auth_key=u1_token)
# Try to access the entity
try:
knowledge_client.entity(entity_uri, auth_key=u2_token)
except WacomServiceException as we:
print(f"Expected exception as user 2 has no access to the personal entity of user 1. Exception: {we}")
print(f"Status code: {we.status_code}")
print(f"Response text: {we.service_response}")
# Try to access the entity
try:
knowledge_client.entity(entity_uri, auth_key=u3_token)
except WacomServiceException as we:
print(f"Expected exception as user 3 has no access to the personal entity of user 1. Exception: {we}")
# Now, user 1 creates a group
g: Group = group_management.create_group("test-group", auth_key=u1_token)
# Shares the join key with user 2 and user 2 joins
group_management.join_group(g.id, g.join_key, auth_key=u2_token)
# Share entity with a group
group_management.add_entity_to_group(g.id, entity_uri, auth_key=u1_token)
# Now, user 2 should have access
other_thing: ThingObject = knowledge_client.entity(entity_uri, auth_key=u2_token)
print(f'User 2 is the owner of the thing: {other_thing.owner}')
# Try to access the entity
try:
knowledge_client.entity(entity_uri, auth_key=u3_token)
except WacomServiceException as we:
print(f"Expected exception as user 3 still has no access to the personal entity of user 1. Exception: {we}")
print(f"URL: {we.url}, method: {we.method}")
print(f"Status code: {we.status_code}")
print(f"Response text: {we.service_response}")
print(f"Message: {we.message}")
# Un-share the entity
group_management.remove_entity_to_group(g.id, entity_uri, auth_key=u1_token)
# Now, again no access
try:
knowledge_client.entity(entity_uri, auth_key=u2_token)
except WacomServiceException as we:
print(f"Expected exception as user 2 has no access to the personal entity of user 1. Exception: {we}")
print(f"URL: {we.url}, method: {we.method}")
print(f"Status code: {we.status_code}")
print(f"Response text: {we.service_response}")
print(f"Message: {we.message}")
group_management.leave_group(group_id=g.id, auth_key=u2_token)
# Now, share the entity with the whole tenant
my_thing.tenant_access_right.read = True
knowledge_client.update_entity(my_thing, auth_key=u1_token)
# Now, all users can access the entity
knowledge_client.entity(entity_uri, auth_key=u2_token)
knowledge_client.entity(entity_uri, auth_key=u3_token)
# Finally, clean up
knowledge_client.delete_entity(entity_uri, force=True, auth_key=u1_token)
# Remove users
user_management.delete_user(TENANT_KEY, u1.external_user_id, u1.id, force=True)
user_management.delete_user(TENANT_KEY, u2.external_user_id, u2.id, force=True)
user_management.delete_user(TENANT_KEY, u3.external_user_id, u3.id, force=True)The samples show how the ontology can be extended and new entities can be added using the added classes.
import argparse
import sys
from typing import Optional, List
from knowledge.base.entity import Label, Description
from knowledge.base.language import EN_US, DE_DE
from knowledge.base.ontology import DataPropertyType, OntologyClassReference, OntologyPropertyReference, ThingObject, \
DataProperty, OntologyContext
from knowledge.services.graph import WacomKnowledgeService
from knowledge.services.ontology import OntologyService
from knowledge.services.session import PermanentSession
# ------------------------------- Constants ----------------------------------------------------------------------------
LEONARDO_DA_VINCI: str = 'Leonardo da Vinci'
CONTEXT_NAME: str = 'core'
# Wacom Base Ontology Types
PERSON_TYPE: OntologyClassReference = OntologyClassReference.parse("wacom:core#Person")
# Demo Class
ARTIST_TYPE: OntologyClassReference = OntologyClassReference.parse("demo:creative#Artist")
# Demo Object property
IS_INSPIRED_BY: OntologyPropertyReference = OntologyPropertyReference.parse("demo:creative#isInspiredBy")
# Demo Data property
STAGE_NAME: OntologyPropertyReference = OntologyPropertyReference.parse("demo:creative#stageName")
def create_artist() -> ThingObject:
"""
Create a new artist entity.
Returns
-------
instance: ThingObject
Artist entity
"""
# Main labels for entity
topic_labels: List[Label] = [
Label('Gian Giacomo Caprotti', EN_US),
]
# Topic description
topic_description: List[Description] = [
Description('Hidden entity to explain access management.', EN_US),
Description('Verstecke Entität, um die Zugriffsteuerung zu erlären.', DE_DE)
]
data_property: DataProperty = DataProperty(content='Salaj',
property_ref=STAGE_NAME,
language_code=EN_US)
# Topic
artist: ThingObject = ThingObject(label=topic_labels, concept_type=ARTIST_TYPE, description=topic_description)
artist.add_data_property(data_property)
return artist
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("-u", "--user", help="External Id of the shadow user within the Wacom Personal Knowledge.",
required=True)
parser.add_argument("-t", "--tenant", help="Tenant Id of the shadow user within the Wacom Personal Knowledge.",
required=True)
parser.add_argument("-i", "--instance", default="https://private-knowledge.wacom.com", help="URL of instance")
args = parser.parse_args()
TENANT_KEY: str = args.tenant
EXTERNAL_USER_ID: str = args.user
# Wacom Ontology REST API Client
ontology_client: OntologyService = OntologyService(service_url=args.instance)
knowledge_client: WacomKnowledgeService = WacomKnowledgeService(
application_name="Ontology Creation Demo",
service_url=args.instance)
# Login as admin user
session: PermanentSession = ontology_client.login(TENANT_KEY, EXTERNAL_USER_ID)
if session.roles != "TenantAdmin":
print(f'User {EXTERNAL_USER_ID} is not an admin user.')
sys.exit(1)
knowledge_client.use_session(session.id)
knowledge_client.ontology_update()
context: Optional[OntologyContext] = ontology_client.context()
if context is None:
# First, create a context for the ontology
ontology_client.create_context(name=CONTEXT_NAME, base_uri=f'demo:{CONTEXT_NAME}')
context_name: str = CONTEXT_NAME
else:
context_name: str = context.context
# Creating a class which is a subclass of a person
ontology_client.create_concept(context_name, reference=ARTIST_TYPE, subclass_of=PERSON_TYPE)
# Object properties
ontology_client.create_object_property(context=context_name, reference=IS_INSPIRED_BY, domains_cls=[ARTIST_TYPE],
ranges_cls=[PERSON_TYPE], inverse_of=None, subproperty_of=None)
# Data properties
ontology_client.create_data_property(context=context_name, reference=STAGE_NAME,
domains_cls=[ARTIST_TYPE],
ranges_cls=[DataPropertyType.STRING],
subproperty_of=None)
# Commit the changes of the ontology. This is very important to confirm changes.
ontology_client.commit(context=context_name)
# Trigger graph service. After the update the ontology is available and the new entities can be created
knowledge_client.ontology_update()
res_entities, next_search_page = knowledge_client.search_labels(search_term=LEONARDO_DA_VINCI,
language_code=EN_US, limit=1000)
leo: Optional[ThingObject] = None
for entity in res_entities:
# Entity must be a person and the label matches with full string
if entity.concept_type == PERSON_TYPE and LEONARDO_DA_VINCI in [la.content for la in entity.label]:
leo = entity
break
artist_student: ThingObject = create_artist()
artist_student_uri: str = knowledge_client.create_entity(artist_student)
knowledge_client.create_relation(artist_student_uri, IS_INSPIRED_BY, leo.uri)The sample shows how to use the asynchronous client. Most of the methods are available in the asynchronous client(s). Only for the ontology management the asynchronous client is not available.
import argparse
import asyncio
import uuid
from pathlib import Path
from typing import Tuple, List, Dict, Any, Optional
from knowledge.base.entity import Label
from knowledge.base.language import LanguageCode, EN, SUPPORTED_LOCALES, EN_US
from knowledge.base.ontology import ThingObject
from knowledge.ontomapping import load_configuration
from knowledge.ontomapping.manager import wikidata_to_thing
from knowledge.public.relations import wikidata_relations_extractor
from knowledge.public.wikidata import WikidataSearchResult, WikidataThing
from knowledge.public.client import WikiDataAPIClient
from knowledge.services.asyncio.graph import AsyncWacomKnowledgeService
from knowledge.services.asyncio.group import AsyncGroupManagementService
from knowledge.services.asyncio.users import AsyncUserManagementService
from knowledge.services.base import WacomServiceException, format_exception
from knowledge.services.group import Group
from knowledge.services.session import PermanentSession, RefreshableSession
from knowledge.services.users import UserRole, User
def import_entity_from_wikidata(search_term: str, locale: LanguageCode) -> Dict[str, ThingObject]:
"""
Import entity from Wikidata.
Parameters
----------
search_term: str
Search term
locale: LanguageCode
Language code
Returns
-------
things: Dict[str, ThingObject]
Mapping qid to a thing object
"""
search_results: List[WikidataSearchResult] = WikiDataAPIClient.search_term(search_term, locale)
# Load mapping configuration
load_configuration(Path(__file__).parent.parent / 'pkl-cache' / 'ontology_mapping.json')
# Search wikidata for entities
qid_entities: List[WikidataThing] = WikiDataAPIClient.retrieve_entities([sr.qid for sr in search_results])
qid_things: Dict[str, WikidataThing] = {qt.qid: qt for qt in qid_entities}
relations: Dict[str, List[Dict[str, Any]]] = wikidata_relations_extractor(qid_things)
# Now, let's create the things
things: Dict[str, ThingObject] = {}
for res in qid_entities:
wikidata_thing, import_warnings = wikidata_to_thing(res, all_relations=relations,
supported_locales=SUPPORTED_LOCALES,
pull_wikipedia=True,
all_wikidata_objects=qid_things)
things[res.qid] = wikidata_thing
return things
async def user_management_sample(tenant_api_key: str, instance: str) -> Tuple[User, str, str]:
"""
User management sample.
Parameters
----------
tenant_api_key: str
Session
instance: str
Instance URL
Returns
-------
user: User
User object
user_token: str
User token
refresh_token: str
Refresh token
"""
user_management: AsyncUserManagementService = AsyncUserManagementService(
application_name="Async user management sample",
service_url=instance)
meta_data: dict = {'user-type': 'demo'}
user, user_token, refresh_token, _ = await user_management.create_user(tenant_key=tenant_api_key,
external_id=uuid.uuid4().hex,
meta_data=meta_data,
roles=[UserRole.USER])
return user, user_token, refresh_token
async def clean_up(instance: str, tenant_api_key: str):
"""
Cleanup sample.
Parameters
----------
instance: str
Instance URL
tenant_api_key: str
Tenant API key
"""
user_management: AsyncUserManagementService = AsyncUserManagementService(
application_name="Async user management sample",
service_url=instance)
users: List[User] = await user_management.listing_users(tenant_api_key)
for user in users:
if 'user-type' in user.meta_data and user.meta_data['user-type'] == 'demo':
await user_management.delete_user(tenant_key=tenant_api_key, external_id=user.external_user_id,
internal_id=user.id, force=True)
async def main(external_user_id: str, tenant_api_key: str, instance: str):
"""
Main function for the async sample.
Parameters
----------
external_user_id: str
External id of the shadow user within the Wacom Personal Knowledge.
tenant_api_key: str
Tenant api key of the shadow user within the Wacom Personal Knowledge.
instance: str
URL of instance
"""
async_client: AsyncWacomKnowledgeService = AsyncWacomKnowledgeService(application_name="Async sample",
service_url=instance)
permanent_session: PermanentSession = await async_client.login(tenant_api_key=tenant_api_key,
external_user_id=external_user_id)
"""
The permanent session contains the external user id, the tenant id, thus it is capable to refresh the token and
re-login if needed. The functions check if the token is expired and refresh it if needed. Internally, the token
manager handles the session. There are three different session types:
- Permanent session: The session is refreshed automatically if needed.
- Refreshable session: The session is not refreshed automatically using the refresh token,
but if the session is not used for a day the refresh token is invalidated.
- Timed session: The session is only has the authentication token and no refresh token. Thus, it times out after
one hour.
"""
print(f'Service instance: {async_client.service_url}')
print('-' * 100)
print(f'Logged in as {permanent_session.external_user_id} (tenant id: {permanent_session.tenant_id}) ')
is_ten_admin: bool = permanent_session.roles == "TenantAdmin"
print(f'Is tenant admin: {is_ten_admin}')
print('-' * 100)
print(f'Token information')
print('-' * 100)
print(f'Refreshable: {permanent_session.refreshable}')
print(f'Token must be refreshed before: {permanent_session.expiration} UTC')
print(f'Token expires in {permanent_session.expires_in} seconds)')
print('-' * 100)
print(f'Creating two users')
print('-' * 100)
# User management sample
user_1, user_token_1, refresh_token_1 = await user_management_sample(tenant_api_key, instance)
print(f'User: {user_1}')
user_2, user_token_2, refresh_token_2 = await user_management_sample(tenant_api_key, instance)
print(f'User: {user_2}')
print('-' * 100)
async_client_user_1: AsyncWacomKnowledgeService = AsyncWacomKnowledgeService(application_name="Async user 1",
service_url=instance)
refresh_session_1: RefreshableSession = await async_client_user_1.register_token(auth_key=user_token_1,
refresh_token=refresh_token_1)
async_client_user_2: AsyncWacomKnowledgeService = AsyncWacomKnowledgeService(application_name="Async sample",
service_url=instance)
await async_client_user_2.register_token(auth_key=user_token_2, refresh_token=refresh_token_2)
"""
Now, let's create some entities.
"""
print('Creation of entities')
print('-' * 100)
things_objects: Dict[str, ThingObject] = import_entity_from_wikidata('Leonardo da Vinci', EN)
created: List[ThingObject] = await async_client_user_1.create_entity_bulk(list(things_objects.values()))
for thing in created:
try:
await async_client_user_2.entity(thing.uri)
except WacomServiceException as we:
print(f'User 2 cannot see entity {thing.uri}.\n{format_exception(we)}')
# Now using the group management service
group_management: AsyncGroupManagementService = AsyncGroupManagementService(application_name="Group management",
service_url=instance)
await group_management.use_session(refresh_session_1.id)
# User 1 creates a group
new_group: Group = await group_management.create_group("sample-group")
for thing in created:
try:
await group_management.add_entity_to_group(new_group.id, thing.uri)
except WacomServiceException as we:
print(f'User 1 cannot delete entity {thing.uri}.\n{format_exception(we)}')
await group_management.add_user_to_group(new_group.id, user_2.id)
print(f'User 2 can see the entities now. Let us check with async client 2. '
f'Id of the user: {async_client_user_2.current_session.external_user_id}')
for thing in created:
iter_thing: ThingObject = await async_client_user_2.entity(thing.uri)
label: Optional[Label] = iter_thing.label_lang(EN_US)
print(f'User 2 can see entity {label.content if label else "UNKNOWN"} {iter_thing.uri}.'
f'Ownership: owner flag:={iter_thing.owner}, owner is {iter_thing.owner_id}.')
print('-' * 100)
await clean_up(instance=instance, tenant_api_key=tenant_api_key)
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("-u", "--user", help="External Id of the shadow user within the Wacom Personal Knowledge.",
required=True)
parser.add_argument("-t", "--tenant", help="Tenant Id of the shadow user within the Wacom Personal Knowledge.",
required=True)
parser.add_argument("-i", "--instance", default='https://private-knowledge.wacom.com',
help="URL of instance")
args = parser.parse_args()
asyncio.run(main(args.user, args.tenant, args.instance))The sample shows how to use the semantic search. There are two types of search:
- Label search
- Document search
The label search is used to find entities based on the label. The document search is used to find documents based on the content.
import argparse
import re
import time
from typing import List, Dict, Any
from knowledge.base.language import EN_US
from knowledge.base.search import LabelMatchingResponse, DocumentSearchResponse, VectorDBDocument
from knowledge.services.search import SemanticSearchClient
def clean_text(text: str, max_length: int = -1) -> str:
"""
Clean text from new lines and multiple spaces.
Parameters
----------
text: str
Text to clean.
max_length: int [default=-1]
Maximum length of the cleaned text. If the length is-1, then the text is not truncated.
Returns
-------
str
Cleaned text.
"""
# First, remove new lines
text = text.strip().replace('\n', ' ')
# Then remove multiple spaces
text = re.sub(r'\s+', ' ', text)
if 0 < max_length < len(text):
return text[:max_length] + '...'
return text
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("-u", "--user", help="External Id of the shadow user within the Wacom Personal Knowledge.",
required=True)
parser.add_argument("-t", "--tenant", help="Tenant Id of the shadow user within the Wacom Personal Knowledge.",
required=True)
parser.add_argument("-i", "--instance", default="https://private-knowledge.wacom.com", help="URL of instance")
args = parser.parse_args()
client: SemanticSearchClient = SemanticSearchClient(service_url=args.instance)
session = client.login(args.tenant, args.user)
max_results: int = 10
labels_count: int = client.count_documents(locale=EN_US)
print(f"Tenant ID: {client.current_session.tenant_id} | Labels count: {labels_count} for [locale:={EN_US}]")
t0: float = time.time()
results: LabelMatchingResponse = client.labels_search(query="Leonardo Da Vinci", locale=EN_US,
max_results=max_results)
t1: float = time.time()
if len(results.results) > 0:
print("=" * 120)
for idx, res in enumerate(results.results):
print(f"{idx + 1}. {res.label} | Relevance: ({res.score:.2f}) | URI: {res.entity_uri}")
all_labels: List[VectorDBDocument] = client.retrieve_labels(EN_US, results.results[0].entity_uri)
print("=" * 120)
print(f"Labels for best match: {results.results[0].entity_uri}")
for idx, label in enumerate(all_labels):
print(f"{idx + 1}. {label.content}")
print("=" * 120)
print(f"Time: {(t1 - t0) * 1000:.2f} ms")
print("=" * 120)
document_count: int = client.count_documents(locale=EN_US)
print(f"Document count: {document_count} for [locale:={EN_US}]")
t2: float = time.time()
document_results: DocumentSearchResponse = client.document_search(query="Leonardo Da Vinci artwork", locale=EN_US,
max_results=max_results)
t3: float = time.time()
print("=" * 120)
if len(document_results.results) > 0:
for idx, res in enumerate(document_results.results):
print(f"{idx + 1}. URI: {res.content_uri} | Relevance: {res.score:.2f} | Chunk:"
f"\n\t{clean_text(res.content_chunk, max_length=100)}")
print(f"\n All document chunks for best match: {document_results.results[0].content_uri}")
print("=" * 120)
# If you need all document chunks, you can retrieve them using the content_uri.
best_match_uri: str = document_results.results[0].content_uri
chunks: List[VectorDBDocument] = client.retrieve_documents_chunks(locale=EN_US, uri=best_match_uri)
metadata: Dict[str, Any] = document_results.results[0].metadata
for idx, chunk in enumerate(chunks):
print(f"{idx + 1}. {clean_text(chunk.content)}")
print("\n\tMetadata:\n\t---------")
for key, value in metadata.items():
print(f"\t- {key}: {clean_text(value, max_length=100) if isinstance(value, str) else value }")
print("=" * 120)
print(f"Time: {(t3 - t2) * 1000:.2f} ms")
print("=" * 120)The InkServices client provides access to Wacom's ink processing pipeline, covering handwriting recognition (HWR),
math recognition, Named Entity Linking on ink content, and format conversion.
All operations accept a Universal Ink Model (UIM) binary file as input.
perform_ink_to_text enriches the UIM with recognition results embedded in the model itself.
perform_ink_to_text_plain is a convenience wrapper that returns only the recognized plain text string.
from pathlib import Path
from knowledge.base.ink import HWRMode, Priority, Provider, Schema
from knowledge.base.language import EN_US
from knowledge.services.ink import InkServices
client: InkServices = InkServices(service_url="https://private-knowledge.wacom.com")
client.login(tenant_api_key="<tenant-key>", external_user_id="<user-id>")
uim_content: bytes = Path("uims/text/en_US/text.uim").read_bytes()
# Enriched UIM with recognition results
enriched_uim: bytes = client.perform_ink_to_text(
content=uim_content,
locale=EN_US,
hwr_mode=HWRMode.TEXT_MODE,
priority=Priority.LOWEST,
provider=Provider.MYSCRIPT,
schema=Schema.SEGMENTATION_V03,
)
# Plain recognized text
text: str = client.perform_ink_to_text_plain(
content=uim_content,
locale=EN_US,
hwr_mode=HWRMode.TEXT_MODE,
priority=Priority.LOWEST,
provider=Provider.MYSCRIPT,
schema=Schema.SEGMENTATION_V03,
)
print(f"Recognized text: {text!r}")perform_ink_to_math runs math recognition on a UIM containing handwritten mathematical expressions
and returns an enriched UIM with the recognition results.
from pathlib import Path
from knowledge.base.ink import Priority, Provider, Schema
from knowledge.services.ink import InkServices
client: InkServices = InkServices(service_url="https://private-knowledge.wacom.com")
client.login(tenant_api_key="<tenant-key>", external_user_id="<user-id>")
math_uim: bytes = Path("uims/math/en_US/math.uim").read_bytes()
math_enriched: bytes = client.perform_ink_to_math(
content=math_uim,
schema=Schema.MATH_V06,
provider=Provider.MYSCRIPT,
priority=Priority.LOWEST,
)
print(f"Math-enriched UIM: {len(math_enriched):,} bytes")perform_named_entity_linking links recognized text spans in an already HWR-enriched UIM to entities
in the personal knowledge graph.
Pass the output of perform_ink_to_text as input.
from knowledge.base.language import EN_US
from knowledge.services.ink import InkServices
client: InkServices = InkServices(service_url="https://private-knowledge.wacom.com")
client.login(tenant_api_key="<tenant-key>", external_user_id="<user-id>")
nel_uim: bytes = client.perform_named_entity_linking(content=enriched_uim, locale=EN_US)
print(f"NEL-enriched UIM: {len(nel_uim):,} bytes")convert_to exports a UIM to PNG, JPG, or SVG. convert_to_pdf exports to PDF in either vector
or raster mode.
from pathlib import Path
from knowledge.base.ink import ExportFormat, PDFType
from knowledge.services.ink import InkServices
client: InkServices = InkServices(service_url="https://private-knowledge.wacom.com")
client.login(tenant_api_key="<tenant-key>", external_user_id="<user-id>")
uim_content: bytes = Path("uims/text/en_US/text.uim").read_bytes()
# Raster formats
png_bytes: bytes = client.convert_to(uim_content, ExportFormat.PNG)
jpg_bytes: bytes = client.convert_to(uim_content, ExportFormat.JPG)
# Vector format
svg_bytes: bytes = client.convert_to(uim_content, ExportFormat.SVG)
# PDF — vector or raster rendering
pdf_vector: bytes = client.convert_to_pdf(uim_content, PDFType.VECTOR)
pdf_raster: bytes = client.convert_to_pdf(uim_content, PDFType.RASTER)
Path("output.png").write_bytes(png_bytes)
Path("output.svg").write_bytes(svg_bytes)
Path("output.pdf").write_bytes(pdf_vector)Run the full ink services sample:
python samples/ink_services.py --user <user-id> --tenant <tenant-key>The ContentClient (sync, in knowledge.services.content) and AsyncContentClient (async, in knowledge.services.asyncio.content) provide access to the Wacom Content API.
Content items are binary blobs — images, PDFs, ink files, audio, and so on — attached to an entity in the knowledge graph by its URI.
The Content API enforces only the mechanical rules (access rights on the owning entity, MIME-type integrity on file replacement, and soft/hard delete primitives); tenant- and product-specific policy belongs in the business layer above (see Business Logic Recommendations).
from pathlib import Path
from knowledge.services.content import ContentClient
client: ContentClient = ContentClient(service_url="https://private-knowledge.wacom.com")
client.login(tenant_api_key="<tenant-key>", external_user_id="<user-id>")
file_bytes: bytes = Path("report.pdf").read_bytes()
content_id: str = client.upload_content(
uri="wacom:entity:abc-123",
file_content=file_bytes,
filename="report.pdf",
mimetype="application/pdf",
)
# All content items attached to an entity
items = client.list_content(uri="wacom:entity:abc-123")
for item in items:
print(f"{item.id} ({item.mime_type}) tags={item.tags} deleted={item.is_deleted}")
# Download the raw file
file_bytes_back: bytes = client.download_content(content_id)
# Metadata only (no blob)
info = client.get_content_info(content_id)
print(info.date_added, info.date_modified, info.metadata)# Patch tags and metadata in a single call
client.update_content(
content_id=content_id,
tags=["report", "Q4-2026"],
metadata={"author": "ada.lovelace", "status": "reviewed"},
)
# Replace just the metadata
client.update_content_metadata(content_id, metadata={"status": "archived"})
# Replace just the tags
client.update_content_tags(content_id, tags=["report", "archived"])
# Replace the stored file. The replacement must have the same MIME type;
# otherwise the service returns 409 Conflict.
client.update_content_file(
content_id=content_id,
file_content=Path("report-v2.pdf").read_bytes(),
filename="report-v2.pdf",
)force=False (the default) performs a soft delete: the item is flagged with isDeleted=true but the blob and metadata are kept.
force=True performs a hard delete, removing the blob from premium storage irreversibly.
Soft-deleted items are returned by list_content(..., show_deleted=True) only when called by a tenant admin.
# Soft delete (reversible)
client.delete_content(content_id)
# Hard delete (irreversible; gate this in the business layer — see below)
client.delete_content(content_id, force=True)
# Cascade: delete every content item attached to an entity
client.delete_all_content(uri="wacom:entity:abc-123")import asyncio
from pathlib import Path
from knowledge.services.asyncio.content import AsyncContentClient
async def main() -> None:
client: AsyncContentClient = AsyncContentClient(
service_url="https://private-knowledge.wacom.com",
)
await client.login(tenant_api_key="<tenant-key>", external_user_id="<user-id>")
try:
content_id: str = await client.upload_content(
uri="wacom:entity:abc-123",
file_content=Path("report.pdf").read_bytes(),
filename="report.pdf",
)
items = await client.list_content(uri="wacom:entity:abc-123")
for item in items:
print(item.id, item.mime_type)
finally:
await client.close_all_sessions()
asyncio.run(main())list_content, get_content_info, and update_content return ContentObject instances (knowledge.base.content):
| Field | Type | Description |
|---|---|---|
id |
str |
Unique identifier returned at upload time. |
mime_type |
str |
MIME type of the stored file. |
tags |
List[str] |
Tags attached to the content item. |
metadata |
Dict[str, str] |
Key-value metadata. |
date_added |
datetime |
UTC creation timestamp. |
date_modified |
datetime |
UTC last-modified timestamp. |
is_deleted |
bool |
True for soft-deleted items returned via show_deleted=True. |
Run the full content API sample:
python samples/content_handling.py \
--tenant <TENANT_API_KEY> \
--user <EXTERNAL_USER_ID> \
--file /path/to/original.png \
--update-file /path/to/replacement.pngThis section captures guidance for the business-level REST API that sits on top of the Content API.
The Content API deliberately exposes broad primitives (Read / Write / Delete rights, force, showDeleted, MIME-type integrity) so that tenant- and product-specific policy can be implemented once, in the business layer, without changing the core.
The recommendations below are non-normative defaults. A given deployment may choose a stricter or looser policy; the Content API will honor whatever the business layer forwards.
The two layers have different jobs:
| Layer | Responsibility |
|---|---|
| Content API (this SDK) | Mechanical correctness: storage, rights enforcement on the owning entity, soft/hard delete primitives, MIME-type integrity, audit timestamps. |
| Business REST API (upstream) | Product and tenant policy: who may hard-delete, restore flow, trash UX, retention windows, GDPR erasure, quotas, rate limits, virus scanning, derivative generation (thumbnails, text extraction), notifications. |
Keeping policy out of the Content API means a tenant can change its rules (for example, "Delete right means soft delete only, hard delete is admin-only") by changing the business layer without touching the data plane.
The Content API grants force=true to any caller holding the Delete right on the owning entity.
The business API should typically not expose this directly.
Recommended default:
| Caller | Soft delete (force=false) |
Hard delete (force=true) |
|---|---|---|
| Content uploader | ✅ | ✅ |
| Entity owner | ✅ | ✅ |
Group member with Delete right on a Shared entity |
✅ | ❌ (defer to owner/admin) |
Tenant user with Delete right on a Public entity |
✅ | ❌ (defer to owner/admin) |
| TenantAdmin | ✅ | ✅ |
Rationale: soft delete is reversible and its blast radius is bounded; hard delete destroys the blob and its history.
A careless or hostile collaborator should not be able to permanently erase someone else's uploaded work.
The business layer should therefore translate an ordinary "Delete" action into DELETE /content/{id} (no force), and only forward force=true when the caller is the uploader, the entity owner, or a TenantAdmin.
showDeleted=true is honored only for TenantAdmins at the core. To build a self-service "Trash" feature, the business layer should:
- Call
GET /content?uri=…&showDeleted=truewith an admin or service token. - Filter the result to items whose uploader (or entity owner) matches the calling user.
- Return that filtered list as the user's trash.
- Offer a Restore action that flips
isDeletedback tofalse. A dedicatedPOST /content/{id}/restoreprimitive in the Content API is recommended; until it exists, the business layer has no lossless way to restore a soft-deleted item. - Offer a Delete permanently action that issues
DELETE /content/{id}?force=true, subject to the gating above.
Soft-deleted items still occupy premium blob storage. The business layer should enforce a retention policy — for example, automatically hard-deleting soft-deleted items after N days — by running a scheduled job that:
- Lists soft-deleted items per tenant via
GET /content?…&showDeleted=true. - Selects items whose
dateModifiedis older than the retention window. - Issues
DELETE /content/{id}?force=truefor each.
Recommended defaults: 30 days for user-initiated soft deletes, 7 days for cascaded deletes originating from an entity removal.
When a user exercises a right-to-erasure request, soft delete is insufficient — the content must actually leave premium storage. The business layer should:
- Enumerate all entities owned by the subject.
- For each entity, call
DELETE /content?uri={entityUri}(cascading to every attached content item). - Follow up with
DELETE /content/{id}?force=trueon any remaining items returned undershowDeleted=trueto guarantee hard deletion. - Record the operation in an auditable log kept outside the knowledge graph.
PUT /content/{id}/file returns 409 Conflict when the replacement file's MIME type differs from the stored one.
The business layer should surface this to the user as "upload a file of the same type, or create a new content item instead" rather than retrying blindly.
If a true type change is intended, the correct pattern is: upload a new content item via POST /content/{uri}, copy over the tags and metadata, then delete the old item.
The Content API does not enforce per-tenant storage quotas, upload rate limits, virus scanning, or content-type whitelists. These belong to the business layer and should run before the request is forwarded to the Content API, so that rejected uploads never touch premium storage.
Thumbnails, extracted text for full-text search, embeddings for vector search, and similar derivatives should be produced by the business layer (or a downstream worker triggered by it) rather than being stored as first-class content items — unless they are themselves user-facing.
When derivatives are stored via this API, tag them (e.g. derivative:thumbnail) so that lifecycle operations can cascade cleanly.
dateAdded, dateModified, and isDeleted provide a minimal audit surface.
For a full audit trail (who uploaded, who deleted, who restored, from which IP, under which business action), the business layer should emit audit events to a separate store at the moment it calls the Content API, rather than relying on the core timestamps alone.
The IndexManagementClient extends SemanticSearchClient with administrative operations for the
vector search index. It allows operators to inspect index health, stream all indexed documents,
refresh or optimize the index, and delete individual documents by ID.
from knowledge.base.index import HealthResponse
from knowledge.base.language import EN_US
from knowledge.services.index_management import IndexManagementClient
client: IndexManagementClient = IndexManagementClient(service_url="https://private-knowledge.wacom.com")
client.login(tenant_api_key="<tenant-key>", external_user_id="<user-id>")
health: HealthResponse = client.index_health(index_mode="document", locale=EN_US)
print(f"Healthy: {health.healthy}")
print(f"Cluster status: {health.condition.cluster.status} | Nodes: {health.condition.cluster.number_of_nodes}")
for shard in health.condition.shards:
print(f" Shard [{shard.shard_id}] state={shard.shard_state} docs={shard.num_docs} size={shard.store_size}")iterate_documents streams all indexed documents as NDJSON without loading everything into memory,
making it suitable for large indices.
from knowledge.base.index import IndexDocument
from knowledge.base.language import EN_US
from knowledge.services.index_management import IndexManagementClient
client: IndexManagementClient = IndexManagementClient(service_url="https://private-knowledge.wacom.com")
client.login(tenant_api_key="<tenant-key>", external_user_id="<user-id>")
for doc in client.iterate_documents(index_mode="document", locale=EN_US):
doc: IndexDocument
print(f"ID: {doc.id} | URI: {doc.content_uri} | Locale: {doc.meta.locale}")
print(f" Created: {doc.meta.creation} | Chunk: {doc.meta.chunk_index}")
print(f" Preview: {doc.content[:100].strip()}...")from knowledge.base.language import EN_US
from knowledge.services.index_management import IndexManagementClient
client: IndexManagementClient = IndexManagementClient(service_url="https://private-knowledge.wacom.com")
client.login(tenant_api_key="<tenant-key>", external_user_id="<user-id>")
# Make recent writes searchable immediately
client.refresh_index(index_mode="document", locale=EN_US)
# Remove a specific document from the index
client.delete_document_by_id(index_mode="document", locale=EN_US, document_ids=["<doc-id>"])
# Optimise storage after bulk deletions
client.force_merge_index(index_mode="document", locale=EN_US)Run the full index management sample:
python samples/index_management.py --user <user-id> --tenant <tenant-key>The QueueManagementClient exposes monitoring information for the message queues that back the
asynchronous processing pipeline of the semantic search service.
It is a read-only observability client — it does not enqueue or dequeue messages.
from typing import List
from knowledge.base.queue import QueueMonitor, QueueNames
from knowledge.services.queue_management import QueueManagementClient
client: QueueManagementClient = QueueManagementClient(service_url="https://private-knowledge.wacom.com")
client.login(tenant_api_key="<tenant-key>", external_user_id="<user-id>")
# Names only
queue_names: QueueNames = client.list_queue_names()
print(queue_names.names)
# Full monitoring information for every queue
queues: List[QueueMonitor] = client.list_queues()
for queue in queues:
print(f"{queue.name} | state={queue.state} | messages={queue.messages} | consumers={queue.consumers}")from knowledge.base.queue import QueueCount, QueueMonitor
from knowledge.services.queue_management import QueueManagementClient
client: QueueManagementClient = QueueManagementClient(service_url="https://private-knowledge.wacom.com")
client.login(tenant_api_key="<tenant-key>", external_user_id="<user-id>")
queue_name: str = "my-queue"
is_empty: bool = client.queue_is_empty(queue_name)
size: QueueCount = client.queue_size(queue_name)
monitor: QueueMonitor = client.queue_monitor_information(queue_name)
print(f"Empty: {is_empty}")
print(f"Size : {size.count} messages")
print(f"State: {monitor.state} | Ready: {monitor.messages_ready} | Unacknowledged: {monitor.messages_unacknowledged}")
if monitor.message_stats:
print(f"Stats: publish={monitor.message_stats.publish} deliver={monitor.message_stats.deliver} "
f"ack={monitor.message_stats.ack}")Run the full queue management sample:
python samples/queue_management.py --user <user-id> --tenant <tenant-key>| Package | Version | Description |
|---|---|---|
aiohttp |
Latest | Async HTTP client/server |
requests |
>=2.32.0, <3.0.0 | HTTP library |
PyJWT |
>=2.10.1, <3.0.0 | JSON Web Token |
rdflib |
>=7.1.0 | RDF library |
orjson |
>=3.10.0 | Fast JSON library |
cachetools |
>=5.3.0 | Caching utilities |
loguru |
0.7.3 | Logging |
tqdm |
>=4.65.0 | Progress bars |
Install with: pip install personal-knowledge-library[dev]
| Package | Purpose |
|---|---|
pytest |
Testing framework |
pytest-asyncio |
Async test support |
pytest-cov |
Coverage reporting |
mypy |
Type checking |
pylint |
Code analysis |
black |
Code formatting |
flake8 |
Linting |
- Clone the repository:
git clone https://github.com/Wacom-Developer/personal-knowledge-library.git
cd personal-knowledge-library- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate- Install the package with development dependencies:
pip install -e ".[dev]"Run the full test suite:
pytestRun tests with coverage report:
pytest --cov=knowledge --cov-report=term-missingRun specific test files:
pytest tests/test_ontology_unit.py -vmypy knowledge --ignore-missing-importspylint knowledgeblack knowledge testsflake8 knowledgeYou can find more detailed technical documentation here.
API documentation is available in the docs/knowledge directory.
Contribution guidelines are still a work in progress.



