Skip to content
Patrick Latimer edited this page Jan 14, 2026 · 1 revision

Patrick's thoughts on Dataverse APIs

In theme with all the architecture discussions happening recently, I think it's worth formalizing the (technical and organizational) plan for Dataverse integration, as it is important for controlling lab operations, and involves many teams, stakeholders, and moving parts.

In setting up the waterlog dataverse infrastructure we've learned a lot about the power platform, and I think we now have sufficient fluency to do a solid architecture. We should clearly define system requirements and design considerations and generate an architecture diagram, defined processes for usage/maintenance/additions, and a project management plan.

Considerations:

  • Enable flexibility for individual use cases while remaining simple and maintainable.
  • Don't want to have to update metadata-service every time a table schema changes
  • Users should be able to contribute to the system
  • End-user code should know about the schema of dataverse data.
  • Can/should we integrate aind-metadata-schema prevalidation somehow?
  • Authentication should be easy (open reads, writes only with svc accounts, different roles)

Objects (in the abstract):

  • Database Tables
  • API Logic (Action <-> database, e.g. update multiple tables to change mouse baseline weight)
  • Data models (what fields are in each table)
  • API Models (what inputs are required to perform operations)
  • API Client (python library facilitating access)

Definitions

  • PowerPlatform: Microsoft low-code infrastructure framework encompassing the following:
    • Dataverse: SQL database with excel-esque web UI, an OData REST API, as well as
    • PowerApps: Web UI construction framework (built on dataverse tables)
    • PowerBI: Web UI framework for reports and graphs
    • PowerAutomate: uses Flows to perform actions
      • Flows can be triggered by various events such as "new row added to table", a scheduled time, or a HTTP post to the flows' URL
      • Flows can take many actions, such as sending emails or updating dataverse tables
      • A Custom Connector can generate a OpenApi specification for a collection of API flows.
  • Microsoft Entra (fka MS Azure AD): Identity and access management service
  • REST: (Representational State Transfer), ubiquitous style for building APIs using HTTP methods (GET/POST...)
  • CRUD (Create/Read/Update/Delete): common acronym for database actions often enabled by an API
  • OData (Open Data Protocol): A protocol for building REST APIs to perform CRUD operations.
  • OpenApi: Standard, language-agnostic specification for HTTP APIs.
    • Many languages have tools for autogenerating clients based on openapi spec files
  • FastApi: A python framework for making REST APIs. aind-metadata-service is built on FastAPI

API Examples

  • aind-slims-api (client with hand-written data models to match slims tables)
  • lims2 (Some REST API (example), mostly just direct table CRUD)
  • PowerAutomate Waterlog API Flows (one flow per endpoint, flows always run with flow-owner's permissions, models in flows, autogenerated client)
    • Individual flows for each write operation (Add weight record, Add water record, Change mouse to water restricted, ...)
    • Flows always run with the permissions of the flow-owner.
    • A python client is automatically generated from the custom connector openapi spec
  • Bruno's Session dataverse tables (lab SW uses odata client to CRUD straight to dataverse)

Examples in detail

For API endpoints that are more complicated than just a simple read or write (for instance those that need to collate data from multiple tables), that logic needs to be written somewhere. It can either be written as a PowerAutomate Flow or in a Python function.

Flow for posting a weight record

This link should take you to the flow, it's named "API - Post weight record"

Here's a screenshot of the whole flow (sorry it's so zoomed out): powerautomate_flow_example

Python function for posting a weight record

Here is the equivalent python function performing the same logic as the flow

def post_weight_record(
    mouse_id: str,
    weight: float,
    date_time: str,
    user_email: str,
    workstation: str,
    software_source: str,
    software_version: str,
    operator_notes: Optional[str] = None
) -> None:
    mouse = db.fetch(table="mice", mouse_id=mouse_id)

    restriction = db.fetch(table="water_restrictions", mouse_id=mouse_id)

    user = db.fetch(table='users', email=user_email)

    data = {
        "date_time": date_time,
        "mouse_id": mouse_id,
        "operator": user.name,
        "water_restriction_record": restriction.id,
        "water_restriction_status": restriction.status,
        "weight": weight,
        "operator_notes": operator_notes,
        "software_source": software_source,
        "software_version": software_version,
        "workstation": workstation,
    }

    new_record = db.add(table='water_weight_records', data=data)

    return new_record