feat: LLM-powered model building tool `plugboard go` by toby-coleman · Pull Request #224 · plugboard-dev/plugboard

toby-coleman · 2026-03-08T19:00:05Z

Summary

Currently work-in-progress. plugboard go opens an interactive chat tool for building Plugboard models.

Changes

…gent

gemini-code-assist · 2026-03-08T19:00:24Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant new feature: an interactive, AI-powered model building experience for Plugboard. The new plugboard go command provides a conversational interface where users can describe their desired models, and the integrated Copilot agent will assist in designing, implementing, and running them. This enhances the user experience by making model creation more intuitive and guided, leveraging large language models to streamline the development workflow.

Highlights

New LLM-powered Interactive Model Builder: Introduced a new plugboard go CLI command that launches an interactive chat-based Textual User Interface (TUI) for building Plugboard models, powered by GitHub Copilot.
Copilot Agent Integration: Implemented a PlugboardAgent to manage the Copilot client and session, enabling real-time interaction, streaming responses, and tool execution within the TUI.
New Agent Definitions and Tools: Added new agent definitions (examples.agent.md, researcher.agent.md) and Copilot tools (run_plugboard_model, get_mermaid_diagram_url) to facilitate model creation, execution, and visualization directly from the interactive environment.
Dynamic Model Selection: Users can now dynamically select and switch between different LLM models (e.g., GPT-5 mini, Claude-Sonnet-4) within the interactive TUI.
Dependency Management and Theming: Updated project dependencies to include github-copilot-sdk and textual as optional go extras, and introduced a shared theme for consistent UI styling.

Changelog

.github/agents/docs.agent.md
- Updated with new model configurations for GPT-5 mini and GPT-4.1.
.github/agents/examples.agent.md
- Added a new agent definition for developing example Plugboard models, including guidelines for tutorials and demo Jupyter notebooks.
.github/agents/lint.agent.md
- Updated with new model configurations for GPT-5 mini and GPT-4.1.
.github/agents/researcher.agent.md
- Added a new agent definition for researching specific topics to inform model development.
examples/AGENTS.md
- Refactored content to be more general for 'Plugboard Models' and removed specific Jupyter Notebook guidelines, which were moved to the new examples agent definition.
plugboard/cli/init.py
- Integrated the new 'go' sub-command into the main Plugboard CLI application.
plugboard/cli/go/AGENTS.md
- Added a new file, referencing the main examples/AGENTS.md for agent instructions.
plugboard/cli/go/init.py
- Added the entry point for the plugboard go CLI command, including dependency checks and TUI application launch logic.
plugboard/cli/go/agent.py
- Added the PlugboardAgent class to manage Copilot client interactions, session events, and model resolution for the TUI.
plugboard/cli/go/app.py
- Added the PlugboardGoApp Textual TUI application, implementing the interactive chat interface, model selection, and display of file trees and Mermaid diagrams.
plugboard/cli/go/tools.py
- Added Copilot tool definitions for running Plugboard models from YAML and generating Mermaid diagram URLs.
plugboard/utils/theme.py
- Added a new module defining shared color constants for Plugboard user interfaces.
pyproject.toml
- Added github-copilot-sdk and textual as optional dependencies under the new go extra.
- Configured plugboard/cli/go/AGENTS.md to be included in the wheel build artifacts.
- Updated the provides-extras list to include the new go extra.
tests/unit/test_cli_go.py
- Added comprehensive unit tests for the plugboard go CLI entry point, system prompt loading, PlugboardAgent functionality, and PlugboardGoApp widgets and message handling.
uv.lock
- Updated dependency lock file to include github-copilot-sdk, linkify-it-py, textual, and uc-micro-py.

Activity

The pull request is currently a work-in-progress, with the author toby-coleman actively developing the plugboard go interactive chat tool.
Initial implementation of the LLM-powered model building tool has been pushed.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-03-08T19:01:11Z

Benchmark comparison for 64adfa9a (base) vs fd7b9be8 (PR)


------------------------------------------------------------------------------------------------------------------ benchmark: 2 tests -----------------------------------------------------------------------------------------------------------------
Name (time in ms)                                                                         Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_process_run (pr/.benchmarks/Linux-CPython-3.12-64bit/0001_pr)         443.0566 (1.0)      451.6036 (1.0)      447.3059 (1.0)      3.3064 (1.0)      446.3391 (1.0)      4.7014 (1.0)           2;0  2.2356 (1.0)           5           1
test_benchmark_process_run (main/.benchmarks/Linux-CPython-3.12-64bit/0001_base)     446.8037 (1.01)     455.1559 (1.01)     449.6616 (1.01)     3.4286 (1.04)     449.0458 (1.01)     4.7388 (1.01)          1;0  2.2239 (0.99)          5           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean

gemini-code-assist

Code Review

This pull request introduces plugboard go, an interactive LLM-powered model building tool, featuring a new CLI command, a Textual-based TUI, and integration with the Copilot SDK. While innovative, it contains critical security vulnerabilities related to insecure LLM tool usage and potential Remote Code Execution (RCE). The agent automatically approves all tool calls, and the tools allow for arbitrary file access and code execution via malicious Plugboard models. Addressing these requires implementing explicit user confirmation for tool execution and strictly sanitizing tool inputs. Furthermore, a critical bug in how the agent's system prompt is loaded will likely cause the feature to fail when installed as a package. Several unit tests for the new TUI are outdated and failing, and error handling could be improved by using more specific exceptions instead of broad except Exception blocks. Addressing these points will significantly improve the robustness, maintainability, and security of this new feature.

gemini-code-assist · 2026-03-08T19:03:37Z

plugboard/cli/go/agent.py

+            system_message={
+                "content": system_prompt,
+            },
+            on_permission_request=PermissionHandler.approve_all,


The use of PermissionHandler.approve_all automatically grants the LLM permission to execute any registered tool without user intervention. When combined with tools that can execute code or access the file system (like run_plugboard_model), this creates a significant security risk. An attacker could use prompt injection to trick the LLM into executing malicious code on the user's machine. It is highly recommended to implement a permission handler that requires explicit user confirmation for sensitive tool executions.

gemini-code-assist · 2026-03-08T19:03:37Z

plugboard/cli/go/tools.py

+    async def run_plugboard_model(params: RunModelParams) -> str:
+        yaml_path = Path(params.yaml_path).resolve()


The run_plugboard_model tool takes a yaml_path directly from the LLM and resolves it without verifying that it resides within a safe or expected directory. This allows for arbitrary file read (of YAML files) and, more critically, Remote Code Execution (RCE) because the tool subsequently builds and runs a Plugboard model from the specified file. Since ProcessBuilder.build uses pydoc.locate and the tool adds the file's directory to sys.path, an attacker can provide a malicious Python file alongside a YAML file to execute arbitrary code. You should restrict the yaml_path to a designated safe directory and ensure the resolved path is contained within it.

gemini-code-assist · 2026-03-08T19:03:37Z

plugboard/cli/go/AGENTS.md

@@ -0,0 +1 @@
+../../../examples/AGENTS.md


The content of this file is a relative path, but the code in plugboard/cli/go/agent.py reads this file's content directly to use as a system prompt. This will result in the prompt being the literal string ../../../examples/AGENTS.md, which is not the intended behavior and will cause the go command to function incorrectly when the package is installed. To fix this, you should replace the path with the actual content of examples/AGENTS.md.

gemini-code-assist · 2026-03-08T19:03:37Z

plugboard/cli/go/agent.py

+        except Exception:
+            return requested_model


Catching a broad Exception can hide unexpected errors. It would be better to catch a more specific exception if the copilot SDK provides one for this case. If not, consider logging the exception at a DEBUG or WARNING level to aid in debugging potential issues with model resolution.

gemini-code-assist · 2026-03-08T19:03:37Z

plugboard/cli/go/agent.py

+        except Exception:
+            return ["gpt-4o", "gpt-5", "claude-sonnet-4", "claude-sonnet-4-thinking", "o3"]


Similar to the _resolve_model method, catching a broad Exception here is not ideal as it can hide the root cause of failures. It would be better to catch a more specific exception. Also, consider logging the exception to provide more context when the fallback list of models is returned. This will make it easier to diagnose connection issues with the Copilot service.

gemini-code-assist · 2026-03-08T19:03:37Z

plugboard/cli/go/app.py

+        except Exception as e:
+            self.post_message(
+                AgentStatus(
+                    f"Failed to connect to Copilot: {e}"
+                    "\n\nMake sure the GitHub Copilot CLI "
+                    "is installed and you are authenticated.",
+                ),
+            )


Catching a broad Exception here can hide the root cause of connection failures. It would be more robust to catch more specific exceptions if the copilot SDK provides them. If not, logging the full traceback at a DEBUG level would be helpful for debugging. This pattern of catching broad exceptions is repeated in _send_to_agent and _change_model and should be addressed there as well.

gemini-code-assist · 2026-03-08T19:03:37Z

tests/unit/test_cli_go.py

+    def test_go_default_model_option(self) -> None:
+        """The --model flag should default to gpt-4o."""
+        with patch("plugboard.cli.go.app.PlugboardGoApp") as mock_app_cls:
+            mock_app = MagicMock()
+            mock_app_cls.return_value = mock_app
+            result = runner.invoke(app, ["go"])
+            assert result.exit_code == 0
+            mock_app_cls.assert_called_once_with(model_name="gpt-4o")
+            mock_app.run.assert_called_once()


This test asserts that the default model is gpt-4o. However, the default model for the plugboard go command is defined as gpt-5-mini in plugboard/cli/go/__init__.py. The test should be updated to assert the correct default value.

Suggested change

def test_go_default_model_option(self) -> None:

"""The --model flag should default to gpt-4o."""

with patch("plugboard.cli.go.app.PlugboardGoApp") as mock_app_cls:

mock_app = MagicMock()

mock_app_cls.return_value = mock_app

result = runner.invoke(app, ["go"])

assert result.exit_code == 0

mock_app_cls.assert_called_once_with(model_name="gpt-4o")

mock_app.run.assert_called_once()

def test_go_default_model_option(self) -> None:

"""The --model flag should default to gpt-5-mini."""

with patch("plugboard.cli.go.app.PlugboardGoApp") as mock_app_cls:

mock_app = MagicMock()

mock_app_cls.return_value = mock_app

result = runner.invoke(app, ["go"])

assert result.exit_code == 0

mock_app_cls.assert_called_once_with(model_name="gpt-5-mini")

mock_app.run.assert_called_once()

gemini-code-assist · 2026-03-08T19:03:37Z

tests/unit/test_cli_go.py

+    def test_model_selector_default(self) -> None:
+        """ModelSelector default model should be gpt-4o."""
+        from plugboard.cli.go.app import ModelSelector
+
+        selector = ModelSelector()
+        assert selector.model_name == "gpt-4o"
+


The widget ModelSelector does not exist in plugboard.cli.go.app. It seems to have been replaced or renamed, possibly to ModelSelectionOverlay. This test is outdated and should be updated or removed.

gemini-code-assist · 2026-03-08T19:03:37Z

tests/unit/test_cli_go.py

+                assert app.query_one("#model-selector") is not None
+                assert app.query_one("#mermaid-link") is not None
+                assert app.query_one("#file-tree") is not None
+                assert app.query_one("#model-overlay") is not None
+                assert app.query_one("#shortcut-hint") is not None
+                assert app.query_one("#title-banner") is not None


Some of the widget IDs being queried in this test do not exist in plugboard.cli.go.app.py. Specifically, #model-selector, #shortcut-hint, and #title-banner seem to have been renamed or removed. Please update the test to use the correct widget IDs, for example #header-banner instead of #title-banner, and remove queries for non-existent widgets.

gemini-code-assist · 2026-03-08T19:03:37Z

tests/unit/test_cli_go.py

+                messages = list(chat_scroll.query(ChatMessage))
+
+                assert messages[-1].role == "user"
+                assert "First user line\nSecond user line" in messages[-1]._content


The _append_to_last_message method in app.py joins messages with \n\n. This test asserts that the content contains First user line\nSecond user line, which will fail because of the single newline. The assertion should check for content with two newlines between the messages.

Suggested change

assert "First user line\nSecond user line" in messages[-1]._content

assert "First user line\n\nSecond user line" in messages[-1]._content

toby-coleman added 13 commits March 5, 2026 20:52

Use cheap models for linting and docs

23558ad

Make AGENTS more general and move some instructions to the examples a…

6ace083

…gent

Add researcher agent

99a4dff

More reorg

8bb005f

Remove specifics from AGENTS

a43ee62

Add dependencies

6e561c2

Initial work on TUI app

c14d586

Remove AGENTS

e6c6f34

Replace with symlink

481dba1

Fix symlink

14021c2

Add colours

4b2724c

Improved version

d3a137b

More improvements

fd7b9be

gemini-code-assist bot reviewed Mar 8, 2026

View reviewed changes

		async def run_plugboard_model(params: RunModelParams) -> str:
		yaml_path = Path(params.yaml_path).resolve()

		@@ -0,0 +1 @@
		../../../examples/AGENTS.md No newline at end of file

		except Exception:
		return ["gpt-4o", "gpt-5", "claude-sonnet-4", "claude-sonnet-4-thinking", "o3"]

	assert "First user line\nSecond user line" in messages[-1]._content
	assert "First user line\n\nSecond user line" in messages[-1]._content

Conversation

toby-coleman commented Mar 8, 2026

Summary

Changes

Uh oh!

gemini-code-assist bot commented Mar 8, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Mar 8, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant