XKCD: Create XKCD comic generator #1404

christolis · 2026-01-31T21:02:57Z

What

When some user calls the command /xkcd relevant [n], our integrated ChatGPT reads the last $n$ messages and tries to post a relevant XKCD comic depending on the dialogue that's being held as of that particular moment the command was executed.

If the user manages to supply no $n$, then a default of $n = 100$ is selected.

For the times when it's needed, /xkcd custom <id>, (where $id$ stands for the XKCD comic number) the bot will send that XKCD comic in the chat.

It's as simple as that.

How

Every time the bot launches, if there's no xkcd.generated.json file found and there's no vector store uploaded on OpenAI of the XKCD comics, the bot spends some time downloading all of the XKCD comics available and creates that file. It then attempts to upload it as a vector store.

If the file is not found but the vector store is uploaded on OpenAI, the bot will still fetch for the file because it needs it for retrieving individual XKCDs and because we should not bother the XKCD API every time. And it's a 3MB file anyways.

Why

It's a fun feature, people can get a laugh out of it. It's harmless, and a fun use of ChatGPT. I don't see why it shouldn't be included.

Preview

Use RAG with ChatGPT to store all XKCD comics and display the correct one based on the chat history. There is also the possibility to specify your own XKCD if you wish. Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>

Zabuzard

the chatgptservice class should not deal with xkcd stuff. refactor the code so that anything xkcd is done in the xkcd classes instead. the responsibilities should be correct.

why does it need to do file io? i would like to avoid that if possible.

christolis · 2026-02-01T09:55:53Z

the chatgptservice class should not deal with xkcd stuff

@Zabuzard I agree with you on that. The createOrGetXkcdVectorStore method you saw introduced should really have been named createOrGetVectorStore as it generally does what it describes for any situation, it's just that the XKCD feature happens to be using it exclusively. Remember that as of writing this, it's still a draft pull request for a good reason. The code does what it's supposed to do, but I am not 100% satisfied with how it looks like, and further refactors are on the way. 👍

why does it need to do file io? i would like to avoid that if possible.

Could you clarify what you mean by that? Are you trying to avoid storing files completely? Or just using file.io in particular?

Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>

- ChatGptService: Refrain from polluting it with XKCD related calls, - ChatGptService: provide JavaDocs to the methods that don't have one, - ChatGptService: remove unused sendWebPrompt method - XkcdCommand and XkcdRetriever: Refactor code into functions for readability. Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>

Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>

christolis · 2026-02-01T20:21:27Z

@Zabuzard ChatGptService is now handling what it's responsible for and file.io is not used anymore. PTAL when you have time :)

application/src/main/java/org/togetherjava/tjbot/features/xkcd/XkcdCommand.java

application/src/main/java/org/togetherjava/tjbot/features/xkcd/XkcdRetriever.java

tj-wazei · 2026-02-01T20:43:44Z

application/src/main/java/org/togetherjava/tjbot/features/xkcd/XkcdCommand.java

+    private static String getChatgptRelevantPrompt(String discordChat) {
+        return """
+                <discord-chat>
+                %s


A single discord message could be 2000 characters. This will blow the context way out. Please handle this edgecase.

This will blow the context way out.

At what limit would the context be blown way out? 2000 characters is a limit imposed by Discord for chat messages, but I suspect that it's different when using OpenAI.

What I mean is let's say chitchat had 5 messages and all 5 where 2000 characters long. You're sending 10K characters to the ChatGPI API. Since there's no validation happening, 5 could end up being 2000 * MAXIMUM_MESSAGE_HISTORY and since that's set to 100, 200000 characters are added to your AI prompt.

This has 2 issues:

The ChatGPT API won't accept this and will error out

It could blow up billing

I understand this is an unlikely scenario but in general, your code needs absolute certainty.

Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>

Zabuzard · 2026-02-01T20:57:44Z

could u explain quickly why it does need to do file stuff in general? id like to avoid reading/writing files if possible.

config, database and in-memory cache should ideally be enough for everything we do.

this would be a first-time for this bot, making the setup more complex overall. so id like to avoid it but first id like to understand why we need files for this feature, cheers

christolis · 2026-02-01T21:01:15Z

could u explain quickly why it does need to do file stuff in general?

@Zabuzard We need to be able to locally reference information about XKCD comics in order to not oversaturate the XKCD API endpoint.

id like to avoid reading/writing files if possible.

One way we can avoid it would be to use the SQLite database we have and store all of the XKCD comics there. Would that be a solution that could work?

Zabuzard · 2026-02-01T21:08:51Z

Why do we need to store the comics instead of just posting a URL or downloading the content from the URL adhoc if really needed?

Like, pick the comic u want and then post the URL to it in the embedded / download the content from the URL, attach it to the embedded and that's it's.

I don't see why we need to store all comics on our side.

christolis · 2026-02-01T21:13:31Z

Why do we need to store the comics instead of just posting a URL or downloading the content from the URL adhoc if really needed?

Sure, for just displaying a comic in an embed, it's easy to just put the URL directly from the XKCD API, but OpenAI needs to have a vector store in order to be aware of all the posts so that it knows which one is more relevant when it gets asked. That's the reason this file is made in the first place, because it's uploaded as a vector store for RAG purposes.

christolis · 2026-02-01T21:22:14Z

(Comment deleted due to duplicate)

Zabuzard · 2026-02-02T07:37:41Z

application/src/main/java/org/togetherjava/tjbot/features/chatgpt/ChatGptService.java

+     * Creates a new vector store with the given file ID if none exists or returns the ID of the
+     * existing vector store with that name.
+     * <p>
+     * You can use this for RAG purposes, it is an effective way to give ChatGPT extra information
+     * from what it has been trained.


what is RAG, what is a vector store? these terms arent generally known to people. maybe drop a sentence or two somewhere to elaborate

tj-wazei · 2026-02-02T21:53:24Z

application/src/main/java/org/togetherjava/tjbot/features/xkcd/XkcdRetriever.java

+ * Posts are cached locally in {@value #SAVED_XKCD_PATH} as JSON and uploaded to OpenAI using the
+ * provided {@link ChatGptService} if not already present.
+ */
+public class XkcdRetriever {


Not really a "retriever" if it's uploading and saving to disk...

tj-wazei · 2026-02-02T21:54:33Z

application/src/main/java/org/togetherjava/tjbot/features/xkcd/XkcdRetriever.java

+    private final ChatGptService chatGptService;
+    private String xkcdUploadedFileId;
+
+    public XkcdRetriever(ChatGptService chatGptService) {


This constructor is doing more than just constructing. I feel like there's too much business logic in here and you should split it out.

A constructor should not be interacting with ChatGpt or uploading files. Should just construct.

tj-wazei · 2026-02-02T22:01:12Z

application/src/main/java/org/togetherjava/tjbot/features/xkcd/XkcdRetriever.java

+        return xkcdPosts;
+    }
+
+    private void fetchAllXkcdPosts(Path savedXckdsPath) {


I've been thinking about this function and honestly, it's so complicated for no reason.

You’ve got three different concurrency controls layered on top of each other:

newFixedThreadPool(FETCH_XCKD_POSTS_POOL_SIZE)

A Semaphore(FETCH_XKCD_POSTS_SEMAPHORE_SIZE)

Thread.sleep(...) throttling/rate limit whatever

In addition to CompletableFuture.runAsync and blocking calls join(), sleep().

Platform threads are so expensive.

So let's use some project loom here and switch this entire thing to use Virtual Threads. This way, you don’t need the extra logic. Only keep the Semaphore if we're actually going to get rate limited.

This could all just be:

try (Executor executor = Executors.newVirtualThreadPerTaskExecutor()) { List<Future<?>> futures = IntegerRange.of(1, XKCD_POSTS_AMOUNT) .toIntStream() .filter(id -> id != 404) .mapToObj(id -> executor.submit(() -> { retrieveXkcdPost(id).join().ifPresent(p -> xkcdPosts.put(id, p)); Thread.sleep(FETCH_XKCD_POSTS_THREAD_SLEEP_MS); })) .toList(); for (Future<?> f : futures) { f.get(); } }

and if you really need the semaphore, add it ot the mapToObj

xkcd: create XKCD comic generator

aaf564a

Use RAG with ChatGPT to store all XKCD comics and display the correct one based on the chat history. There is also the possibility to specify your own XKCD if you wish. Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>

christolis added enhancement New feature or request new command Add a new command or group of commands to the bot java Pull requests that update java code labels Jan 31, 2026

Zabuzard requested changes Jan 31, 2026

View reviewed changes

christolis added 4 commits February 1, 2026 21:07

Rename to createOrGetVectorStore

c522fdc

Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>

refactor(ChatGptService): use java.nio library for file checking

1ac4e67

Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>

XkcdRetriever: make more methods private and add JavaDocs

dd4c251

Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>

christolis marked this pull request as ready for review February 1, 2026 20:21

christolis requested a review from a team as a code owner February 1, 2026 20:21

tj-wazei requested changes Feb 1, 2026

View reviewed changes

tj-wazei reviewed Feb 1, 2026

View reviewed changes

Reduce amount of public declarations and organize them

d4323eb

Signed-off-by: Chris Sdogkos <work@chris-sdogkos.com>

Together-Java deleted a comment from surajkumar Feb 1, 2026

Zabuzard requested changes Feb 2, 2026

View reviewed changes

tj-wazei requested changes Feb 2, 2026

View reviewed changes

Uh oh!

XKCD: Create XKCD comic generator #1404

Are you sure you want to change the base?

XKCD: Create XKCD comic generator #1404

Uh oh!

Conversation

christolis commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

How

Why

Preview

Uh oh!

Zabuzard left a comment

Choose a reason for hiding this comment

Uh oh!

christolis commented Feb 1, 2026

Uh oh!

christolis commented Feb 1, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tj-wazei Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

christolis Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

surajkumar Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Zabuzard commented Feb 1, 2026

Uh oh!

christolis commented Feb 1, 2026

Uh oh!

Zabuzard commented Feb 1, 2026

Uh oh!

christolis commented Feb 1, 2026

Uh oh!

christolis commented Feb 1, 2026

Uh oh!

Zabuzard Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tj-wazei Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

tj-wazei Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

tj-wazei Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

christolis commented Jan 31, 2026 •

edited

Loading

Zabuzard Feb 2, 2026 •

edited

Loading