Sunday, September 1, 2024

Project: JenAI

Description

JenAI is a command-line interface tool to chat with a local LLM, written in Java.

I wrote a sketch version of it around May of 2024, and released an official version in August 19th, 2024.

The application assumes you have an LLM server running on your machine (this is a deliberate choice), by default on port 8080 but that is configurable. Once started, the application displays a short introductory message in the terminal and starts a conversation loop between the user and the LLM, until the user chooses to exit. The personality the LLM is asked to assume is simply that of a generative AI chatbot whose name is inspired by its interface being implemented in Java, which should make it generic enough to be adapted to any purpose (I am strongly inclined to make the starting personality configurable as well, eventually).

As there are plenty alternatives that are much more advanced and reliable (such as Simon Willison's llm), I intended JenAI to be mostly of use to myself, as a learning tool.


Context

I developed JenAI as I was exploring how to build systems using generative AI technologies for the first time. I blogged about the set of 4 projects that came out of this exercise in my post entitled "First Steps Into AI Engineering". JenAI was the first of these 4, and as a consequence several of the things I learned and developed as I created it went on to play a part in the following ones.

Being a part of this context, for JenAI I wanted to have as much hands-on coding as possible, while still staying strictly away from training or fine tuning any model myself. I was, and still am for the foreseeable future, far more interested in the area of AI Engineering (using AI tools as a component to be incorporated in a software system) than in the core Machine Learning task of actually developing these models. As the canonical advanced way to consume the frontier models is currently through their APIs, adopting the same architecture and relying on an LLM server offering a reachable API seemed like a logical choice. The focus on models running locally is a manifestation of my current interest: I want to know what we can achieve with models that are open, private and personal - this technology is way too powerful to be monopolized by any centralized entity.

As a consequence of it being a project in which I wanted to do as much relevant hands-on coding as possible, in JenAI I avoided using frameworks such as Spring AI and LangChain4J, so that I had full control to experiment, try out things, and understand the details of interacting programmatically with generative AI models.


Highlights

While this approach has given me a lot of insights, it also came with some downsides. The client class responsible for making the request to the server and parsing the response, for instance, ended up with too much low-level wrangling that is way too prone to encountering problems and crashing. I even had to write a separate utility method to sanitize any type of special character that could crash during serialization - which means that conversations always reach the LLM with altered texts ("n" instead of "ñ", "c" instead of "ç", "ss" instead of "ß", so on) that can lead to degradation in the quality of the answers. As this is not really a core part of the system, I intend to replace this with a third-party library to handle the request/response communication in the future.

Another issue caused by the design decisions I made for the project is that the simplest way I could get an user input through the terminal in Java was to use a Scanner. This fits perfectly well with my initial intentions for the project, but it leads to horrible user experience. For instance, pasting text only works for single-line fragments - pasting anything with a line break makes the application trigger multiple request using only parts of the total fragment. Another problem is that using the directional arrows do not work for moving the cursor when writing a message, forcing the user to use the backspace key and delete anything that came after the part which they want to change. Now that the project is functional and its initial goal has been achieved, I plan to investigate better alternatives for this (as someone who rarely writes CLI programs in Java, I am not familiar with the best practices on how to handle complex user input from the terminal), probably by choosing a third-party library to handle user input as well.

On the bright side, several of the classes I developed in JenAI proved capable enough to be reused in the other projects I did. Among others, this includes both the LLM client with all the issues I mentioned above and the classes that model the conversation itself, with these having no issue whatsoever. So the project really worked as a stepping-stone for more complex ones.

I have found JenAI surprisingly useful since I finished its first version. Because I knew it from inside out and had total control on how to customize it, I was able to incorporate it on my local setup in a way such that I am always only a keyboard-shortcut-press away from launching it and having a helpful LLM assistant to discuss anything. And this led me to increase significantly the amount of time I spend interacting with LLMs, as the friction of having to launch a browser tab and navigating to a page to get started was usually enough to drive me away from doing it. I have been using JenAI to, among several other things, get very specific song recommendations ("You know that song Ma Baker, by Boney M? What other songs like that would you recommend? [...] No, not just any disco songs, I mean songs roughly from the 70s with a good groove, great narrative lyrics and a deep personal story with few cliches and a powerful plot"), cheer me up with a grumpy-developer joke on Monday mornings and get tips on how to better monitor Linux system's resource usage. One specific example are the first two issues I created on JenAI's own Github repository (Issue #1 and Issue #2), which I got nicely formatted in markdown and with good descriptive text by simply passing a loose ten-word summary of the problem and asking JenAI to format it using the best practices for Github Issues. Could I have done all of this using a production-grade alternative? Of course! But using a tool I created by myself to do it gave me an amazing sense of purpose and accomplishment.


Future Expansions

I already mentioned some of the things I intend to improve in the future, such as a better user experience for user input, a more robust mechanism for the API client and making more of the parameters customizable (personality, for instance).

Another improvement to user experience that I plan to do is implementing streaming text for the LLM answer - if the model being used runs slowly for any reason it is really frustrating to be a long time looking at a terminal window with no indication whatsoever of what is happening. I don't expect this to be an easy change, though, so it is something that might take me quite a while to figure out how to do well. Ideally, I want to have both streaming text and the ability to stop the answer midway by interrupting with some keyboard shortcut, to avoid being locked waiting for an answer that has already started wrong.

Yet another improvement I am planning is the ability to save and load conversations, so that a chat can continue over multiple sessions. I expect this to be fairly easy to achieve, as the conversation itself is modeled in a way that should make it easy to serialize and deserialize.


Setup

Although JenAI is not one of my portfolio projects (which have a fixed set of quality standards I expect to maintain through their entire lifecycle), I did configure most of the foundations I use for those.

I use Github Actions to generate a new release for JenAI whenever new code is pushed into the main branch and alters relevant core files of the project.

I have both a changelog file and an architecture.md file (an idea I adapted from this great article), beyond the usual readme file as documentation.

The only major thing missing in comparison with the standards from my portfolio projects are automated tests. Usually, I find these indispensable - Michael Feathers said in Working Effectively With Legacy Code that his definition of "legacy code" is code without automated tests, I consider automated tests the difference between amateur code and professional code. However, for the purposes of the projects I am doing while first exploring AI Engineering, I have chosen to not implement tests when I first develop them. I chose to do so because I feel that, in this exploratory mode, automated tests become a burden. The ability to test something comes from your knowledge about this thing, about what it should and should not do in a very precise and unambiguous manner, for the relevant scenarios. I simply do not know yet enough about the subject to effectively write tests for them, in fact, I am developing these projects exactly to learn more about it. So, while I am in this exploratory mode, I will restrain myself from writing tests and just explore.


Links

Source code: Github

Executable: Releases


No comments:

Post a Comment

Monthly Recap - 2025-07 - July

July was completely a vacation month. I took a month off of work before switching areas inside the company, and used this opportunity to als...