Craftsmanship Notes: June 2025

May was a very intense and productive, if not always positive, month. I finished a series of project that took me almost a year, was able to restart some of my habits and had some vacations (with mixed results). Here's a summary of what happened.

Achievements

LicLacMoe

LicLacMoe is a desktop application to play tic-tac-toe against a local LLM model. It is the fourth and final project in my First Steps Into AI Engineering series. I have written a blog post about it, Project: LicLacMoe. This was a much more laid back and casual project than the one before it (Local Language Practice, LLP), and I both had fun and achieved some cool insights while developing it. These were its releases in May:

1.0.0: Initial version, containing the basics to play matches.
1.0.1: AI player chooses move asynchronously.
1.1.0: Support for reasoning models and verbose mode to log full response from model.

Finished First Steps Into AI Engineering series

With the release of LicLacMoe, I finish the scope I had in mind for my First Steps Into AI Engineering series. I started planning and implementing this series way back in August of 2024, so it took me almost an year to finish it. That was somewhat surprising, I initially thought it would take only a few months, around 4 or so. Despite taking longer than expected, it was a thoroughly enlightening project, and I enjoyed each part of it. I feel like it gave me a much stronger grounding on working with generative AI models, and paved the way for much more ambitious projects in the future.

In this series, I focused on writing most of the code myself, while avoiding popular frameworks focused on interacting with generative AI models. Now that the goal was accomplished, I feel comfortable to start using frameworks, without feeling like I am too dependent on them, or treat them as a magic black box.

Started studying SpringAI

As a result of what I just mentioned, I started exploring one of the popular frameworks for generative AI development: SpringAI. It is part of an ecosystem that I am very familiar with, so it seems like a logical next step. I will experiment and try to create a few projects with it, before also trying out other Java frameworks and the ecosystem of other programming languages.

Going forward, I expect to shift my time balance to once again invest more time in studying courses and less time on building projects. In the past, I leaned completely towards studying courses and almost never built anything by myself, while since around the first months of 2024 I shifted to exclusively invest my time in building stuff - now it is the time to balance both things at the same time. I know for sure I will not stop building new projects, as I have a huge backlog of ideas (even if I get no more inspiration for the next 5 years, I think I should have enough to not stop working on them).

Restarted personal studies

Another nice point of the month was getting back on track with my personal studies habit. I had paused my career studies in April, and had not done any hobby studies since December of last year. In May, I was able to pick both back up.

I picked some very short books and papers, which allowed me to move fast and achieve some accomplishments quickly. This was a very good morale boost!

For hobby studies, I started and finished the paper Future Brains (purely for intellectual curiosity), while for career studies I started the book Building Successful Communities Of Practice, which was useful for my current job.

Downpoints

I had almost no down points in my personal space. Everything went smoothly, I was able to both finish long standing projects and restart some of the things I enjoy.

However, professionally it was a challenging time. Even though I took some days off, I ended up having to work on a few of them, and in general several changes that happened were done in way that I disagree with, which gave me some frustration. Just a reminder that not everyday is a perfect day.

Plans for next month

TDC

June will have the second edition of The Developers Conference in 2025. While I have been less motivated with the event since its change to the "AI Summit" format, this one should be good as it will be the first one of the year in the full, 3-day long, several-tracks format. It will happen in Florianópolis, and so I plan to attend it remotely.

Deeper exploration of AI Engineering development

With the First Steps Into AI Engineering series complete, in June I plan to start going deeper into some AI Engineering projects and ideas. I don't have anything I can share right now, but I have plenty that I expect to get done along the year, and I will write about it as I finish each project.

Description

LicLacMoe is a desktop application that allows you to play tic-tac-toe against local Large Language Models (LLMs).

I released the first official version in May 4th, 2025. The name, of course, is a word play with tic-tac-toe, replacing each first letter with the initials of "large language model".

The application assumes you have an LLM server running on your machine (this is a deliberate choice), by default on port 8080 but that is configurable. It presents the player with a visual tic-tac-toe grid that can be used for playing - once the player makes their move, a call to the local LLM server is made with the current state of the match, so that the LLM can pick the next move of the AI opponent. The entire interaction with the LLM is through playing, with no conversational interface.

I developed LicLacMoe as a way to explore using LLMs in a way that does not involve any conversation between the user and the AI system. Chatbots have become almost synonymous with LLMs, in large part due to how they were popularized, so it was an interesting experiment to use them in a completely different manner.

Context

I developed LicLacMoe as I was exploring how to build systems using generative AI technologies for the first time. I blogged about the set of 4 projects that came out of this exercise in my post entitled "First Steps Into AI Engineering". LicLacMoe was the fourth and last of these, and the most purely exploratory one.

As described in the post just mentioned, for LicLacMoe I wanted to write most of the code myself, without relying too much on frameworks, so as to get a better feeling about working with these models. It is also intentional to explore only open and local AI models, as it is my intention to find out how far can we go working only with models that can be fully personal and owned by its users.

Highlights

Interacting with LLMs without chatting

LLMs have caught our full attention due to their uncanny ability to behave like a human being in a conversation. However, the big question on everyone's minds was if the models actually have some degree of reasoning intelligence, or if they are just really good at reproducing our patterns of communication (of course, it must be mentioned the obvious philosophical question: "could it be that there is no difference?"). To a certain degree, the appearance of reasoning models, and the current trend of agentic AI, have shown that LLMs can definitely be exploited for some amount of reasoning intelligence, but at large scales the question still remains. My first intent with creating an application that uses LLMs without any chat interface was to see how it would feel to use LLMs purely as a source of thinking, without any verbal communication. While the long time it takes for it to generate an answer can be a bit frustrating, overall it was a positive experience - it is a really, really weird way of interacting with a computer system.

My second intent was to just get used to incorporating LLM answers in a bigger system, as part of the User Interface. I honestly think chatting (especially when you have to type long messages) is not the best interface for any complex computer system, far from that. In order to make full use of the potential that generative AI systems have, we must learn to incorporate them seamlessly into our flows, and that includes into our computerized applications. This was just a first step in this direction, I have several other ideas I want to explore further with regards to this.

Not needing to code game rules and strategy

As mentioned previously, using LLMs for pure intelligence is a really weird experience. One of the weirdest parts was that, in order to implement LicLacMoe, I did not have to implement a strategy that knew the rules of tic-tac-toe at all. I still implemented the logic of the game in order to verify the end result of matches, but I think with a little more development time I could have even replaced that with well-crafted prompts.

I am sure that this was in big part due to tic-tac-toe being an extremely simple and popular game. It is reasonable to assume that most (if not all) models will have seen enough examples of matches and descriptions of the game to be have memorized a pretty good understanding of how to play it. The same would most likely not be the case for more complex games - I find it very interesting to think about how complex of a game is it possible to teach LLMs simply by feeding it enough cases.

Regardless, it felt very odd to rely on a system that "just knew" the rules, and to which I could just feed the current state of the match and it would produce a next move. Of course, it would not always be a valid move (error handling and retry policies were more essential in here than in any other LLM-based system I have implemented so far), nor a particularly brilliant one. But even small models would consistently give something workable in a reasonable amount of time (and retries).

Reasoning vs non-reasoning models

This leads into the final interesting note. While testing the application, I found that non-reasoning models would mostly generate moves that looked a bit random, and could very easily be defeated. I had to make some changes to the logic of parsing the answer from the LLM to support using reasoning models - however, changing to these models drastically improved the performance of the AI player. I tested it with Qwen 3, 8B parameters, 8 bit quantization - a rather small model as far as LLMs go. In comparison, the non-reasoning model I used was Gemma 3, 27B parameters, 8 bit quantization, a model more than 3 times the size. While I have never been a huge fan of reasoning models (to my common use cases they usually don't offer too much improvement, and are considerably slower), in this particular case it was easy to see the value that such models bring.

Future Expansions

Benchmark of performances

As mentioned before, while testing the application I used a non-reasoning 27B parameters models (which had bad performance) and an 8B reasoning model (with significantly better performance). One thing I would like to do, if I ever have the time to, is make a more comprehensive list of the performance for several models of different families and sizes. I would be especially interested in seeing how small in size we could go with a reasoning model and still have it able to avoid defeat in most matches. I would be pleasantly surprised if this is possible with a model smaller than 4B.

Induce reasoning for non-reasoning models

Another interesting exploration would be to craft the base prompt so that even non-reasoning models would think about the current state first before choosing a move. This is easily done with very popular techniques to force step-by-step thinking. It would involve changing the base prompt and possibly the parsing of the response as well. This could then be compared with the improvement in performance gained when switching to a reasoning model, to see if the training that these models receive in reasoning actually gives them an advantage or not.

Model vs model

Finally, the last extension I might make is to change the game to support AI vs AI mode, with two LLMs playing against each other. This could then allow tournaments to be played, and metrics to be gathered as to which models performs better against each other. It would be a nice and fun addition, but it probably won't be a priority any time soon for me.

Setup

Although LicLacMoe is not one of my portfolio projects (which have a fixed set of quality standards I expect to maintain through their entire lifecycle), I did configure most of the foundations I use for those.

I use Github Actions to generate a new release for LicLacMoe whenever new code is pushed into the main branch and alters relevant core files of the project.

I have a changelog file, a file with guidelines about contributing and an architecture.md file (an idea I adapted from this great article), beyond the usual readme file as documentation.

As this project was done in an exploratory and proof-of-concept approach, I did not include automated tests. This is the main point of departure with regards to the quality standards I expect from my portfolio projects.

Links

Source code: Github

Executable: Releases

Craftsmanship Notes

Monday, June 30, 2025

Monthly Recap - 2025-05 May