So after a lot of research and painful hours of coding here's the new update!
What's new :
-
Fixed previous bugs.
-
Switched to a much lighter model (llama3.2) for conversations, RAG and function calling
-
A whole lot of commands
-
Reinforcement Learning (Q-learning).
-
Theortically Multiplayer compatible (just install the dependencies on server side), as carpet mod can run on servers too, but I have not tested it yet. Feedback is welcome from testers on this.
-
Theoretically the mod should not require everyone to install it on multiplayer, it should be a server-sided one, haven't tested this one yet, feedback is welcome from testers.
Bot can now interact with it's envrionment based on "triggers" and then learn about it's situation and try to adapt.
The learning process is not short, don't expect the bot to learn how to deal with a situation very quickly, in fact if you want intelligent results, you may need hours of training(Something I will focus on once I fix some more bugs, add some more triggers and get this version out of the alpha stage)
To start the learning process:
/bot spawn <botName> training
Right now the bot only reacts to hostile mobs around it, will add more "triggers" so that the bot responds to more scenarios and learns how to deal with such scenarios in upcoming updates
A video on how the bot learns and what's new in this patch
# New commands :
Spawn command changed.
/bot spawn <bot> <mode: training or play>
, if you type anything else in the mode parameter you will get a message in chat showing the correct usage of this command
/bot use-key <W,S, A, D, LSHIFT, SPRINT, UNSNEAK, UNSPRINT> <bot>
/bot release-all-keys <bot> <botName>
/bot look <north, south, east, west>
/bot detectDangerZone
// Detects lava pools and cliffs nearby
/bot getHotBarItems
// returns a list of the items in it's hotbar
/bot getSelectedItem
// gets the currently selected item
/bot getHungerLevel
// gets it's hunger levels
/bot getOxygenLevel
// gets the oxygen level of the bot
/bot equipArmor
// gets the bot to put on the best armor in it's inventory
/bot removeArmor
// work in progress.
Current bugs in this version :
- If the bot dies or is killed off while it was engaged in an action, the code might throw an error on the bot respawn. Temporary fix is a game restart (or server restart).
Will be fixed in upcoming patch.
- The
removeArmor
sub-command doesn't work (yet).
What to do to setup this version before playing the game :
1. Make sure you still have ollama installed.
2. In cmd or terminal type `ollama pull nomic-embed-text (if not already done).
3. Type `ollama pull llama3.2`
4. Type `ollama rm gemma2 (if you still have it installed)
5. Type `ollama rm llama2 (if you still have it installed)
6. If you have run the mod before go to your .minecraft folder, navigate to a folder called config, and delete a file called settings.json5
Then make sure you have turned on ollama server. After that launch the game.
Type /configMan
in chat and select llama3.2 as the language model, then hit save and exit.
Then type /bot spawn <yourBotName> <training (for training mode, this mode won't connect to language model) and play (for normal usage)
For the nerds : How does the bot learn?
It uses an algorithm called Q-learning which is a part of reinforcement learning.
A very good video explanation on what Q-learning is :
Changelog description
This is the final hotfix version of version 1.0.2 that addresses a lot of broken code, and changing of the mod's workflow at the core.
In this hotfix, once again, a lot of the code has changed.
I will keep it simple in the changelog this time.
Fixed
GUI issue in config manager which didn't show the currently selected language model[FIXED].The bot was not responding to normal conversations[FIXED].
Even improved accuracy with the gemma2 model.
For this patch, users will need to:
1. Go to your game folder (.minecraft)/config and you will find a settings.json5 file.
Delete that
2.(If you have run the previous 1.0.2 version already then) again go back to your .minecraft. you will find a folder called "sqlite_databases". Inside that is a file called memory_agent.db
3. Delete that as well.
4. Install the gemma2 model (ollama pull gemma2) [Required]
Users can choose to keep the llama2 or mistral models and test each other's efficiency.
However the mod has a 100% chance of breaking when llama2 and mistral is used because of the responses they generate.
From a technical perspective
The breaking of the mod happens llama2 and mistral generate very large initial responses whose vector embeddings cannot be generated and thus the mod throws an exception.
Gemma2 however keeps it precise and to the point, thus resulting in a lot shorter response.
Gemma2 also has much more accurate information about minecraft data, like crafting recipes, biomes, mobs, etc till 1.20.5 presumably, as the test results indicate, as compared to llama2 or mistral.
A small update to the version 1.0.2
Thanks to Mr. Álvaro Carvalho, who suggested a few enhancements to the mod regarding the commands system and some inner code changes.
Also this version's icon is fixed and will be recognized by modmanager.
Once again the setup instructions for existing users
1. Go to your game folder (.minecraft)/config and you will find a settings.json5 file.
Delete that
2.(If you have run the previous 1.0.2 version already then) again go back to your .minecraft. you will find a folder called "sqlite_databases". Inside that is a file called memory_agent.db
3. Delete that as well.
4. Install the models, mistral, llama2 and nomic-embed-text
5.Then run the game
6. Inside the game run /configMan to set the language model to llama2.
Then spawn the bot and start talking!
patch 1.0.3 will include backwards compatibility for versions 1.20, 1.20.1 and 1.20.3 and other features as mentioned earlier!
For the general player base.
This patch fixed a major issue where the llama2 model fails majorly at misclassification of the user prompt of a certain type. That part has been taken up by mistral instead.
Aaaand after 2 weeks of a lot of coffee, back pain and brain ache, here's the new update.
At first it may seem that this version doesn't do much, but keep talking with the bot and you will find out :)
To-do: Install three models in ollama. [VERY IMPORTANT]
- nomic-embed-text (ollama pull nomic-embed-text)
- llama2 (ollama pull llama2)
- mistral (ollama pull mistral)
And go to your game folder (.minecraft/config) and delete the old settings.json5 file.
In game, change your selected language model to llama2 for the best performance.
TLDR: The bot is way more intelligent now, can remember past conversations, can store current conversations and then use the data for long term storage, has gotten better at function calling like movement and block detection, no fixed responses, no need of fixed prompt checking, it all happens,
Also the connection to the language model is only made when the bot is spawned in game, thus saving a lot of memory when the bot is not in the game.
Eventually I intend to bring support for custom models on the ollama database.
dynamically, hehe.
Next patch will include more minecraft interaction, such as mining, improved movement, collision/obstacle detection, etc.
Accuracy for detecting correctly what the player wants to say: somewhere between 95-99%
For the nerds.
So, for the tech savvy people, I have implemented the following features.
LONG TERM MEMORY: This mod now features concepts used in the field AI like Natural Language Processing (much better now) and something called
Retrieval Augmented Generation (RAG).
How does it work?
Well:
We convert the user input, to a set of vector embeddings which is a list of numbers.
Then physics 101!
A vector is a representation of 3 coordinates in the XYZ plane. It has two parts, a direction and a magnitude.
If you have two vectors, you can check their similarity by checking the angle between them.
The closer the vectors are to each other, the more similar they are!
Now if you have two sentences, converted to vectors, you can find out whether they are similar to each other using this process.
In this particular instance I have used a method called cosine similarity
Where you find the similarity using the formula
(x, y) = x . y / |x| . |y|
where |x| and |y| are the magnitudes of the vectors.
So we use this technique to fetch a bunch of stored conversation and event data from an SQL database, generate their vector embeddings, and then run that against the user's prompt. We get then further sort on the basis on let's say timestamps and we get the most relevant conversation for what the player said.
Pair this with function calling. Which combines Natural Language processing to understand what the player wants the bot to do, then call a pre-coded method, for example movement and block check, to get the bot to do the task.
Save this data, i.e what the bot did just now to the database and you get even more improved memory!
To top it all off, Llama 2 is the best performing model for this mod right now, so I will suggest y'all to use llama2.
In fact some of the methods won't even run without llama2 like the RAG for example so it's a must.
For the general player base.
Aaaand after 2 weeks of a lot of coffee, back pain and brain ache, here's the new update.
At first it may seem that this version doesn't do much, but keep talking with the bot and you will find out :)
To-do: Install two models in ollama. [VERY IMPORTANT]
- nomic-embed-text (ollama pull nomic-embed-text)
- llama2 (ollama pull llama2)
In game, change your selected language model to llama2 for the best performance.
TLDR: The bot is way more intelligent now, can remember past conversations, can store current conversations and then use the data for long term storage, has gotten better at function calling like movement and block detection, no fixed responses, no need of fixed prompt checking, it all happens,
Also the connection to the language model is only made when the bot is spawned in game, thus saving a lot of memory when the bot is not in the game.
Eventually I intend to bring support for custom models on the ollama database.
dynamically, hehe.
Next patch will include more minecraft interaction, such as mining, improved movement, collision/obstacle detection, etc.
For the nerds.
So, for the tech savvy people, I have implemented the following features.
LONG TERM MEMORY: This mod now features concepts used in the field AI like Natural Language Processing (much better now) and something called
Retrieval Augmented Generation (RAG).
How does it work?
Well:
We convert the user input, to a set of vector embeddings which is a list of numbers.
Then physics 101!
A vector is a representation of 3 coordinates in the XYZ plane. It has two parts, a direction and a magnitude.
If you have two vectors, you can check their similarity by checking the angle between them.
The closer the vectors are to each other, the more similar they are!
Now if you have two sentences, converted to vectors, you can find out whether they are similar to each other using this process.
In this particular instance I have used a method called cosine similarity
Where you find the similarity using the formula
(x, y) = x . y / |x| . |y|
where |x| and |y| are the magnitudes of the vectors.
So we use this technique to fetch a bunch of stored conversation and event data from an SQL database, generate their vector embeddings, and then run that against the user's prompt. We get then further sort on the basis on let's say timestamps and we get the most relevant conversation for what the player said.
Pair this with function calling. Which combines Natural Language processing to understand what the player wants the bot to do, then call a pre-coded method, for example movement and block check, to get the bot to do the task.
Save this data, i.e what the bot did just now to the database and you get even more improved memory!
To top it all off, Llama 2 is the best performing model for this mod right now, so I will suggest y'all to use llama2.
In fact some of the methods won't even run without llama2 like the RAG for example so it's a must.
This alpha version of version 1.0.1 includes a lot of content. Well atleast in terms of code it does.
Steve (or the bot/) can now understand what users saying a lot better by using a process called "Natural Language Processing!"
This Natural language processing has been implemented for the sake of getting the in-game bot to execute actions based on what the user says, it is separate from the language model. However I intend to achieve a sort of "synchronisation" which will keep the language model informed of what is going on in-game.
Also I have added block and nearby entity detection.
The bot can now detect if there is a block in front of it, simply ask it using the sendAMessage command!
There's 99% chance that the bot will understand the intention and context of your message. Hope to achieve 100% soon.
Initially when you spawn the bot, it takes a minute or so for it to face you correctly, since it's spawn position will be fixed by the game by teleporting it somewhere near you after spawning is done (This is not the same as seen in game when you spawn the bot and instantly teleports to you).
After that the bot will accurately face the nearest entity in front of it. The range is 5 blocks in the X, Y and Z axes. To make it more realistic, I have made it so that the bot can't detect entites behind it.
Check github for the sample footage of this version.
Click on the view source button, keep scrolling till you find the XZ pathfinding video.
This version of the mod includes a pathfinding algorithm on the XZ axis for the bot.
It is recommended to test the pathfinding on a superflat world with no mobs spawning.
Try to keep the destination coordinates relatively under or equals to 50 on the XZ axes, any more than that will take some time to calculate and the game might appear frozen, hence the tag alpha given to this version
Also the commands system has been fully revamped
I uploaded this alpha version just to prove that I did not give up on this project lol.
Expect periodic releases.
Main command
/bot
Sub commands:
spawm <botName>
This command is used to spawn a bot with the desired name. ==For testing purposes, please keep the bot name to Steve==.
walk <botName> <till>
This command will make the bot walk forward for a specific amount of seconds.
/goTo <botName> <x> <y> <z>
This command is supposed to make the bot go to the specified co-ordinates, by finding the shortest path to it. It is still a work in progress as of the moment.
/sendAMessage <botName> <message>
This command will help you to talk to the bot.
/teleportForward <botName>
This command will teleport the bot forward by 1 positive block
/testChatMessage <botName>
A test command to make sure that the bot can send messages.
Example Usage:
/bot spawn Steve