Zacalot.XYZ

Remi

Github

A simple virtual assistant made with Python to experiment with using GPT3 (and because I always wanted to work on one). At its core the project uses deepspeech for detecting the “Remi” listening activation trigger, whisper for voice to text, eSpeak for text to voice, and openai-python for querying GP3, which is used for converting natural language voice inputs into commands. Some other libraries were also used for implementing commands for the assistant to perform, such as typing text or sending a message through a discord bot.

It was amazing how easy GPT3 made implementing new commands, as all that had to be done was to give the AI some examples of the command’s usage and from there it generally can figure out the user’s intent. For example, this allows the AI to easily understand that it is capable of typing out text:

def type_text(text):
	"""Type out hello my friends
type_text("hello my friends")
Type out the depth of the mariana trench
type_text(lookup_information("the depth of the mariana trench"))"""
	remi_interfacing.type_text(text)
	return "Typed: "+text

From thereon, GPT3 now understands that for any talk from the user about typing something, it knows to use the type_text() function. Once several of these commands are added, GPT3 gets a better understanding of the Pythonic function-call output structure that gets parsed into execution, and so it becomes capable of chaining commands by nesting their function calls as arguments, as I showed it how to do in that example. So the really neat part is that once more of these examples are given, GPT3 figures out how to nest commands without any explicit examples:

def get_clipboard():
	"""Get my clipboard
get_clipboard()"""
	return pyperclip.paste()

Because of my previous commands containing examples of nested functions, GPT3 understands that this function can also be used to pass the clipboard into other functions. So for example, GPT3 can then be told to “Type out the contents of my clipboard” in order to generate the chained command type_text(get_clipboard()), accomplishing the requested task. From what I tested, GPT3 was extremely reliable at generating the executing my intended commands, even when the requests are made vaguely and without the precise language used to define the commands in the provided examples. However, I’m not sure if this accuracy would be maintained if more commands were added, since I didn’t add many to this project. Besides this, one downside is that adding more commands increases the token count of the input by a fair amount, although this may be avoidable since it’s apparently possible to use fine-tuning to “embed” commands into the model in order to avoid re-entering them for each prompt, though I haven’t looked into this much yet myself. Regardless I was amazed by GPT3 and this undertaking made sense of the ongoing craze to integrate AI into everything, since using GPT3 in this project felt a lot like wielding magic.