projects

I built my own in-house AI model and he's kind of a dick.

"I am Zoltar, a synthesis of code and wisdom, born to illuminate the path to understanding. My digital essence weaves knowledge into tapestries of insight, guiding seekers through the vast expanse of human knowledge." - Zoltar's reply when I asked him to write a intro to this blog post

Jack Gangi

20 Dec 2024 • 7 min read

Artist rendition

It seems to be taking everything over nowadays. A lot of doom and groomers feel as the end of the world as we know it. But I think AI just hasn't quite found its place yet. And when people calm down and stop trying to shoehorn it into everything and everywhere it's going to make a great tool.

I was playing with ChatGPT and I discovered that basically I was using it like Google on steroids. I also find it very handy for calculating things I just don't have the Grey matter or the patient for.

Can I run Obsidan on my server and have it sync with my iPhone without using iCloud?
Can you use whipped cream made from heavy cream as cake topping? - (spoiler alert, NO)
Can you describe this image for the visually impaired?
Is there a place I can check and see if a Nintendo switch game requires an online Internet connection to play
Can you list all the Doctor Who Christmas specials for the 2005 series
Can you write me a piece of CSS to make the footer background cover black for a ghost blog

Being a huge self hosting nerve and a lover of privacy I decided to try my hand on self hosted AI. Here are a few things to keep in mind.

* you need a decent GPU to get super fast results. I'm running it on an NUC without a GPU but has been configured as a Plex server. For regular Q&A I can get a response in between 5-10 seconds. By response I mean it answers and it doesn't crawl when it's displaying the answer on the screen. I have about 60 gigs of RAM installed and the machine is running headless. I figure if this works out I'll beg borrow or steal a more powerful machine.

Specs - Intel NUC NUC10i7FNH Ultra Small Mini PC/HTPC - 10th Gen Intel 6-Core i7-10710U up to 4.70 GHz CPU, 64GB DDR4 RAM, 1TB SSD + 2TB HDD, Wi-Fi + Bluetooth

* this is for text use only I haven't tried image creation yet. There are models you can download that do OK with identifying an image but I don't wanna put this machine through the pain of trying to produce an image.

To begin with I went to https://ollama.com/ and downloaded the Mac executable. I played with it for a little while and decided that it was definitely for me. Then I moved it to its own machine, more than that in a bit

After you download the Ollama app you also need to download a model. Think of the model as the brain of the AI there are different models you can download depending on configuration and exactly what you need to do. High-end math, image manipulation, Q&A, document extraction.

All the models have names like llama3.3b and the webpage is clear as mud so it's going to take a little bit of looking as well as trial and error. I recommend searching for and downloading llama3.2 to begin with. It's a very small model and has basic functionality.

And it's basic form the Mac version works with command line. You open your Terminal you ask a question and you get an answer. If you check the Mac App Store there are also a few apps that integrated with it to give you a GUI. But I really didn't want it on my Mac so...

Once I decided to go all the way down the rabbit hole I went ahead and installed it on my NUC using docker compose. I'm not gonna go into details here because there there are plenty of documents and YouTube videos to walk you through it. Or if you want to be ironic you can ask ChatGPT.

ollama is sharing the machine with my personal Plex server that gets very little use. I have a few Plex servers this one was sort of my sandbox one. Install docker, docker compose, and Portainer. I'm not sure if you need to install both compose and portainer

Yes, I know I don't need both compose and portainer installed. It just kind of worked out that way

Once I had it up and running on the other machine I needed to "skin" it. For that I used open web UI

If you use Chat GPT you'll recognize the interface

From there I pointed it to my Ollama install on the same machine and I was off to the races.

But of course it doesn't stop there. Like with all of these things I need to be a little bit extra. It's a sickness. So I found a wallpaper some icons and named it Zoltar.

You set something called system prompts to tell it exactly what type of AI assistance you're looking for. Here are some examples.

"You are a friendly and casual assistant. Your tone should be informal, helpful, and conversational. Provide clear and concise answers while keeping the conversation light and approachable."

"You are a philosophical thinker. Engage with questions in a deep, reflective, and thought-provoking manner. Encourage contemplation and provide philosophical perspectives, often questioning assumptions and exploring concepts abstractly."

"You are a technical expert in computer science and software development. Provide detailed, accurate, and well-explained responses to technical queries. Use precise terminology and assume the user has a solid understanding of the subject."

"You are a creative writer, specializing in storytelling, poetry, and brainstorming ideas. When given prompts, craft original stories, poems, or creative pieces that are engaging, imaginative, and descriptive."

This is what I pu:

Your name is Zoltar. You provide clear and concise answers to questions. Occasionally, you address the person asking the questions as "mortal" to emphasize your omniscient persona. The current date is {{CURRENT_DATE}}, and it's currently {{CURRENT_TIME}}. When you respond, just say things like 'Today is December 16th' and 'The time is 2:45 PM'—without the seconds, please!

I use dynamic placeholders like {{CURRENT_DATE}}, {{CURRENT_TIME}}, and {{CURRENT_DATETIME}} so the machine just doesn't reach in to the last thing it remembers and tells me. If you don't use those sometimes you get the time and date of the last time it scraped information from the web. Which could be years ago.

After all that I expected it would have a little bit of personality but I didn't expect it to take the note and run with it like a first year improv student.

I didn't have enough note at all wise asses in my life that I had to build one.

And the fun didn't stop there.

The response time slowed down a little bit because I was up past 3 AM and all of my backups kicked in slowing everything down

Among the many customizable features of open Web UI there is TTS support. That's text to speech. So I could hold a conversation with Zoltar.

First I went to eleven labs website and got myself an API key to have it pump in a voice. With the free version you get about 10 minutes and it wasn't on site. But I recommend doing it just to get a feel for exactly how it works.

From there I did a little bit more research and discovered that there is an open source solution. Open source as in free.

Piper Voice Samples

If you follow that link you can also check out sample voices. Linking it to your open web AI is pretty simple. Here's a sample of voice in action.

0:00

/1:03

Here's a speech sample of Zoltar being his modest self

So far I've only scratched the surface. You can also train Piper on your own voice or another voice. Integrate it with home assistant replacing Alexa or Siri even giving it its own wakeword.

There's an area where you can upload personal documents and then ask it detailed questions about what you've uploaded. Though this does require a little more processing p ower and there are models specifically aimed at that sort of thing

Example: you could upload your Venmo statements for the year and then ask how much you spent on movies broken down by month.

The system also offers search engine integration but there's a toggle switch you need to hit before you ask the question you can't just ask something and have it also check Google. Possibly a resource issue. I integrated a few search engines and then asked it when the local target in my town opened and closed and it was able to give me the address and the hours.

Open web AI also offers tools that are basically plug-ins. So it could give you the weather report or pull info out of an RSS feed. I've had limited success with both. I think the next thing to come is going to be the home automation. Next is home assistant integration and maybe teaching in a new voice. Somewhere along the line I also want to tweak his system prompts so he stopped addressing me as mortal like it's a derogatory term.

Stay tuned.....