Local LLM and HomeAssistant

When dealing with voice assistants, I got sick of Amazon shilling everything under the sun on the Alexa device that I’ve used, quite literally, since they launched the service. It was one of the original Echo devices, the tall tower one with a speaker that was…*chef’s kiss* Really an outstanding device for a little over a decade.

This last Christmas, my family got me a couple of voice assistants and a Raspberry Pi 5 (because I begged for them (tbh I put them on my list (tbh my wife was probably sick of me complaining about the alexa))). I asked for them because we have added quite a few smart plugs and a few smart lights in the house over the last few years, and we’ve really gotten used to using Alexa to set alarms and timers and to play some Pandora stations and some Christmas music. The voice assistants and the Raspberry Pi 5 were to replace the Amazon Echo and the Echo Dot (the Echo was in the house, the dot in my office).

It’s been a really interesting journey here, adding devices to the network, getting it to interface with my TrueNAS storage (and my music), getting it to work with Pandora (thank you kind Music Assistant programmers for the nightly releases that integrated it), and getting all the plugs moved over to open source to avoid having to log in through half a dozen manufacturer sites/APIs/Phone Apps for it to work. It’s been fun and educational. If I had recorded myself it’d be some edutainment, I tell you what.

That aside, though, I finally cracked the last bit of a hassle that I’ve had for the previous few months. My job ended due to budget cuts a week ago last Friday, and so I spent some time finally getting this hassle taken care of.

Y’see, I love the concept of LLMs. I like the back and forth one can have with them, the troubleshooting nature of them, and the way they operate. I REALLY hate the fact that giant companies like Google, Microsoft, OpenAI, Anthropic, etc. are forcing them down everyone’s throats while stealing your personal data from the chats. So I looked into throwing a local LLM on my TrueNAS server.

Suffice to say, it worked pretty well. The models were small, but that’s ok, I don’t need complicated stuff. A basic coding model to flesh out ideas, and a chatty model that will work with Home Assistant were what I wanted. I got them to work on my NAS using my ancient 1080 GTX gpu (with 8gb of RAM!), and they did what I wanted…on the NAS.

Hooking them into the Home Assistant system, though, was a pain point. I wanted them to connect, run the models, contexts, and information on my GPU, and spit out whatever response necessary via the speaker hooked into my voice assistant. That’s where there were issues from almost the get-go.

Ollama as a community app on TrueNAS Scale defaults to the most recent CUDA drivers (something like 570 something or another). 550 drivers, though, were the last ones to work with my 1080. So that was a problem, which interestingly enough, Open WebUI skipped over and used the GPU specifically with the Ollama installed models. I tried a few different things suggested by both Gemini and ChatGPT (because I figured why not get their help to replace themselves), and to no one’s great surprise, they sent me down a few rabbit holes where it didn’t work.

The biggest one was with ChatGPT…they had me make a custom Ollama docker image telling me that this would allow me to use an old version that used the driver. The problem, of course, was that AFTER jumping through that hoop, and then doing a bunch of tests that they assured me would work without a problem, they then told me it was impossible to actually use an older version that used that driver and that I was an utter fool for even thinking that would work, offering such clarifying statements as “I apologize, I didn’t clarify enough”…honestly, it was one step away from Obi Wan’s “from a certain point of view…”

Then I started up the ol’ Gemini, and they gave me the solution to pin to an older version of Ollama. That got installed without a problem with the LLM assuring me without a doubt that each step would fix the issue of using 100% CPU by Ollama. Needless to say eventually they too were like “Hey! Y’know what would fix this? Going back to the official Ollama app!”

To no one’s surprise, it didn’t work.

The LLMs finally admitted there was nothing wrong with my GPU (after telling me repeatedly the GPU was a problem and so was Ollama, and me just as often repeatedly telling them that Open WebUI used the GPU), and so sent me to the Home Assistant settings. None of them worked. They sent me to an eXtended OpenAI conversation community app on Home Assistant. It didn’t work.

They sent me to the official OpenAI conversation app…it didn’t work because I don’t have an API key for ChatGPT and there was no way to point it to my local LLM. They finally said, y’know what, the official Ollama home assistant app should work! This after me repeatedly telling them it processed everything on the CPU.

Finally I asked if there was a way we could just use the Open WebUI app on my server to process the GPU calls…and then things started working thankfully. Now the LLM uses the GPU instead of the CPU, the response time is almost instantaneously compared to the CPU processing.

I’m, in the words of the British, chuffed.

Of course, this would have worked flawlessly if I had a newer GPU, but that’ll come in time.

Big Cell Phone Won This Round

I hate to say it, but I need to go back to a stock ROM for my cell phone. I had rooted it last August to put LineageOS on it, because then I was able to get access to Android 15 and all the whiz-bang security updates.

Y’see, Google stopped updating my Pixel 4a in September of 2023. The phone is still good, though, unlike my wife’s Pixel 4a which they effectively killed with the battery issues (which started this whole NAS situation). Mine was still good, and last summer I tried a couple custom OS’s, one was GrapheneOS and the other was LineageOS.

GrapheneOS was ok, but it was REALLY locked down, so much so that few of my apps were working, and it tended to be a pain. So I switched to LineageOS, and I loved it. Weekly security patches, Android 15, pretty much similar to the new Android devices.

The problem, however, was that the boot loader needed to be unlocked. Not a big deal, says I, as only one app wouldn’t install and that was Coinbase. Oh well, that’s what they make the browser for, amirite? So, for the last year it’s been great.

Until.

End of June. Mint Mobile starts ramping up to get RCS on all devices (RCS was working on my device)…but to get RCS working, they need to work with the great Google. Well, the great Google doesn’t like devices that they can’t control…and since I don’t have a boot locker on my device, well it flagged…and google messaging just auto-kills RCS messages for devices like mine. They do this update a couple times a year, from what I understand, just to kill this option.

You must understand: they’re doing it for us because so many spammers are using unlocked devices. And spammers are bad, see, because they’re not paying google the advertising fee that good spammers are paying…spammers like facebook or amazon or other advertisers. They’re good, because google gets money. They’re not bad like people like me, who didn’t want to pay hundreds of dollars to update a perfectly good phone.

I was ok with losing some access to apps, but I was starting to not get messages from my wife, from mom and family, from other important entities that I needed to communicate with. That won’t fly, and dagnabbit, I’m not dropping hundreds of dollars for a new phone when this one is working just fine…and I’ll just deal with the security hassles.

This is what planned obsolescence looks like folks. It’s not your fridge stopping two months before the warranty is out. It’s the massive amount of control these companies have over our technology and data…and the way they can manipulate us away from making our own choices. Choices that work for us and make our lives better, but may cut out some sliver of profits.

I used to support capitalism. Capitalism used to be about making money. But I really think American Capitalism is starting to be more about control…control over everything.

So I’m back on my pixel 4a running android 13. I’ll look into updating security measures as I go, but if I’m careful, then I should be relatively safe. I hope. Who knows though.

EDITED ABOUT AN HOUR LATER: Some of you may think it wasn’t Google that was blocking RCS messages, even though they’ve been doing it since 2021 according to some sources. Honestly I don’t care if you believe me…my proof came in via RCS chats on Google Messages as soon as I signed into the reinstalled stock version of my Pixel 4a. Every message since about June 13th that they’ve blocked.

Google you’re chasing me into the arms of my NAS

Google destroying workable options in their apps for zero reason like the music controls in maps is yet another reason I’m working on moving away from corporate resources.

Honestly, there’s no customer-based reason for them to have limited it only to a couple apps. It was working fine for Musicolet (which is what I use for my cell phone based music), but I suppose they’re not getting any cash or advertising from a free service instead of a subscription service. I guess it still works for Spotify. Back room deals anyone?

In the TrueNAS Scale department of divorcing myself from corporate overlords, I’ve been using some TailScale for away-from-home access to my storage (mainly for Immich, since my wife and I take a lot of pictures). I mean, accessing my home library of movies is great too, but MUCH less of a concern. The photo gallery is the big one. I’d set up proxy and protective services myself, but Paulbunyan.net uses CNAT addressing, and don’t allow incoming connections. So Tailscale it is, for the time being.

In the coming months (once I sell enough old comp equipment to afford the new hard drives), I’ll be also installing and working with Home Assistant instead of the Amazon Alexa ecosystem. I’ve loved using the Alexa for things like playing Pandora, asking about the weather forecast(hello middle age), controlling wifi plugs/lights, and mainly setting kitchen timers…but similar to Google, I’m getting leery of allowing corpoorate entities that much of a peek into my day-to-day.

And yes, yes, yes, I know I KNOW about the dangers of cell phones and apps constantly listening (*cough cough* FACEBOOK *cough*), and I’m running almost everything through Brave browser on the phone to try and cut that down. I still have to be part of the Googly Moogly system, though, due to work and access (though much of my private day-to-day email is still very much handled via Froyd.net servers). Heck, I use Duckduckgo for 99% of my internet searching now.

I just want to take back control of what I can, using open source programs created by like minded individuals as opposed to corprations who answer to stockholders. It’s the independent Norwegian in me…to do it all myself, I don’t need help. That and my disgust with things changing ALL THE TIME without even a notice or a how-do-you-do.