Picture of Wanda Walleye on her trophy column and tackle box
I am listening. How can I help you?
Welcome
Welcome to the Wanda Walleye blog.
The purpose of this blog is to document a toy I’ve been working on
for … uh, several months now. [NOTE: More pictures, videos, links, source
code, all are on their way.]
Picture of Wanda Walleye from the front with her goggles lit up. Yeah she is a goob. But a loveable goob.
Wanda is, basically, a “smart” talking and listening plush fish that
uses built-in LLM AI to listen to what you say,
to generate some of the
things she says, and to abridge some of the other overlong things she
says. She also uses a lot of hardware like an ESP32 and strands of
programmable LEDs (in her swim goggles and her pineapple mug complete
with slurpy straw) to add some visual interest and to seem more
interactive. And she uses gadgets like a BMP280 barometric pressure
sensor, a 2.4GHz wireless remote, and a (coming soon!) MPR121
multi-touch sensor to provide more info and more ways to interact.
Picture of inside of Wanda tackle box with hardware labeled.
Wanda the Walleye is both a toy to keep me busy by adding features to
it and playing with it, and also the beginning of something more
important: A personal assistant that can help my aging brain keep track
of the key things I should be doing every day and on a regular basis. A
sort of home automation system, but for people, not places.
What Wanda Can Do
Here is a list of the things that Wanda can currently do. It is
almost certainly outdated, as new code is constantly added whenever I
see an interesting thing on the internet, or have a dumb idea, which is
pretty much all the time.
Things That Require An Internet Connection
A long-term goal is to move away from these things, so Wanda can run
with wifi turned off all the time. Safe and secured by an old-school air
gap.
A conflicting long-term goal is for Wanda to help with reminders and
day-to-day reports, which of course require internet connections to grab
weather data, update calendars, send email, etc.
Sigh.
When Is Best To Go Walkies: At 12:30 AM each night,
Wanda downloads the weather forecasts from NOAA, including the Air
Quality Indexes, and uses them to find the best few times today to go
for a walk. The “best” times are selected based on temperature,
humidity, rain/snowfall, heat index, AQIs, and sunrise / sunset. Each
3-hour block of time gets a score, or is disqualified (for example, if
it will be raining or too hot during that time block). Wanda generally
favors walks earlier in the morning, all other things being equal. The top 3
times are then emailed via ‘mutt’ to me. This is a really handy service,
and one I use daily.
Current Weather Report: When asked, Wanda reports
on the current weather. Wanda uses ‘pymetar’ to download the METAR
report from a local airport’s weather station, then adds the phase of
the moon using ‘pom’, and, because it seems to sometimes matter, Wanda
reports if Mercury is retrograde or not, using Astrolog. METAR reports are for
airplane pilots, and are great for getting the up-to-the-minute and very
local weather conditions in a compact and informative format.
NOTE:
Wanda also collects and reports on the current barometric pressure,
using a BMP280 sensor connected to Wanda’s ESP32; this part of the
weather report is standalone. The long-term plan is to get a wireless
outdoor weather station and collect the info from it rather than from
METAR reports.
Today’s Reminders: Wanda looks in a local calendar
file stored on the NUC to find any reminders for today (and tomorrow),
such as “put out the recycling bin” or “mail the rent check”, and
announces them. The calendar also includes a very extensive list of
(funny) holidays, celebrity birthdays, and similar dates. Long term,
this app will also provide inter-day reminders such as “take your
morning pills” and “go for a walk” (see above); the eventual goal is for
it to be interactive (but somehow not nagging): “Did you take your
morning pills yet?”.
While this app does not yet need internet access,
it will eventually connect to my online calendar, and pull reminders
from that as well. Side project: It will look for calendar events with a
specific #hashtag, such as #TerraceMeeting, and automatically generate
pre-events, such as “Drive to Terrace” (which would be scheduled 45
minutes earlier), and “Get Ready” (which would be scheduled 30 minutes
earlier still).
“Hello”: Simply say “Hello” or “Good day” or similar to Wanda. Wanda will reply with Today’s Reminders and the Current Weather Report.
“Take A Memo:”: Ask Wanda to take a memo. Wanda will
transcribe what I say, then e-mail the transcription to me. The transcribing
works well enough to figure out what I actually said, and is handy.
Current News: Wanda runs a RSS-based news
aggregator hourly via a ‘cron’ job. (Wanda does pretty much everything
either directly or indirectly via ‘cron’ jobs.) The aggregator currently
collects news stories from Ars Technical, The Register, and BBC News.
For each story it downloads, it uses Sumy the LLM-based story compactor,
to collapse the story down to 5 or so sentences, so that Wanda doesn’t
spend forever droning through every bit of news.
Sumy works really well,
and is one of the few developments from the current AI craze that I
think is honestly clever and useful.
On request, Wanda reads the
collected stories aloud, one by one. Because all of the downloading and
compacting was done earlier in the background, Wanda's readings are quick
and without lags or pauses. I can use a 2.4GHz wireless
remote to skip through the stories or stop Wanda and return to Wanda’s
ready prompt.
Podbytes
Do you suffer from an overwhelming flood of favorite podcasts, unable to keep
up, wondering if you're missing out on the real gems, and wishing that there
was an easy way to keep track? Me too. Wanda to the rescue.
Wanda downloads the latest episodes of favorite podcasts, once per week,
much like with the Current News. Whisper.cpp is used to create transcripts of
the podcasts, and the transcripts are emailed to me, one podcast episode per
email, along with the episode's title, summary, and link to the audiofile URL.
Whisper.cpp does a very good job of transcribing the podcasts, but the
transcriptions take a while since the podcasts tend to be long MP3 files,
so Wanda does them in the early morning starting at 2 AM.
I can quickly skim over each transcript, decide if I want to read the entire
thing, or listen to the original audio version.
Originally, Wanda also used Sumy on the transcripts in an attempt to make
condensed versions of them, but podcasts--usually conversations between
two or more people--are not well suited to this sort of summarization.
Geriatric Cognitive, Emotional, And Functional
Questionnaires:
Wanda administers simple self-report questionnaires, loosely inspired by
clinical instruments such as MoCA (the one our former prez so gleefully
passed, Person Woman Man Camera TV),
RUDAS, Mini-Cog, PHQ-9, GAD-7, and others.
The questions are all yes / no, both for ease of answering, and
so that the answers are collected reliably via Wanda's voice recognition
system, wireless remote, and / or capacitative touch pin-button box.
NOTE: Wanda's questionnaires are NOT
clinical exams, but are for fun and entertainment purposes only. They
may help shine an early light on areas to discuss with medical and
psychological doctors. The responses are not scored or analyzed. Wanda can
optionally email the responses from the tests to the user or to a friend.
Things That Run Completely Standalone
None of the following apps require an internet connection. They all run completely locally on Wanda’s NUC.
“I am listening. How can I help you?” This is
Wanda’s ready prompt, the indicator that Wanda’s listen / response
system is ready for instructions. Wanda uses Whisper.cpp in realtime
streaming mode to listen for instructions. This is a sort of fancy
keyword recognition approach, in that Wanda knows a list of keywords
(and several synonyms for each keyword), and then searches Whisper.cpp’s
unstructured speech-to-text output for the keywords, and then acts on
them. This allows for the user to be fairly casual in how they ask Wanda
for things. For example, the user might say “Play a song”, or simply
“Music” or “We need some tunes!”. As long as Wanda can recognize the
keyword “music” or one of its synonyms in the request, the corresponding
app will be launched. Once it finishes running, Wanda will return to
the ready prompt.
Text To Speech: Wanda uses the amazing Piper
text-to-speech system to generate very lifelike speech in faster than
real time. Piper is very light on computer resources, and makes it
possible for Wanda to seem so responsive. Wanda’s ESP32 is instructed to
move the mouth motor in correspondence with the speech, and also to
light Wanda’s goggle LEDs as well. The result can be fairly realistic
when the motor and LEDs are in sync with the speech, but sometimes is a
bit jarring when they fall out of sync.
SLLLUUUURRRRPPPP – Random Quip: This is a silly app
that runs at random every 10 – 15 minutes. Wanda takes a big noisy rude
slurp from her pineapple mug (complete with gaudy LED animation), then
says something witty, chosen from her extensive collection of pop quotes
and Random Shakespeare Insult generator. It’s funny and amusing, and
serves no purpose beyond being the best thing Wanda does.
Play Some Tunes: Ask Wanda to play music. Wanda
will pick a random song from a large local directory of rockin' MP3 files,
announce the artist and title, then play the song. The wireless remote can be
used to skip the current song, turn the volume up or down, mute Wanda,
or return to the ready prompt. Short term (very soon!) Wanda’s mouth
will “lip sync” with the music. The LEDs in the pineapple,
straw, and goggles pulse with the music: the pineapple beats red with bass,
the goggles flicker green with vocals / midrange, and the straw glows
blue with the treble sounds like snare drums and flutes. Wanda also has
extensive libraries of holiday tunes and classical music.
Play Wanda’s Anthem.
Of course Wanda has her own personal anthem, don’t you?
Generated courtesy of the LLMs at
Suno AI music generation service.
A pretty catchy pop(corn) ditty, and Wanda's very favorite song.
The LLM Kids: Tell Me A Story (or Poem), Tell Me
Some Trivia, Give Me Advice, Tell Me The Future From The Year 2025
(using a very futuristic robotic voice). These are all variants on the
same theme: Use Llama.cpp running a very small Llama 2.8 LLM to generate
free-form text based on the given prompt. This is all quite a bit like a
slower, stupider ChatGPT, but it does work, and runs in near real time,
completely locally on Wanda’s NUC. NOTE: The user does not actually
interact with the LLM directly, but rather instructs Wanda to, for
example, “Tell me a story”, and Wanda fires up Llama.cpp with the prompt
of “Here is a good story”. Llama.cpp takes it from there, churning out a
decent (if simplistic and heavily moralizing) story, which Wanda uses
Piper to tell. The one exception to these standard kids is:
Tell Me Some Trivia: This is pretty neat. The basic
idea is that I downloaded all of the page names (topics) from every
single page in Wikipedia, about 50,000 topics. When prompted to relate
some trivia, Wanda picks one of the topics at random, for example,
bioinformatics, and uses it to prompt Llama.cpp: “Here is some
interesting information about bioinformatics”. The amazing thing is that
somehow, the ~4GB LLM that Llama.cpp is using has some info about every
single topic I’ve thrown at it. Llama is definitely making some stuff
up, but by and large, it gets things right, and in detail. Considering
that Wikipedia’s article text takes somewhere from 2 to 4GB to store,
there is some serious encoding and compression going on inside that LLM.
The mode isn’t trained on just Wikipedia, but on a huge chunk of the
entire internet, so, yeah. Serious encoding and compression. This might
be the only real win from the current AI internet-scraping craze.
ELIZA: Wanda can run a fairly true version of the
classic ELIZA interactive app. Some of the prompts have been updated
since the original versions were a bit musty. This app is not nearly as
much fun as expected. ELIZA isn’t quite as “magic” as when first
encountered back in the early 1970s.
Interacting With The User: Wanda has multiple
personas to interact with: Wanda Herself; Sad Walleye; Evil Twin; Apple
Fangirl. (The Evil Twin persona is not very evil; the LLMs seem to have
most impish or nasty tendencies trained or guard-railed out of them,
alas.) The basic idea is that the Llama.cpp LLM is run in fully
interactive mode. This is slow and a little weird, but it works. Wanda
finishes each sentence with the word “bubble” so that the user knows
when it is “their turn” to speak. Under the hood, Wanda runs Whisper.cpp
to convert the user’s words to text, then feeds the words to Llama.cpp,
which churns out a response, which is spoken via Piper, and then loops
back around to listening via Whisper.cpp. Whisper isn’t perfect at
recognizing speech (about as good as Alexa or Siri, not as good as
Google’s speech recognition), and Llama comes up with some … interesting
… responses. But it is fun to play with for a few minutes.
Talking With Billy: Billy also has a much more
limited interactive mode, so it is possible (and briefly amusing) for
Wanda and Billy to talk to each other. Long term goals include creating a
YouTube video of Wanda and Billy having a conversation, and eventually
even live-streaming them talking to each other.
Tell Me A Joke: This was originally one of the LLM
Kids, but guess what? LLMs tell terrible, highly repetitive jokes. So
now Wanda pulls a random joke from an extensive list of internet jokes
(mostly lightbulb jokes, yo mama jokes, dad jokes, and the occasional
limerick). This is “better” not in the sense that it is good, but in the
sense that it is not as bad.
Flip A Coin / Roll A Die / Consult The Magic Octad Orb:
Another silly one. Wanda uses a random number generator plus a lot of
very purple prose to help make decisions by simulating a coin flip, dice
roll, or query of the magic ball.
Tell Me My Horoscope: Wanda uses Astrolog to
generate a fairly detailed western horoscope based on the current
locations of the planets and other heavenly bodies.
Tell Me My Fortune: Wanda uses a random number
generator to select 3 or 4 cards from the Tarot deck, and briefly
describe their meanings. The descriptions are good prompts to get the
user thinking about what they’re currently doing, how they’re
approaching things, and what their goals actually are. So, this is a
surprisingly useful app in the sense that it provides meaty food for
thought.
Show Off: Wanda speaks in multiple languages when
asked for a demonstration. This app will likely become fancier and more
involved over time.
System Commands: The user can ask Wanda to turn the
audio volume down, turn it up, mute, turn the wifi on or off, and shut
down the system. Other commands are likely to be added later on.
Help: When asked for help, Wanda reels off the list
of things she can do. Long and boring but necessary. I have a hard time
keeping track of all of Wanda’s apps nowadays.
How It Started
When the latest AI craze, the whole ChatGPT thing, started taking
off, I became curious about using LLMs for something useful. But also
for something personal. And fun. I was also looking for a project that
would involve both tech and arts and crafts to keep my brain busy.
Personal in this case means two things:
Something that would run locally, and not be dependent on giant
mothership server farms that would harvest my data for profit. Which is
to say, something that would protect privacy by not being online. As
Wanda evolves, it will eventually have a fair amount of personal,
private info on it, that I have no intention of sharing with Apple,
Google, Meta, or, really, anyone.
Something that would reflect my own sense of humor, and which would be useful for me personally.
Fun for me means keeping busy going down endless rabbit holes,
hearing about the latest tech fad or meme, and tinkering with it. To be
explicit, this blog is not fun. But it is useful for keeping some sort
of record of all the different aspects of Wanda.
Version 0.1: Smart Billy Bass
Like many people, I started out by making an LLM-driven “smart” Billy
Bass, by hacking the singing fish, connecting an Arduino to control its
motors, and a Raspberry Pi 5 to do the AI stuff.
The Pi could have run the motors directly, but I figured that the
Arduino provided a level of protection, in that if I burned out an IO
port or even the whole thing, I was smoking a relatively cheap
microcontroller rather than a fairly pricey Pi 5. Also, I had done
similar projects on both the Arduino and Pi, and felt far more
comfortable working with the Arduino’s IDE, and the very low-level
setup() and loop() routines rather than equivalent code on the Pi.
The main thing I did differently from similar “smart” Billys or other
Pi AI projects was to run all of the AI stuff locally on the Pi. Which
worked, but was very slow.
I was constantly nettled by PiOS updates and upgrades breaking
things. And by cryptic and underdocumented (even for Linux)
configuration arcana.
Also, the Pi tended to run hot when under load, to the point that I didn’t trust it to leave it running when I wasn’t around. Soon, the microSD card,
which the Pi used as its boot (and everything else) disk, died, probably from
being cooked. I only lost about a day's work, thanks to an auto-backup 'cron'
job. But that kind of took the fun out of working on Billy.
I did eventually buy a USB thumb drive with an aluminum case, stuck a
little copper heat sink on it, and restored Billy's boot disk to it.
So hopefully it'll last longer than the uSD card did. (No, I'm not interested
in buying Billy a proper SSD, and "hat" for the SSD, and bigger case to hold
both the Pi and hat, because...)
Another issue was the overall cost: The Pi 5 with the maximum 8GB of
RAM, an active cooling aluminum case with a fan, the Pi-official power
supply, a microSD card, all together ended up costing around $150.
That’s not including a Billy Bass in good condition, Arduino, H-bridge
to drive the motors, audio stuff, etc.
For that price, I could buy a decent small form factor PC, a used
one, plus heaps of RAM for memory-hungry LLMs. It would come with
built-in audio and power supply, could run standard and well-supported
versions of Linux like Ubuntu, and would be considerably faster and
cooler-running...
Version 0.2: A NUC And A ... Furby? Ruxpin?
So, I bought a 2018-era Intel NUC, a 4 inch by 4 inch by 2 inch
desktop PC in like-new condition for $70. I also bought a new-ish-to-me
$10 ESP32 to do the Arduino stuff. The NUC wouldn’t be quite as compact
as the Pi, but not a whole lot larger either.
Picture of the Pi 5 and NUC sitting on the Billy Bass case back. The Pi wall wart is to the left and the NUC wall wart is to the right.
I chose to go with an ESP32 mostly because I didn’t have much
experience with one (though I had monkeyed with an ESP8266, and liked
how I could program it using the Arduino IDE and libraries). I wanted to
become more familiar with the more capable ESP32.
I was also intrigued by the idea that it had built-in wifi and could
do some basic internet stuff, including getting content from HTTPS
webpages. One long-term goal for Wanda is to turn off the NUC’s wifi
completely, and never need to worry about it being hacked or
inadvertently sharing something it shouldn’t. Perhaps the NUC could use
the relatively dumb ESP32 to retrieve things like NOAA weather info and
maybe even calendar events.
One thing I had considered when first working on Billy was the idea
of actually putting the Pi inside of Billy, putting it entirely inside
Billy’s plaque, along with the Arduino and other hardware. And that
Billy might not be a singing plastic bass, but rather might be a Furby,
or maybe Teddy Ruxpin, or some other toy with a motor-driven mouth and a
little space inside. It turns out that the Pi ran way, way too hot to
make that viable. But the idea still stuck with me.
While poking around eBay and Etsy and Temu and AliExpress looking for
ideas for singing bass, bears, etc., I stumbled across a Billy Bass
clone. It was a stuffed plush fish of some sort (bass? walleye? some
similar fish from China?). It didn’t move its tail or tilt forward like
Billy could, but it did sing and move its mouth.
Picture of the original plush singing fish before the hack job, completel with NUC RAM sticks and The Hu band patch.
I got the idea of making a sort of “trophy”, using the NUC as the
base of the trophy, and the not-Billy as the decoration at the top of
the trophy column.
Picture of Wanda being held above the first pass at a trophy column on top of the NUC with the gift boxes on either side. Kinda like this?
There would need to be a couple of NUC-sized boxes to hold the ESP32,
H-bridge, and NUC wall wart. They could sit on either side of the NUC,
and hide all that stuff and all the wiring.
Picture of the NUC with both gift boxes open, exposing the bunch of stuff stuffed inside them.
I found some cardboard gift boxes with glossy black piano finishes on
Amazon. They looked pretty nice in person and were remarkably close to
the same size as the NUC, so they looked pretty natural sitting next to
it.
This setup worked OK, but the boxes were pretty snug. And things got
warm inside them. Mostly because the side of the NUC that had 2 of the
USB ports and the jack for the wall wart was also the side where the
NUC’s fan blew out the hot exhaust air. So it would blow straight into
that little box. And even with opening the back side of the box and
adding a fan in the box for extra ventilation, it still got hot in
there.
Something bigger and cooler was needed for the trophy base.
How It's Going
The bigger cooler trophy base ended up being a small steel toolbox
from Menards (a Midwest hardware mega-store like Home Depot). The
toolbox is nice and sturdy, but the steel walls are still thin enough to
easily cut and drill through. It was on sale for an amazing $15–a good
price for a plastic toolbox of similar size.
The toolbox is actually a bit too big, but that just means that it’s
ready for more toys, sensors, audio stuff, etc. inside. Better too big
than too little. After all, Wanda is a project for exploration and fun,
it’s not an Apple iPad. It’s OK for Wanda to have a Badger Butt.
The Invasion Of Temu
Around the same time I started working on the new version of Wanda’s base, I started becoming addicted to
a frequent shopper at Temu. Temu is to Amazon as Amazon is to Walmart,
and as Walmart is to a regular store: a cheaper competitor that
regularly (and doubtless ruthless) undercuts prices, and which sells the
same made-in-China stuff as Walazon, just for a lot less money. Do I
feel a little guilty about shopping at Temu? A little. The profits don’t
go to American billionaires, they go to rich Chinese people. But the
stuff is made in the same factories under the same grueling conditions,
and is the same quality (good or bad).
Anyway, for folks with a limited budget and a constant itch to get
new gadgetry, Temu works. Much of Wanda’s bling, and a some of her
internal hardware, was purchased at relatively low cost via Temu.