Hi, I'm Anders

Hello makers!

I’m an artificial intelligence engineer by trade, but spent my youth in my dad’s workshop doing all sorts of wood- and metal working, as well as some electronics and chemistry. My interest in miniatures as a hobby often leads me to miss the workshop equipment I had access to growing up. I hope joining this space will let me take some of my hobby projects off ice and meet some fellow makers. I’ve also considered exploring the use of machine learning in manufacturing, and this would be a good place to experiment.

4 Likes

Hi Anders,

Welcome to our community! Great you found us! Sounds like you’ll fit right in. If you’re available this coming Wednesday, 12 July, come and visit the Makerspace for a tour and a chat - we got an Open Evening lined up.

See you at the Makerspace,
Mark

1 Like

Hi @Dapperwocky,

Welcome to the space. If you haven’t seen the space yet @Geraetefreund and I are histing an open evening this Wednesday.

Glenn

Hello,

Nice to e-meet you both! I’ll try to make the Wednesday open evening :smiley:

Anders

Hi and welcome!

Do you have any experience with AI using Raspberry Pi?
I’m looking for someone to discuss it with.

Thanks
Brendon

Happy to discuss :slight_smile:

Most of my experience deploying AI models is on very large and powerful machines in the cloud (or clusters of them). I’ve used Raspberry PIs in the past but not for ML. The same principles should apply though and I see lots of people making it work. What are you trying to accomplish?

Anders

Thanks for the reply.

I’m creating an open build project to teach kids about IoT, physical computing, home automation, computer vision, edge computing with API calls etc.

My current build used a Raspberry Pi Zero 2 with Node-RED as the primary interface and with a USB webcam.
I can’t even get the TensorFlow node to install.

I then tried a Raspberry Pi 3B+ which installs the TensorFlow node but is verrrrry slooooow.
I’m now considering a Raspberry Pi 4 but don’t have one to test (and they are rare and expensive).

I’m also tripped up by my fundamentally poor understanding of the AI landscape, the options for hardware and software stacks etc. I’m also wondering if I should be using the Raspberry Pi as an edge device with all computations done in the cloud.

So many questions!

Just a quick question: Yesterday, I’d sent you an invitation to join the Makerspace. Did you receive this e-mail from info@southlondonmakerspace.org?

No rush to join, the invitation has no expiry date, I just wanted to check that it hasn’t ended up in spam, it did that several times during the past months for other members. Just trying to sort this out. Thanks!

Sadly, I don’t think the Zero 2 will be able to handle almost any kind of machine learing on it’s own - it just doesn’t have the computational horse power to do that.

Have you hear of Google Coral ML accelerators? They have a USB pluging board that works, but it sounds like this would be ideal usecase for what you’re trying to do.

Also - welcome @Dapperwocky!

I can lend you Pi 4 without SD card for tests if it’s not too long, I have one running my NAS (which is currently disassembled anyways) and pi-hole (which I can live without for a week or two)

Yes, I got it, thanks! All signed up :slight_smile:

Aweseome! :partying_face: You’re all set, all member’s areas in Discourse are you to explore.

Come and collect your keyfob so you can open the door and start getting hands dirty!

1 Like

It sounds like you’re trying to run a typically very resource hungry ML model (computer vision) on a very resource constrained device (any RPi). This has been done plenty, but the performance you’ve observed is expected. How resource hungry a model is varies wildly by the model architecture, but broadly speaking anything computer vision related sits at the top just under LLMs. This is compounded by the fact that most modern computer vision models are optimized for running on dedicated, power hungry, CUDA GPUs (read: NVidia) or TPUs (tensor processing units), not the tiny low-power-draw CPUs in RPis. The RAM size and speed on an RPi is significantly smaller and slower than the vRAM on a dedicated GPU as well, and the underlying Python library Numpy that a lot of the ML code relies on is not well optimised for Raspberry PIs last I checked.

Tensorflow is an infamously lousy installation experience so I’m not surprised you’ve had issues. There are great alternatives to Tensorflow, like PyTorch, that can be easier to wrangle, but I think your bottleneck is still the hardware:

We usually talk about CPUs as compressible resources and RAM as non-compressible resources. What we mean by this is that if you are maxing out your CPU throughput for a task, it just takes longer (doesn’t crash) whereas if you try to use more RAM than is available (including page file, etc) you’ll instantly get an Out Of Memory error (crash, though sometimes hidden from the end user).

When you try to install and load Tensorflow, you need enough RAM for Python itself + TensorFlow + all the memory object created by the packages TensorFlow depends on + the size of the rehydrated model artifact. This may be a tall order for a RPi zero and could explain why it failed for you. Although, I believe TensorFlow dropped support for ARMv6 so that could also explain it. If the latter, you might be able to get around that by compiling TensorFlow yourself.

In your RPi 3B example, since it works, albeit very slowly, it indicates that there is sufficient RAM at least, but some of the slowness could come from the type of RAM. If the model uses more memory than the system has physical RAM, it’s likely using the page file which is orders of magnitude slower than regular RAM. That being said, I suspect most of the slowness comes from the throughput of the CPU.

Some options:

  • Optimise the performance of the existing hardware
    • Check that the CPU is properly (100%) utilised under load and isn’t throttled by temperature, add fans, etc, to maximize throughput
    • See if you can get away with disabling the page file.
    • Add an external dedicated GPU (difficult and very expensive)
  • Optimise the architecture of the model
    • Check that you are using the latest, production ready version of the model. The ML landscape moves fast. Many optimization are baked into new model releases, so there are usually performance improvements from updating to the latest release. Just make sure it’s not a ‘beta’ or ‘experimental’ release, which often have performance regressions that aren’t fixed until they are promoted to be ‘production ready’.
    • Some models can be retrained or tuned to trade off precision and accuracy (result quality) for speed and smaller memory footprint. Without knowing your model, I can’t say if this is the case. It usually revolves around dropping floating point precision and quantization, and can be quite scientifically involved. This may not be an option if you’re just downloading a pre-trained model from somewhere and don’t have the access or knowledge to retain, tune or modify it. The trade-off of quality-to-performance may also not be to your liking (i.e. results become too inaccurate for the use case)
    • Try an equivalent model based on PyTorch. No guarantees this will be better or even supported on RPi.
  • Use specialised edge hardware
    • @potatoman mentioned Google’s Coral. There is also NVidia’s Jetson. These are typically slightly more expensive than regular RPis but the price-performance trade-off is VERY worthwhile.
  • Use remote compute
    • As you mentioned, running inference in the cloud would obviate the need for a specialised, potentially expensive edge device. You could still do all the non-ML/computer vision stuff on a tiny, inexpensive device like and RPi zero and just call out to a cloud endpoint for the vision tasks. You could use the phones people have already as edge devices with this too. This is a very common pattern in industry. Depending on usage (how many API calls for vision tasks per second, etc) this could be the most cost effective option, as a single cloud node can serve many edge devices, scale automatically to meet spikes in demand and only charge you for what you use. The main providers would be Microsoft Azure, Google Cloud Platform (GCP) and Amazon Web Services (AWS). Pricing is similar enough between these that usability and familiarity is typically the selection criteria. They also all have ‘free tiers’ which, again, depending on usage, could make it completely free. There are smaller players, but they all require more work on your part to get going in exchange for slightly cheaper prices. I can vouch for Vertex AI on GCP being a good, friendly experience.
    • Consider a decent gaming PC somehow securely reachable via the internet. Anything with an NVidia RTX* card and 12+ gigs of vRAM.

Trying to optimize the hardware and model you’ve already got will likely be the cheapest in the short term, but the speed gains might not even be noticeable; most edge devices just aren’t built for this. If you’re set on having the model running on the edge device, I would switch to the ML accelerated boards mentioned above. I think running the model remotely is still the best option. A cloud node is more flexible and you can easily switch or modify model architectures in one place without having to deploy new models to all your edge devices. With vision, you’d be transmitting a lot of images, so keep an eye (or a cap) on bandwidth costs.

Is there anything you’d like me to go further into?

1 Like

Wow! Thanks for the really detailed, very structured feedback. You have given me a lot to work with.
Just a note…Raspberry Pi Zero 2 is ARM7, the first version was ARM6 (which I discovered when I went to install InfluxDB).

Thanks again.

1 Like

I ran your scenario past some colleagues and https://modzy.github.io/chassis/ came up as a recommendation. I haven’t used it myself so can’t comment beyond several colleagues recommending it for AI at the edge. They specifically call out Raspberry PIs.

Thanks for pointing me at PyTorch.
I managed to get it installed easily on both a Raspberry Pi Zero 2 and a Raspberry Pi 3b+ on Raspberry Pi OS 64bit.

I then tried to set up a quick demo to test performance and soon realised that I knew way too little about ML and the Python ecosystem to do that.

Do you have a suggestion for a “cut & paste with no understanding whatsoever required” example that I could use?

Thanks

Happy I could be of help.

It’s rare to have “cut & paste with no understanding whatsoever required” in the ML space, especially without ending up paying a service to do it for you. It’s becoming commoditized rapidly, but here be dragons and all of that. I might better be able to help if I know a little bit more about what you are trying to do.

You mentioned a USB webcam. Are you trying to apply ML to video or still images? Is it a live video stream or are the video/image files saved somewhere and can be processed at your leisure? How were you grabbing these and sending them to the Tensorflow model you were using before? Was it all supposed to be handled by node-red? What are you hoping for the model to do? Identify birds? Motion? Video segmentation? Do you have a link to any tutorials you’ve followed for the ML parts, including for the previous setup? Which steps did you follow to install PyTorch?

An in-person chat, pairing session or video call might be a better format for this discussion, as I’m likely to have lots of follow up questions.

LOL. All good points.
I’m away for a week, but will be keen to pick this up again then.
Thanks