Deep learning with Raspberry Pi and alternatives in 2024

Deep learning with RPi and alternatives - Q-engineering

Home
- Machine learning
- Computer vision
- Embedded vision
- Deep learning
- Math
- Optics
Raspberry Pi
- Raspberry Pi 4
  - Frameworks
    - Overview
    - ncnn
    - TensorFlow
    - TensorFlow Lite
    - TensorFlow Addons
    - PyTorch
    - PaddlePaddle
    - Paddle Lite
    - Caffe
    - OpenCV DNN
    - MNN
    - TNN
    - ARMnn
    - MXNet
  - Various
    - OpenCV
    - OpenCV Lite
    - GStreamer
    - Qt5
    - Vulkan
    - Ubuntu
    - Deep learning with RPi and alternatives
    - Deep learning examples
    - Deep learning algorithms
    - Overclocking
    - Protect SD card
    - Code::Blocks
    - Computer vision
    - 64-bit OS + USB boot
    - 64-bit OS RPi zero 2
    - OpenCL
- Raspberry Pi 5
  - Bookworm
  - OpenCV
  - PyTorch
Jetson
- Jetson Nano
  - Frameworks
  - ncnn
  - TensorFlow
    See Also
    How to Train your Raspberry Pi for Facial Recognition Can Raspberry Pi Run for Long Hours Image Processing — OpenCV Vs PIL Use Opencv with GPU with just 2 lines of code
  - TensorFlow Lite
  - TensorFlow Addons
  - PyTorch
  - Paddle (Lite)
  - Darknet
  - Caffe
  - MNN
  - Various
  - Ubuntu 20.04
  - OpenCV
  - Overclocking
- Orin Nano
  - OpenCV
  - Darknet
Various
- YoloCam
- YoloIP
- Google Corals TPU
- FPGA Deep learning
- Caffe on Ubuntu
- Jevois
Shop
Contact

Q-engineering

Home
- Machine learning
- Computer vision
- Embedded vision
- Deep learning
- Math
- Optics
Raspberry Pi
- Raspberry Pi 4
  - Frameworks
    - Overview
    - ncnn
    - TensorFlow
    - TensorFlow Lite
    - TensorFlow Addons
    - PyTorch
    - PaddlePaddle
    - Paddle Lite
    - Caffe
    - OpenCV DNN
    - MNN
    - TNN
    - ARMnn
    - MXNet
  - Various
    See Also
    Optimizing OpenCV on the Raspberry Pi - PyImageSearch
    - OpenCV
    - OpenCV Lite
    - GStreamer
    - Qt5
    - Vulkan
    - Ubuntu
    - Deep learning with RPi and alternatives
    - Deep learning examples
    - Deep learning algorithms
    - Overclocking
    - Protect SD card
    - Code::Blocks
    - Computer vision
    - 64-bit OS + USB boot
    - 64-bit OS RPi zero 2
    - OpenCL
- Raspberry Pi 5
  - Bookworm
  - OpenCV
  - PyTorch
Jetson
- Jetson Nano
  - Frameworks
  - ncnn
  - TensorFlow
  - TensorFlow Lite
  - TensorFlow Addons
  - PyTorch
  - Paddle (Lite)
  - Darknet
  - Caffe
  - MNN
  - Various
  - Ubuntu 20.04
  - OpenCV
  - Overclocking
- Orin Nano
  - OpenCV
  - Darknet
Various
- YoloCam
- YoloIP
- Google Corals TPU
- FPGA Deep learning
- Caffe on Ubuntu
- Jevois
Shop
Contact

Go to content

Last updated: Februari 23, 2024

Introduction.

This page assists you to build your deep learning modal on a Raspberry Pi or an alternative like Google Coral or Jetson Nano. For more general information about deep learning and its limitations, please see deep learning. This page deals more with the general principles, so you have a good idea of how it works and on which board your network can run. Detailed step-by-step recipes for installing of software can be found at deep learning software for Raspberry Pi 4 and alternatives.

Tensor.

A widely used software package for deep learning is TensorFlow. Let start with the name. What is a tensor?

You can have a list of numbers. This is called a vector in mathematics.

If you add a dimension to this list, you get a matrix.

This way you can, for example, display a black and white image. Each value represents a pixel value. The number of rows equals the height, the number of columns matches the width of the image. If you add yet again an extra dimension to the matrix, you get a tensor.

A stack of 2D matrices on top of each other. Or to put it another way, a matrix in which the individual numbers are replaced by a vector, a list of numbers. An example is an RGB picture. Each individual pixel (element in the matrix) consists of three elements; an R, G and B component. This is the most simplified definition of a tensor, an n-dimensional array of numbers.

There is a subtle difference in the definition between tensors in TensorFlow and mathematics.

In mathematics, a tensor is not just a collection of numbers in a matrix. Here a tensor must obey certain transformation rules. These rules have to do with altering the coordinate system in which the tensor lives without altering its outcome. Most tensors are 3D and have the same number of elements as a Rubric cube. Each individual cube predicts how a physical object will deform under stress (tensor) by a set of orthogonal vectors.

If the observer takes another position in the real world, the deformations of the object itself don’t change; obvious, it is still the same object. However, all your vectors or formulas will change given this new position. They will change in such a way that the result of a deformation still remains the same. Think of it as the distance between the top of two towers. Where ever you stand, that will not change. Drawing vectors from your position to those tops will however shift according to your position, your origin.

There is even a third meaning in this context of a tensor, a neural tensor network.

The tensor in this special neural network establishes a relationship between two entities. A dog has a tail, a dog is a mammal, a mammal needs oxygen, etc.

The last two definitions are only given for completeness. Many people think that TensorFlow has something to do with one of these interpretations. This is not the case.

Weight matrix

The most important building block of TensorFlow and other deep-learning software is the n-dimensional array. This section explains the use of these arrays. Every deep learning application consists of a given topology of neural nodes. Each neural node is usually constructed as shown below.

Each input is multiplied by a weight and added together. Together with a bias, the result goes to an activation function φ. This can be a simple step operation or a more complex function such as a hyperbolic tangent.

The output is the input for the next layer in the network. A network can be made of many layers, each with thousands of individual neurons.

If you look at one layer, the same input array can be applied to different weight arrays. Each with a different result so that various features can be extracted from a single input.

In the network above, four inputs (yellow) are all fully connected to the four neurons (blue) of the first layer. These are wired to the five neurons of the next layer. Following another inner layer of six neurons. After two consecutive layers of four and three, the output (orange) with three channels is reached.

Such a scheme results in a vector-matrix multiplication.

Here the input layer of four values (x,y,z,w) is multiplied with the weight matrix. Weights a,b,c,d for the x input, resulting in x' at the output. The weights e,f,g,h for the y' output and so on. There are other ways to describe this multiplication like,

Where v is the input vector (x,y,z,w) and v' the output (x',y',z'). The vector-matrix multiplication is one of the most performed operations in TensorFlow, hence the name.

GPU

Before all dots are put together, first a little detour in GPU hardware. GPU stands for Graphical Processing Unit, a device initially designed to relieve the CPU from the dreary screen rendering task. Over the years the GPUs became much more powerful. Nowadays they have over 21 billion transistors and are capable of performing massive parallel computations. Especially in games where every pixel on the screen is calculated, these calculation capabilities are needed. When moving the position of the viewer, for example, when the hero starts running, all vertices must be recalculated. And this 25 times per second to get smooth transitions. Each vertex needs a rotation and a translation. The formula is:

Here (x,y,z,w) is the initial pixel position in 3D and (x',y',z',w') is the new position after the matrix operation. As you can see, this type of arithmetic is the same as for a neural network. There is another point of interest. When you look at x' it is the summation of four products (ax+by+cz+dw). y' on the other hand is also a summation (ex+fy+gz+hw). But to calculate y' one does not need to know the values that determine x' (a,b,c, and d). They have no bearing on each other. You can calculate x' at the same time as y'. And z' and w' for that matter also. In theory, every calculation with no relations on other outcomes can be performed at the same time. Hence the very parallel architecture of GPU. The fastest GPUs today (2024) are capable of a whopping 125 TFLOPs per second.

This is the whole idea behind GPU acceleration. Transfer all tensors to the GPU memory and have the device perform all vector-matrix calculations in a fraction of the time it would cost the CPU. Without the impressive GPU calculation power, deep learning would hardly be possible.

TPU

Driven by the huge market potential of deep learning, some manufacturers replaced the GPU for a TPU, a Tensor Processing Unit. In addition to the vector-matrix multiplication, the GPU also has other tasks to do such as vertex interpolation and shading, H264 compression, driving HDMI monitors, etc. By using all transistors solely for tensor dot products, the throughput increases while the power consumption decreases. The first generation only works with 8-bit integers, the later also with floating points. The TPUs on the embedded boards below are all integer based except the Jetson Nano. Read an in-depth article here.

GPU Pitfalls

There are a few points about GPU arithmetic that must be taken into account.

To begin, stick to the matrices. GPU architecture is designed for that kind of operation. Writing an extensive if-else structure is disastrous for a GPU and the overall performances.

Another point is that memory swaps cost a lot of efficiencies. More and more the transfer of data from the CPU memory (where the images are usually located) and the GPU memory is becoming a serious bottleneck. You read the same over and over again in every document of NVIDIA; the larger the vector-matrix dot product the faster it will be executed.

In this regard, keep in mind that Raspberry and its alternatives usually have one large RAM for both the CPU and the GPU. They share simply the same DDR4 chip(s). Your neural network must not only fit in the program memory, but it must also leave space in the RAM so that the CPU kernel can run. This can sometimes impose restrictions on the network or the number of objects to be recognized. Choose another board with more RAM may be the only solution in that case. All this contrasts with the graphics card in a PC where the GPU has its memory bank.

Another distinction is that the GPU on a video card works with floats or half floats, sometimes also called small floats. The embedded GPU on the Raspberry or the TPU on the alternatives boards works with 8 or 16-bit integers. Your neural network must be adapted to these formats. If this is not possible, choose another board with floating-point arithmetic like the Jetson Nano.

Last advice, don't overclock the GPU too much. They work normally at a lower frequency than the CPU. Some Mali GPUs in ARM cores run as low as 400 MHz. Overclocking can work in the winter, but the application may falter mid-summer. Remember, it's your vision application at your client that suddenly crashes, not a game you simply restart.

And of course, the comments on the page about computer vision on the Raspberry also apply here.

Showstopper.

You cannot train a deep learning model on a Raspberry Pi or an alternative. Not if you haven't planned a trip around the world. The boards lack the computer capacity to perform the huge amount of floating-point mul-adds required during training. Even a Google Coral cannot train a network because the TPU on this board works only with special pre-compiled TensorFlow networks. Only the last layer in a network can be changed slightly. And although the Jetson Nano has floating-point CUDAs, it is still not very well able to train a network in an acceptable time. Do it overnight is the advice of NVIDIA here. So, in the end, you can only import and run an already trained model on these boards.

Cloud services.

As mentioned before, training is not an option on a Raspberry Pi, nor on any other small SBC. However, there is an escape route. All major technology companies have cloud services. Many of them also include the option to run a Linux virtual machine equipped with a GPU. Now you have a state-of-the-art CPU with CUDA acceleration at your fingertips. One of the best free services has Google, with a free 15 GB on GDrive and minimal 12 hours of free computer time per day. Now it is possible to train your deep learning models to a certain extent with just a simple Raspberry Pi. Transfer training (partially adjusting your weights without changing the topology) is doable because it is a relatively easy task that you can do well in a few hours. Training a complex GAN, on the other hand, takes more resources. It will likely force you to buy additional power.

Practice.

The first step is to install an operating system, usually a Linux derivative such as Ubuntu or Debian. That is the easy part.

The hard part is installing your deep learning model. You have to figure out if any additional libraries (OpenCV) or drivers (GPU support) are needed. Please note that only the Jetson Nano support CUDA, a package most deep learning software on a PC use. All other boards need different GPU support if you want to accelerate the neural network. The development of GPU drivers for Raspberry Pi or the alternatives is an ongoing process. Check the communities on the net.

The last step is reducing the neural network to acceptable proportions. The famous AlexNet has original 2.3 billion floating-point operations per single frame. This will never run fast on a simple single ARM computer or mobile device. Most models have some sort of reduction strategy. YOLO has Tiny YOLO, Caffe has Caffe2 and TensorFlow has TensorFlow Lite. They all use one or more of the following techniques.

Reduce the input size. Smaller images save a lot of computations on the first layers.

Decrease the number of objects to classify; it trims the sizes of many internal layers.

Port the neural network from floats to bytes where possible. This also lowers the memory load considerable.

Another strategy is the reduction of the floats to single bits, an XNOR network. This fascinating idea is discussed here.

Comparison of Raspberry Pi and alternatives.

Jetson Nano vs Google Coral vs Intel Neural stick, here the comparison. The three odd ones out in the list are the JeVois, the Intel Neural Stick, and the Google Colar USB accelerator. The first has a camera onboard and can do a lot as you can read here.

The Intel Neural Stick and the Google Colar accelerator are USB dongles with a special TPU chip performing all tensor calculations. The Intel Neural Stick comes with a toolset to migrate a TensorFlow, Caffe or MXNet model into a working Intermediate Representation (IR) image for the Neural Stick.

The Google Coral works with special pre-compiled TensorFlow Lite networks. If the topology of the neural network and its required operations can be described in TensorFlow it may work well on the Google Coral. However, with its sparse 1 Gbyte RAM, memory shortage can still be an issue.

The Google USB accelerator has its special back-end compiler converting a TensorFlow Lite file to an executable model for the dongle TPU.

The Jetson Nano is the only single-board computer with floating-point GPU acceleration. It supports most models because all frameworks such as TensorFlow, Caffe, PyTorch, YOLO, MXNet, and others use the CUDA GPU support library at a given time. The price is also very competitive. This has everything to do with the booming deep learning market where NVIDIA does not want to lose its prominent role.

Not all the models could run on every device. Most of the time due to memory shortage or incompatibility in hardware and/or software. In these scenarios, several solutions are possible. However, they will be time-consuming to develop and often the results will be disappointing.

Benchmarks are always subject to discussion. Some may find other FPS using the same models. It all has to do with the method used. We used Python, NVIDIA used C++, and Google their TensorFlow and TensorFlow Lite. The Raspberry Pi 3 B+ has a 2.0 USB interface onboard. Both neural sticks can handle 3.0, which means that they could perform faster. The new Raspberry Pi 4 B, on the other hand, has USB 3.0, which will result in a higher FPS compared to its predecessor.

The numbers shown in the table are purely the time it takes to execute from input to output. No other processes are taken into account like capturing and scaling images. No overclocking is used, by the way.

Model	Framework	Raspberry Pi (TF-Lite)	Raspberry Pi (ncnn)	Raspberry Pi Intel Neural Stick 2	Raspberry Pi Google Coral USB	JeVois	Jetson Nano	Google Coral
EfficientNet-B0 (224x224)	TensorFlow	14.6 FPS (Pi 3) 25.8 FPS (Pi 4)	-	95 FPS (Pi 3) 180 FPS (Pi 4)	105 FPS (Pi 3) 200 FPS (Pi 4)	-	216 FPS	200 FPS
ResNet-50 (244x244)	TensorFlow	2.4 FPS (Pi 3) 4.3 FPS (Pi 4)	1.7 FPS (Pi 3) 3 FPS (Pi 4)	16 FPS (Pi 3) 60 FPS (Pi 4)	10 FPS (Pi 3) 18.8 FPS (Pi 4)	-	36 FPS	18.8 FPS
MobileNet-v2 (300x300)	TensorFlow	8.5 FPS (Pi 3) 15.3 FPS (Pi 4)	8 FPS (Pi 3) 8.9 FPS (Pi 4)	30 FPS (Pi 3)	46 FPS (Pi 3)	30 FPS	64 FPS	130 FPS
SSD Mobilenet-V2 (300-300)	TensorFlow	7.3 FPS (Pi 3) 13 FPS (Pi 4)	3.7 FPS (Pi 3) 5.8 FPS (Pi 4)	11 FPS (Pi 3) 41 FPS (Pi 4)	17 FPS (Pi 3) 55 FPS (Pi 4)	-	39 FPS	48 FPS
Binary model (300x300)	XNOR	6.8 FPS (Pi 3) 12.5 FPS (Pi 4)	-	-	-	-	-	-
Inception V4 (299x299)	PyTorch	-	-	-	3 FPS (Pi 3)	-	11 FPS	9 FPS
Tiny YOLO V3 (416x416)	Darknet	0.5 FPS (Pi 3) 1 FPS (Pi 4)	1.1 FPS (Pi 3) 1.9 FPS (Pi 4)	-	-	2.2 FPS	25 FPS	-
OpenPose (256x256)	Caffe	4.3 FPS (Pi 3) 10.3 FPS (Pi 4)	-	5 FPS (Pi 3)	-	-	14 FPS	-
Super Resolution (481x321)	PyTorch	-	-	0.6 FPS (Pi 3)	-	-	15 FPS	-
VGG-19 (224x224)	MXNet	0.5 FPS (Pi 3) 1 FPS (Pi 4)	-	5 FPS	-	-	10 FPS	-
Unet (1x512x512)	Caffe	-	-	5 FPS	-	-	18 FPS	-
Unet (3x257x257)	TensorFlow	2.0 FPS (Pi 3) 3.6 FPS (Pi 4)

Raspberry Pi and deep learning.

We have placed a deep learning library and several deep learning networks on GitHub. Together with the simple C++ example code, you could build your deep learning application on a bare Raspberry Pi. It is extremely user friendly. More information on this page.

Above an impression of a TensorFlow Lite model (MobileNetV1_SSD 300x300 with the COCO training set) running on a bare Raspberry Pi.

With a 64-bits operating system like Ubuntu, you get 24 FPS, if you overclock to 1925 MHz.

With the regular 32-bits system like Raspbian, you get 17 FPS, once overclocked to 2000 Mhz.

Raspberry Pi and recent alternatives.

Below a selection is made between Raspberry Pi and recent alternatives suitable for implementing deep learning models. Most have extensive GPU or TPU hardware on the chip. Please note that the listed price can fluctuate a lot. The prices shown are from just after the worldwide severe shortage of chips. The GPU speed is given in TOPS which stands for Tera Operations Per Second. The highest score will be, of course, when you are using 8-bit integers. Most suppliers give this 8-bit score. If you want to have an impression in TFLOPS (Tera Floating Operations Per Second), divide the number by four. Although some GPUs aren't capable of processing single 8-bits, like the Jetson Nano, the score is still in TOPS, just for comparison reasons.

	Raspberry Pi Pico 2x Cortex-M0+ CPU - - 133 MHz - 264 KB ☆☆☆☆☆ € 5 Just some I/O, an RP2040 MCU and 2 MB of flash. Can it be used for deep learning? Barely. However, TensorFlow TinyML has some examples here.
	Raspberry Pi Zero 2 W 4x Cortex-A53 CPU VideoCore IV GPU 24 GFLOPS 1.0 GHz - 512 MB ★☆☆☆☆ € 15 The Raspberry Pi 3B+ on the RPi Zero footprint. Ideal for low cost, small deep learning applications in tiny housing.
	Raspberry Pi 3 B+ 4x Cortex-A53 CPU VideoCore IV GPU 24 GFLOPS 1.2 GHz - 1 GB ★☆☆☆☆ € 40 Parent of all boards. Still one of the most sold. Lots of code and support available.
	Raspberry Pi 4 B 4x Cortex-A72 CPU VideoCore VI GPU 32 GFLOPS 1.5 GHz - 1/2/4/8 GB ★★☆☆☆ € 50/€ 50/€ 60/€ 90 The successor to the Raspberry Pi 3 with a slightly faster processor, USB 3.0 and GigaEhternet.
	Raspberry Pi 5 4x Cortex-A76 CPU VideoCore VII GPU 52 GFLOPS 2.4 GHz - 4/8 GB ★★★☆☆ € 70/€ 93 The successor to the Raspberry Pi 4 with a (much) faster processor, two camera ports, PCIe 2.0, USB 3.0 and GigaEhternet.
	Jetson Nano B01 4x Cortex-A57 CPU 128x CUDA 1.88 TOPS 1.43 GHz - 4 GB ★★☆☆☆ ★★★☆☆ € 216 Identical to the Jetson Nano A02 board, except it has two camera ports, which makes it ready for binocular applications like stereo recording, depth sensing, 3D object tracking and image stitching. More NVIDIA boards.
	Jetson Orin Nano 4GB 6x Cortex-A78 CPU 512x CUDA 20 TOPS 1.43 GHz - 4/8 GB ★★★★☆ ★★★★★★ € 242 / € 450 The successor to the Jetson Nano. Except this board has a lot more AI power. If you want to start deep learning at the edge, here's your board.Note, you need also a carrier board which makes a total of € 450,= for a development kit. More NVIDIA boards.
	Radxa Zero 3W 4x Cortex-A55 Mali-G52 GPU 0.6 TOPS NPU 1.4 GHz - 1/2/4/8 GB ★★☆☆☆ ★★☆☆☆ € 16/€ 21/€ 30/€ 44 With a form factor of the Raspberry Pi Zero, this little board beats all its competitors. The RK3366 has an NPU for deep learning acceleration. For € 21,= you get a board identical to the Raspberry Pi 4, with an additional 0.6 TOPS NPU. Note: the NPU need 2 GB or more RAM. Due to the poor support, drivers and software may be a problem,
	Rock 3C 4x Cortex-A55 Mali-G52 GPU 1 TOPS NPU 1.6 GHz - 1/2 GB ★★☆☆☆ ★★☆☆☆ € 40/€ 50 We mentioned the Rock 3C only because it is the cheapest fully stacked board with an RK3366. For € 40,= you get a board almost identical to the Raspberry Pi 4, with an additional 1 TOPS NPU. As usual, drivers and software are the bottleneck.
	Rock 5A 4x Cortex-A76 + 4x Cortex-A55 CPU Mali-G610 MP4 GPU 6 TOPS NPU 2.5 GHz - 4/8/16 GB ★★★★☆ ★★★★☆ € 100/€ 120/€ 160 The Rock5A is targeted as the next-generation Raspberry Pi 4. Build with the Rockchip RK3588, giving you better and faster CPUs. The NPU (neural processing unit) supports INT4/INT8/INT16/FP16 mixed operations. Runs Android, Debian, Ubuntu etc. Due to the weak support, software drivers can be a problem.
	Rock 5B 4x Cortex-A76 + 4x Cortex-A55 CPU Mali-G610 MP4 GPU 6 TOPS NPU 2.5 GHz - 4 GB ★★★★☆ ★★★★☆ € 150 A slightly larger board than the Rock 5A, with the same Rockchip RK3588. More I/O is available compared to the Rock 5A. Thanks to the metal housing, the cooling of the SoC is no problem.
	Orange Pi 5 4x Cortex-A76 + 4x Cortex-A55 CPU Mali-G610 MP4 GPU 6 TOPS NPU 2.5 GHz - 4/8/16 GB ★★★★☆ ★★★★☆ € 77/€ 100/€ 127 Almost identical to the Rock 5 board with the Rockchip RK3588S. The price is slightly lower compared to the Rock 5. Runs on Android, Debian, Ubuntu etc. Due to the weak support, software drivers can be a problem. Don't forget to cool your RK3588. Camera.
	Google Coral 4x Cortex-A53 + 1x Cortex M4 CPU GC7000 Lite 3D GPU 4.0 TOPS NPU 1.5 + 1.0 GHz - 1/4 GB ★★☆☆☆ ★★★☆☆ € 125 Raspberry-inspired board with the Edge TPU accelerator. Note the limited RAM (1 GB), while deep learning is memory hungry.
	Google Coral Mini 4x Cortex-A35 CPU IMG PowerVR GE8300 GPU 4.0 TOPS NPU 1.3 GHz - 2 GB ★☆☆☆☆ ★★★☆☆ € 90 A smaller, simpler and cheaper board. Less CPU performance gives lower power consumption. Usual I/O and the original Edge TPU accelerator.
	Google Coral Micro Cortex-M7 + Cortex-M4 4.0 TOPS NPU 1.0 GHz - 512 MB ☆☆☆☆☆ ★★★☆☆ € 85 A simple micro controller with the original Edge TPU accelerator. On board a 324x324 color camera. WiFi and Ehternet are separate add-on boards. With only 512MB of RAM, it's like a Ferrari with a 5-gallon gas tank. Best, you only run tiny quantized (int8) TensorFlow Lite models.
	Khadas VIM3 4x Cortex-A73 + 2x Cortex-A53 CPU ARM G52 MP4 GPU 5 TOPS NPU 2.2 GHz - 2/4 GB ★★★☆☆ ★★★☆☆ € 110 Superior Raspberry replacement with an 8 and 16-bit neural network processing unit . Update Oct 2022: Sadly, at the moment, there's only one outdated framework (2020) available for the NPU, which isn't capable of running modern Yolo models.
	OAK-1 Myriad X 16 SHAVE cores 4 TOPS ★★☆☆☆ ★★★☆☆ € 85 OpenCV AI Kit, with an integrated Sony 12 MPixel IMX378 camera and a Myriad X VPU. Suitable for most vision tasks, such as simple deep learning.
	OAK-D Myriad X 16 SHAVE cores 4 TOPS ★★☆☆☆ ★★★☆☆ € 125 OpenCV AI Kit with an integrated Sony 12 MPixel IMX378 camera and a Myriad X VPU. Compared to the OAK-1 it has two additional OV9282 camera's with global shutter, making it ready for depth sensing, 3D object tracking and image stitching.
	OAK-D-lite Myriad X 16 SHAVE cores 4 TOPS ★★☆☆☆ ★★★☆☆ € 170 The OAK-D, but now in a beautiful housing. The Myriad X VPU is still used as a workhorse. Only the cameras have been upgraded to the Sony 13 MPixel IMX214 and two OV7251 for depth measurement.
	Intel Neural Stick 2 Intel Movidius Myriad X 16 SHAVE cores 1 TOPS ☆☆☆☆☆ ★★☆☆☆ € 81 Special Intel neural network USB 3 dongle for PC and single boards like Raspberry Pi. Accelerates tensor arithmetic enormously. Fully supported by OpenCV.
	Google Coral USB Edge TPU 4.0 TOPS ☆☆☆☆☆ ★★★☆☆ € 82 Only the bare Google Coral Edge TPU with a USB 3.0 interface. Capable of the same as the Coral board.
	Orange Pi AI Stick Lite Lightspeeur NPU 2.8 TOPS ☆☆☆☆☆ ★★☆☆☆ € 22 The Rockchip RK3399 neural processor unit in a USB 3 dongle. Supports various deep learning models such as VGG, SSD by an Orange Pi convertor tool.
	RK1808 NPU Rockchip AI core 3.0 TOPS ☆☆☆☆☆ ★★★☆☆ € 78 (Sold out) The neural processor unit from the Rockchip RK3399 in a USB 3 dongle. It also has 1GB RAM and 8GB EMMC storage on board.
	Sipeed Maix Go 2x RISC-V 64-bit CPU 0,5 TOPS 800 MHz - 8 MB ★☆☆☆☆ ★★☆☆☆ € 35 (Sold out) A very cute board with camera, microphone, speaker, I/O, USB and on top an NPU accelerator. Not a RPi or a Nano, but still perfect for simple deep learning tasks. Working with MicroPython. Most interesting is the low power consumption of 300 mW.
	JeVois 4x Cortex-A7 CPU 2x Mali-400 GPU 10 GFLOPS 1.35 GHz - 256 MB ★★☆☆☆ € 70 A complete 32-bit single board computer with an integrated 1.3 MP camera. Due to the GPU capable of deep learning and other machine vision tasks. It's also very small (32x40 mm). Read more here.
	Sophon BM1880 2x Cortex-A53 + RISCV CPU - 1.0 TOPS AI Core 1.5 + 1.0 GHz - 1 GB ★★☆☆☆ ★★☆☆☆ € 116 8-bit neural network processing unit
	Google SOM 4x Cortex-A53 + 1x Cortex M4 CPU GC7000 Lite 3D GPU 4.0 TOPS NPU 1.5 + 1.0 GHz - 1 GB ★★☆☆☆ ★★★☆☆ € 90 Single tiny (40x48 mm) pluggable module with full I/O and the Edge TPU accelerator.

Deep learning algorithms for Raspberry Pi

Deep learning software for Raspberry Pi

Deep learning with Raspberry Pi and alternatives in 2024 - Q-engineering (2024)

FAQs

Is Raspberry Pi powerful enough for AI? ›

Raspberry Pi, a credit card-sized computer, emerges as a key player in democratizing AI exploration. Its affordability and versatility make it an ideal platform for enthusiasts and professionals alike.

Tell Me More ›

Is Raspberry Pi good for deep learning? ›

Now it is possible to train your deep learning models to a certain extent with just a simple Raspberry Pi. Transfer training (partially adjusting your weights without changing the topology) is doable because it is a relatively easy task that you can do well in a few hours.

Know More ›

What is more powerful than a Raspberry Pi? ›

The Rock64 Media Board is a powerful Raspberry Pi alternative with a faster processor and double the memory. It features a quad-core 2.1GHz processor and plenty of ports, including HDMI, and built-in support for Android and Linux.

Get More Info ›

Can Raspberry Pi 3 run machine learning? ›

Raspberry Pi devices have limited computational power, so consider using lightweight machine learning models and efficient code to ensure smooth performance. Additionally, explore techniques like quantization and parallel processing to optimize your code for real-time applications.

Tell Me More ›

Is Arduino or Raspberry Pi better for AI? ›

Choose Arduino for real-time applications and minimal power requirements, and Raspberry Pi for advanced IoT projects with computational demands and networking features.

Know More ›

Is Raspberry Pi powerful enough for coding? ›

With the Raspberry PI, you can plug it into your TV and a keyboard and mouse; if you want it to learn to code or to build electronics projects, it was made for that kind of works and for many of the things that your desktop PC does, like spreadsheets, word processing, browsing the internet, and playing games.

Get More Info Here ›

Can I run a CNN on Raspberry Pi? ›

This paper proposes a lightweight convolutional neural network TripleNet, which can operate easily on Raspberry Pi.

Get More Info ›

How many days will it take to learn Raspberry Pi? ›

Our current courses take either three or four weeks to complete (check course details). Each week usually has around two hours of content for learners to work through. It's absolutely fine to take more time to reflect and learn at your own pace though.

Read On ›

What is Raspberry Pi weakness? ›

It does not replace the computer, and the processor is not as fast. It is a time consuming to download and install software i.e.; unable to do any complex multitasking. Not compatible with the other operating systems such as Windows.

Show Me More ›

What is the alternative to Raspberry Pi in 2024? ›

The Best Raspberry Pi Alternatives in 2024

Buy the UDOO BOLT V3 if you want the best premium Raspberry Pi alternative. Buy the Libre Le Potato if you want the best budget Raspberry Pi alternative. Buy the ODROID N2+ if you want the best Raspberry Pi alternative for networking and streaming.

Keep Reading ›

What will replace Raspberry Pi? ›

What is the best Raspberry Pi alternative?

Best Raspberry Pi alternative	Processor
Libre Computer Board AML-S905X-CC (Le Potato)	Amlogic S905X SoC
Orange Pi 5 Plus	Rockchip RK3588 8-core, up to 2.4GHz
Asus Tinker Board S R2.0	Rockchip Quad-Core RK3288 1.8GHz

2 more rows

Apr 1, 2024

Get More Info ›

Can Raspberry Pi run TensorFlow? ›

This page will guide you through the installation of TensorFlow on a Raspberry Pi 4 with a 64-bit Bullseye operating system. TensorFlow is a large software library specially developed for deep learning. It consumes a vast amount of resources. You can execute TensorFlow on a Raspberry Pi 4, but don't expect miracles.

Can TensorFlow run on Raspberry Pi 3? ›

We will use a Raspberry Pi 3 Model B+ and the recommended power supply for it. Installing TensorFlow will enable you to get started with learning about AI techniques and incorporating it in your future Raspberry Pi projects.

Show Me More ›

Which Raspberry Pi is best for machine learning? ›

Therefore, the latest and preferred model for machine learning applications is the Raspberry Pi 4 Model B. Typical machine learning projects for the Pi involve classifying items, including different visual, vocal, or statistical patterns. For example, one could train a Pi to recognize coffee mugs.

Is the Raspberry Pi 5 good for AI? ›

In it, we get an impressive look at the Pi 5's ability to utilize local AI models. Data Slayer is using a tool called Olama to manage and switch between the LLMs. This is done via command line and doesn't require any access to the internet as everything is stored on the Pi 5.

Keep Reading ›

What is the AI processor for Raspberry Pi? ›

One of the RZ family MPUs, the RZ/V2L AI processor, is equipped with two Cortex^®-A55, the latest 64-bit CPU core from Arm^®, and can run at 1.2GHz, a much higher operating frequency than MCUs. And, the RZBOARD V2L, a Raspberry Pi-type single-board computer equipped with RZ/V2L, was released this summer.

Read On ›

Do you need a powerful PC for AI? ›

The CPU is the most important factor when choosing a laptop for AI or ML work. You'll want at least 16 cores, but if you can get 24, that's best. The clock speed will also be important.

What is the use of Raspberry Pi in AI? ›

A Raspberry Pi, when combined with sensors and AI algorithms, can power robots that navigate obstacles, follow predefined paths, or even clean your room. Projects like these not only teach you about robotics and AI but also involve a fun, hands-on approach to solving complex problems.

Read The Full Story ›

Deep learning with Raspberry Pi and alternatives in 2024 - Q-engineering (2024)

Q-engineering

Introduction.

Weight matrix

GPU

TPU

Comparison of Raspberry Pi and alternatives.

Raspberry Pi and deep learning.

Raspberry Pi and recent alternatives.

FAQs

Is Raspberry Pi powerful enough for AI? ›

What will replace Raspberry Pi? ›