Introduction
I've been having some intrusive thoughts - that while async embedded Rust is great, it could also be better, more transparent and best practices should be documented.
This book serves two main purposes:
- To demystify some parts of the current embedded Rust ecosystem and provide example solutions to some pain points that exist today.
- To serve as a notebook for my ideas. Note that these are just ideas, not a definitive source of truth. These ideas may be presented in a very raw form and important parts may be missing.
My intrusive thoughts revolve around the following ideas (in no particular order):
- Tooling improvements - make common tasks easy (measure binary size, bss use), crash inspections, logs inspections.
- Explanation of async on embedded by developing a simple async executor.
- Exploration of intrusive linked list as an alternative to static or fixed size allocation.
- Tracing for embedded async.
- Standardization of reading and writing of firmware metadata.
- Developing best practices for panic/hardfault handling and post-mortem debugging.
- Developing a limited example RP2350 HAL with primitives for more low level DMA and drivers (something like lilos's Notify).
Some of the aforementioned rough edges IMHO are:
- It is unclear how to do some common things
(e.g.
static mut
handling, especially in the context of 2024 edition changes). - Writing hardware independent/HAL independent drivers requires a lot of "infectious" generics.
- HALs lock the users into a specific ways of using peripherals, because it is often impractical to implement all the peripheral IP features. As a result of this, making highly special things is hard - an example of this is abusing the double buffered DMA to support reading from the DCMI peripheral on STM32 to allow for DMA reads consisting of more than 65535 transfers.
- Debugging of why things don't work (for example even before defmt is available) is not well documented.
liltcp
liltcp
is a demo project concerned with developing a basic glue library for
connecting together smoltcp
, an HAL and an async runtime.
The name is sort of a pun on both smoltcp and Cliff L. Biffle's lilos,
because both of these are used as a basis for the glue.
The goal of the project is to be able to produce a working yet very basic
alternative to embassy-net
, therefore documenting how it works and how
to use smoltcp
's async capabilities.
To avoid depdency on embassy
itself, stm32h7xx-hal
is used as a HAL.
The demo project is developed for the STM32H743ZI Nucleo devkit, but it should work with any other H7 board, providing pin mappings are corrected.
Getting started
Before diving into developing the networking code,
let's first make an LED blinking smoke test.
This is just to make sure that the environment is set up correctly
and there are no broken things (devkit, cables, etc).
The smoke test also makes sure that we have lilos
working together with the HAL.
The code below implements such a smoke test.
#![no_main] #![no_std] use liltcp as _; #[cortex_m_rt::entry] fn main() -> ! { let mut cp = cortex_m::Peripherals::take().unwrap(); let dp = stm32h7xx_hal::pac::Peripherals::take().unwrap(); let ccdr = liltcp::initialize_clock(dp.PWR, dp.RCC, &dp.SYSCFG); let gpio = liltcp::init_gpio( dp.GPIOA, ccdr.peripheral.GPIOA, dp.GPIOB, ccdr.peripheral.GPIOB, dp.GPIOC, ccdr.peripheral.GPIOC, dp.GPIOE, ccdr.peripheral.GPIOE, dp.GPIOG, ccdr.peripheral.GPIOG, ); lilos::time::initialize_sys_tick(&mut cp.SYST, ccdr.clocks.sysclk().to_Hz()); lilos::exec::run_tasks( &mut [core::pin::pin!(liltcp::led_task(gpio.led))], lilos::exec::ALL_TASKS, ) }
First, it initializes the clock, then the GPIO. These are initialized with functions created to allow for easier code sharing, so these include more code than necessary. Next, we initialize the SYSTICK and spawn an LED blinking task.
The LED blinking task itself is pretty bare:
#![allow(unused)] fn main() { pub async fn led_task(mut led: ErasedPin<Output>) -> Infallible { let mut gate = PeriodicGate::from(lilos::time::Millis(500)); loop { led.toggle(); gate.next_time().await; } } }
If everything went well you should see a blinking LED (amber on the Nucleo devkit). We can now move to initializing the Ethernet peripheral to do some basic link state polling.
Initializing and polling the Ethernet peripheral
At this point, we know that the devkit is able to run our code, but it doesn't yet do anything network related, so let's change that.
First, we need to initialize the Ethernet peripheral driver from the HAL.
#![allow(unused)] fn main() { let (_eth_dma, eth_mac) = ethernet::new( dp.ETHERNET_MAC, dp.ETHERNET_MTL, dp.ETHERNET_DMA, gpio.eth_pins, unsafe { liltcp::take_des_ring() }, liltcp::MAC, ccdr.peripheral.ETH1MAC, &ccdr.clocks, ); let mut lan8742a = ethernet::phy::LAN8742A::new(eth_mac.set_phy_addr(0)); lan8742a.phy_reset(); lan8742a.phy_init(); }
The initialization itself is pretty bare, the only remotely interesting part is the initialization of the PHY on the address 0.
The ethernet peripheral internally sets up DMA for receiving and transmitting data and lets the user know that something happened using an interrupt handler.
#![allow(unused)] fn main() { #[cortex_m_rt::interrupt] fn ETH() { unsafe { ethernet::interrupt_handler(); } } }
The interrupt must also be enabled in NVIC, which is done using the following function, called just before lilos
spawns tasks.
#![allow(unused)] fn main() { pub unsafe fn enable_eth_interrupt(nvic: &mut pac::NVIC) { ethernet::enable_interrupt(); nvic.set_priority(stm32h7xx_hal::stm32::Interrupt::ETH, NVIC_BASEPRI - 1); cortex_m::peripheral::NVIC::unmask(stm32h7xx_hal::stm32::Interrupt::ETH); } }
Once this is done, the peripheral is ready to send and receive data. That, however, is a topic for the next chapter. For now, we only want to check if the link is up. This is done by polling the PHY. Let's now add a new async task, which will periodically poll the PHY and print the link state on change. To also see the link state on the devkit, let's also turn the LED on, when the link is UP.
#![allow(unused)] fn main() { // Periodically poll if the link is up or down async fn poll_link<MAC: StationManagement>( mut phy: LAN8742A<MAC>, mut link_led: ErasedPin<Output>, ) -> Infallible { let mut gate = PeriodicGate::from(Millis(1000)); let mut eth_up = false; loop { gate.next_time().await; let eth_last = eth_up; eth_up = phy.poll_link(); link_led.set_state(eth_up.into()); if eth_up != eth_last { if eth_up { defmt::info!("UP"); } else { defmt::info!("DOWN"); } } } } }
The final thing left to do is to spawn the task and run the binary on our devkit.
#![allow(unused)] fn main() { unsafe { liltcp::enable_eth_interrupt(&mut cp.NVIC); lilos::exec::run_tasks_with_preemption( &mut [ core::pin::pin!(liltcp::led_task(gpio.led)), core::pin::pin!(poll_link(lan8742a, gpio.link_led)), ], lilos::exec::ALL_TASKS, Interrupts::Filtered(liltcp::NVIC_BASEPRI), ); } }
When you plug in an Ethernet cable, there should be a log visible in the terminal and an LED should light up.
We are now ready to move on to actually receiving and transmitting data via the Ethernet.
Polled TCP
When developing a classic embedded Rust application that uses smoltcp for networking (either using RTIC or no executor at all), a common way to do that is to handle networking as part of the ethernet interrupt. This has a few problems:
- Dependencies to the interrupt have to be declared as global statics.
- The IRQ must never block.
- It is harder to add another source of forcing the stack polling.
- It is up to the developer to handle the state machine properly. (This will be solved in the next chapter with async.)
Let's try to solve the first two problems by adding a simple async task, which will periodically poll the smoltcp
interface and handle a TCP client.
For reference, an example of an RTIC example can be found here.
Configuring the IP address
At this point, we will be using the network layer, so the first thing we need to do is to configure an IP address for our smoltcp interface.
let config = smoltcp::iface::Config::new(liltcp::MAC.into());
let mut interface = Interface::new(config, &mut eth_dma, liltcp::smoltcp_lilos::smol_now());
interface.update_ip_addrs(|addrs| {
let _ = addrs.push(IpCidr::new(
liltcp::IP_ADDR.into_address(),
liltcp::PREFIX_LEN,
));
});
let mut storage = [SocketStorage::EMPTY; 1];
let mut sockets = SocketSet::new(&mut storage[..]);
The IP address and PREFIX_LEN are defined in the lib.rs
as follows:
pub const IP_ADDR: Ipv4Address = Ipv4Address::new(10, 106, 0, 251);
pub const PREFIX_LEN: u8 = 24;
In theory, it should be possible to initialize the whole CIDR address in a single constant, but the patch has only landed recently and is not released yet.
Another thing included in the snippet is allocation of a SocketStorage
and a SocketSet
, which is smoltcp
's way of storing active sockets.
In this case, we will add only one socket, so the storage array length will be 1
.
Network task
Now, that the preparations are out of the way, we can define our net_task. This task will handle both polling of the stack and handling of TCP (even though it will be simplified.
async fn net_task(
mut interface: Interface,
mut dev: ethernet::EthernetDMA<4, 4>,
sockets: &mut SocketSet<'_>,
mut phy: LAN8742A<impl StationManagement>,
mut link_led: ErasedPin<Output>,
) -> Infallible {
static mut RX: [u8; 1024] = [0u8; 1024];
static mut TX: [u8; 1024] = [0u8; 1024];
let rx_buffer = unsafe { RingBuffer::new(&mut RX[..]) };
let tx_buffer = unsafe { RingBuffer::new(&mut TX[..]) };
let client = smoltcp::socket::tcp::Socket::new(rx_buffer, tx_buffer);
let handle = sockets.add(client);
let mut eth_up = false;
loop {
'worker: {
let eth_last = eth_up;
eth_up = phy.poll_link();
link_led.set_state(eth_up.into());
if eth_up != eth_last {
if eth_up {
defmt::info!("UP");
} else {
defmt::info!("DOWN");
}
}
if !eth_up {
break 'worker;
}
let ready = interface.poll(liltcp::smoltcp_lilos::smol_now(), &mut dev, sockets);
if !ready {
break 'worker;
}
let socket = sockets.get_mut::<smoltcp::socket::tcp::Socket>(handle);
if !socket.is_open() {
defmt::info!("not open, issuing connect");
defmt::unwrap!(socket.connect(
interface.context(),
liltcp::REMOTE_ENDPOINT,
liltcp::LOCAL_ENDPOINT,
));
break 'worker;
}
let mut buffer = [0u8; 10];
if socket.can_recv() {
let len = defmt::unwrap!(socket.recv_slice(&mut buffer));
defmt::info!("recvd: {} bytes {}", len, buffer[..len]);
}
if socket.can_send() {
defmt::unwrap!(socket.send_slice(b"world"));
}
}
// NOTE: Not performant, doesn't handle interrupt signal, cancel the wait on IRQ, etc.
// NOTE: In async code, this will be replaced with a more elaborate calling of poll_at.
lilos::time::sleep_for(lilos::time::Millis(1)).await;
}
}
First, we define buffers that the TCP socket will internally use.
These are defined as mutable statics, because they need to have the same
lifetime or outlive the 'a
lifetime defined for the SocketSet
.
Next, we create a TCP socket and add it to our SocketSet
.
This call gives us a handle that can be used to later access the socket through
the SocketSet
.
Now, the polling itself takes place.
This is done in a loop with a labeled block called 'worker
.
First, we check that the link is UP, if it is not the case, let's just break the
'worker
block.
If the link is UP, we poll the interface to check if there are any new data
to be processed by our socket.
When there are, we can access our socket using the aforementioned handle and we
can do operations with it.
In this case, we check if it is open, if it is not the case, we attempt to connect to a
remote endpoint and break the 'worker
block to let the interface be polled again.
On next polls, if the socket is open, we attempt to do a read and subsequently
a write.
In the case of completion of the 'worker
block or the block being interrupted
by the break 'worker
, the task will sleep for a millisecond.
Another big problem here is performance, the polling loop runs with a fixed period of 1 ms.
Spawning the network task
Now we can simply spawn our task and let it do the polling and TCP handling.
lilos::exec::run_tasks_with_preemption(
&mut [
core::pin::pin!(liltcp::led_task(gpio.led)),
core::pin::pin!(net_task(
interface,
eth_dma,
&mut sockets,
lan8742a,
gpio.link_led
)),
],
lilos::exec::ALL_TASKS,
Interrupts::Filtered(liltcp::NVIC_BASEPRI),
);
Conclusions
This solution is probably good enough for a simple tests, but apart from it not being async, there is one big problem - adding the TCP handling will soon become a hassle, with any addition.
This is caused by these factors:
- It is tightly coupled with smoltcp stack polls.
- Adding more sockets will clutter the code even more.
- Adding any kind of timeout would block the entire task, or you'd need to implement some sort of a state machine that will handle this - but this is what we want to use async for.
Let's now have a quick intermezzo concerning decoupling of polling and socket handling. Let's share the smoltcp stack across tasks.
Intermezzo - sharing smoltcp stack between tasks
Sharing data between tasks is usually dependent on the executor and other environment.
For example in embassy, sharing can be done with references with a static
lifetime, since tasks are allocated in statics.
In the std
environment, you'd typically used something like an Arc
.
In our environment (lilos
executor), tasks are allocated on the stack.
This means that for sharing data, we don't need to use references with a static
lifetime, but with a generic lifetime.
This is important, as we don't have to deal with either static mut
s or
initialization of statics with local data.
A simple example of this can be seen in the following snippet.
fn main() -> ! {
let shared_resource = 0;
lilos::run_tasks(
&mut [
pin!(task_a(&shared_resource)),
pin!(task_b(&shared_resource)),
]
)
}
async fn task_a(res: &i32) -> Infallible { .. }
async fn task_b(res: &i32) -> Infallible { .. }
Mutating the shared resources
This basically solves the problem of sharing data between tasks,
but one problem still remains - how can we mutate the shared data?
We can't have multiple mutable references at the same time, so we need to
utilize some kind of interior mutability pattern.
This is usually done with the Cell
or RefCell
types.
Cell
is not very useful for our use case, since it provides mutability by
moving in and out of it.
RefCell
is much more interesting, because it allows us to obtain mutable and
immutable references to our data.
Without going into much detail, RefCell
basically implements the borrow checker
and its rules in the runtime, instead of compile time.
When we wrap our shared resource with RefCell
, our example code will look
like the following snippet.
fn main() -> ! {
let shared_resource = RefCell::new(i32);
lilos::run_tasks(
&mut [
pin!(task_a(&shared_resource)),
pin!(task_b(&shared_resource)),
]
)
}
async fn task_a(res: &RefCell<i32> -> Infallible { .. }
async fn task_b(res: &RefCell<i32> -> Infallible { .. }
Now, when we want to access some data in a task we can do:
async fn task_a(res: &RefCell<i32>) -> Infallible {
{
let r = res.borrow_mut();
*r += 1;
}
yield_cpu().await;
}
Notice, that the shared reference access is done in a block.
That is to assure that the r
which is actually a "smart" pointer
to the underlying data is dropped before we yield control to the executor.
If it weren't dropped before the yield (actually any await
point),
the code would crash upon obtaining another mutable borrow from the RefCell
.
Hiding the implementation details and providing a nice API
This code is quite good until there are more shared resources, or the need arises to implement methods on the shared resource. Ideally, we'd like to be able to wrap the shared state into a structure and not expose the implementation detail of the shared reference and interior mutability.
The approach I have chosen for this is to create a wrapper around the shared reference.
Until we add some more fields to the wrapper, it will be trivially copyable -
meaning it can be passed into as many tasks as required and using it, we can
make a nice API, that hides the aforementioned implementation detail.
This pattern is generally used, embassy-net
, which this tutorial is based on,
also uses it.
Let's implement it:
We'll define our shared state as InnerStack
struct.
pub struct InnerStack {
// stack fields
}
Now, let's create a wrapper struct that we'll implement our API on.
pub struct Stack<'a> {
pub inner: &'a RefCell<InnerStack>,
}
We want to avoid handling the RefCell
in every function call, so let's create
an accessor function.
impl<'a> Stack<'a> {
pub fn with<F, U>(&mut self, f: F) -> U
where
F: FnOnce(&mut InnerStack) -> U,
{
f(&mut self.inner.borrow_mut())
}
}
Now, we can implement methods on the Stack
that look like this:
impl<'a> Stack<'a> {
pub fn poll(&mut self) -> bool {
self.with(|stack| stack.poll())
}
}
Which is much more readable, hides the RefCell
and most importantly limits
the scope of the RefCell
borrows.
Sharing a smoltcp stack
This implementation works for the simpler cases, but there is a problem with
smoltcp: for some calls, you need to have mutable references to two fields of
the InnerStack
- to the SocketStorage
and to the Interface
.
This seems simple at first, but is a bit involved as it goes against the borrow checker's rules on mutable borrows. Trying it out is left as an exercise to the reader.
The solution to this is to use the RefMut::map_split
function to effectively
split one RefMut
into two RefMut
s.
Combining all the above together and modifying it to fit the needs of
a smoltcp
wrapper, we get the following code.
use core::cell::{RefCell, RefMut};
use smoltcp::iface::{Interface, SocketSet, SocketStorage};
pub struct InnerStack<'a> {
sockets: SocketSet<'a>,
interface: Interface,
}
impl<'a> InnerStack<'a> {
pub fn new(storage: &'a mut [SocketStorage<'a>], interface: Interface) -> Self {
Self {
sockets: SocketSet::new(storage),
interface,
}
}
}
#[derive(Clone, Copy)]
pub struct Stack<'a> {
inner: &'a RefCell<InnerStack<'a>>,
}
impl<'a> Stack<'a> {
pub fn new(inner: &'a RefCell<InnerStack<'a>>) -> Self {
Self { inner }
}
pub fn with<F, U>(&mut self, f: F) -> U
where
F: FnOnce((&mut SocketSet<'a>, &mut Interface)) -> U,
{
let (mut interface, mut sockets) = RefMut::map_split(self.inner.borrow_mut(), |r| {
(&mut r.interface, &mut r.sockets)
});
f((&mut sockets, &mut interface))
}
}
Cleaning up the API
The code now implements everything we need from it, but still has a problem that
we are leaking the information about the RefCell
to the creator of the stack,
which in turn requires us to make the InnerStack
public.
A possible solution to this is the following:
use core::{cell::RefCell, mem::MaybeUninit};
pub struct StackResources {
inner: MaybeUninit<RefCell<InnerStack>>,
}
struct InnerStack {
resource_a: i32,
}
struct Stack<'a> {
inner: &'a RefCell<InnerStack>,
}
impl<'a> Stack<'a> {
fn new(resources: &'a mut StackResources) -> Self {
let inner = resources
.inner
.write(RefCell::new(InnerStack { resource_a: 42 }));
Self { inner }
}
}
This code is a heavily distilled solution of how embassy-net
does this.
You can find the original solution here.
Having this out of the way, we can now finally go and implement an asynchronous TCP socket.
Fully asynchronous TCP client
In the previous chapter, we managed to share a wrapper around smoltcp
between tasks.
That means that we are now ready to separate polling the stack and handling sockets.
Polling the stack
Let's start by implementing the stack polling. There are two signals that should trigger polling:
- The Ethernet interrupt
- smoltcp's internal timers
As for signaling from the Ethernet interrupt, we can use lilos
's
Notify
synchronization primitive.
static IRQ_NOTIFY: lilos::exec::Notify = lilos::exec::Notify::new();
We must declare it statically, so that it can be accessed from the interrupt handler.
Luckily, it has a const
new()
function, so nothing special needs to be done
to initialize it.
Now, whenever the interrupt handler is called, we can notify that something happened.
#[cortex_m_rt::interrupt]
fn ETH() {
unsafe {
ethernet::interrupt_handler();
}
// NOTE: embassy_net wakes polling task any time RX or TX tokens are consumed, resulting in 3x
// throughput
IRQ_NOTIFY.notify();
}
We can wait for the signal in our polling task using the Notify::until_next
method.
Now, let's go back to the polling signaled by the smoltcp internal timers.
smoltcp
's Interface
contains a mechanism of letting the polling code know
when it should be polled next or after how much time it should be polled next.
For the delaying of the polling, we can use lilos::time::sleep_for
async function.
So, we now have two futures, we need to combine and whenever one of them
completes, we can poll the interface.
For this we can use the select(A, B)
asynchronous function from
embassy-futures
, which does exactly what we need,
receives two features and returns whenever one of the features resolves.
The whole polling task is in the following snippet.
async fn net_task(
mut stack: Stack<'_>,
mut dev: ethernet::EthernetDMA<4, 4>,
mut phy: LAN8742A<impl StationManagement>,
mut link_led: ErasedPin<Output>,
) -> Infallible {
let mut eth_up = false;
loop {
let poll_delay = stack.with(|(sockets, interface)| {
interface
.poll_delay(smol_now(), sockets)
.unwrap_or(Duration::from_millis(1))
});
match embassy_futures::select::select(
lilos::time::sleep_for(lilos::time::Millis(poll_delay.millis())),
IRQ_NOTIFY.until_next(),
)
.await
{
select::Either::First(_) => {}
select::Either::Second(_) => {}
}
let eth_last = eth_up;
eth_up = phy.poll_link();
link_led.set_state(eth_up.into());
if eth_up != eth_last {
if eth_up {
defmt::info!("UP");
} else {
defmt::info!("DOWN");
}
}
if !eth_up {
continue;
}
stack.with(|(sockets, interface)| interface.poll(smol_now(), &mut dev, sockets));
}
}
Apart from just polling, it also handles the link state.
Adding a TCP client socket
With polling out of the way, we can now focus on adding a task that will handle a TCP connection. What we want is to connect to a TCP server, and loopback the data the server sent us. This time, let's start with the top-down approach and write the body of the task first, without worrying about the implementation.
async fn tcp_client_task(stack: Stack<'_>) -> Infallible {
static mut TX: [u8; 1024] = [0u8; 1024];
static mut RX: [u8; 1024] = [0u8; 1024];
let mut client = TcpClient::new(stack, unsafe { &mut RX[..] }, unsafe { &mut TX[..] });
client
.connect(liltcp::REMOTE_ENDPOINT, liltcp::LOCAL_ENDPOINT)
.await
.unwrap();
defmt::info!("Connected.");
// loopback
loop {
let mut buffer = [0u8; 5];
let len = defmt::unwrap!(client.recv(&mut buffer).await);
// Let's not care about the number of sent bytes,
// with the current buffer settings, it should always write full buffer.
defmt::unwrap!(client.send(&buffer[..len]).await);
}
}
We can see, that first, we initialize the transmitting and receiving buffers.
Then we create a new socket on our stack and pass it the buffers.
unsafe
here is unavoidable without a lot of code because static mut
s are
inherently unsafe and will not even be possible in the future.
Socket definition and initialization
Let's have a look at the socket definition and initialization.
pub struct TcpClient<'a> {
pub stack: Stack<'a>,
pub handle: SocketHandle,
}
Here, the TcpClient
struct contains the wrapper to our Stack
and a
handle pointing to the Stack
's SocketSet
.
pub fn new(mut stack: Stack<'a>, rx_buffer: &'a mut [u8], tx_buffer: &'a mut [u8]) -> Self {
let rx_buffer = RingBuffer::new(rx_buffer);
let tx_buffer = RingBuffer::new(tx_buffer);
let socket = smoltcp::socket::tcp::Socket::new(rx_buffer, tx_buffer);
let handle = stack.with(|(sockets, _interface)| sockets.add(socket));
Self { stack, handle }
}
What happens here is wrapping the raw buffers into smoltcp
's ring buffers.
Then, a new socket is initialized with them and the socket is added
to the Stack
's SocketSet
.
The SocketSet::add
call returns a SocketHandle
, which we can later use
to access the socket.
Accessing the socket
The TcpClient
is basically a wrapper around the Stack
with a SocketHandle
,
together forming a "wrapper" around smoltcp::socket::tcp::Socket
,
which can be indirectly accessed with these two values.
That means that whenever we want to do something with the raw TCP socket, we need to obtain a reference to it via a handle.
To do this, we can utilize a similar pattern as in the previous chapter with
the Stack
.
fn with<F, U>(&mut self, f: F) -> U
where
F: FnOnce(&mut tcp::Socket, &mut Context) -> U,
{
self.stack.with(|(sockets, interface)| {
let socket = sockets.get_mut(self.handle);
f(socket, interface.context())
})
}
This way, when doing anything with the socket, we don't need to write
the boilerplate needed to access it via the Stack
and SocketHandle
combo.
Connecting
Let's now connect to the server.
This will be the first async function utilizing smoltcp
's async support.
pub async fn connect(
&mut self,
remote_endpoint: impl Into<IpEndpoint>,
local_endpoint: impl Into<IpListenEndpoint>,
) -> Result<(), ConnectError> {
self.with(|socket, context| socket.connect(context, remote_endpoint, local_endpoint))?;
poll_fn(|cx| {
self.with(|socket, _context| {
// shamelessly copied from embassy
match socket.state() {
tcp::State::Closed | tcp::State::TimeWait => {
Poll::Ready(Err(ConnectError::InvalidState))
}
tcp::State::Listen => unreachable!(), // marks invalid state
tcp::State::SynSent | tcp::State::SynReceived => {
socket.register_send_waker(cx.waker());
socket.register_recv_waker(cx.waker());
Poll::Pending
}
_ => Poll::Ready(Ok(())),
}
})
})
.await
}
Here, we first, initiate the connecting process and then, we create
a future using the poll_fn
.
The poll_fn
creates a future, that upon being polled calls a closure returning
core::task::Poll
, the closure also has access to Future
Context
,
meaning that we can register its Waker
to the socket.
That means that after the connecting process is initiated, the closure is
called once and then whenever it is awaken by smoltcp
.
In the body of the closure, the state of the socket is checked for possible failures,
or a success.
In the case, there is nothing yet to be done, it registers its waker to
the socket (this is done every time, because some executors may change
the waker over time).
This is the working principle of all the async smoltcp glue code.
Sending data
Sending data utilizes the same working principle as connecting. When polled, it attempts to write as much data to the socket buffers as possible and postpones its execution if the buffers are full.
pub async fn send(&mut self, buf: &[u8]) -> Result<usize, SendError> {
poll_fn(|cx| {
self.with(|socket, _context| match socket.send_slice(buf) {
Ok(0) => {
socket.register_send_waker(cx.waker());
Poll::Pending
}
Ok(n) => Poll::Ready(Ok(n)),
Err(e) => Poll::Ready(Err(e)),
})
})
.await
}
Receiving data
Receiving the data is similar to send data. When polled, it attempts to read some bytes, and when no_data is available, it waits for next poll.
pub async fn recv(&mut self, buf: &mut [u8]) -> Result<usize, RecvError> {
poll_fn(|cx| {
self.with(|socket, _context| match socket.recv_slice(buf) {
// return 0 doesn't mean EOF when buf is empty
Ok(0) if buf.is_empty() => Poll::Ready(Ok(0)),
Ok(0) => {
socket.register_recv_waker(cx.waker());
Poll::Pending
}
Ok(n) => Poll::Ready(Ok(n)),
// EOF
Err(RecvError::Finished) => Poll::Ready(Ok(0)),
Err(RecvError::InvalidState) => Poll::Ready(Err(RecvError::InvalidState)),
})
})
.await
}
Conclusion
And that is all there is to it. We now have a working async networking stack with quite nice API.
The TCP socket is by no means complete, but adding more functionality to it should not be much of a problem.
Conclusion
The goal of this tutorial was to explore the way to implement an
asynchronous networking stack and to show how embassy-net
works under the hood.
Huge kudos to @dirbaio for all the work he did
to make this possible.
The tutorial went from a strictly blocking code up to a fully asynchronous TCP
client socket.
I did some measurements on its throughput and the maximum throughput
on the Nucleo devkit was around 8 Mbits,
embassy-net
achieves 24 Mbits, which is likely due to polling each time
a buffer is dispatched through the peripheral.
Adding support for this would require significant changes to
the stm32h7xx-hal
crate.
The whole source code for this tutorial is available in the intrusive-thoughts repo. Don't hesitate to open any issues or post pull-requests with improvements.
It should be possible to make these wrappers HAL agnostic and have an async stack that can be shared across many HALs, but that is out of scope of this tutorial.
Sharing resources in no-std environments
Work in Progress
This article describes ways to share some resources across multiple tasks.
fn main() {
println!("hello")
}
// hides interior mutability implementation detail (creation of the inner state), infects code with
// references
// Can't be copy
// PROs
// - hides interior mutability primitive (is this desired though? - embassy-mutex flexibility)
// CONs
// - can't have &mut self receiver
mod reference_outside {
use std::cell::RefCell;
struct Inner {
a: i32,
}
struct Outer(RefCell<Inner>);
impl Outer {
fn new() -> Self {
Self(RefCell::new(Inner { a: 0 }))
}
fn describe(&self) {
println!("a: {}", self.0.borrow().a)
}
// Can't pass &mut
fn modify(&self, a: i32) {
self.0.borrow_mut().a = a;
}
}
fn main() {
let outer = Outer::new();
}
fn a(outer: &Outer) {
outer.describe();
}
fn b(outer: &Outer) {
outer.modify(1);
}
}
// shows interior mutability implementation detail (creation of the inner state), infects code with
// lifetimes
// Is meant to be copied
// PROs
// - can have &mut self receiver - API shows intent better
// CONs
// - internal implementation detail is shown
mod reference_inside {
use std::cell::RefCell;
struct Inner {
a: i32,
}
#[derive(Clone, Copy)]
struct Outer<'a>(&'a RefCell<Inner>);
impl<'a> Outer<'a> {
fn new(inner: &'a RefCell<Inner>) -> Self {
Self(inner)
}
fn describe(&self) {
println!("a: {}", self.0.borrow().a)
}
fn modify(&mut self, a: i32) {
self.0.borrow_mut().a = a;
}
}
fn a(outer: Outer) {
outer.describe();
}
fn b(mut outer: Outer) {
outer.modify(1);
}
fn main() {
let inner = RefCell::new(Inner { a: 0 });
let outer = Outer::new(&inner);
a(outer);
b(outer);
}
}
// Hides interior mutability implementation detail (creation of the inner state), infects code with
// lifetimes
//
// Is meant to be copied
//
// Makes need to allocate resources still visible
//
// PROs
// - can have &mut self receiver - API shows intent better
// - internal implementation detail is hidden
// - handling of init with multiple resources is easier
// - still shows that there is some shared state
// CONs
// - Boilerplate, that should be removable with a macro
mod reference_inside_hide_state {
use std::{cell::RefCell, mem::MaybeUninit};
struct Inner {
a: i32,
}
struct OuterAllocations {
inner: MaybeUninit<RefCell<Inner>>,
}
impl Default for OuterAllocations {
fn default() -> Self {
OuterAllocations {
inner: MaybeUninit::uninit(),
}
}
}
#[derive(Clone, Copy)]
struct Outer<'a> {
inner: &'a RefCell<Inner>,
}
impl<'a> Outer<'a> {
// &'a mut here makes sure that allocations is not used multiple times
fn new(allocations: &'a mut OuterAllocations) -> Self {
let inner = &*allocations.inner.write(RefCell::new(Inner { a: 0 }));
Self { inner }
}
fn describe(&self) {
println!("a: {}", self.inner.borrow().a)
}
fn modify(&mut self, a: i32) {
self.inner.borrow_mut().a = a;
}
}
fn a(outer: Outer) {
outer.describe();
}
fn b(mut outer: Outer) {
outer.modify(1);
}
fn main() {
let mut allocations = OuterAllocations::default();
let outer = Outer::new(&mut allocations);
a(outer);
b(outer);
}
}