FutureSDR MobiCom Demo Accepted


Our FutureSDR + IPEC demo is accepted at ACM MobiCom 2023. Yay!

In the demo, we show the same FutureSDR receiver running on three very different platforms:

  • a normal laptop, interfacing an Aaronia Spectran v6 SDR
  • a web browser, compiled to WebAssembly and interfacing a HackRF SDR through WebUSB
  • an AMD/Xilinx RFSoC ZCU111 evaluation board

On the ZCU111, the same decoder is implemented both in software (using FutureSDR) and in hardware (using IPEC). Since both implementations have the same structure, we can configure during runtime after which decoding stage to switch from FPGA to CPU processing.

With regard to FutureSDR, this highlights two important features:

  • We show the portability of FutureSDR, having the same receiver running on three very different platforms.
  • We show that the software implementation is capable of offloading different parts of the decoding dynamically during runtime.

Please checkout the paper for further information or visit our booth at the conference.

  1. David Volz, Andreas Koch and Bastian Bloessl, "Software-Defined Wireless Communication Systems for Heterogeneous Architectures," Proceedings of 29th Annual International Conference on Mobile Computing and Networking (MobiCom 2023), Demo Session, Madrid, Spain, October 2023. [DOI, BibTeX, PDF and Details...]

WebAssembly Tutorial


With all the cool Software Defined Radio (SDR) WebAssembly projects popping up, it seems like 2023 is the year of SDR in the browser :-)

We recently worked on improving the WebAssembly support for FutureSDR and are pretty happy with the result. It no longer requires pulling a lot of tricks to cross-compile the native driver with emscripten but enables a complete Rust workflow.

The current user experience is shown in a tutorial-style live coding video, where we port a native FutureSDR ZigBee receiver to WebAssembly.

FutureSDR Demo at 6G Platform Germany


David Volz and I presented our demo FutureSDR meets IPEC at the Berlin 6G Conference, a meeting of all 6G-related projects, funded by the Federal Ministry of Education and Research (BMBF). Our demo showed how FutureSDR can be used to implement platform-independent real-time signal processing applications that can be reconfigured during runtime.

We had the same FutureSDR receiver running on a Xilinx RFSoC FPGA board, a normal laptop with an Aaronia Spectran V6 SDR, an in the web, using a HackRF. We, furthermore, had the same receiver implemented on the FPGA of the RFSoC, using David’s IPEC framework for Inter-Processing Element Communication. Since the FPGA and the CPU implementations had the same structure, we could dynamically decide where to make the cut between FPGA and CPU processing, which was reflected in the CPU load of the RFSoC’s ARM processor.



We finally entered the proc_macro game for some advanced syntactic sugaring :-) The macros are still experimental but already fun to use.

At the moment, we support two macros; one for connecting the flowgraph and one for implementing message handlers.

Connect Macro

The connect! macro serves two purposes. It adds blocks to the flowgraph and it allows to connect them.

This makes the code quite a bit cleaner. Compare, for example:

fn main() -> Result<()> {
    let mut fg = Flowgraph::new();

    let src = NullSource::<u8>::new();
    let head = Head::<u8>::new(123);
    let snk = NullSink::<u8>::new();

    let src = fg.add_block(src);
    let head = fg.add_block(head);
    let snk = fg.add_block(snk);

    fg.connect_stream(src, "out", head, "in")?;
    fg.connect_stream(head, "out", snk, "in")?;




fn main() -> Result<()> {
    let mut fg = Flowgraph::new();

    let src = NullSource::<u8>::new();
    let head = Head::<u8>::new(123);
    let snk = NullSink::<u8>::new();

    connect!(fg, src > head > snk);



The macro uses > to indicate stream connections and | to indicate message connections.

connect!(fg, stream_source > stream_sink);
connect!(fg, message_source | message_sink);

If the port connections are not the default "in" and "out", they can be put explicitly.

connect!(fg, src.out > snk.in);

While it is uncommon, a port might have a space in its name. This can be solved by quoting the port name.

connect!(fg, src."output port" > snk.in);

If a port has no input or output ports that have to be connected, it can just be put on a line on its own, which just adds it to the flowgraph.

connect!(fg, dummy_block);

As shown in the example, blocks can also be chained.

connect!(fg, src > head > snk);

And, finally, more complex topologies can be set up with multiple lines.

    src > fwd;
    fwd > snk;
    msg_src | msg_snk;

The idea and initial implementation of the connect! macro was by Loïc Fejoz. Thank you!

Message Handlers

Handlers for message ports are, at the moment, quite ugly. This is mainly due to current limitations of Rust’s async functions that will hopefully be overcome in the future.

Assume you want to implement a block with a handler my_handler.

pub fn new() -> Block {
            .add_input("handler", Self::my_handler)

Using the #[message_handler] attribute macro, one can implement the handler with:

async fn my_handler(
    &mut self,
    _mio: &mut MessageIo<Self>,
    _meta: &mut BlockMeta,
    _p: Pmt,
) -> Result<Pmt> {

Which is much more what one would expect, compared to the dynamically generated and boxed-up async block that it actually is under the hood.

All of this is still experimental, but it’s amazing what is possible with proc macros.

If you want to see a complete example, we added one to the project.

Please give it a try and let us know what you think :-)

Better Flowgraph Interaction


A FlowgraphHandle is the main struct to interact with a flowgraph, once it is started and ownership is passed to the runtime. In essence, the handle wraps the sending part of a multi-producer-single-consumer channel to send FlowgraphMessages, which define the protocol between the flowgraph and the outside world.

pub struct FlowgraphHandle {
    inbox: Sender<FlowgraphMessage>,

Recently, we extended the interface, allowing the user to get a FlowgraphDescritpion or BlockDescription from the handle. This information can be used, for example, with GUI components to plot the flowgraph topology.

pub struct FlowgraphDescription {
    pub blocks: Vec<BlockDescription>,
    pub stream_edges: Vec<(usize, usize, usize, usize)>,
    pub message_edges: Vec<(usize, usize, usize, usize)>,

The interaction with the flowgraph is pretty elegant. We send it a message, asking for a FlowgraphDescription and provide a channel, where we await the result. This is sometimes referred to as the Actor Pattern.

pub async fn description(&mut self) -> Result<FlowgraphDescription> {
    let (tx, rx) = oneshot::channel::<FlowgraphDescription>();
        .send(FlowgraphMessage::FlowgraphDescription { tx })
    let d = rx.await?;

Control Port

Apart from extending the flowgraph handle, we refactored the control port interface of the flowgraph (i.e., the REST API) to use the FlowgraphHandle. Prior to that, there was a disconnect, which resulted in code duplication. Now, the web application server just uses the handle and exposes a web interface for it.

Both FlowgraphDescription and BlockDescription are serializable structs that are exposed at localhost:1337/api/fg/ and localhost:1337/api/block/<n>/.

Querying the new API w/ Curl.

Using the FlowgraphHandle with control port, the whole web server endpoint is just:

async fn flowgraph_description(
    Extension(mut flowgraph): Extension<FlowgraphHandle>,
) -> Result<Json<FlowgraphDescription>, StatusCode> {
    if let Ok(d) = flowgraph.description().await {
    } else {

API endpoints like these can be used easily with tools like curl or any web library.

Querying the API with Curl.

To demonstrate how this can be used, we made the control port front page a Mermaid diagram of the flowgraph.

New control port frontpage.

These concepts, in particular the Mermaid representation of the flowgraph, were thought of and pushed forward by Loïc Fejoz. Thank you!

Monomorphized Apply-/Functional-Style Blocks


We already discussed blocks that are generic over a function and showed how they are handy for rapid prototyping. These apply- or functional-style blocks proved incredibly useful and are utilized in many examples.

For example, a block that doubles every float is just Apply::new(|i: &f32| i * 2.0).

However, until now, these blocks had an inherent drawback, since the function was dispatched dynamically during runtime. To be precise, the function was a closure allocated on the heap, which resulted in a function call per item, without any chance for the compiler to optimize things.

From FutureSDR v0.0.22, we avoid this overhead, making the blocks generic over the function, instead of a heap-allocated closure. This means, we move from

pub struct Apply<A, B>
    A: 'static,
    B: 'static,
    f: Box<dyn FnMut(&A) -> B + Send + 'static>,


pub struct Apply<F, A, B>
    F: FnMut(&A) -> B + Send + 'static,
    A: Send + 'static,
    B: Send + 'static,
    f: F,

With this, the compiler generates a separate implementation for each apply-style block, i.e., it is monomorphized. This is in contrast to the old version, which was polymorphic in the sense that there was one Apply implementation that could handle any function closure with a given I/O signature.

Switching to monomorphized blocks, the compiler can inline the function and apply optimizations. There is, for example, no reason the compiler couldn’t vectorize the function to benefit from SIMD instructions.

We did a quick performance comparison for a simple operation (|x: &u8| x.wrapping_add(1)). To this end, we mocked the Apply block and processed 100M samples. On my machine, this took ~24ms for the monomorphized version vs ~127ms for the old, dynamically dispatched version.

Sync vs Async Blocks


FutureSDR supports both sync and async blocks. Their only difference is the work() function, which is either a normal or an async function. Overall, a Block is an enum containing a sync/async block with a sync/async kernel, implementing work(). This class structure already suggests that supporting both implementations leads to complexity, bloat, and code duplication.

pub trait AsyncKernel: Send {
    async fn work(&mut self, ...) -> Result<()> {

pub trait SyncKernel: Send {
    fn work(&mut self, ...) -> Result<()> {

Obviously, it would be possible to just use async blocks, since one is not forced to .await anything inside an async function, i.e., one could just make any sync function async. The reason both implementations exist, is that async blocks implement the AsyncKernel trait, defining async functions. This is an area where Rust is still in active development. Out-of-the-box, it does not support async trait functions, which is why everybody refrains to the async_trait crate that enables this. The popularity of this crate shows how desperate people want this language feature.

The reason that it is not mainline is – according to my understanding – that there is a performance penalty to using async trait functions. In short, one can think of an async function as a state machine with some local variables. If the compiler knows the concrete realization, it can build complex, nested state machines during compilation. If the compiler doesn’t know the concrete realization due to dynamic dispatch of trait functions, it has to allocate the state machine during runtime for every function call.

This sounded like a complete performance disaster – at least for work(), which is called over and over again. Therefore, we added support for sync blocks, which avoid this overhead.


Already back then, some quick tests suggested that the performance difference might not be that big. So the question is, whether it is really worth having both block types. Today, we conducted some experiments to have a closer look.

The code and all scripts are available here. In short, the measurements consider three schedulers: a single-threaded scheduler (Smol-1), a multi-threaded scheduler (Smol-N), an optimized, multi-threaded schedulers (Flow) that polls blocks in their natural order (upstream to downstream). We make six CPU cores available to the process and use six worker threads for the multi-threaded schedulers. The flowgraph consists of six independent subflows, each with a source that streams 200e6 32-bit floats into the flowgraph and #Stages (x-axis) number of copy blocks, each copying a random number samples (uniformly distributed in [1; 512]) in each call to work. We create a sync and an async version of the otherwise identical copy blocks.

Execution Time of Flowgraphs

The blocks do not do any DSP and only copy small chunks of samples. The performance is, therefore, mainly determined by the overhead of the runtime and the potential overhead of the async block. Yet, the differences are minor.


This suggests that it is not worth supporting sync implementations. At least not for now. And in the future, I expect that things just get better. There are ongoing discussions how Rust should handle async trait functions. Maybe a more efficient way is found, which would further improve the situation.

In retrospect, one could see this as a failed premature optimization. But it is also interesting to see the effect and quantify its impact on performance. As time permits, I will go ahead and remove sync blocks, so we get back to a more minimal runtime.

Full ZigBee SDR Receiver in the Browser


Some months ago, we showed a complete SDR waterfall plot running in the browser. It interfaced an RTL-SDR from within the browser, using cross-compiled drivers. In short, this requires compiling the driver to WebAssembly (Wasm) using Emscripten and a shim that maps libusb to WebUSB calls.

Signal processing was implemented with FutureSDR. It even supported wgpu custom buffers for platform-independent GPU acceleration. Wgpu is really awesome. It supports all major platforms, using their native backends: Linux/Android (→ Vulkan), Windows (→ DX12), macOS/iOS (→ Metal), and Wasm (→ WebGPU).

What was missing was a real, non-trivial SDR application. We were curious what is possible, so we developed a ZigBee receiver and cross-compiled it to Wasm. Furthermore, since the RTL-SDR doesn’t work in the 2.4GHz band and doesn’t provide the required bandwidth, we cross-compiled the driver of the HackRF in a similar fashion.

To test the receiver, we generated ZigBee frames with Scapy and sent them using an ATUSB IEEE 802.15.4 USB Adapter.

import time
from scapy.all import *

# linux: include/uapi/linux/if_ether.h
ETH_P_IEEE802154 = 0x00f6

i = 0
while True:
    fcf = Dot15d4FCS()
    data = Dot15d4Data(dest_panid=0x47d0, dest_addr=0x0000, src_panid=0x47d0, src_addr=0xee64)
    frame_data = fcf/data/f"FutureSDR {i}"
    sendp(frame_data, iface='monitor0', type=ETH_P_IEEE802154)
    i += 1

Turns out, this actually works and the 4Msps of the ZigBee receiver can be processed in real-time in the browser. We host a demo on our website, which is hard-coded to ZigBee channel 26 @ 2.48GHz. The receiver is, however, also part of the examples. It works just as well as native binary outside the browser, using SoapySDR to interface the hardware.

At the moment, the FutureSDR receiver only uses one thread that is, however, separate from the HackRF RX thread, spawned by the driver. Compiling in release mode, FutureSDR uses around 20% CPU on an Intel i7-8700K. See the demo video here:

And as usual, everything just works on the phone. This is not an Android application with cross-compiled drivers. It runs the whole ZigBee receiver in the Google Chrome browser that is shipped with my phone. Really fascinating what is possible in the browser these days…

Phone Setup

If you have ideas for cool applications, feel free to reply to one of the Twitter threads :-)

Slab Buffers


Buffers are at the heart of every SDR runtime. GNU Radio, for example, is famous for its double-mapped circular buffers. In short, they use the MMU to map the same memory twice, back-to-back in the virtual address space of the process. This arrangement allows to implement a ring buffer on-top that always presents the available read/write space as consecutive memory, similar to a C array. The figure below shows how a buffer, consisting of physical memory areas A and B, would be mapped.

Double-Mapped Circular Buffer

Using these buffers, blocks can assume that data is always in linear, consecutive memory. In contrast to normal circular buffers, they do not have to care about wrapping. This simplifies DSP implementations, in particular, for algorithms that consider multiple samples to produce output (e.g., a FIR filter). Furthermore, samples in linear memory allow using vectorized instructions (provided by SIMD extensions), which can make a big difference [1, 2].

Given these advantages, double-mapped circular buffers were also adopted by FutureSDR. (There is now also a separate crate for them, in case you want to roll your own SDR application without a framework or runtime.) These buffers work well for Linux, Android, Windows, and macOS. FutureSDR, however, also targets platforms that do not allow memory mapping (WebAssembly/WASM) or do not have a MMU in the, first place.

read more

Benchmarking FutureSDR


The introductory video of FutureSDR already showed some quick benchmarks for the throughput of message- and stream-based flowgraphs. What was missing (not only for FutureSDR but for SDRs in general) were latency measurements. I, therefore, had a closer look into this issue.

While throughput can be measured rather easily (by piping a given amount of data through a flowgraph and measuring its execution time), latency is more tricky. The state-of-the-art is to do I/O measurements, where the flowgraph reads samples from an SDR, processes them, and loops them back. Using external HW (i.e., a signal generator and an oscillator), one can measure the latency.

The drawback of this approach is obvious. It requires HW, a non-trivial setup, is hard to automate and integrate in CI/CD.

An alternative is measuring latency by logging when a sample is produced in a source and received in a sink. The main requirement for this measurement is that the overhead must be minimal. Otherwise, one easily measures the performance of the logging or impacts the flowgraph in a way that its behavior is no longer representative for normal execution.

read more

Generic Blocks for Rapid Prototyping


FutureSDR misses basically all blocks at this stage. Fortunately, people started contributing some of them, including blocks to add or multiply a stream with a constant. This block was implemented in way so that it was generic over the arithmetic operation. Thinking a bit further about the concept, we realized that it can be extended to arbitrary operations, creating blocks that are generic over function closures.

Meet our new blocks: Source, FiniteSource, Apply, Combine, Split, and Filter, all of which are generic over mutable closures. This can come in handy to quickly hack something together. Let me give you some examples.


Need a constant source that produces 123 as u32?

use futuresdr::blocks::Source;

let _ = Source::new(|| 123u32);

The Source block is generic over FnMut() -> A. It recognizes the output type (in this case u32) and creates the appropriate stream output.

Need a source that iterates again and again over a range or vector?

let mut v = (0..10).cycle();
let _ = Source::new(move || v.next().unwrap());
read more

Hello World!


We are not perfect, but we are here :-) Yay! A lot of stuff is still very much in the flow, but we think that the project reached a state, where it might be interesting for some.

After a lot of refactoring, we believe that the main components are in place and development is more fun, working on bugs and issues that are more local, i.e., one doesn’t have to change bits across the whole code base to fix something :-)

FutureSDR implements several new concepts. We hope you have some fun playing around with them. So happy hacking and please get in touch with us on GitHub or Discord, if you have questions, comments, or feedback.