Devik's blog

You are now browsing through `kernel` articles

RPMsg in STM32MP1 performance

by Martin Devera file under kernel

Testing with STM32MP1 setup, 100MHz MCU, 500MHz MPU, sending 100B query to MCU which responds with 5x400B packets.

Run of 1000 queries tooks 150ms. It equals MCU read rate of 13.3MB/sec. Here is top:

Mem: 21040K used, 100804K free, 10868K shrd, 0K buff, 11632K cached
CPU:  0.0% usr  100% sys  0.0% nic  0.0% idle  0.0% io  0.0% irq  0.0% sirq
Load average: 0.46 0.12 0.04 2/47 363
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
   25     2 root     RW       0  0.0   0 30.7 [kworker/0:1+m4@]
  362   147 root     S     1268  1.0   0 23.0 ./rpmsg_tst.elf
  100     2 root     SW       0  0.0   0 15.3 [irq/44-4c001000]
   99     2 root     SW       0  0.0   0 15.3 [irq/43-4c001000]
  363   147 root     R     2572  2.1   0  7.6 top
   10     2 root     IW       0  0.0   0  7.6 [rcu_sched]

Another setup is VirtIO network, the same HW config, M4 sends 990B eth packets to A7 at rate 20MB/sec. Here is load:

CPU:  0.1% usr 41.0% sys  0.0% nic 40.2% idle  0.0% io  0.0% irq 18.5% sirq
Load average: 0.66 0.56 0.37 1/48 748
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
   99     2 root     SW       0  0.0   0 43.4 [irq/43-4c001000]
   25     2 root     RW       0  0.0   0 15.8 [kworker/0:1-m4@]

Variant when VirtIO IRQ is sent for each 8th packet:

CPU:  0.0% usr  4.1% sys  0.0% nic 81.9% idle  0.0% io  0.0% irq 13.9% sirq
Load average: 0.25 0.11 0.18 1/48 759
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
   99     2 root     SW       0  0.0   0 12.9 [irq/43-4c001000]
   25     2 root     RW       0  0.0   0  2.3 [kworker/0:1-m4@]

RPMsg thoughts

by Martin Devera file under kernel

The RPMsg is comm channel between two CPUs in system. The main reason why do you want to use it is its master-side support in Linux kernel.

It is quite natural reuse of VirtIO subsystem however I don't understand some design decisions yet.

  • Both IMX and STM uses fixed size MAX_RPMSG_BUF_SIZE=512 which is compile time constant (!). Regular question on support forums is how to enlarge it. Why not from DM ?
  • Usage from userspace is not easy. There is drivers/rpmsg/rpmsg_char.c driver but it doesn't bind to rpmsg bus automatically. There is /sys/bus/rpmsg/devices/virtio0.chrdev.-1.1/driver_override which can be used to bind rpmsg_char to the channel. I used this patch to bind all chrdev named channels to it.

    @@ -532,9 +533,11 @@ static void rpmsg_chrdev_remove(struct rpmsg_device *rpdev)
        put_device(&ctrldev->dev);
     }
    
    +static const struct rpmsg_device_id id_table[2] = { { "chrdev" }, { "" } };
     static struct rpmsg_driver rpmsg_chrdev_driver = {
        .probe = rpmsg_chrdev_probe,
        .remove = rpmsg_chrdev_remove,
    +       .id_table = id_table,
        .drv = {
            .name = "rpmsg_chrdev",
        },
    
  • Is there any reason not to use VirtIO of type network ? It is well optimized, can use MAC for addressing, no size restrictions, userspace control via sockets. Probably whole existence of RPMsg is because it is allocated from defined DMA-like memory (often SRAM). SKB's on other side are typically in main DRAM.