[Dprglist] PID-tuned Clock in Python?

Murray Altheim murray18 at altheim.com
Sat Feb 20 12:31:22 PST 2021


On 21/02/21 6:43 am, Chris N wrote:
> Lot’s of good advise in here!

Hi Chris,

Yes, agreed. An interesting conversation. I'm learning a lot.

> The recent Cortex-A + Cortex-M combo SOCs are certainly interesting. 
> Hopefully the boards come down in price a little. I have seen one
> from NXP with a i.MX8 + Cortex M7.   I.e. Quad-core Cortex-A53 at 
> 1.5Ghz, i.e. same caliber as Pi but with built-in Teensy 4.0-caliber
> coprocessor.

I'm following discussions here as well as continuing to keep my eyes
open for similar developments. As I mentioned in my previous message,
the real question *may* be, when we're talking about more seriously
complicated robots, not so much whether we can get two CPUs to talk to
each other, but is there a way for an array of say, six to do so, with
the same performance we *might* see between two on the same board?
It's probably better to optimise (increase the speed and shrink the
size of) the communications between them rather than rely on the
availability of specific duo-CPU boards, but these are still interesting
developments.

> How is this different from connecting a STM32G4xx Nucleo board or 
> a Teensy to the Pi?  I think the main advantage of having the 
> Cortex-M4 (or M7 in some cases) as a co-processor on the same 
> silicon is that you get a low-latency high-bandwidth connection
> between the two, compared to I2C, SPI or UART. Probably (hopefully)
> there are libraries available to take full advantage of this
> internal interconnect.

Agreed. I2C, SPI, UART, or CAN all have inherent bandwidth limitations,
and quirks of their own. I think David is using primarily UARTs on his
robot (I think he needs something like six).

> What it DOESN’T get you automagically is rugged, 5V-tolerant I/O 
> that is exposed via a prototyping-friendly connector that carries
> Signal+GND+PWR  (that last bit is of course a board-level thing,
> not chip level)

Well, not quite. But I frankly haven't been complaining much about
using the Pimoroni Breakout Garden series of I2C boards. They seem
to work pretty flawlessly. I've had about nine or ten of them on
my Pi working without issue, even running a Pi camera in HD video
while running the robot around in telerobotic mode (that was seen
in the YouTube video I posted*).

> Murray – could you please state what exactly the timing requirements
> of your control logic are?  What would you consider “good enough”?

I'm running a 20Hz/50ms timing loop that spits out TICK messages on
each loop, TOCK messages every 20th (1 second). These messages are
sent onto a message bus to which much of the robot's other "features"
subscribe. This is kinda a more complicated version of what David is
doing in his control loop, I'm just using a publish-subscribe model
built over PyMessageBus. So long as the messages are sent and received,
being off by a little bit isn't generally a big deal. But I'm also using
that loop for my more time-critical things like PID and encoder timing,
and while variances between 49.0-51ms aren't going to cause any real
problems, when the system load of the Pi goes way up (intermittently
and rather unpredictably, from the perspective of the Python program),
that loop can skip (100ms) or jump up to 79ms, down to 43ms. And when
it recovers it doesn't recover back to the original beep beep beep of
the clock, it recovers relative to where the clock's next TICK is.

> And what exactly happens when occasionally you don’t meet your 
> requirement?   I think Karim made some good points in that regard. 
> Maybe you can just let it go?  Maybe its one of those times where
> you are trying to improve or optimize something but it is actually 
> already “good enough”?

Well, the clock instability means that the overall timing curve of the
robot is pretty steady until it isn't, and then there's a big lag and
a recovery. The robot for a moment gets a bit drunk, and then recovers.
Not pretty. Worst case is that the PID loop surges for maybe even a
half second, which of course has an effect on the robot, a visible one.

Now, my addmittedly Rube Goldberg solution of using the external clock
means that when the surge happens and the Python loop falters, it still
falters but when it comes back it comes back right on the sync of the
external clock, which hasn't faltered from its 0.1ms max error rate.
I'm not sure (I haven't tested this in the field yet) but the recovery
happens within one 20Hz clock cycle, not over three or four or more.

But I admit that part of this exploration is for the journey's sake. I'm
trying to find out how I can get the combination of Linux+Python to work
better. I do understand (from reading extensively on the web and from
being reminded here on the DPRG list) that this is an utterly futile
task, that Linux+Python is not an RTOS because of all the things we've
talked about previously. I don't disagree with *any* of that, it's just
that I am by nature one of the stubborn kids on David's lawn. I don't
give up easily, especially when issued a challenge (even to myself).
And I don't mind a small bit of Rube Goldberg if that actually works. I'm
not sure it does -- I'm still experimenting.
> Frankly I don’t understand all that talk about connecting an external
> device to the Pi for the purpose of generating a “timer interrupt”, 
> or even reaching down into the low levels of the Pi hardware and 
> tapping into the hardware timers. Linux does use the hardware timers
> internally.
> 
> Now if somebody is bit-banging something and needs a delay of exactly
> 2.567 microseconds, then one definitely needs to  use the hardware timer > directly and there are libraries available to do so.   (but if one is
> bitbaning stuff on the Pi than one is probably using the wrong tool….)

Well, when I was only using the Pi my timing loops were within about
0.5-1.0ms of 50ms and things worked fine until the Pi hit a big surge,
then that error would go up to as much as 25-30ms until things settled
down again. That's the extreme end of it, usually the surge would slow
it more in the range of 3-5ms, but it still happened. Doing the external
clock thing means my clock loop gets hit but recovers almost immmediately,
and my error is in the range of 0.1ms (usually 0.7-0.9ms). That's very
certainly fine for my usage.

I can live with this Rube Goldberg arrangement with this level of
performance, no problem. It works. If it's possible to use one of the
Pi's internal timers, and that timer isn't affected by a surge on the Pi,
then great: I can get rid of an external device. That's what David put
me onto.

> I have said the following previously, so apologies for the repeat, 
> but it is worth saying again:

No apologies necessary. It's really appreciated that you map it out so clearly, as
while I've seen bits and pieces of this, it's good to see it in one place. I've on
occasion taken (with permission) some of David's writing and put it on my blog or
wiki to keep it where I (and potentially others) can find it. Would you have any
issue with me consolidating your message and posting it to the NZPRG wiki?

> For anybody who is using the Pi, and who has
> 
>  * actual timing REQUIREMENTS or expectations expressed in single-digit 
>    milliseconds or smaller
>  * or who simply wants to make sure the Pi is firing on all cylinders 
>   (timing wise),
>  * or who simply wants to rule out timing issues as one of the variables
>    while trying to get stuff to work:

That would include me.

> If you are not doing ALL of the following, then you are simply wasting your time:
> 
>  1. Run your application – and all its threads - at elevated priority. 
>     Not only that, be sure to run your real-time threads at higher priority 
>     than your non-real-time threads.    The same would be true when one is 
>     using a RTOS. The OS can’t magically know which activity is most important.
>     You have to tell it.
>  2. Use a real-time Linux kernel (keyword: PREEMPT_RT)
>  3. Tell the kernel to isolate one of the Pi’s 4 cores and then run your real-time 
>     stuff on that core (add isolcpus=3 to cmdline.txt)
> 
> 1+2 are simply a must.
> 
> You are not allowed to have any expectations whatsoever regarding fully
> predictable timing if you are not doing 1+2.
> 
> Period. 
> > 3 is icing on the cake and will shave off only a few more microseconds in practice.

Fair enough. I may be wasting my time. Like I said, I'm stubborn, and I have no
issue in trying unorthodox solutions. After you mentioned #1 (maybe a few weeks ago)
I've started (I think, if I did it right) running my main thread at elevated
priority, but that's probably not the correct approach. I should probably run only
my Clock thread that way. But I can also be configuring any of this wrong. I'm new
at this.

>  4. It is also implied of course that the real-time threads in your application
>     don’t do things are incompatible  with real-time.  For example, perform complex
>     file I/O or call  complex library functions and expect those to finish their
>     work in a predictable amount of time.   The same would be true in case of a RTOS
> 
> “Linux is not a RTOS” ?    I would argue, if you do 1+2+3 then it is.

Well, with my publish-subscribe architecture, the only requirement I really need
timing-wise is that I have a system clock sending out TICKs and TOCKs to trigger
all the behaviours and keep things in sync, such is the nature of that architecture
(true on my Pi as well as a 1000 VMs running in a microservice architecture on the
cloud). My Rube Goldberg solution, or using one of the Pi's internal timers would
be fine.

I completely appreciate what you're saying about what's required to turn a Pi into
an RTOS (and I believe whether that is technically correct, it is obviously so in
practice, as you demonstrate). But that does create a rather significant hurdle
for those who won't/don't/can't compile their own Linux kernel. I could do that
as I have before, but I'm trying to avoid needing to. If I can get around needing
to, great. In the end maybe you're correct and there's no way to avoid 1+2[+3]*.

> Most recent test on my Pi4 with above 4 prerequisites met:

[Chris' performance results...]

> In other words, come hell or high water, the scheduling latency will
> be < 100us.
> 
> Throw in Covid-19 and the current conditions in Texas, let’s just
> say true worst case is somewhere around 200us.
> 
> If that is not good enough, then follow the advice given previously
> about normalizing / compensating.
> 
> Could somebody please show me a use case where the performance I just listed,
> coupled if necessary with normalizing the data, is not good enough?

For anything I'm doing, and anything I imagine people in this group
doing, this seems certainly good enough. It sounds like you've kinda
set an impressive benchmark.

> Now regarding Murray’s situation – throw in python with multiple threads
> implemented via the python “threading” module, then critical prerequisite
> #1 is no longer met, because python threads don’t have priorities or 
> preemption.

Yeah, understood. That's the big mountain I'm trying to either climb or
tunnel under with my deep rabbit hole. I first (ignorantly, based on my
experience in Java) wrote everything as separate threads. I've now moved
over to a publish-subscribe architecture, and if the system clock used
for the TICK messages is reliable, I *think* things will work out fine,
even absent real, proper, thread prioritisation.

> Murray – I would still encourage you to measure your timing errors 
> after meeting prerequisite 1+2+3.   I always use a Pi4, because frankly
> the extra performance of it has a positive impact on timing, but I have
> recently built a real-time kernel that is compatible with a Pi 3.  Once
> I have done a quick sanity check on it, I will make it available for
> download.

And when you do have that available please let us on the list know -- I'll
be one of the first to give it whirl.

Thanks much,

Murray

----
* KR01 Telerobotic Joyride
   https://youtu.be/Lw5Hz95IyBk
...........................................................................
Murray Altheim <murray18 at altheim dot com>                       = =  ===
http://www.altheim.com/murray/                                     ===  ===
                                                                    = =  ===
     In the evening
     The rice leaves in the garden
     Rustle in the autumn wind
     That blows through my reed hut.
            -- Minamoto no Tsunenobu



More information about the DPRGlist mailing list