[Dprglist] PID-tuned Clock in Python?

Tue Feb 9 01:06:53 PST 2021

Hi,

Executive Summary: This has to do with Publish-Subscribe message buses,
PID controllers, the accuracy of system clocks, Python clocks, threads
and timing loops, and yes, I know that last subject tickles the fancy
of those using microcontrollers where these kinds of problems never happen. :-)

----

Okay, my (Murray's) Robot Operating System ("ROS"), written in Python,

   https://github.com/ifurusato/ros

includes a Clock class. It's basically a configurable system clock
that publishes TICK messages at a fixed rate, TOCK messages every 20
TICKs. Things I want to run at 20Hz I listen for TICK messages, things
that need to happen less frequently (1Hz) listen for TOCKs.

The Clock's messages are sent into a message bus, and other Python
classes can subscribe to these messages, a typical Publish-Subscribe
model. In this way I have gotten around using a whole bunch of separate
Threads, especially given threads in Python don't quite work as they do
in Java. For example, I have an instance of my Velocity class attached
to each of the instances of my Motor class, and those TICK messages
provide the timing information so I can do:

    _velocity = motor.velocity

and get a value back in cm/sec. The Motor's PID and Velocity classes
receive and respond to these TICK messages and don't contain their own
timing loops, as they used to when I was using separate Threads for
everything. In theory, if I were truly disciplined, there'd be just that
single Clock Thread in the whole of my ROS. In reality it's not quite
possible, but the exceptions aren't worth noting (e.g., GamePad). Point
is, I used to have a separate Thread for each of my PID loops and would
have had two more for my Velocity loops, now they all just respond to
that Clock TICK. No loops at all, except the one in Clock.

Noting that this is somewhat akin to the kinds of master control loops
we see on a microcontroller, such as David Anderson's SR04 loop:

     void sensors()                      // SR04 20 Hz sensor loop
     {
         INT32 t;
         mseconds(&t);                   // set local 32 bit timer t to now

         while(1) {                      // endless loop
             if (srat_system & ARBITRATE) {

                     speedctrl();       // PID
                     slewspeed_task();  // velocity profiler
                     prowl_task();      // seek navigation coordinates
                     bumper_task();     // ballistic collision recovery
                     photo_task();      // seek/avoid light
                     ir_task();         // seek/avoid IR reflections
                     sonar_task();      // seek/avoid sonar reflections
                     motion_task();     // motion detector
                     xlate_task();      // rotate and scan
                     behaviors_task();  // scan for soda can profiles
                     passive_task();    // jump toward/away from movement
                     seek_task();       // seek/avoid IR beacon
                     boundary_task();   // detect virtual walls with odometry
                     feelers();         // gripper grasping reflex
                     arbitrate();       // send highest priority to motors
              }
              tsleep(&t,50);            // suspend in multitasking queue
         }                              // for the remainder of 50 ms, then loop
     }

except that with the Publish-Subscribe model we don't have to wait for
each of those task() calls to return, as my tasks now just subscribe to
the TICK messages of the system Clock. Adding a new task means that upon
instantiation it just subscribes to the message bus and listens for TICKs.
It can also publish to the bus itself. So a subsumption loop could be
considered as a collection of tasks that publish and subscribe over a
common message bus, where messages can set suppression/inhibition flags
on other tasks while running ballistic tasks. This would either replace
or augment a multi-tasking OS or true multi-threaded system.

The Clock on my robots is currently configured to run at 20Hz, a 50ms
period. The Clock class runs a single Thread loop that uses a Rate
class internally as its timer. You configure Rate's frequency in Hertz
and its wait() function will pad the time between calls so that the
total time matches the requested time. If the time period exceeds the
set period it pads with zero and simply returns. I use this Rate class
in a lot of places where it might pad zero, but in the case of the Clock
this never happens, so if I set the Clock's Rate to 20Hz, I get a nice
20Hz loop sending TICKs, and every 20 of those a TOCK.

In any case, I noticed that my Clock's period is not *quite* 20Hz. The
actual time varies just slightly, maybe 50.10300ms instead of 50.00000ms,
sometimes worse [*]. I've run this on my 3.5GHz Intel i7 workstation, on
a MacBook 12 inch, and on a Raspberry Pi 3 B+. Surprisingly, the MacBook
is by far the worst offender, my desktop i7 the most consistent, the
Raspberry Pi closer to my i7 than to the MacBook, i.e., not so bad.

The idea here was that if I noticed that the 50ms loop was consistently
0.5ms too slow, I could set a trim variable on the Rate value to -0.5 so
that the overall loop was closer to 50.0000ms. Or something like that.

So after thinking about capturing that error and adding a trim value to
the Rate class (as used by Clock), I realised that I was basically doing
the Proportional part of a PID controller, so I cleaned up the PID class
I'm using for my motor control so that it was more generic and actually
stuck it into the Clock. So there's now a boolean flag "enable_trim" for
the Clock: when True it enables the PID loop trim function. I'm still
playing around with its tuning.

So begging the first question: does this make any sense?! Does it make
sense to use the error from a system (Python) clock that is known to be
somewhat inconsistent in order to try to provide a not-perfect-but-improved
ROS Clock? I'm guessing this might work for "consistent" time errors (such
as a consistent lag) but not for intermittent clock burst errors, where
e.g., for one clock cycle it's suddenly 49ms or 52ms. I realise that
ideally the thing fixing the Clock would be better than the Clock, not
using the same underlying system clock to try to fix the Clock, in situ.
A bit of Alice and a rabbit hole here.

And the second question: I'm running my Clock at 20Hz. Can the PID loop
used to trim the Clock run at the same 20Hz rate, or does it need to be
running significantly faster (or slower)? Currently I'm just capturing
the error on every Clock loop (every TICK) and calling the PID for the
trim adjustment at 20Hz.

I figure this _at_least_ ticks the box of being PID-related...

Cheers,

Murray

* Is a 0.1ms error on 50ms really significant? Maybe not. Maybe none of
   this trimming and complexity is necessary. Sometimes I do go down
   rabbit holes just to see what's down there.
PS. I won't be able to attend tomorrow's RBNV meeting, but thought to
   send this into the mailing list for discussion.

----
The code for this is as follows (all Python3, notably all in flux):

Test class:   https://github.com/ifurusato/ros/blob/master/clock_test.py
Clock class:  https://github.com/ifurusato/ros/blob/master/lib/clock.py
Rate class:   https://github.com/ifurusato/ros/blob/master/lib/rate.py
PID class:    https://github.com/ifurusato/ros/blob/master/lib/pid.py
...........................................................................
Murray Altheim <murray18 at altheim dot com>                       = =  ===
http://www.altheim.com/murray/                                     ===  ===
                                                                    = =  ===
     In the evening
     The rice leaves in the garden
     Rustle in the autumn wind
     That blows through my reed hut.
            -- Minamoto no Tsunenobu