<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
.MsoChpDefault
{mso-style-type:export-only;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:575550502;
mso-list-type:hybrid;
mso-list-template-ids:-1048514904 -1 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
{mso-level-text:"%1\)";
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l1
{mso-list-id:2117090485;
mso-list-type:hybrid;
mso-list-template-ids:1660742330 -1 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l1:level1
{mso-level-text:%1;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l1:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l1:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style></head><body lang=EN-US link=blue vlink="#954F72"><div class=WordSection1><p class=MsoNormal>Murray – this is a follow-up to our discussion about timing, in Python, on the Pi during the 12/1 RBNV meeting. </p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>(I can get into more details and/or show the tests in action on the command line during the next RBNV if there is interest)</p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>I ran a few tests on my Pi 4 to see if there is anything unique or different about Python when it comes to real-time performance. I mainly wanted to check if Python performs any worse than a normal process. (I don’t believe there is a way for it to do any better than other processes….). </p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>You mentioned time.monotonic() on the call, so I played with that a bit but in the end I used time.perf_counter(). On the Pi the two timing methods in Python seem to be equivalent, but on Windows time.perf_counter() delivers sub-millisecond granularity whereas time.monotonic() gives you a granularity of 16ms or so.</p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>I usually use a utility called “cyclictest” (apt-get install rt-tests) to evaluate real-time performance on a linux system and a tool called “stress” (apt-get install stress) to create a load. Cyclictest basically does a sleep() and then checks if the thread woke up at the expected time. </p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>It turns out that it only takes a few lines of Python code to replicate the essence of what cyclictest does, i.e. </p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Loop:</p><p class=MsoNormal style='text-indent:.5in'>t1=timestamp(); </p><p class=MsoNormal style='text-indent:.5in'>sleep(1 millisecond); </p><p class=MsoNormal style='text-indent:.5in'>t2=timestamp();</p><p class=MsoNormal style='text-indent:.5in'>//keep track of min,avg,max </p><p class=MsoNormal>//report results</p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Conclusion: Python is bound by the same scheduling laws and limitations that other processes under Linux are bound to, which means the following:</p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>1. </p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>If you don’t explicitly control the scheduling class and the priority, a thread is scheduled based on the standard policy, which is the “completely fair scheduler” aka SCHED_NORMAL. This means a sleep(1 ms) will sometimes turn into a sleep(1 ms + N ms) – where N is almost completely unbounded and depends heavily on what else is going on and only on a system that is 100% idle, N will be close to 0 most of the time. </p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>On my Pi with only pi-hole and htop running in the background, my python test reported a latency (i.e. timing error) as high as 14 milliseconds (i.e. a sleep(1) turned into a sleep(15). Cyclictest on the other hand reported a worst-case latency that was “only” 6 milliseconds</p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>2. </p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>If you do explicitly control scheduling class and priorities, for example by using the chrt utility ( chrt -f 60 pyton3 ./t.py ), you can expect your timers to behave with A LOT more accuracy. </p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Both python and cyclictest reported a worst case timing error of about 250 microseconds during my tests.</p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>The key here is to use scheduling class SCHED_FIFO (the -f in chrt) or SCHED_RR.</p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>3.</p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Even if you use SCHED_FIFO with sufficiently high priority, the fact that standard Linux distributions are not configured for real-time becomes immediately evident as soon as you run certain stress tests. Basically, even a low priority process/thread that makes system calls can cause the Linux kernel to enter a critical section from which it won’t emerge until several milliseconds later and during which preemption (i.e. rescheduling) is disabled. </p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Both the python test and cyclictest reported an error in the 5 millisecond range for the following memory allocation workload: “stress –vm 4 –vm-bytes 64M”<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>4.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>If you want to do better than that, you need a kernel that has the real-time patches applied and configuration option PREEMPT_RT enabled. With this, the Linux kernel no longer enters these multi-millisecond long critical sections and so it doesn’t get much in the way of good real-time performance. Also, the interrupt handlers of most device drivers now get turned into prioritized threads and therefore even a poorly written device driver won’t get much in the way provided you use a priority that is higher than that of the interrupt threads. PREEMPT_RT is required for ROS2 if you expect to benefit from the real-time improvements that ROS2 has made over ROS<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>With this, latencies of <250microseconds even under all types of stress conditions are quite doable. I have seen reports of <100 microseconds also. <o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Here is the python timing test code. Results in microseconds, except for the last one, which is in seconds:<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>import time<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>while True:<o:p></o:p></p><p class=MsoNormal> min=1000.0<o:p></o:p></p><p class=MsoNormal> max=0.0<o:p></o:p></p><p class=MsoNormal> avg=0.0<o:p></o:p></p><p class=MsoNormal> sum=0.0<o:p></o:p></p><p class=MsoNormal> i=0<o:p></o:p></p><p class=MsoNormal> T0=time.perf_counter()<o:p></o:p></p><p class=MsoNormal> while i < 1000:<o:p></o:p></p><p class=MsoNormal> i=i+1<o:p></o:p></p><p class=MsoNormal> t1=time.perf_counter()<o:p></o:p></p><p class=MsoNormal> time.sleep(0.001)<o:p></o:p></p><p class=MsoNormal> t2=time.perf_counter()<o:p></o:p></p><p class=MsoNormal> t3=t2-t1<o:p></o:p></p><p class=MsoNormal> if t3>max: <o:p></o:p></p><p class=MsoNormal> max=t3<o:p></o:p></p><p class=MsoNormal> if t3<min:<o:p></o:p></p><p class=MsoNormal> min=t3<o:p></o:p></p><p class=MsoNormal> sum=sum+t3<o:p></o:p></p><p class=MsoNormal> <o:p></o:p></p><p class=MsoNormal> avg=sum/i<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal> avg=(avg*1000000)-1000<o:p></o:p></p><p class=MsoNormal> min=(min*1000000)-1000<o:p></o:p></p><p class=MsoNormal> max=(max*1000000)-1000<o:p></o:p></p><p class=MsoNormal> T1=time.perf_counter()<o:p></o:p></p><p class=MsoNormal> duration=T1-T0<o:p></o:p></p><p class=MsoNormal> print("min,avg,max,duration = %7.1f , %7.1f , %7.1f , %3.1f" % (min,avg,max,duration))<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p></div></body></html>