[Dprglist] Timing in Python on the RPi
Chris N
netterchris at gmail.com
Thu Dec 3 11:57:54 PST 2020
Murray – this is a follow-up to our discussion about timing, in Python, on the Pi during the 12/1 RBNV meeting.
(I can get into more details and/or show the tests in action on the command line during the next RBNV if there is interest)
I ran a few tests on my Pi 4 to see if there is anything unique or different about Python when it comes to real-time performance. I mainly wanted to check if Python performs any worse than a normal process. (I don’t believe there is a way for it to do any better than other processes….).
You mentioned time.monotonic() on the call, so I played with that a bit but in the end I used time.perf_counter(). On the Pi the two timing methods in Python seem to be equivalent, but on Windows time.perf_counter() delivers sub-millisecond granularity whereas time.monotonic() gives you a granularity of 16ms or so.
I usually use a utility called “cyclictest” (apt-get install rt-tests) to evaluate real-time performance on a linux system and a tool called “stress” (apt-get install stress) to create a load. Cyclictest basically does a sleep() and then checks if the thread woke up at the expected time.
It turns out that it only takes a few lines of Python code to replicate the essence of what cyclictest does, i.e.
Loop:
t1=timestamp();
sleep(1 millisecond);
t2=timestamp();
//keep track of min,avg,max
//report results
Conclusion: Python is bound by the same scheduling laws and limitations that other processes under Linux are bound to, which means the following:
1.
If you don’t explicitly control the scheduling class and the priority, a thread is scheduled based on the standard policy, which is the “completely fair scheduler” aka SCHED_NORMAL. This means a sleep(1 ms) will sometimes turn into a sleep(1 ms + N ms) – where N is almost completely unbounded and depends heavily on what else is going on and only on a system that is 100% idle, N will be close to 0 most of the time.
On my Pi with only pi-hole and htop running in the background, my python test reported a latency (i.e. timing error) as high as 14 milliseconds (i.e. a sleep(1) turned into a sleep(15). Cyclictest on the other hand reported a worst-case latency that was “only” 6 milliseconds
2.
If you do explicitly control scheduling class and priorities, for example by using the chrt utility ( chrt -f 60 pyton3 ./t.py ), you can expect your timers to behave with A LOT more accuracy.
Both python and cyclictest reported a worst case timing error of about 250 microseconds during my tests.
The key here is to use scheduling class SCHED_FIFO (the -f in chrt) or SCHED_RR.
3.
Even if you use SCHED_FIFO with sufficiently high priority, the fact that standard Linux distributions are not configured for real-time becomes immediately evident as soon as you run certain stress tests. Basically, even a low priority process/thread that makes system calls can cause the Linux kernel to enter a critical section from which it won’t emerge until several milliseconds later and during which preemption (i.e. rescheduling) is disabled.
Both the python test and cyclictest reported an error in the 5 millisecond range for the following memory allocation workload: “stress –vm 4 –vm-bytes 64M”
4.
If you want to do better than that, you need a kernel that has the real-time patches applied and configuration option PREEMPT_RT enabled. With this, the Linux kernel no longer enters these multi-millisecond long critical sections and so it doesn’t get much in the way of good real-time performance. Also, the interrupt handlers of most device drivers now get turned into prioritized threads and therefore even a poorly written device driver won’t get much in the way provided you use a priority that is higher than that of the interrupt threads. PREEMPT_RT is required for ROS2 if you expect to benefit from the real-time improvements that ROS2 has made over ROS
With this, latencies of <250microseconds even under all types of stress conditions are quite doable. I have seen reports of <100 microseconds also.
Here is the python timing test code. Results in microseconds, except for the last one, which is in seconds:
import time
while True:
min=1000.0
max=0.0
avg=0.0
sum=0.0
i=0
T0=time.perf_counter()
while i < 1000:
i=i+1
t1=time.perf_counter()
time.sleep(0.001)
t2=time.perf_counter()
t3=t2-t1
if t3>max:
max=t3
if t3<min:
min=t3
sum=sum+t3
avg=sum/i
avg=(avg*1000000)-1000
min=(min*1000000)-1000
max=(max*1000000)-1000
T1=time.perf_counter()
duration=T1-T0
print("min,avg,max,duration = %7.1f , %7.1f , %7.1f , %3.1f" % (min,avg,max,duration))
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dprg.org/pipermail/dprglist-dprg.org/attachments/20201203/b4173271/attachment.html>
More information about the DPRGlist
mailing list