Orocos 2.6, Xenomai 2.6.2 and SIGXCPU (CPU time limit exceeded)

Hi

I have just installed Orocos 2.6 on a fresh Xenomai 2.6.2 revision hosted by Ubuntu 12.10 x86_64. A problem appears when I run helloWord example

WARNING: You are not root. This program *may* require that you are root. 0.045 [ Info ][main()] ./helloworld manually raises LogLevel to 'Info' (5). See also file 'orocos.log'. 0.045 [ Info ][main()] **** Creating the 'Hello' component **** 0.045 [ Info ][Thread] Creating Thread for scheduler=ORO_SCHED_OTHER, priority=1, CPU affinity=0, with name='Hello' 0.046 [ Info ][Hello] Thread created with scheduler type 'ORO_SCHED_OTHER', priority 1, cpu affinity 0 and period 0 (PID= 0 ). 0.046 [ Info ][Thread] Creating Thread for scheduler=ORO_SCHED_OTHER, priority=1, CPU affinity=0, with name='GlobalEngine' 0.046 [ Info ][GlobalEngine] Thread created with scheduler type 'ORO_SCHED_OTHER', priority 1, cpu affinity 0 and period 0 (PID= 0 ). 0.047 [ Info ][Thread] Creating Thread for scheduler=ORO_SCHED_RT, priority=0, CPU affinity=0, with name='Activity' 0.047 [ Info ][Activity] Thread created with scheduler type 'ORO_SCHED_RT', priority 0, cpu affinity 0 and period 0.1 (PID= 0 ). CPU time limit exceeded (core dumped)

The same problem appears during exiting from deployer-xenomai (I am able to run deployer without problems).

Last few lines from strace ./helloWord are following

write(3, "0.060 [ Info ][GlobalEngine] T"..., 136) = 136 --- SIGWINCH (Window changed) @ 0 (0) --- sched_setscheduler(6505, SCHED_OTHER, { 0 }) = 0 rt_sigreturn(0x1969) = 0 write(2, "0.061 [ Info ][Thread] ", 250.061 [ Info ][Thread] ) = 25 write(2, "Creating Thread for scheduler=OR"..., 92Creating Thread for scheduler=ORO_SCHED_RT, priority=0, CPU affinity=0, with name='Activity') = 92 write(2, "\n", 1 ) = 1 write(3, "0.061 [ Info ][Thread] Creatin"..., 118) = 118 sched_get_priority_min(SCHED_OTHER) = 0 sched_get_priority_max(SCHED_OTHER) = 0 mmap(NULL, 128000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f50358f7000 mprotect(0x7f50358f7000, 4096, PROT_NONE) = 0 sched_get_priority_min(SCHED_OTHER) = 0 sched_get_priority_max(SCHED_OTHER) = 0 clone(child_stack=0x7f50359153b0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f5035915dd0, tls=0x7f5035915b00, child_tidptr=0x7f5035915dd0) = 6509 sched_setscheduler(6509, SCHED_OTHER, { 0 }) = 0 futex(0x7f5035916118, FUTEX_WAKE_PRIVATE, 1) = 1 write(2, "0.062 [ Info ][Activity] ", 270.062 [ Info ][Activity] ) = 27 write(2, "Thread created with scheduler ty"..., 103Thread created with scheduler type 'ORO_SCHED_RT', priority 0, cpu affinity 0 and period 0.1 (PID= 0 ).) = 103 write(2, "\n", 1 ) = 1 write(3, "0.062 [ Info ][Activity] Threa"..., 131) = 131 --- SIGXCPU (CPU time limit exceeded) @ 0 (0) --- rt_sigaction(SIGXCPU, {SIG_DFL, [], SA_RESTORER, 0x7f5034ca0cb0}, NULL, 8) = 0 tgkill(6505, 6505, SIGXCPU) = 0 rt_sigreturn(0x1969) = 0 --- SIGXCPU (CPU time limit exceeded) @ 0 (0) ---

+++ killed by SIGXCPU (core dumped) +++

Backtrace from ddd ./helloWord

  1. 14 0x00000000004a8092 in main (argc=1, argv=0x7fffffffddc8) at /worek/install/orocos-toolchain-xeno/ocl/helloworld/HelloWorld.cpp:159
  2. 13 0x00000000004a8315 in ORO_main_impl (argc=1, argv=0x7fffffffddc8) at /worek/install/orocos-toolchain-xeno/ocl/helloworld/HelloWorld.cpp:172
  3. 12 0x00000000004a9d82 in OCL::HelloWorld::HelloWorld (this=0x7fffffffd5d0, name=...) at /worek/install/orocos-toolchain-xeno/ocl/helloworld/HelloWorld.cpp:123
  4. 11 0x00007ffff7ae9c52 in RTT::TaskContext::setActivity (this=0x7fffffffd5d0, new_act=0x737000) at /worek/install/orocos-toolchain-xeno/rtt/rtt/TaskContext.cpp:357
  5. 10 boost::shared_ptr<RTT::base::ActivityInterface>::operator= (this=this@entry=0x7fffffffd6d8, r=...) at /usr/include/boost/smart_ptr/shared_ptr.hpp:311
  6. 9 ~shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/boost/smart_ptr/shared_ptr.hpp:164
  7. 8 ~shared_count (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/boost/smart_ptr/detail/shared_count.hpp:305
  8. 7 0x00007ffff7af4a8e in release (this=0x771680) at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:145
  9. 6 0x00007ffff7adc259 in RTT::Activity::~Activity (this=0x73f080, __in_chrg=<optimized out>) at /worek/install/orocos-toolchain-xeno/rtt/rtt/Activity.cpp:83
  10. 5 0x00007ffff7adc207 in RTT::Activity::~Activity (this=0x73f080, __in_chrg=<optimized out>) at /worek/install/orocos-toolchain-xeno/rtt/rtt/Activity.cpp:80
  11. 4 RTT::os::Thread::~Thread (this=0x73f090, __in_chrg=<optimized out>) at /worek/install/orocos-toolchain-xeno/rtt/rtt/os/Thread.cpp:326
  12. 3 ~MutexRecursive (this=0x73f0d8, __in_chrg=<optimized out>) at /worek/install/orocos-toolchain-xeno/rtt/rtt/os/Mutex.hpp:230
  13. 2 unlock (this=0x73f0d8) at /worek/install/orocos-toolchain-xeno/rtt/rtt/os/Mutex.hpp:242
  14. 1 0x00007ffff7b28ca9 in rtos_mutex_rec_unlock (m=0x73f0e0) at /worek/install/orocos-toolchain-xeno/rtt/rtt/os/xenomai/fosi.h:316
  15. 0 0x00007ffff7522f5b in rt_mutex_release () from /usr/xenomai/lib/libnative.so.3

Could you please point me how can I solve this problem?

Best regards, Mariusz

The latest working Xenomai version with OROCOS

Hi

Because I have problem with setting up Xenomai 2.6.x with OROCOS 2.6, could someone please tell what is the latest working Xenomai version with OROCOS? I mean which compile and run without problems. Thank you in advance.

Mariusz

The latest working Xenomai version with OROCOS

On Mon, Jan 7, 2013 at 9:39 PM, <mariuszjaniak [..] ...> wrote:

> Hi
>
> Because I have problem with setting up Xenomai 2.6.x with OROCOS 2.6, could
> someone please tell what is the latest working Xenomai version with
> OROCOS? I
> mean which compile and run without problems. Thank you in advance.
>

I've only tested it against 2.5.x, but I'm willing to help debugging with
2.6.x

Peter

Re: Orocos 2.6, Xenomai 2.6.2 and SIGXCPU

mjaniak wrote:

Hi

I have just installed Orocos 2.6 on a fresh Xenomai 2.6.2 revision hosted by Ubuntu 12.10 x86_64. A problem appears when I run helloWord example

WARNING: You are not root. This program *may* require that you are root.
0.045 [ Info ][main()] ./helloworld manually raises LogLevel to 'Info' (5). See also file 'orocos.log'.
0.045 [ Info ][main()] **** Creating the 'Hello' component ****
0.045 [ Info ][Thread] Creating Thread for scheduler=ORO_SCHED_OTHER, priority=1, CPU affinity=0, with name='Hello'
0.046 [ Info ][Hello] Thread created with scheduler type 'ORO_SCHED_OTHER', priority 1, cpu affinity 0 and period 0 (PID= 0 ).
0.046 [ Info ][Thread] Creating Thread for scheduler=ORO_SCHED_OTHER, priority=1, CPU affinity=0, with name='GlobalEngine'
0.046 [ Info ][GlobalEngine] Thread created with scheduler type 'ORO_SCHED_OTHER', priority 1, cpu affinity 0 and period 0 (PID= 0 ).
0.047 [ Info ][Thread] Creating Thread for scheduler=ORO_SCHED_RT, priority=0, CPU affinity=0, with name='Activity'
0.047 [ Info ][Activity] Thread created with scheduler type 'ORO_SCHED_RT', priority 0, cpu affinity 0 and period 0.1 (PID= 0 ).
CPU time limit exceeded (core dumped)

The same problem appears during exiting from deployer-xenomai (I am able to run deployer without problems).

Last few lines from strace ./helloWord are following

write(3, "0.060 [ Info ][GlobalEngine] T"..., 136) = 136
--- SIGWINCH (Window changed) @ 0 (0) ---
sched_setscheduler(6505, SCHED_OTHER, { 0 }) = 0
rt_sigreturn(0x1969) = 0
write(2, "0.061 [ Info ][Thread] ", 250.061 [ Info ][Thread] ) = 25
write(2, "Creating Thread for scheduler=OR"..., 92Creating Thread for scheduler=ORO_SCHED_RT, priority=0, CPU affinity=0, with name='Activity') = 92
write(2, "\n", 1
) = 1
write(3, "0.061 [ Info ][Thread] Creatin"..., 118) = 118
sched_get_priority_min(SCHED_OTHER) = 0
sched_get_priority_max(SCHED_OTHER) = 0
mmap(NULL, 128000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f50358f7000
mprotect(0x7f50358f7000, 4096, PROT_NONE) = 0
sched_get_priority_min(SCHED_OTHER) = 0
sched_get_priority_max(SCHED_OTHER) = 0
clone(child_stack=0x7f50359153b0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f5035915dd0, tls=0x7f5035915b00, child_tidptr=0x7f5035915dd0) = 6509
sched_setscheduler(6509, SCHED_OTHER, { 0 }) = 0
futex(0x7f5035916118, FUTEX_WAKE_PRIVATE, 1) = 1
write(2, "0.062 [ Info ][Activity] ", 270.062 [ Info ][Activity] ) = 27
write(2, "Thread created with scheduler ty"..., 103Thread created with scheduler type 'ORO_SCHED_RT', priority 0, cpu affinity 0 and period 0.1 (PID= 0 ).) = 103
write(2, "\n", 1
) = 1
write(3, "0.062 [ Info ][Activity] Threa"..., 131) = 131
--- SIGXCPU (CPU time limit exceeded) @ 0 (0) ---
rt_sigaction(SIGXCPU, {SIG_DFL, [], SA_RESTORER, 0x7f5034ca0cb0}, NULL, 8) = 0
tgkill(6505, 6505, SIGXCPU) = 0
rt_sigreturn(0x1969) = 0
--- SIGXCPU (CPU time limit exceeded) @ 0 (0) ---
+++ killed by SIGXCPU (core dumped) +++

Backtrace from ddd ./helloWord

#14 0x00000000004a8092 in main (argc=1, argv=0x7fffffffddc8) at /worek/install/orocos-toolchain-xeno/ocl/helloworld/HelloWorld.cpp:159
#13 0x00000000004a8315 in ORO_main_impl (argc=1, argv=0x7fffffffddc8) at /worek/install/orocos-toolchain-xeno/ocl/helloworld/HelloWorld.cpp:172
#12 0x00000000004a9d82 in OCL::HelloWorld::HelloWorld (this=0x7fffffffd5d0, name=...) at /worek/install/orocos-toolchain-xeno/ocl/helloworld/HelloWorld.cpp:123
#11 0x00007ffff7ae9c52 in RTT::TaskContext::setActivity (this=0x7fffffffd5d0, new_act=0x737000) at /worek/install/orocos-toolchain-xeno/rtt/rtt/TaskContext.cpp:357
#10 boost::shared_ptr::operator= (this=this@entry=0x7fffffffd6d8, r=...) at /usr/include/boost/smart_ptr/shared_ptr.hpp:311
#9 ~shared_ptr (this=, __in_chrg=) at /usr/include/boost/smart_ptr/shared_ptr.hpp:164
#8 ~shared_count (this=, __in_chrg=) at /usr/include/boost/smart_ptr/detail/shared_count.hpp:305
#7 0x00007ffff7af4a8e in release (this=0x771680) at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:145
#6 0x00007ffff7adc259 in RTT::Activity::~Activity (this=0x73f080, __in_chrg=) at /worek/install/orocos-toolchain-xeno/rtt/rtt/Activity.cpp:83
#5 0x00007ffff7adc207 in RTT::Activity::~Activity (this=0x73f080, __in_chrg=) at /worek/install/orocos-toolchain-xeno/rtt/rtt/Activity.cpp:80
#4 RTT::os::Thread::~Thread (this=0x73f090, __in_chrg=) at /worek/install/orocos-toolchain-xeno/rtt/rtt/os/Thread.cpp:326
#3 ~MutexRecursive (this=0x73f0d8, __in_chrg=) at /worek/install/orocos-toolchain-xeno/rtt/rtt/os/Mutex.hpp:230
#2 unlock (this=0x73f0d8) at /worek/install/orocos-toolchain-xeno/rtt/rtt/os/Mutex.hpp:242
#1 0x00007ffff7b28ca9 in rtos_mutex_rec_unlock (m=0x73f0e0) at /worek/install/orocos-toolchain-xeno/rtt/rtt/os/xenomai/fosi.h:316
#0 0x00007ffff7522f5b in rt_mutex_release () from /usr/xenomai/lib/libnative.so.3

Could you please point me how can I solve this problem?

Best regards,
Mariusz

After discussion with Gilles from Xenomai team about this problem (http://www.xenomai.org/pipermail/xenomai/2012-December/027156.html) I have prepared simple test case. For this reason I went through OROCOS internals and try to track down Xenomai system calls issued by OROCOS during component construction. I end up when component receive SIGXCPU signal, when default TaskContext activity is releasing mutex (breaker) in thread destructor associated with this activity. Code is bellow

/****************************************************************************
* mutexTest.c
* Xenomai mutex test for SIGXCPU
* Mariusz Janiak
* Wroclaw 2013
*****************************************************************************/

#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <execinfo.h>
#include <sched.h>
#include <native/task.h>
#include <native/mutex.h>
#include <native/sem.h>

#define ORO_SCHED_OTHER 1 /** Soft real-time */

#ifdef UNUSED
#elif defined(__GNUC__)
# define UNUSED(x) UNUSED_ ## x __attribute__((unused))
#else
# define UNUSED(x) x
#endif

void test(void *UNUSED(arg));
void warn_upon_switch(int UNUSED(sig));

RT_TASK mainTask, testTask;
RT_MUTEX mutex;
RT_SEM sem;

void warn_upon_switch(int sig __attribute__((unused)))
{
void *bt[32];
int nentries;

nentries = backtrace(bt, sizeof(bt) / sizeof(bt[0]));
backtrace_symbols_fd(bt, nentries, fileno(stderr));
}

void test(void *UNUSED(arg))
{
/* thread_function() in Thread.cpp:83 -- Thread::configure() in
Thread.cpp:489 -- rtos_task_set_period(...) in fosi_internal.cpp:387 --
rtos_task_make_periodic(...) in fosi_internal.cpp:378*/
rt_task_set_periodic(NULL, TM_NOW, TM_INFINITE);
/* thread_function() in Thread.cpp:86 -- rtos_sem_signal(...) in fosi.h:188*/
rt_sem_v(&sem);
/* thread_function() in Thread.cpp:89 -- MutexLock */
rt_mutex_acquire(&mutex, TM_INFINITE);
rt_mutex_release(&mutex);
/* thread_function() in Thread.cpp:116 -- rtos_sem_wait(...) in fosi.h:194 */
rt_sem_p(&sem, TM_INFINITE);
/* Do something */
rt_task_suspend(NULL);
}

int main(int UNUSED(argc), char *UNUSED(argv[]))
{
int ret=0;
struct sched_param param;
struct sigaction sa;

/* rtos_task_create_main(...) in fosi_internal.cpp:82 */
mlockall(MCL_CURRENT|MCL_FUTURE);
/* rtos_task_create_main(...) in fosi_internal.cpp:91 */
param.sched_priority = sched_get_priority_max(ORO_SCHED_OTHER);
if (param.sched_priority != -1 ){
sched_setscheduler(0, ORO_SCHED_OTHER, &param);
}
/* rtos_task_create_main(...) in fosi_internal.cpp:102 */
ret = rt_task_shadow(&mainTask, "MutexTest", 0, 0);
if(ret < 0){
printf("ERROR: rt_task_shadow(...)\n");
return -1;
}
/* rtos_task_create_main(...) in fosi_internal.cpp:162 */
sa.sa_sigaction = my_warn_upon_switch;
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sigaction(SIGXCPU, &sa, 0);

/* Thread::Thread(...) in Thread.cpp:238 -- automatic in constructor */
rt_mutex_create(&mutex, "breaker");
/* Thread::setup(...) in Thread.cpp:257 -- in MutexLock constructor */
ret = rt_mutex_acquire(&mutex, TM_INFINITE);
/* Thread::setup(...) in Thread.cpp:264 -- rtos_sem_init(...) in fosi.h:176*/
rt_sem_create(&sem, "sem", 0, S_PRIO);
/* Thread::setup(...) in Thread.cpp:293 -- rtos_task_create(...) in
fosi_internal.cpp:240 */
ret = rt_task_spawn(&testTask, "testTask", 128000, 1, T_JOINABLE | (0 & T_CPUMASK), test, NULL);
if(ret < 0){
printf("ERROR: rt_task_spawn(...)\n");
return -1;
}
/* Thread::setup(...) in Thread.cpp:309 -- rtos_sem_wait(...) in fosi.h:194 */
rt_sem_p(&sem, TM_INFINITE);

/* Thread::~Thread() in Thread.cpp:326 -- automatic in destructor when
TaskContext::setActivity(...) is called by HelloWord object (default
activity created in object constructor is replaced by new activity)*/
rt_mutex_release(&mutex);
/* Do something */
sleep(1);
rt_task_delete(&testTask);
return 0;
}

Unfortunately this example behaves quite differently from helloWord program. First of all Xenomai not generate SIGXCPU signal, I have to add following call

rt_task_set_mode(0, T_WARNSW, NULL);

in thread function to get a signal. I haven't found in OROCOS a place where such call is issued (rtos_enable_rt_warning(...) in fosi.h:319). After that signal is generated when mutex is locked by 'main' thread and then by 'test' thread not during mutex unlocking. Installing signal handler in 'main' thread in test example results proper signal handling while similar procedure in OROCOS case, not (application abort).

I am not expert in OROCOS internals so is someone here who can help me with this issue?

Best regards,
Mariusz