R: (no subject)

the output of gdb ./main-testReading symbols from /home/laboratorio/orocos/orocos-toolchain/rtt/build/tests/main-test...done.
(gdb) run
Starting program: /home/laboratorio/orocos/orocos-toolchain/rtt/build/tests/main-test
[Thread debugging using libthread_db enabled]

Program terminated with signal SIGKILL, Killed.
The program no longer exists.
and after dmesg
[252277.419411] BUG: unable to handle kernel NULL pointer dereference at 00000504
[252277.419415] IP: [<f8da2176>] rtai_lxrt_invoke+0x11dc/0x1be6 [rtai_lxrt]
[252277.419421] *pdpt = 000000003644b001 *pde = 0000000000000000
[252277.419424] Oops: 0000 [#4] PREEMPT SMP
[252277.419426] last sysfs file: /sys/devices/pci0000:00/0000:00:1c.7/0000:09:00.0/local_cpus
[252277.419428] Modules linked in: rtai_sem rtai_lxrt rtai_hal binfmt_misc ppdev ipv6 psmouse serio_raw evdev lp parport ext3 jbd mbcache usbhid hid sg sr_mod sd_mod cdrom ata_generic ohci1394 ahci ieee1394 libata r8169 mii scsi_mod ehci_hcd usbcore [last unloaded: rtai_hal]
[252277.419444]
[252277.419447] Pid: 15877, comm: main-test Tainted: G D (2.6.32.20-hal-rtai-3.9.1 #2) System Product Name
[252277.419449] EIP: 0060:[<f8da2176>] EFLAGS: 00013246 CPU: 0
[252277.419451] EIP is at rtai_lxrt_invoke+0x11dc/0x1be6 [rtai_lxrt]
[252277.419453] EAX: f85050d0 EBX: 00000000 ECX: f8dabcc0 EDX: 00000000
[252277.419454] ESI: bffff038 EDI: 00000001 EBP: f387df30 ESP: f387dee0
[252277.419456] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[252277.419457] Process main-test (pid: 15877, ti=f387c000 task=f6b97250 task.ti=f387c000)
[252277.419459] I-pipe domain Linux
[252277.419460] Stack:
[252277.419461] f4b93518 f387deec c0155789 f387df00 c017feac f85050d0 f4b93518 f4b93530
[252277.419465] <0> 0002d804 f387df0c c0155789 f387df20 c017feac e12554c8 f70025d8 f4b93518
[252277.419469] <0> f387df2c bffff030 0807e708 f387df44 f387df4c f859de3e f387df48 b7ff6280
[252277.419474] Call Trace:
[252277.419478] [<c0155789>] ? __ipipe_restore_root+0x22/0x24
[252277.419481] [<c017feac>] ? kmem_cache_free+0x90/0x98
[252277.419484] [<f85050d0>] ? rt_sem_wait_if+0x0/0x3ef [rtai_sem]
[252277.419486] [<c0155789>] ? __ipipe_restore_root+0x22/0x24
[252277.419488] [<c017feac>] ? kmem_cache_free+0x90/0x98
[252277.419491] [<f859de3e>] ? intercept_syscall_prologue+0x53/0x109 [rtai_hal]
[252277.419494] [<f859de3e>] ? intercept_syscall_prologue+0x53/0x109 [rtai_hal]
[252277.419496] [<c01550e6>] ? __ipipe_dispatch_event+0x28/0x185
[252277.419499] [<c0185012>] ? __fput+0x16a/0x172
[252277.419502] [<c018502d>] ? fput+0x13/0x15
[252277.419504] [<c0182709>] ? filp_close+0x51/0x5b
[252277.419507] [<c011454b>] ? __ipipe_syscall_root+0x5f/0xed
[252277.419510] [<c01029f7>] ? sysenter_past_esp+0x5c/0x6f
[252277.419511] Code: cc 8b 15 2c 04 00 00 8b 45 c8 e8 71 f3 45 c7 83 7d dc 00 0f 84 88 00 00 00 8b 4d dc 89 d8 8b 15 30 04 00 00 e8 57 f3 45 c7 eb 76 <8b> 83 04 05 00 00 85 c0 7e 3f ff 76 20 ff 76 1c ff 76 18 ff 76
[252277.419537] EIP: [<f8da2176>] rtai_lxrt_invoke+0x11dc/0x1be6 [rtai_lxrt] SS:ESP 0068:f387dee0
[252277.419541] CR2: 0000000000000504
[252277.419543] ---[ end trace 1f0bdb4c46f0ac75 ]---

----Messaggio originale----

Da: peter [..] ...

Data: 11/01/2013 10.48

A: <dgerb [..] ...>

Cc: "Orocos Developers"<orocos-dev [..] ...>

Ogg: Re: [Orocos-Dev] (no subject)

On Thu, Jan 10, 2013 at 2:04 PM, <dgerb [..] ...> wrote:

peter wrote:
On Wed, Jan 9, 2013 at 9:43 AM, wrote:

> I use Orocos toolchain 2.6 under ubuntu 10.04 with RTAI 3.9. The system

> hangs

> when an activity is created and only if the module "rtai_sem" is loaded.

>

The rtai_sem module must be loaded. It has been many years since I tested

Orocos against RTAI/LXRT. Solving this will require typical debugging

skills such as analysing the last lines of 'dmesg', turning on the debug

features of the RTAI kernel modules, checking if the API of RTAI 3.5 -> 3.9

didn't change semantics (ie we check for a return value which was different

in 3.5 or 3.9)

Another thing you should do in RTT is :

cd build

cmake .. -DENABLE_TESTS=ON

make check

and report on which unit tests work and which don't.

Good luck,

Peter

this is the output of make check :

[..][100%] Built target typekit_test

UpdateCTestConfiguration from

:/home/laboratorio/orocos/orocos-toolchain/rtt/build/tests/DartConfiguration.tcl

Parse Config

file:/home/laboratorio/orocos/orocos-toolchain/rtt/build/tests/DartConfiguration.tcl

UpdateCTestConfiguration from

:/home/laboratorio/orocos/orocos-toolchain/rtt/build/DartConfiguration.tcl

Parse Config

file:/home/laboratorio/orocos/orocos-toolchain/rtt/build/DartConfiguration.tcl

Test project /home/laboratorio/orocos/orocos-toolchain/rtt/build

Constructing a list of tests

Done constructing a list of tests

Checking test dependency graph...

test 1

Start 1: main-test

1: Test command:

/home/laboratorio/orocos/orocos-toolchain/rtt/build/tests/main-test

1: Test timeout computed to be: 1500

1/38 Test #1: main-test ........................***Exception: SegFault

0.07 sec

test 2

Start 2: list-test

2: Test command:

/home/laboratorio/orocos/orocos-toolchain/rtt/build/tests/list-test

2: Test timeout computed to be: 1500

2: Running 2 test cases...

2:

2: *** No errors detected

2/38 Test #2: list-test ........................***Exception: SegFault

0.08 sec

test 3

Start 3: core-test

3: Test command:

/home/laboratorio/orocos/orocos-toolchain/rtt/build/tests/core-test

3: Test timeout computed to be: 1500

(the system hangs)
Ok. We first need to solve why main-test did SegFault. I see in the trace that the kernel got a NULL pointer in the rtai_lxrt module, so that triggered this case.

You can try to build rtt with this extra option:
cd rtt/buildcmake .. -DOROSEM_OS_LXRT_CHECK=ONmakecd testsgdb ./main-test

run... crash ...bt
In case this does not work, try to run without gdb too.
Peter

(no subject)

On Mon, Jan 14, 2013 at 12:13 PM, dgerb [..] ... <dgerb [..] ...> wrote:
> the output of gdb ./main-test
>
> Reading symbols from
> /home/laboratorio/orocos/orocos-toolchain/rtt/build/tests/main-test...done.
>
> (gdb) run
>
> Starting program:
> /home/laboratorio/orocos/orocos-toolchain/rtt/build/tests/main-test
>
> [Thread debugging using libthread_db enabled]
>
>
> Program terminated with signal SIGKILL, Killed.

Hmm, that's right, you can't debug a kernel oops with GDB.

I can't provide further help in debugging this without having an RTAI
system of my own.

It's likely that there has been a change in RTAI how it handles the
LXRT calls, and that we still rely on older data structures.

You could test this with turning the cmake option OS_AGNOSTIC off:

cd rtt/build
cmake .. -DOS_AGNOSTIC=OFF
make check

It's worth a try...although I can't predict how well all will build,
it could be that you have to add extra RTAI include paths here and
there in order to find the rtai_[...].h header files, especially when
building components.

Peter