We are striking some issues with using methods with parameters over
CORBA, and are wondering if anything has changed with v1.10 in this
respect? Demonstrated between combinations of Ubuntu Hardy and Mac OS
X Leopard and Snow Leopard.
This only occurs with methods that take parameters (for any of int,
double, std::string, KDL::Frame, ...), but does not occur for ports,
non-parameter methods, properties, attributes and commands. We have a
deployer and a GUI connecting through a name service (doesn't matter
which end the name server runs). We are using omniORB.
It is caused by having two network interfaces, one wired and one
wireless, in which one interface is down/disconnected (the wired in
our case). It appears that for methods with parameters, the deployer
tries to call back into the GUI on the disconnected interface instead
of the connected one (by default). We used to see this on one of our
embedded vehicles, and successfully bypassed it by ifdown'ing the
interface. This no longer works.
WIth a disconnected wired interface, and omniORB default parameters,
we get a lockup of the GUI when trying to connect to a remote method
that has one or more parameters. Back trace of GUI
2.553 [ Debug ][MethodC] Created 'Any' Assignable Expression server for type int ^C Program received signal SIGINT, Interrupt. 0x00007fff861c0a16 in recvfrom () (gdb) bt #0 0x00007fff861c0a16 in recvfrom () #1 0x0000000106fb9ae1 in omni::tcpConnection::Recv () #2 0x0000000106f789b3 in omni::giopStream::inputMessage () #3 0x0000000106f8e108 in omni::giopImpl12::inputReplyBegin () #4 0x0000000106f8efaf in omni::giopImpl12::inputMessageBegin () #5 0x0000000106f7f03f in omni::GIOP_C::ReceiveReply () #6 0x0000000106f61a8a in omniRemoteIdentity::dispatch () #7 0x0000000106f47ad4 in omniObjRef::_invoke () #8 0x0000000106b52e99 in RTT::Corba::_objref_MethodInterface::createMethod (this=0x107e384b0, method=0x10a501448 "requestManualMotion", args=@0x107e5d1e0) at /g/bo/ rtt/src/corba/OperationInterfaceC.cc:652 #9 0x0000000106ab4de4 in RTT::Corba::CorbaMethodFactory::produce (this=0x10a501460, args=@0x107c286c0) at CorbaMethodFactory.hpp:122 #10 0x0000000103b66bd6 in RTT::OperationFactory<RTT::DataSourceBase*>::produce (this=0x107e33cd8, name=@0x107c286b8, args=@0x107c286c0) at OperationFactory.hpp:506 #11 0x0000000103b66d9e in RTT::MethodC::D::checkAndCreate (this=0x107c286b0) at /g/o/rtt/src/MethodC.cpp:70 #12 0x0000000103b66ebe in RTT::MethodC::D::newarg (this=0x107c286b0, na=@0x7fff5fbfcba0) at /g/o/rtt/src/MethodC.cpp:83 #13 0x0000000103b661d0 in RTT::MethodC::arg (this=0x107c282e8, a=@0x7fff5fbfcbf0) at /g/o/rtt/src/MethodC.cpp:150 #14 0x00000001000de35c in RTT::detail::DataSourceStorageImpl<1, void () (int)>::initArgs<RTT::MethodC> (this=0x107c282e0, cc=@0x107c282e8) at DataSourceStorage.hpp:157 #15 0x00000001000de428 in RTT::detail::RemoteMethod<void ()(int) >::RemoteMethod (this=0x107c282d0, of=0x107e33cd8, name=@0x7fff5fbfccf0) at RemoteMethod.hpp:140 #16 0x00000001000e8e7c in RTT::MethodRepository::getMethod<void ()(int) > (this=0x107e33cd8, name=@0x7fff5fbfd7f0) at MethodRepository.hpp:147 #17 0x00000001000260ae in liive::gui::TaskData::init (this=0x7fff5fbfd9a0, argc=3, argv=0x7fff5fbff178) at /z/l/liive/gui/ taskdata.cpp:277 #18 0x0000000100019003 in main (argc=3, argv=0x7fff5fbff178) at /z/l/ liive/gui/liivegui.cpp:55 (gdb)
while the deployer computer shows (via netstat) an attempted
connection from its wireless interface to the wired interface of the
GUI computer.
Running the same setup while instructing the deployer's omniORB to use
only the wireless interface (via "clientTransportRule" in the omniORB
config file) results in a GUI-side CORBA exception
2.563 [ Debug ][MethodC] Created 'Any' Assignable Expression server for type int terminate called after throwing an instance of 'CORBA::TRANSIENT' Program received signal SIGABRT, Aborted. 0x00007fff861caff6 in __kill () (gdb) bt #0 0x00007fff861caff6 in __kill () #1 0x00007fff8626c072 in abort () #2 0x00007fff87e3c5d2 in __gnu_cxx::__verbose_terminate_handler () #3 0x00007fff878dff5d in _objc_terminate () #4 0x00007fff87e3aae1 in __cxxabiv1::__terminate () #5 0x00007fff87e3ab16 in std::terminate () #6 0x00007fff87e3abfc in __cxa_throw () #7 0x0000000106f23f9b in omni::omniExHelper::TRANSIENT () #8 0x0000000106f47ec6 in omniObjRef::_invoke () #9 0x0000000106b52e99 in RTT::Corba::_objref_MethodInterface::createMethod (this=0x109e00ab0, method=0x109e01b98 "requestManualMotion", args=@0x109c28c90) at /g/bo/ rtt/src/corba/OperationInterfaceC.cc:652 #10 0x0000000106ab4de4 in RTT::Corba::CorbaMethodFactory::produce (this=0x109e01bb0, args=@0x109c28bc0) at CorbaMethodFactory.hpp:122 #11 0x0000000103b66bd6 in RTT::OperationFactory<RTT::DataSourceBase*>::produce (this=0x109801018, name=@0x109c28bb8, args=@0x109c28bc0) at OperationFactory.hpp:506 #12 0x0000000103b66d9e in RTT::MethodC::D::checkAndCreate (this=0x109c28bb0) at /g/o/rtt/src/MethodC.cpp:70 #13 0x0000000103b66ebe in RTT::MethodC::D::newarg (this=0x109c28bb0, na=@0x7fff5fbfcba0) at /g/o/rtt/src/MethodC.cpp:83 #14 0x0000000103b661d0 in RTT::MethodC::arg (this=0x109c288e8, a=@0x7fff5fbfcbf0) at /g/o/rtt/src/MethodC.cpp:150 #15 0x00000001000de35c in RTT::detail::DataSourceStorageImpl<1, void () (int)>::initArgs<RTT::MethodC> (this=0x109c288e0, cc=@0x109c288e8) at DataSourceStorage.hpp:157 #16 0x00000001000de428 in RTT::detail::RemoteMethod<void ()(int) >::RemoteMethod (this=0x109c288d0, of=0x109801018, name=@0x7fff5fbfccf0) at RemoteMethod.hpp:140 #17 0x00000001000e8e7c in RTT::MethodRepository::getMethod<void ()(int) > (this=0x109801018, name=@0x7fff5fbfd7f0) at MethodRepository.hpp:147 #18 0x00000001000260ae in liive::gui::TaskData::init (this=0x7fff5fbfd9a0, argc=3, argv=0x7fff5fbff178) at /z/l/liive/gui/ taskdata.cpp:277 #19 0x0000000100019003 in main (argc=3, argv=0x7fff5fbff178) at /z/l/ liive/gui/liivegui.cpp:55 (gdb)
with the deployer computer correctly showing a connection across the
wireless network to the GUI.
With connected wired/wireless networks, this works fine. Both Orocos
and our application have been rebuilt from scratch on both computers.
Any ideas?
Stephen
CORBA changes in v1.10?
On Mon, Nov 2, 2009 at 21:41, S Roderick <kiwi [dot] net [..] ...> wrote:
> We are striking some issues with using methods with parameters over
> CORBA, and are wondering if anything has changed with v1.10 in this
> respect? Demonstrated between combinations of Ubuntu Hardy and Mac OS
> X Leopard and Snow Leopard.
Nothing intentional anyway. There was an addition for TAO-only, using
the Messaging spec. Our CORBA proxies got unused threads (bug fix in
trunk), and as just discovered, a default build uses no optimization
flags :-]
>
> This only occurs with methods that take parameters (for any of int,
> double, std::string, KDL::Frame, ...), but does not occur for ports,
> non-parameter methods, properties, attributes and commands. We have a
> deployer and a GUI connecting through a name service (doesn't matter
> which end the name server runs). We are using omniORB.
>
> It is caused by having two network interfaces, one wired and one
> wireless, in which one interface is down/disconnected (the wired in
> our case). It appears that for methods with parameters, the deployer
> tries to call back into the GUI on the disconnected interface instead
> of the connected one (by default). We used to see this on one of our
> embedded vehicles, and successfully bypassed it by ifdown'ing the
> interface. This no longer works.
So by downing the wired interface on the GUI, the ORB on the deployer
side choose the wireless interface instead ?
>
> WIth a disconnected wired interface, and omniORB default parameters,
> we get a lockup of the GUI when trying to connect to a remote method
> that has one or more parameters. Back trace of GUI
>
Looks normal to me. We operate on the object reference received by the
server. This reference only dictates how we will try to reach the
other side. So the problem is that the remote side's orb has bound to
the wrong interface and returns the wrong reference (an unreachable
object) or the local orb is extracting the wrong interface, in case of
multi-endpoint references.
I'm not so familiar with properly dealing with multi-endpoint
references, but I'm assuming most connection problems (like latencies
etc) come from their use. Maybe we need to tell the deployer to bind
to a certain interface only ?
>
> while the deployer computer shows (via netstat) an attempted
> connection from its wireless interface to the wired interface of the
> GUI computer.
So a wrong object reference telling the deployer to call back on a
downed interface.
>
> Running the same setup while instructing the deployer's omniORB to use
> only the wireless interface (via "clientTransportRule" in the omniORB
> config file) results in a GUI-side CORBA exception
I'd say that you must adjust the GUI's omniorb config file ! Such that
your GUI's ORB only binds to wireless and returns object references
saying that it must be contacted over the wireless interface.
>
Sidenote: the RTT factory code should catch this exception.
>
> with the deployer computer correctly showing a connection across the
> wireless network to the GUI.
>
> With connected wired/wireless networks, this works fine. Both Orocos
> and our application have been rebuilt from scratch on both computers.
>
> Any ideas?
If you can't connect to a given object (transient), the object
reference or network routing are wrong. So I hope your problem lies
with the GUI's ORB. I'm starting to believe that we need a good way to
configure and control these behaviours because they are the most
reported problems regarding CORBA.
Peter
CORBA changes in v1.10?
On Nov 2, 2009, at 16:29 , Peter Soetens wrote:
> On Mon, Nov 2, 2009 at 21:41, S Roderick <kiwi [dot] net [..] ...> wrote:
>> We are striking some issues with using methods with parameters over
>> CORBA, and are wondering if anything has changed with v1.10 in this
>> respect? Demonstrated between combinations of Ubuntu Hardy and Mac OS
>> X Leopard and Snow Leopard.
>
> Nothing intentional anyway. There was an addition for TAO-only, using
> the Messaging spec. Our CORBA proxies got unused threads (bug fix in
> trunk), and as just discovered, a default build uses no optimization
> flags :-]
>
>>
>> This only occurs with methods that take parameters (for any of int,
>> double, std::string, KDL::Frame, ...), but does not occur for ports,
>> non-parameter methods, properties, attributes and commands. We have a
>> deployer and a GUI connecting through a name service (doesn't matter
>> which end the name server runs). We are using omniORB.
>>
>> It is caused by having two network interfaces, one wired and one
>> wireless, in which one interface is down/disconnected (the wired in
>> our case). It appears that for methods with parameters, the deployer
>> tries to call back into the GUI on the disconnected interface instead
>> of the connected one (by default). We used to see this on one of our
>> embedded vehicles, and successfully bypassed it by ifdown'ing the
>> interface. This no longer works.
>
> So by downing the wired interface on the GUI, the ORB on the deployer
> side choose the wireless interface instead ?
Yes, that is what used to work. The name server was running on the GUI
computer, and with the wireless connected but the wired disconnected
(but up) on the vehicle, you get part way through the corba connection
business and then it just stops. If you ifdown the wired and then
start the deployer, it all works fine (used to!). This is with
omniORB's default parameters which don't specify any particular
networks.
>> WIth a disconnected wired interface, and omniORB default parameters,
>> we get a lockup of the GUI when trying to connect to a remote method
>> that has one or more parameters. Back trace of GUI
>>
>
> Looks normal to me. We operate on the object reference received by the
> server. This reference only dictates how we will try to reach the
> other side. So the problem is that the remote side's orb has bound to
> the wrong interface and returns the wrong reference (an unreachable
> object) or the local orb is extracting the wrong interface, in case of
> multi-endpoint references.
>
> I'm not so familiar with properly dealing with multi-endpoint
> references, but I'm assuming most connection problems (like latencies
> etc) come from their use. Maybe we need to tell the deployer to bind
> to a certain interface only ?
We have to instruct omniORB, through the deployer, where the name
service is (and it can't be "localhost", it must be the IP adrs, on
certain distro's, like $#$*% Ubuntu Jaunty and newer - we never solved
that one!) and then CORBA is supposed to figure it out. As we can pass
parameters through the deployer to the ORB, I think we're covered here.
>> while the deployer computer shows (via netstat) an attempted
>> connection from its wireless interface to the wired interface of the
>> GUI computer.
>
> So a wrong object reference telling the deployer to call back on a
> downed interface.
That's what it looks like to me too. What is puzzling is what has
changed recently in Orocos or our application, to cause this. Those
are the only changes. This actually cropped up late last week,
_before_ I had done the rebasing to get my git repo correctly in line
with v1.10. So if there has been a change, it might have been back in
v1.8. Unfortunately, I'm not completely sure. :-(
>> Running the same setup while instructing the deployer's omniORB to
>> use
>> only the wireless interface (via "clientTransportRule" in the omniORB
>> config file) results in a GUI-side CORBA exception
>
> I'd say that you must adjust the GUI's omniorb config file ! Such that
> your GUI's ORB only binds to wireless and returns object references
> saying that it must be contacted over the wireless interface.
We have been trying that, and we spent a lot of time trying it when
this first cropped up months ago on the vehicle. Some part of the code
(CORBA or Orocos or our app) just seems to ignore what we tell it, and
goes out and uses the wrong interface. I really can't see how it
_should_ be the application/Orocos (ie user) code, and truly think
it's a ORB config issue. Damned if we can find it though ...
>>
>
> Sidenote: the RTT factory code should catch this exception.
You want a bug report for this then?
>> with the deployer computer correctly showing a connection across the
>> wireless network to the GUI.
>>
>> With connected wired/wireless networks, this works fine. Both Orocos
>> and our application have been rebuilt from scratch on both computers.
>>
>> Any ideas?
>
> If you can't connect to a given object (transient), the object
> reference or network routing are wrong. So I hope your problem lies
> with the GUI's ORB. I'm starting to believe that we need a good way to
> configure and control these behaviours because they are the most
> reported problems regarding CORBA.
Agreed. CORBA is a wonderful use-and-forget tool, once you have the
damn configuration right! And it's much harder when you don't have
working DNS, like we do ...
I'll keep at it and let you know what I find ..
Stephen
CORBA changes in v1.10?
On Nov 2, 2009, at 17:10 , Stephen Roderick wrote:
> On Nov 2, 2009, at 16:29 , Peter Soetens wrote:
>
>> On Mon, Nov 2, 2009 at 21:41, S Roderick <kiwi [dot] net [..] ...> wrote:
>>> We are striking some issues with using methods with parameters over
>>> CORBA, and are wondering if anything has changed with v1.10 in this
>>> respect? Demonstrated between combinations of Ubuntu Hardy and Mac
>>> OS
>>> X Leopard and Snow Leopard.
>>
>> Nothing intentional anyway. There was an addition for TAO-only,
>> using
>> the Messaging spec. Our CORBA proxies got unused threads (bug fix in
>> trunk), and as just discovered, a default build uses no optimization
>> flags :-]
Solved ... UGH!!!!
I will update the CORBA wiki page with this, but basically, if you
have a multi-homed machine and you want to use just one of the
interfaces (ie wireless in our case), then you must specify the
endPoint to use to the Naming Service _in the configuration file_. It
isn't sufficient to do so on the command line. So the following
endPoint = giop:tcp:10.0.10.14:
is now in our omniorb.cfg file. Note that the damn omniORB name server
_still_ publishes on our wired network, but it does the wireless
endpoint above before the wired and everything works. If you supply
the end point on the command line to the name service (instead of in
the config file), then it publishes the wired end point first and then
the wireless end point second. The GUI program then fails to look up
the name server over the wireless interface.
Note that the clientTransportRule and serverTransportRule parameters
appear to have no affect on this problem. Also, they do not appear to
affect the solution either.
If you still strike problems with the above, then specify the end
point at both ends. Then the name server, deployer and GUI (in our
system) all have been explicitly instructed to use only that network.
NB we also specify "ORBInitRef corbaloc:iiop:10.0.10.14:2809/
NameService" specifically on the command line for both deployer and GUI.
Aye yay yay ... :-(
Stephen
PS: And I have absolutely NO idea why this all of a sudden cropped up
for us. It used to work ....
CORBA changes in v1.10?
On Tue, Nov 3, 2009 at 5:35 PM, S Roderick <kiwi [dot] net [..] ...> wrote:
> On Nov 2, 2009, at 17:10 , Stephen Roderick wrote:
>
>> On Nov 2, 2009, at 16:29 , Peter Soetens wrote:
>>
>>> On Mon, Nov 2, 2009 at 21:41, S Roderick <kiwi [dot] net [..] ...> wrote:
>>>> We are striking some issues with using methods with parameters over
>>>> CORBA, and are wondering if anything has changed with v1.10 in this
>>>> respect? Demonstrated between combinations of Ubuntu Hardy and Mac
>>>> OS
>>>> X Leopard and Snow Leopard.
>>>
>>> Nothing intentional anyway. There was an addition for TAO-only,
>>> using
>>> the Messaging spec. Our CORBA proxies got unused threads (bug fix in
>>> trunk), and as just discovered, a default build uses no optimization
>>> flags :-]
>
> Solved ... UGH!!!!
>
> I will update the CORBA wiki page with this, but basically, if you
> have a multi-homed machine and you want to use just one of the
> interfaces (ie wireless in our case), then you must specify the
> endPoint to use to the Naming Service _in the configuration file_. It
> isn't sufficient to do so on the command line. So the following
>
> endPoint = giop:tcp:10.0.10.14:
>
> is now in our omniorb.cfg file. Note that the damn omniORB name server
> _still_ publishes on our wired network, but it does the wireless
> endpoint above before the wired and everything works. If you supply
> the end point on the command line to the name service (instead of in
> the config file), then it publishes the wired end point first and then
> the wireless end point second. The GUI program then fails to look up
> the name server over the wireless interface.
>
> Note that the clientTransportRule and serverTransportRule parameters
> appear to have no affect on this problem. Also, they do not appear to
> affect the solution either.
>
> If you still strike problems with the above, then specify the end
> point at both ends. Then the name server, deployer and GUI (in our
> system) all have been explicitly instructed to use only that network.
We have seen the exact same problem (on two Jaunty systems), we'll try
your suggestion and report back if it also solves our problem.
Ruben
> NB we also specify "ORBInitRef corbaloc:iiop:10.0.10.14:2809/
> NameService" specifically on the command line for both deployer and GUI.
>
> Aye yay yay ... :-(
> Stephen
>
> PS: And I have absolutely NO idea why this all of a sudden cropped up
> for us. It used to work ....
> --
> Orocos-Dev mailing list
> Orocos-Dev [..] ...
> http://lists.mech.kuleuven.be/mailman/listinfo/orocos-dev
>
CORBA changes in v1.10?
On Nov 4, 2009, at 04:23 , Ruben Smits wrote:
> On Tue, Nov 3, 2009 at 5:35 PM, S Roderick <kiwi [dot] net [..] ...> wrote:
>> On Nov 2, 2009, at 17:10 , Stephen Roderick wrote:
>>
>>> On Nov 2, 2009, at 16:29 , Peter Soetens wrote:
>>>
>>>> On Mon, Nov 2, 2009 at 21:41, S Roderick <kiwi [dot] net [..] ...> wrote:
>>>>> We are striking some issues with using methods with parameters
>>>>> over
>>>>> CORBA, and are wondering if anything has changed with v1.10 in
>>>>> this
>>>>> respect? Demonstrated between combinations of Ubuntu Hardy and Mac
>>>>> OS
>>>>> X Leopard and Snow Leopard.
>>>>
>>>> Nothing intentional anyway. There was an addition for TAO-only,
>>>> using
>>>> the Messaging spec. Our CORBA proxies got unused threads (bug fix
>>>> in
>>>> trunk), and as just discovered, a default build uses no
>>>> optimization
>>>> flags :-]
>>
>> Solved ... UGH!!!!
>>
>> I will update the CORBA wiki page with this, but basically, if you
>> have a multi-homed machine and you want to use just one of the
>> interfaces (ie wireless in our case), then you must specify the
>> endPoint to use to the Naming Service _in the configuration file_. It
>> isn't sufficient to do so on the command line. So the following
>>
>> endPoint = giop:tcp:10.0.10.14:
>>
>> is now in our omniorb.cfg file. Note that the damn omniORB name
>> server
>> _still_ publishes on our wired network, but it does the wireless
>> endpoint above before the wired and everything works. If you supply
>> the end point on the command line to the name service (instead of in
>> the config file), then it publishes the wired end point first and
>> then
>> the wireless end point second. The GUI program then fails to look up
>> the name server over the wireless interface.
>>
>> Note that the clientTransportRule and serverTransportRule parameters
>> appear to have no affect on this problem. Also, they do not appear to
>> affect the solution either.
>>
>> If you still strike problems with the above, then specify the end
>> point at both ends. Then the name server, deployer and GUI (in our
>> system) all have been explicitly instructed to use only that network.
>
> We have seen the exact same problem (on two Jaunty systems), we'll try
> your suggestion and report back if it also solves our problem.
>
> Ruben
>
>> NB we also specify "ORBInitRef corbaloc:iiop:10.0.10.14:2809/
>> NameService" specifically on the command line for both deployer and
>> GUI.
Just in case, the above should read "ORBInitRef
NameService=corbaloc:iiop:10.0.10.14:2809/NameService".
I would be interested to know if supplying the above as a command line
parameter to both your GUI and deployer, but with "localhost" instead
of "10.0.10.14", works for you in Jaunty. It doesn't for us, yet it
does in all previous versions of Ubuntu and in all other systems we
use. We gave up trying to figure out why ...
Stephen