'New' OS dependency: mutex_lock_timed

We only needed recursive/non-recursive lock/trylock from the RTOS
until now. I'm adding 'lock_timed' (which specifies a timeout for
which the thread may wait for the lock, see 'man
pthread_mutex_timedlock').

I need this in the new Activity/Thread implementation to let the
caller of 'Thread::stop()' not wait infinitely if the thread he wants
to stop is in an infinite loop or suspended while it held the
execution lock. I checked it for Win32, gnulinux, MacOS-X, RTAI and
Xenomai, and all support this primitive out of the box. I dont' think
there's a problem here, but just to let you know.

Also of importance is that if Thread::stop() does not work within
5*period for periodic threads or 1second for non periodic threads, it
will bail out (ie return false). Yes, that are arbitrary, hard coded
values.
I still need to work out what will happen if such a bad behaving
thread is being cleaned up (destructor). I would let the
implementation in rtos_task_delete() (which must implement joining the
thread) in the fosi layer handle it (for example, suspend/kill after a
timeout)... because what will work in such a case is going to be very
dependent on the underlying OS, and not a policy we need to hard-code
in the generic code.

The new Thread/Activity classes will also handle 'stop()',
'setScheduler()' and 'setPeriod()',... more gracefully. Today, some of
these may only be called by a different thread, the same thread or
when the thread is not running. All these restrictions will be lifted,
anyone (including the thread itself) will be able to change its
behaviour at any time. Some of these restrictions were imposed by our
early RTAI work, but I came to the conclusion that the RTAI fosi layer
needs to handle these exceptions, not the generic code.

I'll push this to my github account tomorrow.

Peter

'New' OS dependency: mutex_lock_timed

On Thu, 11 Jun 2009, Peter Soetens wrote:

> We only needed recursive/non-recursive lock/trylock from the RTOS
> until now. I'm adding 'lock_timed' (which specifies a timeout for
> which the thread may wait for the lock, see 'man
> pthread_mutex_timedlock').

Ok... do you also foresee the same error return values as this POSIX
example? (I think it's difficult to find the "right" set of return values
that is (i) minimally necessary, and (ii) maximally feasible on all
possible OSs...

> I need this in the new Activity/Thread implementation to let the
> caller of 'Thread::stop()' not wait infinitely if the thread he wants
> to stop is in an infinite loop or suspended while it held the
> execution lock. I checked it for Win32, gnulinux, MacOS-X, RTAI and
> Xenomai, and all support this primitive out of the box. I dont' think
> there's a problem here, but just to let you know.

Ok...

> Also of importance is that if Thread::stop() does not work within
> 5*period for periodic threads or 1second for non periodic threads, it
> will bail out (ie return false). Yes, that are arbitrary, hard coded
> values.
And why is it that you choose this "ugly solution"? Because adding the
time out value explicitly would introduce an ABI incompatibility?

> I still need to work out what will happen if such a bad behaving
> thread is being cleaned up (destructor). I would let the
> implementation in rtos_task_delete() (which must implement joining the
> thread) in the fosi layer handle it (for example, suspend/kill after a
> timeout)... because what will work in such a case is going to be very
> dependent on the underlying OS, and not a policy we need to hard-code
> in the generic code.

But it might be useful to give Orocos/RTT-level code an indication about
the fact that there _is_ an OS-level problem...

> The new Thread/Activity classes will also handle 'stop()',
> 'setScheduler()' and 'setPeriod()',... more gracefully. Today, some of
> these may only be called by a different thread, the same thread or
> when the thread is not running. All these restrictions will be lifted,
> anyone (including the thread itself) will be able to change its
> behaviour at any time. Some of these restrictions were imposed by our
> early RTAI work, but I came to the conclusion that the RTAI fosi layer
> needs to handle these exceptions, not the generic code.
Ok.

>
> I'll push this to my github account tomorrow.
>
> Peter

Herman

'New' OS dependency: mutex_lock_timed

On Fri, Jun 12, 2009 at 08:01, Herman
Bruyninckx<Herman [dot] Bruyninckx [..] ...> wrote:
> On Thu, 11 Jun 2009, Peter Soetens wrote:
>
>> We only needed recursive/non-recursive lock/trylock from the RTOS
>> until now. I'm adding 'lock_timed' (which specifies a timeout for
>> which the thread may wait for the lock, see 'man
>> pthread_mutex_timedlock').
>
> Ok... do you also foresee the same error return values as this POSIX
> example? (I think it's difficult to find the "right" set of return values
> that is (i) minimally necessary, and (ii) maximally feasible on all
> possible OSs...

There's no need to mimic all the return values. We mostly only care about
success (==0) or failure(==-1). The user never sees these values anyway
because the fosi layer is used by the C++ primitives, which know what to do.
in either case.

>
>> Also of importance is that if Thread::stop() does not work within
>> 5*period for periodic threads or 1second for non periodic threads, it
>> will bail out (ie return false). Yes, that are arbitrary, hard coded
>> values.
> And why is it that you choose this "ugly solution"? Because adding the
> time out value explicitly would introduce an ABI incompatibility?

I've been thinking about adding a stop(timeout) which allows the user to specify
how long he's prepared to wait (synchronise) until the activity/thread
really stopped.

The problem I was facing is: do I choose a sensible default or do I burden the
user with the decision ? I certainly wanted a sensible default for the 95% of
the applications out there. And now I'm probing you guys to see which
kind of burden you would like to carry :-)

Peter

'New' OS dependency: mutex_lock_timed

On Fri, 12 Jun 2009, Peter Soetens wrote:

> On Fri, Jun 12, 2009 at 08:01, Herman
> Bruyninckx<Herman [dot] Bruyninckx [..] ...> wrote:
>> On Thu, 11 Jun 2009, Peter Soetens wrote:
>>
>>> We only needed recursive/non-recursive lock/trylock from the RTOS
>>> until now. I'm adding 'lock_timed' (which specifies a timeout for
>>> which the thread may wait for the lock, see 'man
>>> pthread_mutex_timedlock').
>>
>> Ok... do you also foresee the same error return values as this POSIX
>> example? (I think it's difficult to find the "right" set of return values
>> that is (i) minimally necessary, and (ii) maximally feasible on all
>> possible OSs...
>
> There's no need to mimic all the return values. We mostly only care about
> success (==0) or failure(==-1). The user never sees these values anyway
> because the fosi layer is used by the C++ primitives, which know what to do.
> in either case.

How do they know, in case of "failure" to discriminate between different
failure causes? Or is it not necessary to be able to make such a
distinction?

>>> Also of importance is that if Thread::stop() does not work within
>>> 5*period for periodic threads or 1second for non periodic threads, it
>>> will bail out (ie return false). Yes, that are arbitrary, hard coded
>>> values.
>> And why is it that you choose this "ugly solution"? Because adding the
>> time out value explicitly would introduce an ABI incompatibility?
>
> I've been thinking about adding a stop(timeout) which allows the user to
> specify how long he's prepared to wait (synchronise) until the
> activity/thread really stopped.
>
> The problem I was facing is: do I choose a sensible default or do I
> burden the user with the decision ? I certainly wanted a sensible default
> for the 95% of the applications out there. And now I'm probing you guys
> to see which kind of burden you would like to carry :-)
>
Okay, I see... I am, mostly, in favour of a policy of sensible defaults,
but in this case I find it very difficult to define what is "sensible"...
For the practical reason that stopping threads is an inherently unsolvable
problem (at least in its full generality) and opens an (inevitable) can of
worms of potential race conditions. For example, what happens when you time
out on your stop(timeout) call and are in the process of taking the
necessary actions to clean up the mess, _and_ at that moment the stopped
thread does show some signs of life...? Especially tricky and dangerous if
that "sign of life" happens to be putting a value on the DA output of a
motor drive... :-(

I feel your pain! :-)

Herman

'New' OS dependency: mutex_lock_timed

On Fri, Jun 12, 2009 at 10:46, Herman
Bruyninckx<Herman [dot] Bruyninckx [..] ...> wrote:
> On Fri, 12 Jun 2009, Peter Soetens wrote:
>
>> On Fri, Jun 12, 2009 at 08:01, Herman
>> Bruyninckx<Herman [dot] Bruyninckx [..] ...> wrote:
>>> On Thu, 11 Jun 2009, Peter Soetens wrote:
>>>
>>>> We only needed recursive/non-recursive lock/trylock from the RTOS
>>>> until now. I'm adding 'lock_timed' (which specifies a timeout for
>>>> which the thread may wait for the lock, see 'man
>>>> pthread_mutex_timedlock').
>>>
>>> Ok... do you also foresee the same error return values as this POSIX
>>> example? (I think it's difficult to find the "right" set of return values
>>> that is (i) minimally necessary, and (ii) maximally feasible on all
>>> possible OSs...
>>
>> There's no need to mimic all the return values. We mostly only care about
>> success (==0) or failure(==-1). The user never sees these values anyway
>> because the fosi layer is used by the C++ primitives, which know what to do.
>> in either case.
>
> How do they know, in case of "failure" to discriminate between different
> failure causes? Or is it not necessary to be able to make such a
> distinction?

We don't discriminate. It didn't work, so we bail out. We don't retry. At
best we log() something to the user console. But in practice we have never
needed these descriminative return values. They're over estimated. We just
check for == 0 .

Peter

'New' OS dependency: mutex_lock_timed

On Fri, 12 Jun 2009, Peter Soetens wrote:

> On Fri, Jun 12, 2009 at 10:46, Herman
> Bruyninckx<Herman [dot] Bruyninckx [..] ...> wrote:
>> On Fri, 12 Jun 2009, Peter Soetens wrote:
>>
>>> On Fri, Jun 12, 2009 at 08:01, Herman
>>> Bruyninckx<Herman [dot] Bruyninckx [..] ...> wrote:
>>>> On Thu, 11 Jun 2009, Peter Soetens wrote:
>>>>
>>>>> We only needed recursive/non-recursive lock/trylock from the RTOS
>>>>> until now. I'm adding 'lock_timed' (which specifies a timeout for
>>>>> which the thread may wait for the lock, see 'man
>>>>> pthread_mutex_timedlock').
>>>>
>>>> Ok... do you also foresee the same error return values as this POSIX
>>>> example? (I think it's difficult to find the "right" set of return values
>>>> that is (i) minimally necessary, and (ii) maximally feasible on all
>>>> possible OSs...
>>>
>>> There's no need to mimic all the return values. We mostly only care about
>>> success (==0) or failure(==-1). The user never sees these values anyway
>>> because the fosi layer is used by the C++ primitives, which know what to do.
>>> in either case.
>>
>> How do they know, in case of "failure" to discriminate between different
>> failure causes? Or is it not necessary to be able to make such a
>> distinction?
>
> We don't discriminate. It didn't work, so we bail out. We don't retry. At
> best we log() something to the user console. But in practice we have never
> needed these descriminative return values. They're over estimated. We just
> check for == 0 .
>
This "bailing out when it doesn't work" sounds like a meaningful default to
me! Any effort to come up with a portable 'recovery' is bound to fail
anyway...

Herman

'New' OS dependency: mutex_lock_timed

> >>> There's no need to mimic all the return values. We mostly only care
> >>> about success (==0) or failure(==-1). The user never sees these values
> >>> anyway because the fosi layer is used by the C++ primitives, which know
> >>> what to do. in either case.
> >>
> >> How do they know, in case of "failure" to discriminate between different
> >> failure causes? Or is it not necessary to be able to make such a
> >> distinction?
> >
> > We don't discriminate. It didn't work, so we bail out. We don't retry. At
> > best we log() something to the user console. But in practice we have
> > never needed these descriminative return values. They're over estimated.
> > We just check for == 0 .
>
> This "bailing out when it doesn't work" sounds like a meaningful default to
> me! Any effort to come up with a portable 'recovery' is bound to fail
> anyway...

This is true at the module level, absolutely not true at the supervision
level. There is definitely the need to know if
- the stop() functioned
- it did not function because stopping was not possible (rejected by task)
- it did not function because of a task malfunction

In the second case, you can assume that the task still functions properly. In
the third case *you can't assume anything* and therefore have to take action.
Which is something a supervision level can do. I already KILLED (SIGKILL) a
motor module process and started a small executable that simply reset the
motors, when the motor module control seem to malfunction. That is something a
supervision system can do easily *as soon as it knows what is happening*.

'New' OS dependency: mutex_lock_timed

On Fri, 12 Jun 2009, Sylvain Joyeux wrote:

>>>>> There's no need to mimic all the return values. We mostly only care
>>>>> about success (==0) or failure(==-1). The user never sees these values
>>>>> anyway because the fosi layer is used by the C++ primitives, which know
>>>>> what to do. in either case.
>>>>
>>>> How do they know, in case of "failure" to discriminate between different
>>>> failure causes? Or is it not necessary to be able to make such a
>>>> distinction?
>>>
>>> We don't discriminate. It didn't work, so we bail out. We don't retry. At
>>> best we log() something to the user console. But in practice we have
>>> never needed these descriminative return values. They're over estimated.
>>> We just check for == 0 .
>>
>> This "bailing out when it doesn't work" sounds like a meaningful default to
>> me! Any effort to come up with a portable 'recovery' is bound to fail
>> anyway...
>
> This is true at the module level, absolutely not true at the supervision
> level. There is definitely the need to know if
> - the stop() functioned
> - it did not function because stopping was not possible (rejected by task)
> - it did not function because of a task malfunction

I agree! But, similar to what I said in another thread, there is
Coordination functionality needed within RTT that is different from
Coordination functionality in the FOSI layer! The former is about the
failure/success of _your application's_ Activities, the latter is about OS
threads. I have the feeling that separating both Coordination
functionalities is a Good Thing... :-)

> In the second case, you can assume that the task still functions properly. In
> the third case *you can't assume anything* and therefore have to take action.
> Which is something a supervision level can do. I already KILLED (SIGKILL) a
> motor module process and started a small executable that simply reset the
> motors, when the motor module control seem to malfunction. That is something a
> supervision system can do easily *as soon as it knows what is happening*.

Exactly. And that application dependent knowledge is not something that
OS-level services are going to provide...

Herman

'New' OS dependency: mutex_lock_timed

On Fri, Jun 12, 2009 at 12:13, Sylvain Joyeux<sylvain [dot] joyeux [..] ...> wrote:
>> >>> There's no need to mimic all the return values. We mostly only care
>> >>> about success (==0) or failure(==-1). The user never sees these values
>> >>> anyway because the fosi layer is used by the C++ primitives, which know
>> >>> what to do. in either case.
>> >>
>> >> How do they know, in case of "failure" to discriminate between different
>> >> failure causes? Or is it not necessary to be able to make such a
>> >> distinction?
>> >
>> > We don't discriminate. It didn't work, so we bail out. We don't retry. At
>> > best we log() something to the user console. But in practice we have
>> > never needed these descriminative return values. They're over estimated.
>> > We just check for == 0 .
>>
>> This "bailing out when it doesn't work" sounds like a meaningful default to
>> me! Any effort to come up with a portable 'recovery' is bound to fail
>> anyway...
>
> This is true at the module level, absolutely not true at the supervision
> level. There is definitely the need to know if
>  - the stop() functioned
>  - it did not function because stopping was not possible (rejected by task)
>  - it did not function because of a task malfunction
>
> In the second case, you can assume that the task still functions properly. In
> the third case *you can't assume anything* and therefore have to take action.
> Which is something a supervision level can do. I already KILLED (SIGKILL) a
> motor module process and started a small executable that simply reset the
> motors, when the motor module control seem to malfunction. That is something a
> supervision system can do easily *as soon as it knows what is happening*.

I would only provide in RTT the 1st and 3rd case, and leave a
rejectable stop to the user api.
So if an RTT::stop() fails, you're in big trouble. If you want a
policy where the computation
can 'suggest' it shouldn't be stopped, this must be specified at the
user level, using a
state machine or whatever.

To go into detail: this also holds for the breakLoop() function, if it
returns false, there
is nothing else left than killing the thread or process.

Peter

'New' OS dependency: mutex_lock_timed

On Jun 11, 2009, at 16:45 , Peter Soetens wrote:

> We only needed recursive/non-recursive lock/trylock from the RTOS
> until now. I'm adding 'lock_timed' (which specifies a timeout for
> which the thread may wait for the lock, see 'man
> pthread_mutex_timedlock').
>
> I need this in the new Activity/Thread implementation to let the
> caller of 'Thread::stop()' not wait infinitely if the thread he wants
> to stop is in an infinite loop or suspended while it held the
> execution lock. I checked it for Win32, gnulinux, MacOS-X, RTAI and
> Xenomai, and all support this primitive out of the box. I dont' think
> there's a problem here, but just to let you know.
>
> Also of importance is that if Thread::stop() does not work within
> 5*period for periodic threads or 1second for non periodic threads, it
> will bail out (ie return false). Yes, that are arbitrary, hard coded
> values.

Would it be possible to make these global variables instead? That way,
if we wanted to change them we could do so manually in main() prior to
starting orocos. We only need the ability to change the values at
startup.

Of course, how we get this kind of behaviour in the Deployer is
another story???

All the changes are sounding good though ... :-)
Stephen

'New' OS dependency: mutex_lock_timed

On Thu, Jun 11, 2009 at 04:53:57PM -0400, S Roderick wrote:
> On Jun 11, 2009, at 16:45 , Peter Soetens wrote:
>
> > We only needed recursive/non-recursive lock/trylock from the RTOS
> > until now. I'm adding 'lock_timed' (which specifies a timeout for
> > which the thread may wait for the lock, see 'man
> > pthread_mutex_timedlock').
> >
> > I need this in the new Activity/Thread implementation to let the
> > caller of 'Thread::stop()' not wait infinitely if the thread he wants
> > to stop is in an infinite loop or suspended while it held the
> > execution lock. I checked it for Win32, gnulinux, MacOS-X, RTAI and
> > Xenomai, and all support this primitive out of the box. I dont' think
> > there's a problem here, but just to let you know.
> >
> > Also of importance is that if Thread::stop() does not work within
> > 5*period for periodic threads or 1second for non periodic threads, it
> > will bail out (ie return false). Yes, that are arbitrary, hard coded
> > values.
>
> Would it be possible to make these global variables instead? That way,
> if we wanted to change them we could do so manually in main() prior to
> starting orocos. We only need the ability to change the values at
> startup.

As we are in a component based framework and we already have nice ways
of dealing with configuration options (Properies), why not make such
configuration options RTT::Properties of a meta-component at
Process/Node-level?

Markus

'New' OS dependency: mutex_lock_timed

On Thu, Jun 11, 2009 at 22:53, S Roderick<kiwi [dot] net [..] ...> wrote:
> On Jun 11, 2009, at 16:45 , Peter Soetens wrote:
>
>> We only needed recursive/non-recursive lock/trylock from the RTOS
>> until now. I'm adding 'lock_timed' (which specifies a timeout for
>> which the thread may wait for the lock, see 'man
>> pthread_mutex_timedlock').
>>
>> I need this in the new Activity/Thread implementation to let the
>> caller of 'Thread::stop()' not wait infinitely if the thread he wants
>> to stop is in an infinite loop or suspended while it held the
>> execution lock. I checked it for Win32, gnulinux, MacOS-X, RTAI and
>> Xenomai, and all support this primitive out of the box. I dont' think
>> there's a problem here, but just to let you know.
>>
>> Also of importance is that if Thread::stop() does not work within
>> 5*period for periodic threads or 1second for non periodic threads, it
>> will bail out (ie return false). Yes, that are arbitrary, hard coded
>> values.
>
> Would it be possible to make these global variables instead? That way, if we
> wanted to change them we could do so manually in main() prior to starting
> orocos. We only need the ability to change the values at startup.
>
> Of course, how we get this kind of behaviour in the Deployer is another
> story???

We need to make it vastly easier to write plugins, such that "it's
just another my-run-time-settings-plugin" to tweak this. I doubt if we
need to burden the Deployer with this. Such tweaking should best
happen in a user component because we can't know anyway what's best.
Markus suggested to make the DeploymentComponent's function some kind
of library such that user code could more quickly add his own
application specific deployer (in C++ or in scripting). We're actually
quite close to that already.

Peter

'New' OS dependency: mutex_lock_timed

On Jun 11, 2009, at 17:01 , Peter Soetens wrote:

> On Thu, Jun 11, 2009 at 22:53, S Roderick<kiwi [dot] net [..] ...> wrote:
>> On Jun 11, 2009, at 16:45 , Peter Soetens wrote:
>>
>>> We only needed recursive/non-recursive lock/trylock from the RTOS
>>> until now. I'm adding 'lock_timed' (which specifies a timeout for
>>> which the thread may wait for the lock, see 'man
>>> pthread_mutex_timedlock').
>>>
>>> I need this in the new Activity/Thread implementation to let the
>>> caller of 'Thread::stop()' not wait infinitely if the thread he
>>> wants
>>> to stop is in an infinite loop or suspended while it held the
>>> execution lock. I checked it for Win32, gnulinux, MacOS-X, RTAI and
>>> Xenomai, and all support this primitive out of the box. I dont'
>>> think
>>> there's a problem here, but just to let you know.
>>>
>>> Also of importance is that if Thread::stop() does not work within
>>> 5*period for periodic threads or 1second for non periodic threads,
>>> it
>>> will bail out (ie return false). Yes, that are arbitrary, hard coded
>>> values.
>>
>> Would it be possible to make these global variables instead? That
>> way, if we
>> wanted to change them we could do so manually in main() prior to
>> starting
>> orocos. We only need the ability to change the values at startup.
>>
>> Of course, how we get this kind of behaviour in the Deployer is
>> another
>> story???
>
> We need to make it vastly easier to write plugins, such that "it's
> just another my-run-time-settings-plugin" to tweak this. I doubt if we
> need to burden the Deployer with this. Such tweaking should best
> happen in a user component because we can't know anyway what's best.
> Markus suggested to make the DeploymentComponent's function some kind
> of library such that user code could more quickly add his own
> application specific deployer (in C++ or in scripting). We're actually
> quite close to that already.

If they're globally available, there's nothing stopping you writing a
component that sets the values appropriately. A *lot* easier than
writing a plugin. But we need to come up with a better, more scalable,
and less typing, method to address global configuration issues. These
values are some of those, and I would argue that some of the logging
configuration (eg able log to stdout, etc) are others.

I don't quite understand what Markus is getting at. I have thought
before of dumping a deployment component into a GUI, and the kick-
starting the application components from that. Is this the kind of
thing he was talking about? If so, I would argue you could do a lot of
that already with the DeploymentComponent as is.

'New' OS dependency: mutex_lock_timed

On Thu, Jun 11, 2009 at 07:43:14PM -0400, Stephen Roderick wrote:
> On Jun 11, 2009, at 17:01 , Peter Soetens wrote:

> > Markus suggested to make the DeploymentComponent's function some kind
> > of library such that user code could more quickly add his own
> > application specific deployer (in C++ or in scripting). We're actually
> > quite close to that already.

Good! However I do not suggest to change the Deployment component,
that can stay as it is. I just want to factor out the basic mechanisms
for deployment into a library, so that these can be reused for more
sophisticated deployers.

> I don't quite understand what Markus is getting at. I have thought
> before of dumping a deployment component into a GUI, and the kick-
> starting the application components from that. Is this the kind of
> thing he was talking about? If so, I would argue you could do a lot of
> that already with the DeploymentComponent as is.

I'm not really thinking about GUIs, but more about doing more complex
verification or coordination tasks. For instance once the components
have "required" entities in addition to "provided", these could be
checked in such an advanced deployer. Or think of fault handling,
which might mean instatiating a new component, configuring it and then
changing the peer connections as smooth as possible...

I would not want to do all this in C++.

Markus

'New' OS dependency: mutex_lock_timed

On Fri, 12 Jun 2009, Markus Klotzbuecher wrote:

> On Thu, Jun 11, 2009 at 07:43:14PM -0400, Stephen Roderick wrote:
>> On Jun 11, 2009, at 17:01 , Peter Soetens wrote:
>
>>> Markus suggested to make the DeploymentComponent's function some kind
>>> of library such that user code could more quickly add his own
>>> application specific deployer (in C++ or in scripting). We're actually
>>> quite close to that already.
>
> Good! However I do not suggest to change the Deployment component,
> that can stay as it is. I just want to factor out the basic mechanisms
> for deployment into a library, so that these can be reused for more
> sophisticated deployers.
>
>> I don't quite understand what Markus is getting at. I have thought
>> before of dumping a deployment component into a GUI, and the kick-
>> starting the application components from that. Is this the kind of
>> thing he was talking about? If so, I would argue you could do a lot of
>> that already with the DeploymentComponent as is.
>
> I'm not really thinking about GUIs, but more about doing more complex
> verification or coordination tasks. For instance once the components
> have "required" entities in addition to "provided", these could be
> checked in such an advanced deployer.

"could" -> "should" ! :-)

> Or think of fault handling,
> which might mean instatiating a new component, configuring it and then
> changing the peer connections as smooth as possible...
>
> I would not want to do all this in C++.

I agree. This kind of advanced deployment is already solved (to some
extent) by others, most notably OSGi... I think it is more effective to
spend time in finding out how to use this Java-centric deployment
coordination framework to work also with our C++/C code then to reinvent
that (very elaborate!) wheel on our own and in C++...

Herman