naming of component threads

I’m attempting to debug a CPU-intensive component in a large/complex deployment. Reducing the complexity while reproducing the CPU usage has proven very difficult, so I need to leave the deployment as is. Right now, our components all inherit from a base class that provide a small amount of runtime statistics (actual period, max update duration, etc), but that information isn’t helpful enough. What I need to know, initially, is the name of the component that’s bogging things down. The trouble is, when I deploy everything (using rttlua, in this case), all ‘htop’ shows me is a bunch of processes with the name “rttlua” which is less than helpful.

I found a patch that looked promising (http://bugs.orocos.org/show_bug.cgi?id=1016) linked to from a thread (http://www.orocos.org/forum/orocos/orocos-users/logging-timestamps-gener...) that discussed a related issue. The patch isn’t included in the latest Hydro pre-built or the master branch, but seems like an innocuous addition. If anything, I think this patch should be given a higher priority. It seems like a very useful thing.

I tried the other suggestion of using pthread_setname_np (in my base class’ constructor/startHook) but with very odd results. I found that multiple components would end up with the same thread name, with multiple duplicate entries in ‘htop’. Note, not all components would get the same name, but some would.

/** Code in base class that sets name
char thread_name[16];

if (not pthread_getname_np(pthread_self(), thread_name, 16))
{
std::cout << "comp name: " << name << " default threadname: " << std::string(thread_name) << std::endl;
}

pthread_setname_np(pthread_self(), name.c_str());

if (not pthread_getname_np(pthread_self(), thread_name, 16))
{
std::cout << "comp name: " << name << " new threadname: " << std::string(thread_name) << std::endl;
}
**/

/** Code in rttlua that sets component activities

naming of component threads

Hi Dustin,

On Mon, Dec 1, 2014 at 4:39 PM, Gooding, Dustin R. (JSC-ER411) <
dustin [dot] r [dot] gooding [..] ...> wrote:

> I’m attempting to debug a CPU-intensive component in a large/complex
> deployment. Reducing the complexity while reproducing the CPU usage has
> proven very difficult, so I need to leave the deployment as is. Right now,
> our components all inherit from a base class that provide a small amount of
> runtime statistics (actual period, max update duration, etc), but that
> information isn’t helpful enough. What I need to know, initially, is the
> name of the component that’s bogging things down. The trouble is, when I
> deploy everything (using rttlua, in this case), all ‘htop’ shows me is a
> bunch of processes with the name “rttlua” which is less than helpful.
>
> I found a patch that looked promising (
> http://bugs.orocos.org/show_bug.cgi?id=1016) linked to from a thread (
> http://www.orocos.org/forum/orocos/orocos-users/logging-timestamps-gener...)
> that discussed a related issue. The patch isn’t included in the latest
> Hydro pre-built or the master branch, but seems like an innocuous
> addition. If anything, I think this patch should be given a higher
> priority. It seems like a very useful thing.
>
> I tried the other suggestion of using pthread_setname_np (in my base
> class’ constructor/startHook) but with very odd results. I found that
> multiple components would end up with the same thread name, with multiple
> duplicate entries in ‘htop’. Note, not all components would get the same
> name, but some would.
>
> /** Code in base class that sets name
> char thread_name[16];
>
> if (not pthread_getname_np(pthread_self(), thread_name, 16))
> {
> std::cout << "comp name: " << name << " default threadname: " <<
> std::string(thread_name) << std::endl;
> }
>
> pthread_setname_np(pthread_self(), name.c_str());
>
> if (not pthread_getname_np(pthread_self(), thread_name, 16))
> {
> std::cout << "comp name: " << name << " new threadname: " <<
> std::string(thread_name) << std::endl;
> }
> **/
>

If this code is executed in the constructor or starthook, you get the
thread name of the MainThread, ie main(), or whoever created or started
your component.

You need to put this code in an 'OwnThread' operation, such that it is
executed by the thread of the component itself. Then call this operation
from lua as a usual 'call'.

Peter

>
> /** Code in rttlua that sets component activities

naming of component threads

On 12/01/2014 04:39 PM, Gooding, Dustin R. (JSC-ER411) wrote:
> I’m attempting to debug a CPU-intensive component in a large/complex deployment. Reducing the complexity while reproducing the CPU usage has proven very difficult, so I need to leave the deployment as is. Right now, our components all inherit from a base class that provide a small amount of runtime statistics (actual period, max update duration, etc), but that information isn’t helpful enough. What I need to know, initially, is the name of the component that’s bogging things down. The trouble is, when I deploy everything (using rttlua, in this case), all ‘htop’ shows me is a bunch of processes with the name “rttlua” which is less than helpful.
>
> I found a patch that looked promising (http://bugs.orocos.org/show_bug.cgi?id=1016) linked to from a thread (http://www.orocos.org/forum/orocos/orocos-users/logging-timestamps-gener...) that discussed a related issue. The patch isn’t included in the latest Hydro pre-built or the master branch, but seems like an innocuous addition. If anything, I think this patch should be given a higher priority. It seems like a very useful thing.
>
> I tried the other suggestion of using pthread_setname_np (in my base class’ constructor/startHook) but with very odd results. I found that multiple components would end up with the same thread name, with multiple duplicate entries in ‘htop’. Note, not all components would get the same name, but some would.

Hi

The thread's name is not always directly mapped to the TaskContext names. Sometimes, there can be several TC that run on the same thread.

The Thread naming is a bit scattered in the rtt sources. For instance (if you use the DeploymentComponent) :

- when you create a periodic task, it will probably use a PeriodicActivity (ocl/deployment/DeploymentComponent.cpp line 1973). The PeriodicActivity uses a timer thread instance (rtt/extras/PeriodicActivity.hpp line 226) that is used by all periodic activities with the same period (rtt/extras/TimerThread.cpp TimerThread::Instance line 66). So all activities, will have the same task name "TimerThreadInstance" :(
- when you create a default activity (ocl/deployment/DeploymentComponent.cpp line 1969), the Activity ctor (rtt/Activity.cpp line 75) will name the task with the component name.
- when you create a NonPeriodicActivity (ocl/deployment/DeploymentComponent.cpp line 1976), the Activity ctor (rtt/Activity.cpp line 75) will be called with the default argument for the name. So the task will always be named "Activity" :(
- i haven't search what it does for other activities (SlaveActivity, FileDescriptorActivity, etc.).

What you describe is probably due to TaskContext grouping (for PeriodicActivity). I don't know the lua interface to call ‘setActivity’ differently.

Regards.

Paul.

>
> /** Code in base class that sets name
> char thread_name[16];
>
> if (not pthread_getname_np(pthread_self(), thread_name, 16))
> {
> std::cout << "comp name: " << name << " default threadname: " << std::string(thread_name) << std::endl;
> }
>
> pthread_setname_np(pthread_self(), name.c_str());
>
> if (not pthread_getname_np(pthread_self(), thread_name, 16))
> {
> std::cout << "comp name: " << name << " new threadname: " << std::string(thread_name) << std::endl;
> }
> **/
>
> /** Code in rttlua that sets component activities

naming of component threads

FYI, I revived at the beginning on the year the patches from Paul
Chavent from a while back to enable lttng support for tracing of these
kind of events. They live on RTT's lttng branch, and allow to trace
all hooks, directly with the task name.

I had to put it on ice. Was useful to me, but did not (and does not)
have the time to test it properly.

Might be useful ...

Sylvain

2014-12-01 13:39 GMT-02:00 Gooding, Dustin R. (JSC-ER411)
<dustin [dot] r [dot] gooding [..] ...>:
> I’m attempting to debug a CPU-intensive component in a large/complex
> deployment. Reducing the complexity while reproducing the CPU usage has
> proven very difficult, so I need to leave the deployment as is. Right now,
> our components all inherit from a base class that provide a small amount of
> runtime statistics (actual period, max update duration, etc), but that
> information isn’t helpful enough. What I need to know, initially, is the
> name of the component that’s bogging things down. The trouble is, when I
> deploy everything (using rttlua, in this case), all ‘htop’ shows me is a
> bunch of processes with the name “rttlua” which is less than helpful.
>
> I found a patch that looked promising
> (http://bugs.orocos.org/show_bug.cgi?id=1016) linked to from a thread
> (http://www.orocos.org/forum/orocos/orocos-users/logging-timestamps-gener...)
> that discussed a related issue. The patch isn’t included in the latest
> Hydro pre-built or the master branch, but seems like an innocuous addition.
> If anything, I think this patch should be given a higher priority. It seems
> like a very useful thing.
>
> I tried the other suggestion of using pthread_setname_np (in my base class’
> constructor/startHook) but with very odd results. I found that multiple
> components would end up with the same thread name, with multiple duplicate
> entries in ‘htop’. Note, not all components would get the same name, but
> some would.
>
> /** Code in base class that sets name
> char thread_name[16];
>
> if (not pthread_getname_np(pthread_self(), thread_name, 16))
> {
> std::cout << "comp name: " << name << " default threadname: " <<
> std::string(thread_name) << std::endl;
> }
>
> pthread_setname_np(pthread_self(), name.c_str());
>
> if (not pthread_getname_np(pthread_self(), thread_name, 16))
> {
> std::cout << "comp name: " << name << " new threadname: " <<
> std::string(thread_name) << std::endl;
> }
> **/
>
> /** Code in rttlua that sets component activities

naming of component threads

On Dec 1, 2014, at 12:21 PM, Sylvain Joyeux <sylvain [dot] joyeux [..] ...> wrote:
>
> FYI, I revived at the beginning on the year the patches from Paul
> Chavent from a while back to enable lttng support for tracing of these
> kind of events. They live on RTT's lttng branch, and allow to trace
> all hooks, directly with the task name.
>
> I had to put it on ice. Was useful to me, but did not (and does not)
> have the time to test it properly.
>
> Might be useful ...
>
> Sylvain
>

That looks fantastic. You think this functionality is too much overhead (dependency- or performance-wise) to eventually be included in master?

My initial guess, fwiw, is I have a non-periodic RT priority component that’s overrunning (that is, it is triggered again before the first go has finished). When I move the deployment back to non-RT components, i don’t get the telltale 100% CPU spike, but then I’m not “RT”. Granted, I’m not even using a “real” RT kernel, just the built-in kernel preemption in vanilla.

So with this branch, I’d just add this RTT package to my personal catkin workspace (thus stepping in front of the system RTT), and then lttng trace the updateHook (or maybe InputPort) and watch’m go? I take it this only tracepoints the beginning of Hooks and not the end? Watching for the ends of Hooks might be useful too, to watch for the type of overruns I think are happening.

Thanks!


dustin