[SLL] technically speaking, what might the vendor do?

Chuck Wolber chuckw at quantumlinux.com
Mon Dec 1 23:43:17 PST 2008


On Mon, 1 Dec 2008, Mathew D. Watson wrote:

[...]

> The problem is, I _really_ think it's a bad idea to run user apps as 
> root. I said as much to the vendor and threw in that I consider 
> requiring to run as root an API defect.  I received the following reply.

You did not say what the machine was *ALSO* used for. Nor did you describe 
where the machine sits in relation to the Internet.

*IF* the machine is a dedicated box for this application only *AND* it is 
not directly connected to the Internet or any other unsecured network, 
*THEN* there's nothing wrong with running the app as root.

Failing the above, there are still good reasons to run an app as root. I 
simply do not have the time to outline them all... Suffice to say, it is 
not an *EITHER*/*OR* proposition. It requires some critical thinking to 
decide if you are at risk or not.

However, as a *GENERAL* rule, the safest policy is to run each app as a 
different non-root user within its own chroot jail. Keep reading though...


> " Machine Vision applications and processes are highly demanding on the 
> host OS. Since Linux is a non RTOS, we need to run the application as a 
> super user to prevent other (non [Vendor] API) threads from interrupting 
> the process.  This is not a "[Vendor] API defect", it is beyond our 
> control or influence. "

Hmmm, well sorta... They are right to bring up the RTOS thing. However, 
the only thing that cannot be interrupted in the modern (non RTOS) Linux 
kernel are hardware interrupts. Everything else is fair game. Regardless 
of user...

That was not always the case. Back in the olden days, kernel threads could 
not be interrupted and you would get a rather choppy response when you 
tried to listen to MP3s or watch movies if you had a lot going on "under 
the hood". Then at some point (I forget exactly when) they got rid of the 
BKL (Big Kernel Lock) and made all of the rest of the locking much more 
granular and voila, everything but hardware is interruptible...

In an RTOS kernel everything can be interrupted too (except hardware 
interrupts), but there is a (hard or soft) guarantee regarding the max 
length of time a thread can go without a CPU and/or IO timeslice. It is 
actually more complicated than that, and if you want to know why, get a 
good book on real time programming.

As far as what your vendor is saying about running as root allowing you to 
run at a higher priority, that is just not the case. As the root user, you 
can renice anything to the highest priority (-20), even processes *NOT* 
running as root. In other words, it does not matter which user the app 
runs as, it still gets the same priority as root owned processes. The only 
thing you get running as root is access, not extra performance.


> To my eye "it is beyond our control or influence" is not true, but I am 
> not an expert (I just know enough to be dangerous).  Technically 
> speaking, what might the vendor do? (short of using something like 
> rtlinux or rtai)

What they are saying does not pass the sniff test. As for what they would 
do to fix it, it is hard to tell without knowing the app better. My 
instinct tells me that the hardware they tested the app on was much faster 
than your hardware. You might ask them what they are running on their test 
setups. They probably have a systemic problem in the software that's being 
duct taped over with fast hardware. This can probably be fixed, but they 
may not be willing to spend what it takes to do so, and you may not be 
willing to pay for the software if they did. Logic is complicated and good 
developers are expensive (bad developers are even more expensive).

The first step towards solving the problem is to find out if it is CPU or 
IO bound. Use vmstat to figure that out. If you are IO bound, then data is 
not being written fast enough to the disk when your camera is sending data 
to the computer running the vendor's software. To fix that, you can use a 
different IO scheduler and/or upgrade your IO subsystem (IE get faster 
disks). The vendor could also probably help here by buffering more 
efficiently.

If it is CPU bound, turn off wasteful stuff like the GUI desktop, and 
other non-essential processes, and run this app from the commandline. If 
they only have a GUI version, then that may be part of the problem. You 
may also need to consider a CPU upgrade. A lot of times, a vendor will 
mistakenly think they're playing it smart to process data as it rolls in. 
The vendor could be even smarter here by buffering raw data and deferring 
processing to a "bottom half" type mechanism. It would add a delay to how 
fast the machine can accept images, but at least you'll always get an 
accurate result. If possible, you could overcome this delay by having 
multiple image processing machines.

Some other things you should consider is to max out the RAM on the machine 
and update the kernel to the very latest release (2.6.27.7).

..Chuck..

P.S. You mentioned "gigabit ethernet machine". It goes without saying that 
gigabit ethernet is hardly the only link in the chain. A faster network 
interface is unlikely to solve your problem.

-- 
http://www.quantumlinux.com                 | "An idea does not gain
 Quantum Linux Laboratories, LLC.           |  truth as it gains
 ACCELERATING Business with Open Technology |  followers." Amanda Bloom


More information about the linux-list mailing list