Thursday, May 27, 2010

What is the logic behind killing processes during an Out of Memory situation?

As per kernel source code, following is OOM-killer logic,

A function called badness() is defined to calculate points for each processes.

* Points will be added to following processes.

Processes with high memory size.
Niced processes.

* Points will be reduced from following processes.

Processes which were running for long time.
Processes which were started by superusers.
Process with direct hardware access.

The process with the highest number of point, will be killed, unless it is already in the midst of freeing up memory on its own.

Then the system will wait for sometime to see if enough memory is freed. If enough memory is not freed after killing one process, the above steps will continue.

As per select_bad_process function, if a processes is having 0 or less points it could not be killed. The oom kills will be continued until there is no candidate processes left to kill. If the system is not able to find a candidate process to kill, it panics.

static unsigned long badness(struct task_struct *p, unsigned long uptime)
unsigned long points, cpu_time, run_time, s;

if (!p->mm)
return 0;

if (p->flags & PF_MEMDIE)
return 0;
* The memory size of the process is the basis for the badness.
points = p->mm->total_vm;

* CPU time is in tens of seconds and run time is in thousands
* of seconds. There is no particular reason for this other than
* that it turned out to work very well in practice.
cpu_time = (p->utime + p->stime) >> (SHIFT_HZ + 3);

if (uptime >= p->start_time.tv_sec)
run_time = (uptime - p->start_time.tv_sec) >> 10;
run_time = 0;

s = int_sqrt(cpu_time);
if (s)
points /= s;
s = int_sqrt(int_sqrt(run_time));
if (s)
points /= s;

* Niced processes are most likely less important, so double
* their badness points.
if (task_nice(p) > 0)
points *= 2;

* Superuser processes are usually more important, so we make it
* less likely that we kill those.
if (cap_t(p->cap_effective) & CAP_TO_MASK(CAP_SYS_ADMIN) ||
p->uid == 0 || p->euid == 0)
points /= 4;

* We don't want to kill a process with direct hardware access.
* Not only could that mess up the hardware, but usually users
* tend to only have this flag set on applications they think
* of as important.
if (cap_t(p->cap_effective) & CAP_TO_MASK(CAP_SYS_RAWIO))
points /= 4;
#ifdef DEBUG
printk(KERN_DEBUG "OOMkill: task %d (%s) got %d points\n",
p->pid, p->comm, points);
return points;

No comments:

Post a Comment