NAME
syscall
—
system calls overview
DESCRIPTION
System calls in the kernel are implemented through a set of switch tables for each emulation type. Each table is generated from the “master” file by sys/kern/makesyscalls.sh through the appropriate rules in the Makefile.
The “master” file is a text file consisting of a list of lines for each system call. Lines may be split by the means of back slashing the end of the line. Each line is a set of fields separated by whitespace:
number type ...
Where:
- number
- is the system call number;
- type
- is one of:
- STD
- always included;
- OBSOL
- obsolete, not included in the system;
- UNIMPL
- unimplemented, not included in the system;
- NODEF
- included, but don't define the syscall number;
- NOARGS
- included, but don't define the syscall args structure;
- INDIR
- included, but don't define the syscall args structure, and allow it to be "really" varargs;
- COMPAT_XX
- a compatibility system call, only included if the corresponding option is configured for the kernel (see options(4)).
The rest of the line for the STD, NODEF, NOARGS, and COMPAT_XX types is:
{ pseudo-proto } [alias]
pseudo-proto
is a C-like prototype used to
generate the system call argument list, and alias is an optional name alias
for the call. The function in the prototype has to be defined somewhere in
the kernel sources as it will be used as an entry point for the
corresponding system call.
For other types the rest of the line is a comment.
To generate the header and code files from the “master” file a make(1) command has to be run from the directory containing the “master” file.
Usage
Entry from the user space for the system call is machine dependent. Typical code to invoke a system call from the machine dependent sources might look like this:
const struct sysent *callp; register_t code, args[8], rval[2]; struct proc *p = curproc; int code, nsys; ... /* "code" is the system call number passed from the user space */ ... if (code < 0 || code >= nsys) callp += p->p_emul->e_nosys; /* illegal */ else callp += code; /* copyin the arguments from the user space */ ... rval[0] = 0; /* the following steps are now performed using mi_syscall() */ #ifdef SYSCALL_DEBUG scdebug_call(p, code, args); #endif #ifdef KTRACE if (KTRPOINT(p, KTR_SYSCALL)) ktrsyscall(p, code, argsize, args); #endif error = (*callp->sy_call)(p, args, rval); switch (error) { case 0: /* normal return */ ... break; case ERESTART: /* * adjust PC to point before the system call * in the user space in order for the return * back there we reenter the kernel to repeat * the same system call */ ... break; case EJUSTRETURN: /* just return */ break; default: /* * an error returned: * call an optional emulation errno mapping * routine and return back to the user. */ if (p->p_emul->e_errno) error = p->p_emul->e_errno[error]; ... break; } /* the following steps are now performed using mi_syscall_return() */ #ifdef SYSCALL_DEBUG scdebug_ret(p, code, orig_error, rval); #endif userret(p); #ifdef KTRACE if (KTRPOINT(p, KTR_SYSRET)) ktrsysret(p, code, orig_error, rval[0]); #endif
The SYSCALL_DEBUG
parts of the code are
explained in the Debugging section
below. For the KTRACE
portions of the code refer to
the ktrace(9) document for further explanations.
Debugging
For debugging purposes the line
option SYSCALL_DEBUG
should be included in the kernel configuration file (see options(4)). This allows tracing for calls, returns, and arguments for both implemented and non-implemented system calls. A global integer variable scdebug contains a mask for the desired logging events:
- SCDEBUG_CALLS
- (0x0001) show calls;
- SCDEBUG_RETURNS
- (0x0002) show returns;
- SCDEBUG_ALL
- (0x0004) show even syscalls that are implemented;
- SCDEBUG_SHOWARGS
- (0x0008) show arguments to calls.
Use ddb(4) to set scdebug to the desired value.
CODE REFERENCES
- sys/kern/makesyscalls.sh
- a sh(1) script for generating C files out of the syscall master file;
- sys/kern/syscalls.conf
- a configuration file for the shell script above;
- sys/kern/syscalls.master
- master files describing names and numbers for the system calls;
- sys/kern/syscalls.c
- system call names lists;
- sys/kern/init_sysent.c
- system call switch tables;
- sys/sys/syscallargs.h
- system call argument lists;
- sys/sys/syscall.h
- system call numbers;
- sys/sys/syscall_mi.h
- Machine-independent syscall entry end return handling.
SEE ALSO
HISTORY
The syscall
section manual page appeared
in OpenBSD 3.4.