Posts Tagged howto

Cron

Posted by on Tuesday, 7 April, 2009

Technically, cron is just the clock daemon (/usr/sbin/cron or perhaps /usr/sbin/crond) that executes commands at specific times. However, a handful of configuration files and programs go into making up the cron package. Like many system processes, cron never ends.

The controlling files for cron are the cron-tables or crontabs. The crontabs are often located in /var/spool/cron/crontab. However, on SuSE you will find them in /var/spool/cron/tabs. The names of the files in this directory are the names of the users that submit the cron jobs.

Unlike other UNIX dialects, the Linux cron daemon does not sleep until the next cron job is ready. Instead, when cron completes one job, it will keep checking once a minute for more jobs to run. Also, you should not edit the files directly. You can edit them with a text editor like vi, though there is the potential for messing things up. Therefore, you should use the tool that Linux provides: crontab. (see the man-page for more details)

The crontab utility has several functions. It is the means by which files containing the cron jobs are submitted to the system. Second, it can list the contents of your crontab. If you are root, it can also submit and list jobs for any user. The problem is that jobs cannot be submitted individually. Using crontab, you must submit all of the jobs at the same time.

At first, that might sound a little annoying. However, lets take a look at the process of “adding” a job. To add a cron job, you must first list out the contents of the existing crontab with the -l option. If you are root and wish to add something to another user’s crontab, use the -u option followed by the user’s logname. Then redirect this crontab to a file, which you can then edit. (Note that on some systems crontab has -e (for “edit”), which will do all the work for you. See the man-page for more details.)

For example, lets say that you are the root user and want to add something to the UUCP user’s crontab. First, get the output of the existing crontab entry with this command:

crontab -l -u uucp >/tmp/crontab.uucp

To add an entry, simply include a new line. Save the file, get out of your editor, and run the crontab utility again. This time, omit the -l to list the file but include the name of the file. The crontab utility can also accept input from stdin, so you could leave off the file name and crontab would allow you to input the cronjobs on the command line. Keep in mind that any previous crontab is removed no matter what method you use.

The file /tmp/crontab.uucp now contains the contents of UUCPs crontab. It might look something like this:

39,9 * * * * /usr/lib/uucp/uudemon.hour > /dev/null
10 * * * * /usr/lib/uucp/uudemon.poll > /dev/null
45 23 * * * ulimit 5000; /usr/lib/uucp/uudemon.clean > /dev/null
48 10,14 * * 1-5 /usr/lib/uucp/uudemon.admin > /dev/null

Despite its appearance, each crontab entry consists of only six fields. The first five represent the time the job should be executed and the sixth is the actual command. The first five fields are separated by either a space or a tab and represent the following units, respectively:

  • minutes (0-59)
  • hour (0-23)
  • day of the month (1-31)
  • month of the year (1-12)
  • day of the week (0-6, 0=Sunday)

To specify all possible values, use an asterisk (*). You can specify a single value simply by including that one value. For example, the second line in the previous example has a value of 10 in the first field, meaning 10 minutes after the hour. Because all of the other four time fields are asterisks, this means that the command is run every hour of every day at 10 minutes past the hour.

Ranges of values are composed of the first value, a dash, and the ending value. For example, the fourth line has a range (1-5) in the day of the week column, meaning that the command is only executed on days 1-5, Monday through Friday.

To specify different values that are not within a range, separate the individual values by a column. In the fourth example, the hour field has the two values 10 and 14. This means that the command is run at 10 a.m. and 2 p.m.

Note that times are additive. Lets look at an example:

10 * 1,16 * 1-5 /usr/local/bin/command

The command is run 10 minutes after every hour on the first and sixteenth, as well as Monday through Friday. If either the first or the sixteenth were on a weekend, the command would still run because the day of the month field would apply. However, this does not mean that if the first is a Monday, the command is run twice.

The crontab entry can be defined to run at different intervals than just every hour or every day. The granularity can be specified to every two minutes or every three hours without having to put each individual entry in the crontab.

Lets say we wanted to run the previous command not at 10 minutes after the hour, but every ten minutes. We could make an entry that looked like this.:

0,10,20,30,40,50 * 1,16 * 1-5 /usr/local/bin/command

This runs every 10 minutes: at the top of the hour, 10 minutes after, 20 minutes after, and so on. To make life easier, we could simply create the entry like this:

*/10 * 1,16 * 1-5 /usr/local/bin/command

This syntax may be new to some administrators. (It was to me.) The slash (/) says that within the specific interval (in this case, every minute), run the command every so many minutes; in this case, every 10 minutes.

We can also use this even when we specify a range. For example, if the job was only supposed to run between 20 minutes after the hour and 40 minutes after the hour, the entry might look like this:

20-40 * 1,16 * 1-5 /usr/local/bin/command

What if you wanted it to run at these times, but only every three minutes? The line might look like this:

20-40/3 * 1,16 * 1-5 /usr/local/bin/command

To make things even more complicated, you could say that you wanted the command to run every two minutes between the hour and 20 minutes after, every three minutes between 20 and 40 minutes after, then every 5 minutes between 40 minutes after and the hour.

0-20/2,21-40/3,41-59/5 * 1,16 * 1-5 /usr/local/bin/command

One really nice thing that a lot of Linux dialects do is allow you to specify abbreviations for the days of the week and the months. Its a lot easier to remember that fri is for Friday instead of 5.

With the exception of certain errors in the time fields, errors are not reported until cron runs the command. All error messages and output is mailed to the users. At least that’s what the crontab man-page says and that is basically true. However, as you see in the previous examples, you are redirecting stdout to /dev/null. If you wanted to, you could also redirect stderr there and you would never see whether there were any errors.

Output is mailed to the user because there is no real terminal on which the cronjobs are being executed. Therefore, there is no screen to display the errors. Also, there is no keyboard to accept input. Does that mean you cannot give input to a cron job? No. Think back to the discussion on shell scripts. We can redefine stdin, stdout and stderr. This way they can all point to files and behave as we expect.

One thing I would like to point out is that I do not advocate doing redirection in the command field of the crontab. I like doing as little there as possible. Instead, I put the absolute path to a shell script. I can then test the crontab entry with something simple. Once that works, I can make changes to the shell script without having to resubmit the cronjob.

Keep in mind that cron is not exact. It synchronizes itself to the top of each minute. On a busy system in which you lose clock ticks, jobs may not be executed until a couple minutes after the scheduled time. In addition, there may be other processes with higher priorities that delay cron jobs. In some cases, (particularly on very busy systems) jobs might end up being skipped if they are run every minute.

Access is permitted to the cron facility through two files, both in /etc. If you have a file cron.allow, you can specify which users are allowed to use cron. The cron.deny says who are specifically not allowed to use cron. If neither file exists, only the system users have access. However, if you want everyone to have access, create an entry cron.deny file. In other words, no one is denied access.

It is often useful for root to run jobs as a different user without having to switch users (for example, using the su command). Most Linux dialects provide a mechanism in the form of the /etc/crontab file. This file is typically only writable by root and in some cases, only root can read it (which is often necessary in high security environments). The general syntax is the same as the standard crontabs, with a couple of exceptions.

The first difference is the header, which you can see here:

SHELL=/bin/sh
PATH=/usr/bin:/usr/sbin:/sbin:/bin:/usr/lib/news/bin
MAILTO=root
#
# check scripts in cron.hourly, cron.daily, cron.weekly, and cron.monthly
#
59 *  * * *     root  rm -f /var/spool/cron/lastrun/cron.hourly
14 0  * * *     root  rm -f /var/spool/cron/lastrun/cron.daily
29 0  * * 6     root  rm -f /var/spool/cron/lastrun/cron.weekly
44 0  1 * *     root  rm -f /var/spool/cron/lastrun/cron.monthly

The SHELL variable defines the shell under which each command will run. The PATH variable is like the normal PATH environment variable and defines the search path. The MAILTO variable says who should get email messages, which includes error messages and the standard output of the executed commands.

The structure of the actual entries is pretty much the same with the exception of the user name (root in each case here). This way, the root users (or whoever can edit /etc/crontab) can define which user executes the command. Keep in mind that this can be a big security hole. If someone can write to this file, they can create an entry that runs as root and therefore has complete control of the system.

The next command in the cron “suite” is at. Its function is to execute a command at a specific time. The difference is that once the at job has run, it disappears from the system. As for cron, two files, at.allow and at.deny, have the same effect on the at program.

The batch command is also used to run commands once. However, commands submitted with batch are run when the system gets around to it, which means when the system is less busy, for example, in the middle of the night. Its possible that such jobs are spread out over the entire day, depending on the load of the system.

One thing to note is the behavior of at and batch. Both accept the names of the commands from the command line and not as arguments to the command itself. You must first run the command to be brought to a new line, where you input the commands you want execute. After each command, press Enter. When you are done, press Ctrl-D.

Because these two commands accept commands from stdin, you can input the command without having to do so on a new line each time. One possibility is to redirect input from a file. For example

at now +1 hour < command_list

where command_list is a file containing a list of commands. You could also have at (or batch) as the end of a pipe

cat command_list | at now + 1 hour

cat command_list | batch

Another interesting thing about both at and batch is that they create a kind of shell script to execute your command. When you run at or batch, a file is created in /usr/spool/cron/atjobs. This file contains the system variables that you would normally have defined, plus some other information that is contained in /usr/lib/cron.proto. This essentially creates an environment as though you had logged in.

Last-Modified: 2007-03-07 19:38:50


Partitioning basics

Posted by on Tuesday, 7 April, 2009

Most computer users are aware of the existence of their hard drive, even if they’re not familiar with how it works. It’s hard not to be, since the first things people tend to ask when you get a new computer are how much hard drive space you have, how fast the machine is, and how much memory is included. From there, if you’re used to using Windows, then you’ve probably dealt with the installer letting you break your hard drive into “multiple drives,” probably referred to as C:, E:, F:, and so on.

You may not realize it, but if you’ve done this, you’ve already created partitions. A partition is a virtual drive inside a drive, created through storing information about the drive’s virtual layout in special locations on the drive itself. The system’s BIOS and operating system(s) then utilize this information to determine where to look on the drive for boot instructions and data.

A hard drive may look like a short, rectangular box, but the data is stored on a round section that looks somewhat like a stack of records . The way data gets written to and read from this drive involves drive heads, which are housed on the carriage assembly shown in the image. Each of these electromagnetic heads alters data on a hard drive by manipulating the magnetic media at such a miniscule level that the head turns on and off individual bits and bytes.

When a PC boots, once the BIOS finishes loading hardware information, it looks to the very first spot on the hard drive, which is referred to as the Master Boot Record (MBR). How do we find this spot? The data on a hard drive is stored in a series of rings called tracks, and the tracks are subsequently broken up into equally sized pieces called sectors . Tracks are numbered from 0 on up, and the counting starts at the outermost ring. Sectors are numbered from 1 on up.

The MBR is on track 0, sector 1: the very first location on the drive. In this tiny spot on your hard drive, two key pieces of information live. The first is the data on how many partitions you’ve created on the drive, and their vital statistics. The second is vital to whether your machine will boot properly or not without a floppy disk: the MBR contains a pointer to the specific partition or hard drive that has the boot information.

If you’re familiar with Linux installation, then you’ll know that the boot loaders LILO and GRUB place information in the MBR.

When you partition a hard drive, you’re creating a virtual drive within a drive. Rather than dealing with tracks and sectors, though, you’re dealing with cylinders. A hard drive these days is actually a stack of platters with drive heads that work with the top and bottom of each platter. Every one of those platters has identical tracks and sectors, and a cylinder is comprised of the data contained within a particular ring of tracks as shown in Figure 1.
Fig 1

There are BIOS limits on the number of partitions you can have on a single drive in a PC. Perhaps one day this issue will change, when drives get so large that you might want more than sixteen partitions, but until then I wouldn’t expect it to change soon. How you lay out these partitions is important, since of those sixteen partitions, only four of those can be primary, the rest have to be logical. So if you want one through four partitions then make them all primary. If you want five through sixteen partitions, then you make three of the partitions primary, the fourth a special type of container partition an extended, and then you make as many as you need (up to a total of sixteen) as logical partitions inside the extended one.

Confused? It doesn’t really make a lot of sense at first glance, but that’s how it works. You might notice that you don’t really get sixteen partitions out of the deal. You get fifteen, since the extended partition is just a container for other partitions; you won’t have any data in it. See Figure 2 for a conceptual view of how these partitions might work.
Fig 1

Notice in Figure 4 how there’s this weird label associated with each of the partitions I mentioned. In the Windows world, partitions and hard drives are labeled exactly the same way, starting with C:, then D:, and so on. In the Linux world, each drive also has a letter associated with it, and from there it gets more complicated (but in the end it makes a lot more sense once you understand it).

So let’s start with the hard drives, themselves. The primary drive in your machine is drive a, the secondary is b, and so on. Then you have to distinguish between an IDE drive, and a SCSI drive; most people these days use IDE drives in their home computers, if you’re using a SCSI drive, you had to go out of your way to get it, so hopefully you’ll know that’s what you’ve got. Linux sees all IDE drives as hd, and all SCSI drives as sd.

Therefore, if the first drive on your machine is an IDE, then Linux sees that drive as hda. The second IDE drive is hdb, and so on. If your first drive is a SCSI, then it’s seen as sda, the second as sdb, and so on. On a machine where for some odd reason you’ve mixed and matched hardware between SCSI and IDE, Linux will letter the drives in the order that the computer sees them.

Now on to the partitions. Each partition gets a number, just as you saw in Figure 4. No matter what drive we’re talking about, the partitions are simply numbered in order: 1, 2, 3, 4, and so on. The number is then tacked onto the end of the drive reference, so the first partition on the first IDE hard drive is hda1, the second on that drive is hda2, etc. If your second drive is a SCSI, then you’d end up with sdb1, sdb2, sdb3, and so on.

There’s just one more thing in Figure 4 I haven’t explained yet. Understanding it requires wrapping your mind around the fact that everything to Linux is a file. Your hard drive is a file, your partitions are files, your printers are files, your directories are files, your monitor’s a file, and so on.

All devices on your system (pieces of hardware that Linux needs to interact with) have a corresponding file in the /dev directory. This fact includes both your hard drives and your partitions. This is why, in Figure 4, you see references to /dev/hda1, /dev/hda2, and so on. That’s how the Linux kernel looks at your drives and partitions. The nice thing is that of course, each type of hard drive needs a different driver to interface with the kernel, but we’re spared that much of it. We can just use these handy shortcuts.

Aside from the drive designators (/dev/hda1, etc.), your partitions are named after where you want the kernel to place them within the filesystem. Unlike Windows where you treat every partition as a different drive, Linux leaves the physical locations out of it and just lets you navigate through directories and files. So your root directory (/) might be on /dev/hda1, your home directories (all stored in /home) might all be on /dev/hda2, and then maybe Fred’s home directory (/home/fred) is even on another machine across the network but you’re using NFS to attach it to this machine as well.

You create Linux partitions during the installation process. How many you make depends on quite a number of factors, but I have some general advice that I tend to give anyone that asks me. Let’s start with the general ones that all Linux machines must have and work from there.

All Linux boxes need a root partition (/), and a swap partition. Typically speaking, you want the boot partition as close to the start of the drive as possible, so on the lowest cylinder number, which means that if you only used these two partitions you would put root on the drive first?making root /dev/hda1 and swap /dev/hda2. The swap partition, in general, should be either the same size as your machine’s memory (RAM), or up to twice that size, depending on who you talk to. Root’s size depends on how much room your particular Linux distribution requires.

You can make more than two partitions, and in most cases you should. At the very least, I tend to advise making a boot partition (/boot) as well, and placing that on the drive first. That way, if something in your root partition gets damaged, at the very least you’ll be able to boot the machine to try to fix it. Another popular addition is a home partition (/home), so that if you want to completely reinstall the machine you can wipe everything but leave home untouched?though keep in mind that if you do this, you need to create user accounts in the same order as you had the last time or your permissions will be quite messed up in /home.

When you’re dealing with servers, there are even more partitions you might want to create. A temporary partition (/tmp) will make sure that temp files can’t fill up your filesystem, and also protects the root and boot portions from potential damage since temporary files are changed so often. For the same reason, on a server machine you might want to create a separate /var partition, since that’s where your log files, mail, and other such items are kept and constantly changed.

If you’re using a network and want to keep certain items on a central machine, then you might create a separate /usr partition on the server and then mount it using NFS onto all of your Linux boxes. You might do the same with /home, so everyone has access to their full home directories no matter what machine they log into.

There’s lots more you can do here as well. If you use Samba to access files from Windows, Macintosh, and other machines on your network, then they too will ultimately show up as partitions plugged into your filesystem.

As you can see, Linux deals with drives and partitions quite differently from other non-Unix operating systems. It takes a bit of getting used to, but once you understand the basic concepts life gets a lot easier. The nice thing is that you don’t have to remember what drive you have particular types of data on! There’s still lots more to learn but this primer should get you started on the basics. From there, look into SMB, NFS, mounting, devices, and more.

Dee-Ann LeBlanc has been writing about computers since 1994, when she did her first computer book. Since then, she’s written 10 books, over forty articles, a number of courses, and twelve presentations (which she also presented), with most of these works involving Linux. Her latest book is Linux Routing from New Riders, and you can find out more at http://www.Dee-AnnLeBlanc.com/.

Artwork provided by Bryan Hoff (http://www.themeparkmultimedia.com/)

Last-Modified: 2007-03-07 19:38:50