find is a long-standing UNIX® utility. Its role is to recursively scan one or more directories and find files which match a certain set of criteria in those directories. Even though it is very useful, the syntax is truly obscure, and using it requires a little practice. The general syntax is:
find [options] [directories] [criterion1] ... [criterionN] [action]
If you do not specify any directory, find will search the current directory. If you do not specify criteria, this is equivalent to “true”, thus all files will be found. The options, criteria and actions are so numerous that we will only mention a few of each here. Here are some options:
-xdev
: do not search on
directories located on other file systems.
-mindepth
<n>
: descend at least n
levels below the specified directory before searching for
files.
-maxdepth
<n>
: search for files which are located at most
n
levels below the specified directory.
-follow
: follow symbolic
links if they link to directories. By default, find does
not follow links.
-daystart
: when using
tests related to time (see below), take the beginning of current day
as a time stamp instead of the default (24 hours before current
time).
A criteria may be one or more of several atomic tests. Some useful tests are:
-type
<file_type>
: search for a given type of file.
file_type
can be one of: f
(regular file), d
(directory), l
(symbolic link), s
(socket), b
(block mode file), c
(character mode file) or
p
(named pipe).
-name
<pattern>
: find files whose names match the
given pattern. With this option, the pattern is treated as a
shell globbing pattern
(see Section 3, “Shell Globbing Patterns”).
-atime
<n>
, -amin <n>
: find
files which have last been accessed n
days
ago (-atime
) or n
minutes
ago (-amin
). You can also specify
<+n>
or <-n>
, in
which case the search will be done for files accessed at most
or at least n
days/minutes ago.
-anewer
<a_file>
: find files which have been accessed
more recently than file a_file
.
-ctime
<n>
, -cmin <n>
,
-cnewer <file>
: same as for
-atime
, -amin
and
-anewer
, but applies to the last time that the
contents of the file were modified.
-regex
<pattern>
: same as -name
, but
pattern
is treated as a regular expression.
There are many other tests, refer to find(1) for more details. To combine tests, you can use one of:
<c1>
-a <c2>
: true if both c1
and
c2
are true; -a
is
implicit, therefore you can type <c1> <c2>
<c3>
if you want all c1
,
c2
and c3
tests to
match.
<c1>
-o <c2>
: true if either c1
or c2
are true, or both. Note that
-o
has a lower precedence than
-a
, therefore if you want to match files
which match criteria c1
or
c2
and also match criterion
c3
, you will have to use parentheses and
write ( <c1> -o <c2> ) -a
<c3>
. You must escape (deactivate)
parentheses, as otherwise they will be interpreted by the
shell
!
-not
<c1>
: inverts test c1
,
therefore -not <c1>
is true if
c1
is false.
Finally, you can specify an action for each file found. The most frequently used are:
-print
: just prints the
name of each file on the standard output. This is the default
action.
-ls
: prints on the
standard output the equivalent of ls -ilds for each
file found.
-exec
<command_line>
: executes command
command_line
on each file found. The
command line command_line
must end with a
;
, which you must escape so that the shell
does not interpret it. The file position is marked with
{}
. See the usage
examples.
-ok
<command>
: same as -exec
but
asks for confirmation for each command.
The best way to consolidate all of
the options and parameters is with some examples. We want to find
all directories in the /usr/share
directory. We would type:
find /usr/share -type d
Suppose you have an HTTP
server. All your HTML files are in
/var/www/html
, which is also your current
directory. You want to find all files whose contents have not been
modified for a month. Because you have pages from several writers,
some files have the html
extension and some
have the htm
extension. You want to link
these files in the /var/www/obsolete
directory. You would type[27]:
find \( -name "*.htm" -o -name "*.html" \) -a -ctime -30 \ -exec ln {} /var/www/obsolete \;
This is a fairly complex example, and requires a little explanation. The criterion is this:
\( -name "*.htm" -o -name "*.html" \) -a -ctime -30
which does what we want: it finds all files
whose names end either in .htm
or
.html
(“ \( -name
"*.htm" -o -name
"*.html" \)
”),
and (-a
) which have not been
modified in the last 30 days, which is roughly a month (-ctime
-30
). Note the parentheses: they are necessary here, because
-a
has a higher precedence. If there weren't any, all
files ending with .htm
would have been found, plus
all files ending with .html
and which haven't been
modified for a month, which is not what we want. Also note that
parentheses are escaped from the shell: if we had put
( .. )
instead of
\( .. \)
, the shell would have
interpreted them and tried to execute -name
"*.htm" -o -name "*.html"
in
a sub-shell... Another solution would have been to put parentheses between
double quotes or single quotes, but a backslash here is preferable as we
only have to isolate one character.
And finally, there is the command to be executed for each file:
-exec ln {} /var/www/obsolete \;
Here too you have to escape the
;
character from the shell
. Otherwise
the shell would interpret it as a command separator. If you happen
to forget, find will complain that -exec
is missing an argument.
A last example: you have a huge
directory (/shared/images
) containing all
kinds of images. You regularly use the touch command to
update the times of a file named stamp
in
this directory, so that you have a time reference. You want to
find all JPEG images which are newer than
the stamp
file, but because you got the
images from various sources, these files have extensions
jpg
, jpeg
,
JPG
or JPEG
. You also
want to avoid searching in the old
directory. You want this file list to be mailed to you, and your
user name is peter
:
find /shared/images -cnewer \ /shared/images/stamp \ -a -iregex ".*\.jpe?g" \ -a -not -regex ".*/old/.*" \ | mail peter -s "New images"
Of course, this command is not very useful if you have to type it each time, and you would like it to be executed regularly. A simple way to have the command run periodically is to use the cron daemon as shown in the next section.
[27] Note that this example requires that
/var/www
and
/var/www/obsolete
be on the same file
system!