Links
On path: Filing | 1: Filing System • 2: File Systems • 3: Files • 4: File Attributes • 5: File Types • 6: File Permissions • 7: File Access • 9: Filing System Implementation • 10: I-nodes • 11: Links • 12: File Descriptor |
---|
Depends on | Filing System Implementation |
---|
The simplest filing system is ‘flat’, where all files are in the same place. This is not manageable for any significant number of files.
A more familiar structure is probably the tree structure. This allows some categorisation of files and filenames can be duplicated in different directories. The actual filename really includes the path to that file from the root of the filestore, or relative to a known position.
Sometimes it is convenient to have a file in more than one place. Rather than copying the file, this may be done by adding a link, which is an alias for a file.
Adding links can cause complications. The structure ceases to be a simple tree and becomes a directed graph. In particular, it is undesirable to form cycles in this graph: recursive searches would never reach the ‘bottom’ of the structure. A particular system way therefore not allow such links to be created.
In Unix, at least, there is more than one way of building a link. These will largely appear to be the same, but there are important differences.
Symbolic links
A symbolic (or ‘soft’) link is a special form of file which is simply a form of pointer to another file. The other file could be a directory; making it point at a directory higher up the hierarchy (i.e. creating a cycle) is probably a bad idea, but shouldn’t be catastrophic.
Exercise: Try this in a Unix terminal
ln -s `which ls` dir
. (Thewhich ls
command finds and returns the path to your usualls
command.) You now have your ‘very own’, Windows-likedir
command … which simply points at the operating system utility. (You can delete the link again, harmlessly, withrm dir
.)
A symbolic link, being an alias for an object, does not have to link to anything real.
Exercise: Try this in a Unix terminal
ln -s this_is_not_a_filename_which_exists qwerty
(you can shorten the filenames).qwerty
will now appear in your directory butcat qwerty
will fail (because there is no real file there).
Hard links
To understand hard links it is probably sensible to check up on (e.g.) i-nodes first. Note that an i-node defines a file and the filename is simply associated with that (unique) node number.
When a file is created, an i-node is allocated and a directory entry is made which includes the i-node number. Another way of stating this is: the file is created and a name is linked to it.
When a hard link is created it is a directory entry which links to the i-node – to the file itself, not another name. The link is the file.
Exercise: Try this in a Unix terminal: Choose or create a file (e.g. xxx).
stat xxx
– you will probably see “Links: 1”, as in there is one directory entry, somewhere, pointing to this file. Create a hard link to that file: (e.g.)ln xxx yyy
.stat xxx
– you will probably see “Links: 2” now.stat yyy
– the same file so the data is the same! You can remove eitherxxx
oryyy
– removing both will remove the file though.
Note that the filing system is keeping track of the number of links to an i-node.
- “Deleting a file” which has multiple links will not actually delete the data.
- Deleting a file when that is the only link would leave the i-node isolated; at this point it becomes is available for reuse.
Experiment
Have a look at the output from ls -la
. The first number is the
hard link count. This will be 1
for most data files but more for a
directory. Create a new, empty directory and the count for .
will
be 2
: one from the parent and one from itself: the count for ..
is
likely to be more because every subdirectory also points at its
parent with a hard link. Example:
From the red directory:
drwxr-xr-x 2 jdg apt 4096 Sep 21 16:12 .
drwxr-xr-x 4 jdg apt 4096 Sep 21 16:12 ..
-rw-r--r-- 1 jdg apt 16 Sep 21 16:12 thing
- The parent directory and self have hard links to
.
- The parent directory has links from its own parent, itself and two subdirectories (we are in one of these)
thing
is just a file and has one link – from this directory.
It could have more, but hasn’t here.
Links and i-nodes
Another thing you might check (try ls -i
for example) is looking at
the i-node numbers for soft- and hard-links. Creating a new hard link
does not allocate a new i-node; however a soft (symbolic) link is
really just a file in its own right and each one has its own i-node.
Thought exercise
Test your understanding with what happens in each of these sequences.
- Create a file
aa
- Create a symbolic link to
aa
calledbb
- Delete
aa
- Read
bb
Create a file aa
Straightforward so far!
Create a symbolic link to aa
called bb
aa
is a real file;bb
is a link to that filename.
Delete aa
The file is gone.
Read bb
No chance. The link is there, but it links to nothing.
… and is this any different?
- Create a file
aa
- Create a hard link to
aa
calledbb
- Delete
aa
- Read
bb
Create a file aa
Straightforward so far!
Create a hard link to aa
called bb
aa
is still there;bb
is another name for the file.
Delete aa
The link is removed and the filename
aa
has gone. However as there is still a link to the file the data is retained.
Read bb
… and there it is.
The middle two steps are effectively renaming the file.