[David Shayer was a senior engineer on Norton Utilities for
Macintosh 3.0, 4.0, and 5.0. Before that he worked on Public
Utilities, a disk repair program that won the MacUser Editor's
Choice Award, and on Sedit, a low level disk editor.]
Optimizing Disks Is a Waste of Time
-----------------------------------
by David Shayer <das@sentience.com>
Optimizing disks is a waste of time. There, I said it. The horse
is out of the bag, the cat is out of the barn. So why do so many
people believe that an optimizer is an essential part of any Mac
user's tool kit? And what does it mean to optimize a disk, anyway?
**Background Fragments** -- When you save a file to disk, the file
system looks for an empty space to write the data. If there isn't
a single space large enough, it divides the file among several
smaller spaces. When a file is stored in more than one piece, we
say it's fragmented. Each piece is called a fragment, or an
extent.
A file may be broken into two fragments, or 20 fragments, or 200
fragments. The file system doesn't care; it can handle any number
of fragments equally well. However, reading a fragmented file
takes longer than reading an unfragmented one. The more fragments
in the file, the longer it takes to read. That's because the hard
disk's head must move to each fragment and read each one
separately. Reading a single chunk of data sequentially is
fast, even when the chunk is rather large. But moving the head
from track to track for each fragment is comparatively slow.
(And I mean "comparatively" - we're talking about additional
milliseconds here.)
The solution to this slowdown? Defragmenting or optimizing. Some
programs claim to defragment a disk, others claim to optimize it,
and a few offer both functions. What's the difference?
Defragmenting combines files that are broken up across multiple
fragments into a single fragment. But defragmenting files is only
part of the problem, since the free space on a disk is often split
into many pieces, a little here, and a little there. In effect,
the free space is fragmented. You may have 5 GB of free space, but
it could be in 5,000 chunks of 1 MB each. The next file saved may
be fragmented, simply because there isn't enough unfragmented free
space. That's where optimizing comes in - it defragments all the
fragmented files _and_ the free space.
Some optimizers also position similar files, such as all the
operating system files, physically next to one another. The claim
is that this speeds up the computer even more, because operating
system files are likely to be accessed together, which prevents
the disk head from needing to move long distances to read the next
file. Although the concept sounds good at first blush, I'm dubious
that this technique creates any perceptible speed increase. Beyond
a few simple cases, it's very difficult to divine in advance which
file the computer will want next.
So optimizing the disk should make your Mac run faster, right?
Well, maybe. If a file you use all the time is fragmented, such as
a key part of the operating system, then defragmenting that file
could really help. But the operating system is usually written to
the disk right after it has been freshly formatted. The disk is
empty, so the operating system is rarely fragmented. If a file you
rarely use is fragmented, such as that QuickTime movie from Aunt
Ethel's birthday party, it doesn't matter as long as you can
access the file - play the movie, in this case - normally. In
short, avoiding fragmentation is helpful only on files that are
accessed constantly.
So where did this cult of disk optimization come from? Back in the
early days of Windows, and DOS before that, PCs used the FAT (File
Allocation Table) file system. Legend has it that the FAT file
system was pretty bad about fragmenting files, so disks quickly
became badly fragmented. Back then, disks - and computers in
general - were extremely slow, especially by today's standards.
With those painfully slow disks and computers, optimizing a disk
could provide noticeable performance improvements. Modern
computers and disks are of course much faster, and they also
have much larger and more sophisticated disk caches, all of
which significantly reduces the impact of a fragmented disk.
When Apple designed the HFS (Hierarchical File System) file system
for the Mac, and later when they replaced HFS with HFS+, they took
special care to try to minimize fragmentation. All hard disks
store data in 512 byte chunks called sectors. FAT, HFS, and HFS+
use larger chunks, called clusters on FAT and allocation blocks
on HFS. One purpose of clusters and allocation blocks is to try to
reduce fragmentation, by storing files in larger pieces. But HFS
goes one step further. When saving a file to disk, the Mac file
system allocates space in even larger chunks, called clumps, in
a further effort to reduce fragmentation.
When the Mac file system saves a file, it looks for a free
space large enough to hold the entire file. If there aren't any,
it finds the largest free space available, then the next largest,
and so on, in an effort to reduce fragmentation as much as
possible. HFS will never fragment a file if it can be avoided.
**Real World Fragmentation** -- There are two things that lead
to disk fragmentation for most people: full disks and email.
Overall, the Mac's HFS+ file system does a good job of keeping
fragmentation to a minimum, assuming a reasonable amount of
free space remains on the disk to use when laying down files in
contiguous chunks. How much free space should you maintain? There
is no set answer, but leaving 20 to 25 percent of a disk free is
a good rule of thumb.
If your 60 GB hard disk has only 5 GB free, that doesn't mean that
you have a single empty space on the disk where the entire 5 GB is
available. Rather, there are dozens, if not hundreds, of smaller
free areas. The largest single chunk of free space may be only 500
MB. When a disk is very full, not only is there less total free
space, but the size of the largest free area becomes much smaller.
Thus the likelihood of fragmentation goes way up.
What kind of files tend to be fragmented? The most likely
candidates are files that grow regularly, with little bits of
data added to them over time. Each time the file system extends
the file, it looks for another piece of free space, and the file
fragments a little more. Various types of files fit this profile,
but the prime candidate is email.
My email program's In box file has been fragmented into more than
100 pieces. Does this matter? No, it still functions perfectly.
Doesn't it slow down my email program? Certainly, but not enough
for me to notice. The main reason people optimize their disk is
to make their Mac run faster. Doubtless it does make using the Mac
somewhat faster, but I've rarely seen a perceptible speed increase
in real world usage.
**Pros and Cons** -- So increasing the speed of your Mac, even if
the improvement is nearly imperceptible, is one reason to optimize
your hard disk. There is a second reason to consider defragmenting
files. If you suffer a disk crash, disk recovery software has a
harder time recovering badly fragmented files than unfragmented
files, simply because there are more pieces to track down. And
which files are most likely to be fragmented? Email files, which
are also the most likely to have changed recently, and thus the
least likely to be in your last backup. (Obligatory reminder - if
you don't have a recent backup, make one right after you finish
reading this article. Really.)
There are also some good reasons not to optimize, and ironically,
one of them is speed. Optimizers are slow. It takes many hours to
optimize a disk. Does it make sense to tie up your Mac for hours
just to make it respond a second faster when you're opening a
mailbox?
More worrying is the fact that if the optimizer crashes, the disk
could be, to use the technical term, "horked." That's because an
optimizer must move nearly every piece of data on the disk. The
best optimizers use algorithms that make it nearly impossible to
lose data, even if the power goes out in the middle of a long
optimization, but there's always a slim chance of something bad
happening when you let a program move everything on your disk.
The problem is that no program is perfect. Earlier versions of
some optimizers have had bugs that resulted in lost data or
damaged disks. I don't know of any currently shipping optimizers
with these types of catastrophic bugs. But that's not to say that
some future version may not contain a bug, or that a current
version won't have trouble when combined with a new version of
the Mac OS. Be careful when you're using optimization software!
**Optimization Advice** -- If you're going to optimize your disk,
be sure to check the disk first with a program like Apple's Disk
First Aid or Disk Utility. A damaged disk could cause even the
best optimizer to crash when it runs across corrupted data or
data in a completely unexpected place.
It's also a good idea to back up your entire disk (or at least
your most important data) first. But once you have a backup, you
could just erase the disk, and restore from your backup. Doing
this optimizes the disk as effectively as running any optimizer.
Plus, reformatting a hard disk ensures you have clean directory
structures, and if you reformat it with the option of writing
zeroes to every sector (which takes a long time and isn't
worthwhile unless you've been experiencing odd disk problems),
you'll also make the drive map out any bad blocks it may have
developed. That's why I say my favorite optimizer is Retrospect -
with it you can both protect your data and optimize your disk.
<http://www.dantz.com/products/mac_express/>
Speed Disk, the optimizer in Symantec's Norton Utilities, has some
useful features for analyzing a disk. It rates the overall disk
fragmentation as light, moderate, or severe. It's almost certainly
not worth optimizing a disk unless the fragmentation is severe,
and often not even then. That's because Speed Disk considers a
disk severely fragmented based on a combination of how many files
are fragmented, how fragmented they are, and how fragmented the
b-trees (disk directory structures) are. The last item is what can
make it seem alarmist, because the b-trees act as triggers: if
they're fragmented a certain amount, Speed Disk can automatically
assign the whole disk a severe rating, even if the other files on
the disk wouldn't otherwise generate that rating.
<http://www.symantec.com/nu/nu_mac/>
Speed Disk shows a graph of the files and free space on the disk,
letting you see how badly the free space is fragmented. It also
lists the size of the largest free block, a useful piece of
information to keep in mind because any file larger than that
will, by necessity, be fragmented when it is saved. If you
routinely work with files larger than your largest free block,
optimizing the disk would be advisable.
Lastly, Speed Disk lists all fragmented files and the number of
fragments per file, and it lets you defragment individual files.
Why would you want to do this? HFS+ can track up to eight
fragments in a file's catalog record. If a file has more than
eight fragments, HFS+ creates additional records, called extent
records, to track the extra fragments. Since files with more than
eight fragments require accessing these additional records each
time they are opened, a file with more than eight fragments is
certainly a reasonable candidate to be defragmented, assuming of
course that you access it frequently enough for defragmenting to
make a real difference.
There's usually no need for Speed Disk's capability to defragment
individual files. That's because you can usually defragment a file
yourself, by simply duplicating it in the Finder. When the file
system creates the duplicate file, it automatically uses only a
single fragment for the file, assuming there is enough contiguous
free space on the disk. Then you can delete the original and
rename the copy with the original file's name.
Alsoft's DiskWarrior 3 offers the unique feature of showing a
graph of a disk's directory, using a color gradient to show items
that are out of order. DiskWarrior's "rebuild" function is usually
used to repair damaged disks, but when used on healthy disks, it
"optimizes" their directories. Although Alsoft calls this feature
"optimization," it's quite different from what all other disk
optimizers do. Other disk optimizers defragment the files on a
disk. DiskWarrior puts the disk's catalog in order.
<http://www.alsoft.com/DiskWarrior/>
The catalog is composed of nodes, which contain records that
correspond to files. The nodes form a tree structure, with all the
nodes linked together in a specific order. The file system tends
to keep the nodes in order. But as files are added to and deleted
from the disk, nodes are likewise created, deleted, and shuffled
around, and they can end up out of order. This is not dangerous,
or even wrong, just not optimal.
DiskWarrior reorders the nodes. In theory, this should make a disk
faster for the same reason defragmenting a file makes it faster,
namely that related information is stored together, so the disk's
head doesn't have to seek to distant sectors when retrieving it.
In the real world, I doubt the speed increase is noticeable,
especially since the file system caches key pieces of the catalog
in memory, making access much faster than when the information
is stored only on the hard disk. Disk Warrior is excellent at
recovering disks with damaged directories, but optimizing a
properly functioning catalog is gratuitous.
**Bottom Line** -- To sum up then, for most people, most of the
time, there's simply not enough to gain by optimizing your disk
to bother doing it. There's nothing wrong with optimizing a disk,
and for a severely fragmented disk that is responding slowly when
reading regularly accessed files, it may even be worthwhile. But
in general, it's not necessary and carries a small risk. If you
really want to optimize your disk, the best approach is to make
a backup (with a second backup for safety's sake), reformat your
hard disk, and restore from the backup.