5/15/2011

05-15-11 - SSD's

There's been some recent rumor-mongering about high failure rates of SSD's. I have no idea if it's true or not, but I will offer a few tips :

1. Never ever defrag an SSD. Your OS may have an automatic defrag, make sure it is off.

2. Make sure you have a new enough driver that supports TRIM. Make sure you aren't running your SSD in RAID or something like that which breaks TRIM support for some drivers/chipsets.

3. If you use an Intel SSD you can get the Intel SSD Toolbox ; if you have your computer set up correctly, it should not do anything, eg. if you run the "optimize" it should just finish immediately, but it will tell you if you've borked things up.

4. If you have one of the old SSD's with bad firmware or drivers (not Intel), then you may need to do more. Many of them didn't balance writes properly. Probably your best move is just to buy a new Intel and move your data (never try to save money on disks), but in the mean time, most of the manufacturers offer some kind of "optimize" tool which will go move all the sectors around. For example SuperTalent's Performance Refresh Tool is here .

5. Unnecessary writes *might* be killers for an SSD. One thing you can do is to check out the SMART info on your drive (such as CrystalDiskInfo, or the Intel SSD Optimizer works too), which will tell you the total amount of writes in its lifetime. So far my 128 G drive has seen 600 G of writes. If you see something like 10 TB of writes, that should be a red flag that data is getting rewritten over and over for no good reason, thrashing your drive. So then you might proceed to take some paranoid steps :

Disable virtual memory. Disable superfetch , indexing service, etc. Put firefox's SQL db files on a ram disk. Run "filemon" and watch for any file writes and see who's doing it and stop them. Now it's certainly true that *in theory* if the SSD's wear levelling is working correctly, then you should never have to worry about write endurance with a modern SSD - it's simply not possible to write enough data to overload it, even if you have a runaway app that just sits and writes data all the time, the SSD should not fail for a very long time ( see here for example ). But why stress the wear leveller when you don't need to? It's sort of like crashing your car into a wall because airbags are very good.


I'm really not a fan of write caching, because I've had too many incidents of crashes causing partial flushes of the cache to corrupt the file system. That may be less of an issue now that crashes are quite rare.

What would really be cool is a properly atomic file system, and then you could cache writes in atoms.

Normally when you talk about an atomic file system you mean the micro-operation is atomic, eg. a file delete or file rename or whatever will either commit correctly or not change anything. But it would also be nice to have atomicity at a higher level.

For example when you run a program installer, it should group all its operations as one "changelist" and then the whole changelist is either committed to the filesystem, or if there is an error somewhere along the line the whole thing fails to commit.

However, not everything should be changelisted. For example when you tell your code editor to "save all" it should save one by one, so that if it crashes during save you get as much as possible. Maybe the ideal thing would be to get a GUI option where you could see "this changelist failed to commit properly, do you want to abort it or commit the partial set?".

2 comments:

Pinky's Brain said...

A versioning file system with application control of version numbers would make it trivial to implement that higher level of atomicity.

Unfortunately most of the existing ones seem to use automatic snapshotting rather than giving the application control.

Raghar said...

Allowing application to abuse a filesystem is a great invitation for a virus, or DRM system which would kill your HD on purpose.

BTW when you have 200 GB HD, 4 TB per year writes shouldn't be that uncommon. Movies, anime, various programs that need to be tested, all of them are able to write TBs per year. (Not to mention virtual memory.)

old rants