From the Unmixed Files of Dr. Alan W. Dove

April 20, 2022

When I started my journalism career 25 years ago, I had some major worries: how to expand my client list, how to keep paying the rent, whether I was even qualified to be doing this kind of work. Amid those existential concerns, seemingly minor issues of file organization didn’t even register. My workflow then was expedient, but not sustainable.

I did all of my writing in a semi-legitimately acquired copy of Microsoft Word, saved the files on my Mac Powerbook in whatever folders seemed convenient at the time, and didn’t think the concept of an archive was relevant to my life. Sure, I knew it would be handy to search old assignments for source contacts and background whenever I revisited a subject, but on a computer that’s easy to do regardless of where the text is or what format it’s in. Or so I thought.

Harsh experience eventually taught me otherwise. After several years in this business, I had a pile of old stories scattered across my hard drive. I’d done a reasonable job keeping backups in case of a hardware crash, but file management was another matter. My growing collection of old notes and stories, a mixture of Word .doc and AppleWorks .cwk files, became progressively harder to search. Microsoft repeatedly altered the .doc format as part of their predatory vendor lock-in strategy, and in 2007 Apple discontinued all support for AppleWorks. Many of the files in my poorly-structured archive became inaccessible.

That’s when I started thinking about archival file formats and organization, and implemented several changes to my workflow. My experience since then proves that this intervention worked; I can now search a systematically structured file tree for anything I’ve written since 1998, and all of the files since about 2008 are in a format I’m certain will remain readable for years to come. While my specific approach might not work for everyone, I urge every writer to at least think about it. Having a system and understanding its limitations is more important than the specific system you choose.

First, let’s talk about file formats. In the history of computing, there is only one file format that could be called “archival”: plain text. Often stored with a file extension of .txt, these files are readable by any word processor, but also by dedicated text editors available on every computer ever built. That’s because source code is written in plain text. At the most fundamental level, computing depends on the ability to read and modify these files.

As a result, every operating system comes with at least one text editor: TextEdit on the Mac, Notepad on Windows, Gedit on most Linux distributions. Beyond these basic options, there’s an entire library of editors available. Most are designed for writing computer code, but those can often be customized for prose, and there are a few text editors just for writers. I’ve tried dozens of them in my daily work, and will likely try many more.

Indeed, one great benefit of working in plain text is that there is no barrier to switching tools. Every text editor can open every other text editor’s files. For the past few months I’ve been working in VSCode, the first decent thing Microsoft has built since DOS, but if my mood changes I can switch to one of several other text editors I have installed right now. Writers who want something built specifically for their needs should look at Ghostwriter, or search for another “markdown text editor” for their platform of choice.

Even if my sermon hasn’t persuaded you to switch to plain text, I hope I’ve made the point that file formats matter. The modern Word document format (.docx) is supposed to be a “standard,” so if you trust that to hold true into the future, it’s a reasonable choice. A more stable option might be the truly open .odf produced by LibreOffice, which has the added benefit of being Free software. LibreOffice can also decrypt most old .doc and .cwk files, rescuing idiots like me who made the mistake of trusting Microsoft and Apple way back when.

Now, how do we store these files? Modern operating systems do a decent job of searching plain text or XML-based word processing formats, but trust the voice of experience when I say that having them in some kind of systematic hierarchy helps a lot. My system stems from the specific needs of my workflow, which is built entirely around deadlines. Every assignment has a deadline and a client, so that’s how I organize the associated files.

Let’s say I’m working on a story for Science magazine, due two weeks from now. I’d have a folder called 20220504SCI (4 May 2022 deadline for Science). Those who prefer longer file names might use 2022-05-04-Science, but having standardized on my current format I’ve stuck with it. In there, my main notes file for the story would be saved as 20220504SCI.md.

The .md extension refers to Markdown, a very simple notation format that’s saved as plain text. Some text editors can display Markdown files in a slightly prettier format, but they’ll open just fine in any text editor.

If there are PDFs, images, or other related files, they’ll all be in that folder as well. I’ll draft the piece itself in another text file with a descriptive title (e.g. SingleCellSequencingStory.md), but because it’s in the same folder, I know which assignment it was for. Interview transcripts will be saved as [DateSourcename].txt (e.g. 20220502Smith.txt for an interview with Dr. Smith on 2 May, 2022), again in the main story folder.

At the file system level, all of the projects I’m working on currently are in a folder I’ve creatively named Current. That folder functions as a rudimentary “To Do” list, as anything in it is still in some part of the production process. I also use it to keep track of my accounting. Invoices go in each project’s folder, and a project only leaves the Current folder after the check for it clears. Once the project is done, its folder goes into the PublishedArchive directory, which I’ve further subdivided by year and month.

Now, if I’m working on a piece about a topic I’ve covered before, I can find all of my prior work on it very quickly. Initial search results will tell me where the files are, and just from their names I’ll know exactly when and where the story appeared. Alternatively, if a client or prospective client emails saying they really liked such-and-such piece that I wrote a couple of years ago, I can note the date of that piece, open the appropriate archive folder, and be looking at all of the details within seconds. The story draft, interview transcripts, and my original notes will open in my current favorite text editor, whatever it is.

Having an organized, accessible archive didn’t magically generate more clients, pay the bills, or dispel my impostor syndrome, but it did make many aspects of the work itself easier. It’s also a transition that, done properly, only needs to be done once. Now I can spend more time on the big worries.

Communication