File Organization

File Organization: Naming

Overview

Teaching: 30 min
Exercises: 10 min
Questions
  • What are the common file organization errors?

  • What are best practices for file organization?

Objectives
  • Highlight common SNAFUs

  • Learn to employ unit testing

Names matter

NO

myabstract.docx
Joe’s Filenames Use Spaces and Punctuation.xlsx
figure 1.png
fig 2.png
JW7d^(2sl@deletethisandyourcareerisoverWx2*.txt

YES

2014-06-08_abstract-for-sla.docx
joes-filenames-are-getting-better.xlsx
fig01_scatterplot-talk-length-vs-interest.png
fig02_histogram-talk-attendance.png
1986-01-28_raw-data-from-challenger-o-rings.txt

Three principles for (file) names:

  1. Machine readable
  2. Human readable
  3. Plays well with default ordering

### Awesome file names :)

plot of chunk unnamed-chunk-1


### Machine readable

Machine readable


#### Globbing

Except of complete file listing:

plot of chunk unnamed-chunk-2


Example of globbing to narrow file listing:

plot of chunk unnamed-chunk-3


Same using Mac OS Finder search facilities:

plot of chunk unnamed-chunk-4


Same using regex in R:

plot of chunk unnamed-chunk-5


#### Punctuation

Deliberate use of “-“ and “_” allows recovery of meta-data from the filenames:

plot of chunk unnamed-chunk-6


plot of chunk unnamed-chunk-7

This happens to be R but also possible in the shell, Python, etc.


### Recap: machine readable

Human readable

Human readable


#### Example

Which set of file(name)s do you want at 3 a.m. before a deadline?

plot of chunk unnamed-chunk-8


#### Embrace the slug

plot of chunk unnamed-chunk-9

plot of chunk unnamed-chunk-10


#### Recap: Human readable

Easy to figure out what the heck something is, based on its name


Plays well with default ordering

Plays well with default ordering

Examples

Chronological order:

plot of chunk unnamed-chunk-11


Logical order: Put something numeric first

plot of chunk unnamed-chunk-12


Dates: Use the ISO 8601 standard for dates: YYYY-MM-DD

plot of chunk unnamed-chunk-13


plot of chunk unnamed-chunk-14

From twitter


Left pad other numbers with zeros

plot of chunk unnamed-chunk-15

If you don’t left pad, you get this:

 10_final-figs-for-publication.R
 1_data-cleaning.R
 2_fit-model.R

which is just sad :(


#### Recap: Plays well with default ordering

Recap


#### Three principles for (file) names

  1. Machine readable
  2. Human readable
  3. Plays well with default ordering

Pros


Go forth and use awesome file names :)

plot of chunk unnamed-chunk-16

plot of chunk unnamed-chunk-17

Key Points