Wednesday, March 23, 2011

Editing SMOKE input data text files

Editing SMOKE input data text files can be a long process that sucks and is prone to mistakes. Or it can be a medium-length process that doesn't suck and is way less prone to mistakes.

Here's how to go with that second method:

1. sftp/scp a copy of the file you wanna edit to your desktop. (Ex: get ptinv_ptipm_cap2002v2_02apr2007_v4_orl.txt)

2. import that sucker into Excel:

2a. Say it's a text file and find it on your computer. Then say it's delimited, and start the import at line 16 or wherever the data actually starts after the header. there's a little preview window so you don't mess it up.

2b. tell it that the delimiter is a comma, and that text has no identifier ({none} instead of "). again there's a preview window, and make sure the columns are in the right places and there's still quotes around the text. Then just click Finish

3. You can do all sorts of Excel stuff with the data then. For example I sorted by the longitude column and then deleted all the point sources east of SAGE (-89.415 deg long.)

4. Then go File>>Save As and choose .csv as your file format and save it someplace.

5. Then find that file on your computer and Open With TextEdit.

6. It'll open, but will have a million (actually 3) quotation marks everyplace there should be just one. So go Edit>>Find, and do a Find/Replace of 3 quotes (""") with 1 quote ("). It takes a second and then looks way better.

7. Go to the place on NOX or whatever server and create a new file by going vi "newfile.txt"

8. Press i to go into insert mode. Then do a Select All and Copy on the window showing the .csv file on your computer, and Paste into your vi window. It takes a little while but eventually it all gets there.

9. Then if you like you can copy the header from the original file at the top of this file too, so that it looks identical to your original (other than the data you changed).

So now you can do all sorts of stuff like zero-out certain pollutants, or certain sources. Or multiply emissions by a certain factor. Whatever Excel will let you do, really.

2 comments:

  1. Oh yeah. I forgot something that is way important:

    2c. Click "next" instead of "finish" and make sure that all columns are being imported as text, even if they are actually numbers. This keeps leading 0's around for the ID numbers that need to have a fixed number of digits or SMOKE freaks out.

    ReplyDelete
  2. right. and another thing:

    6a. Make sure that whatever file you are producing looks just like the body of the one it will replace. If there are trailing commas just hanging out in the original file, for example, you can do something in Excel to recreate that. I just typed in "qqqqq" in the first row in any empty boxes, selected the entire rest of column and pasted. Then, just after replacing the mysterious triple " marks with single ones, I replaced all instances of "qqqqq" with nothing.

    ReplyDelete