Wednesday, April 14, 2010

Bulk COPY a CSV into PostgreSQL, skipping first row

Looked for a solution to this yesterday and couldn't find it. Asked my Linux guru Jeremy today and he had a easy solution, so this might be helpful to others.

The scenario is you have a big CSV file, and you want to bulk copy it into PostgreSQL, but the first row of the file isn't data, it's got the column names in it. In my case, the text file is 65 Megs so it's not like you can just edit it in a text editor to delete the offending line. (The data happens to be the combined US and Canada zip/postal code database from ZipInfo.com, fyi.)

SQL Server has a bulk insert GUI that lets you specify a start row. Needed that functionality here.

Solution:

Use wc to find out how many rows are in your file:

$ wc ZCUG.TXT
872135 1871133 69105493 ZCUG.TXT


That first number returned, in my case 872135, is the number of rows in the file. Subtract one and and tail that number, outputting to a new file:

tail -872134 ZCUG.TXT > ZCUG-trimmed.txt

Boom! A new file without the row of column names.

Tuesday, March 30, 2010

owasp-esapi-python configuration

I tried to send this issue to the esapi-python mailing list (after subscribing) but it doesn't look like that is a functioning list. So any help with the following would be greatly appreciated.


Hi!

Thanks for your work on owasp-esapi-python! I am trying to integrate it into a project and will certainly spread the word to help drum up support for this as I make headway.

I've run into an issue during configuration:

When I do this at the python 2.6 interactive shell, it returns a single line of output...

>>> from esapi.core import ESAPI
>>> ESAPI.encryptor().gen_keys()
Creating new keys in /esapi/keyring/

The documentation leads me to believe that it will also output an Encryptor_MasterSalt but, if it's supposed to do that here, it isn't for me. Let me know any info I can provide. This is on Ubuntu 9.10.

Thanks in advance,
- Steve