Handling UTF-8-sig for encoding?

Hi all,

I was recently trying to use `csvlink` to filter the data from two well-formated data sets. I tried to follow the documentation but it was not working at all. I repeatedly got the following error despite the field "sample" being in both my files.

`csvlink: error: Could not find field 'sample' in input`

Ultimately, I was able to dump the CSV and notice my header was printing as `\ufeffsample` which left me to figure out this was a byte order mark (BOM) issue. I made the following change to csvlink.py and the code ran for me. 

```
-                self.input_1 = open(self.configuration['input'][0], encoding='utf-8').read()
+                self.input_1 = open(self.configuration['input'][0], encoding='utf-8-sig').read()
             except IOError:
                 raise self.parser.error("Could not find the file %s" %
                                    (self.configuration['input'][0], ))

             try:
-                self.input_2 = open(self.configuration['input'][1], encoding='utf-8').read()
+                self.input_2 = open(self.configuration['input'][1], encoding='utf-8-sig').read()
``` 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Handling UTF-8-sig for encoding? #102

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Handling UTF-8-sig for encoding? #102

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions