Skip to content
This repository was archived by the owner on Jan 22, 2019. It is now read-only.
This repository was archived by the owner on Jan 22, 2019. It is now read-only.

Two doubles quotes in columns causes Unexpected character exception #151

@youribonnaffe

Description

@youribonnaffe

I have a CSV file with the following content (just a limited extract here):

route_id,agency_id,route_short_name,route_long_name,route_desc,route_type,route_url,route_color,route_text_color
OCE669711,OCESN,"",""Cars Réguliers ""L 11""  (Nantes - St Gilles Croix de Vie)"",,3,,,

Parsing this CSV content with CsvMapper causes the following error:

com.fasterxml.jackson.core.JsonParseException: Unexpected character ('C' (code 67)): Expected separator ('"' (code 34)) or end-of-line
 at [Source: java.io.StringReader@279ad2e3; line: 2, column: 23]

	at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1702)
	at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:558)
	at com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:456)
	at com.fasterxml.jackson.dataformat.csv.CsvParser._reportUnexpectedCsvChar(CsvParser.java:1089)
	at com.fasterxml.jackson.dataformat.csv.impl.CsvDecoder._nextQuotedString(CsvDecoder.java:838)
	at com.fasterxml.jackson.dataformat.csv.impl.CsvDecoder.nextString(CsvDecoder.java:601)
	at com.fasterxml.jackson.dataformat.csv.CsvParser._handleNextEntry(CsvParser.java:678)
	at com.fasterxml.jackson.dataformat.csv.CsvParser.nextFieldName(CsvParser.java:575)
	at com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap(MapDeserializer.java:505)
	at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:362)
	at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:27)
	at com.fasterxml.jackson.databind.MappingIterator.nextValue(MappingIterator.java:277)
	at com.fasterxml.jackson.databind.MappingIterator.readAll(MappingIterator.java:317)
	at com.fasterxml.jackson.databind.MappingIterator.readAll(MappingIterator.java:303)

Here is a unit test to reproduce the issue:

    @Test
    public void doubleQuotes() throws Exception {
        String content =
                "route_id,agency_id,route_short_name,route_long_name,route_desc,route_type,route_url,route_color,route_text_color\n" +
                        "OCE669711,OCESN,\"\",\"\"Cars Réguliers \"\"L 11\"\"  (Nantes - St Gilles Croix de Vie)\"\",,3,,,";

        CsvSchema schema = CsvSchema.emptySchema().withHeader();
        MappingIterator<Map<String, String>> it = new CsvMapper().readerFor(Map.class)
                .with(schema)
                .readValues(content);

        assertEquals(1, it.readAll().size());
    }

Is there a way to configure the parser to be more flexible about this usage of quotes?
Unfortunately the CSV file is not under my control and I won't be able to change it's format.

Parsing this file with OpenCSV was working but I was hoping to switch to Jackson for better performances.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions