Mongoimport: How to specify field types ?

Q) When using mongoimport to import a CSV, the field “00234” will get imported as the number 234, is there any way to specify field type of each field or is there any other solution for the same so that we can store the data in actual format.

Answer :

Unfortunately, {{mongoimport}} does not allow input types to be specified at present; this is an open feature request which is currently being worked on (see TOOLS-67). {{mongoimport}} attempts to interpret CSV values as minimally as possible, which is why quoted CSV fields containing only numbers are parsed to numerical types.

 

There are a number of workarounds which you can implement:

 

*1. Write a custom CSV parser.*

The most straightforward option is to write your own CSV parser which is aware of each field’s format. This would process each row of the CSV in turn, build an appropriate JSON document from it, and insert it into the database directly. It would likely be very simple to create, would be customisable to whatever degree you found necessary, and – unlike the other options listed below – would avoid having to modify the data either pre- or post-import.

 

*2. Escape-quote the values in the CSV file.* {{mongoimport}} ignores the outermost set of quotation marks when interpreting types, but [accepts double quotation marks as an escape character|http://docs.mongodb.org/master/reference/program/mongoimport/#cmdoption–type] per [RFC 4180|http://tools.ietf.org/html/rfc4180.html]. Consider the following CSV field:

 

{code}

“””00234″””

{code}

 

This will be interpreted as a String by {{mongoimport}}, and will appear in the MongoDB document as follows:

 

{code}

{ fieldname : “\”00234\”” }

{code}

 

Note that the escaped quotation marks are retained in the MongoDB document.

 

*3. Change the field’s type in MongoDB post-import.* MongoDB’s [{{$type}} operator|http://docs.mongodb.org/manual/reference/operator/query/type/] allows you to determine the type of data held in a particular field. Using it, you can scan the collection for CSV-imported fields which have been interpreted as the wrong type, and convert them to the desired format. The code snippet below demonstrates how to do this.

 

{code}

  1. db.foo.find( { fieldname : { $type : 16 } } ).forEach( function (x)  {

  x.fieldname = new String(x.fieldname);

 

while(x.fieldname.length < 5)

                x.fieldname = ‘0’ + x.fieldname;

 

  db.foo.save(x);

});

{code}

 

The above code will look for all occurrences of {{fieldname}} in database {{foo}} which are Integers (i.e. type 16), and will convert them to Strings. It will then normalize the length of the String to 5 characters by prepending them with ‘0’. In this way, the script will convert all instances of {{fieldname:234}} to {{fieldname:”00234″}}.

 

If specifying typed fields for {{mongoimport}} is a feature that you would particularly like to see, we encourage you to visit the feature request ticket (TOOLS-67) and click “Vote for this issue” on the right-hand side of the page. By doing this, you will help to raise its profile relative to other requests and improve the likelihood of it being added to a future release of the product.

 

  • Ask Question