[MongoDB]: How to get nested document thru mongoimport
Imagine you’ve a flat CSV something like below, but you wanted to create a nested document something like dbversity collection below with mongoimport – how do we do ?
Files,FileName,Extension,Size,SubFolders
\\dbversity\\dbversity_files,dbversity_file00.db,db,1024,Subfolder1
\\dbversity\\dbversity_files,dbversity_file11.db,db,500,Subfolder2
\\dbversity\\dbversity_file1,dbversity_file22.db,db,250,Subfolder1
[root@dbversity ~]# mongo –port 27010
MongoDB shell version: 2.4.11
connecting to: 127.0.0.1:27010/test
rs1:PRIMARY>
rs1:PRIMARY> db.dbversity.find().pretty()
{
“_id” : “\\dbversity\\dbversity_files”,
“Size” : 6312131,
“DirectoryName” : “dbversity_file00.db”,
“Files” : [
{
“FileName” : “dbversity_file00.db”,
“Extension” : “db”,
“Size” : 1024
},
{
“FileName” : “dbversity_file11.jar”,
“Extension” : “jar”,
“Size” : 500
},
{
“FileName” : “dbversity_file22.db”,
“Extension” : “db”,
“Size” : 250
},
{
“FileName” : “dbversity_file33.pdf”,
“Extension” : “pdf”,
“Size” : 2048
}
],
“SubFolders” : [
“Subfolder1”,
“Subfolder2”
]
}
{
“_id” : “\\dbversity\\dbversity_files1”,
“Size” : 6312131,
“DirectoryName” : “dbversity_file00.db”,
“Files” : [
{
“FileName” : “dbversity_file00.db”,
“Extension” : “db”,
“Size” : 1024
},
{
“FileName” : “dbversity_file11.jar”,
“Extension” : “jar”,
“Size” : 500
},
{
“FileName” : “dbversity_file22.db”,
“Extension” : “db”,
“Size” : 250
},
{
“FileName” : “dbversity_file33.pdf”,
“Extension” : “pdf”,
“Size” : 2048
}
],
“SubFolders” : [
“Subfolder1”,
“Subfolder2”
]
}
rs1:PRIMARY>
[root@dbversity ~]# mongoexport –port 27010 –db test -c dbversity –csv –fields Files,FileName,Extension,Size,SubFolders –out /tmp/dbversity_outfile.csv
connected to: 127.0.0.1:27010
exported 2 records
[root@dbversity ~]#
[root@dbversity ~]# cat /tmp/dbversity_outfile.csv
Files,FileName,Extension,Size,SubFolders
“[ { “”FileName”” : “”dbversity_file00.db””, “”Extension”” : “”db””, “”Size”” : 1024 }, { “”FileName”” : “”dbversity_file11.jar””, “”Extension”” : “”jar””, “”Size”” : 500 }, { “”FileName”” : “”dbversity_file22.db””, “”Extension”” : “”db””, “”Size”” : 250 }, { “”FileName”” : “”dbversity_file33.pdf””, “”Extension”” : “”pdf””, “”Size”” : 2048 } ]”,,,6312131.0,”[ “”Subfolder1″”, “”Subfolder2″” ]”
“[ { “”FileName”” : “”dbversity_file00.db””, “”Extension”” : “”db””, “”Size”” : 1024 }, { “”FileName”” : “”dbversity_file11.jar””, “”Extension”” : “”jar””, “”Size”” : 500 }, { “”FileName”” : “”dbversity_file22.db””, “”Extension”” : “”db””, “”Size”” : 250 }, { “”FileName”” : “”dbversity_file33.pdf””, “”Extension”” : “”pdf””, “”Size”” : 2048 } ]”,,,6312131.0,”[ “”Subfolder1″”, “”Subfolder2″” ]”
[root@dbversity ~]#
If try to import it back to a test database, you can do so.
[root@dbversity ~]# mongoimport –port 27010 –db test -c newcol -f Files,FileName,Extension,Size,SubFolders –file /tmp/dbversity_outfile.csv –type csv
connected to: 127.0.0.1:27010
Mon Nov 24 01:44:15.901 imported 3 objects
[root@dbversity ~]#
[root@dbversity ~]# mongo –port 27010
MongoDB shell version: 2.4.11
connecting to: 127.0.0.1:27010/test
rs1:PRIMARY>
rs1:PRIMARY> db.newcol.find()
{ “_id” : ObjectId(“5472d3bffafd7a3149da96f2”), “Files” : “Files”, “FileName” : “FileName”, “Extension” : “Extension”, “Size” : “Size”, “SubFolders” : “SubFolders” }
{ “_id” : ObjectId(“5472d3bffafd7a3149da96f3”), “Files” : “[ { \”FileName\” : \”dbversity_file00.db\”, \”Extension\” : \”db\”, \”Size\” : 1024 }, { \”FileName\” : \”dbversity_file11.jar\”, \”Extension\” : \”jar\”, \”Size\” : 500 }, { \”FileName\” : \”dbversity_file22.db\”, \”Extension\” : \”db\”, \”Size\” : 250 }, { \”FileName\” : \”dbversity_file33.pdf\”, \”Extension\” : \”pdf\”, \”Size\” : 2048 } ]”, “FileName” : “”, “Extension” : “”, “Size” : 6312131, “SubFolders” : “[ \”Subfolder1\”, \”Subfolder2\” ]” }
{ “_id” : ObjectId(“5472d3bffafd7a3149da96f4”), “Files” : “[ { \”FileName\” : \”dbversity_file00.db\”, \”Extension\” : \”db\”, \”Size\” : 1024 }, { \”FileName\” : \”dbversity_file11.jar\”, \”Extension\” : \”jar\”, \”Size\” : 500 }, { \”FileName\” : \”dbversity_file22.db\”, \”Extension\” : \”db\”, \”Size\” : 250 }, { \”FileName\” : \”dbversity_file33.pdf\”, \”Extension\” : \”pdf\”, \”Size\” : 2048 } ]”, “FileName” : “”, “Extension” : “”, “Size” : 6312131, “SubFolders” : “[ \”Subfolder1\”, \”Subfolder2\” ]” }
rs1:PRIMARY>
rs1:PRIMARY>
rs1:PRIMARY> db.newcol.find().pretty()
{
“_id” : ObjectId(“5472d3bffafd7a3149da96f2”),
“Files” : “Files”,
“FileName” : “FileName”,
“Extension” : “Extension”,
“Size” : “Size”,
“SubFolders” : “SubFolders”
}
{
“_id” : ObjectId(“5472d3bffafd7a3149da96f3”),
“Files” : “[ { \”FileName\” : \”dbversity_file00.db\”, \”Extension\” : \”db\”, \”Size\” : 1024 }, { \”FileName\” : \”dbversity_file11.jar\”, \”Extension\” : \”jar\”, \”Size\” : 500 }, { \”FileName\” : \”dbversity_file22.db\”, \”Extension\” : \”db\”, \”Size\” : 250 }, { \”FileName\” : \”dbversity_file33.pdf\”, \”Extension\” : \”pdf\”, \”Size\” : 2048 } ]”,
“FileName” : “”,
“Extension” : “”,
“Size” : 6312131,
“SubFolders” : “[ \”Subfolder1\”, \”Subfolder2\” ]”
}
{
“_id” : ObjectId(“5472d3bffafd7a3149da96f4”),
“Files” : “[ { \”FileName\” : \”dbversity_file00.db\”, \”Extension\” : \”db\”, \”Size\” : 1024 }, { \”FileName\” : \”dbversity_file11.jar\”, \”Extension\” : \”jar\”, \”Size\” : 500 }, { \”FileName\” : \”dbversity_file22.db\”, \”Extension\” : \”db\”, \”Size\” : 250 }, { \”FileName\” : \”dbversity_file33.pdf\”, \”Extension\” : \”pdf\”, \”Size\” : 2048 } ]”,
“FileName” : “”,
“Extension” : “”,
“Size” : 6312131,
“SubFolders” : “[ \”Subfolder1\”, \”Subfolder2\” ]”
}
rs1:PRIMARY>
Now, do we have any other best solutions to create nested/sub documents using shell script/JS ?
Warning
mongoimport and mongoexport do not reliably preserve all rich BSON data types because JSON can only represent a subset of the types supported by BSON. As a result, data exported or imported with these tools may lose some measure of fidelity. See the Extended JSON reference for more information.
JSON can only represent a subset of the types supported by BSON. To preserve type information, mongoimport accepts strict mode representation for certain types.
For example, to preserve type information for BSON types data_date and data_numberlong during mongoimport, the data should be in strict mode representation, as in the following:
{ "_id" : 1, "volume" : { "$numberLong" : "2980000" }, "date" : { "$date" : "2014-03-13T13:47:42.483-0400" } }
For the data_numberlong type, mongoimport converts into a float during the import.
MongoDB Extended JSON
JSON can only represent a subset of the types supported by BSON. To preserve type information, MongoDB adds the following extensions to the JSON format:
- Strict mode. Strict mode representations of BSON types conform to the JSON RFC. Any JSON parser can parse these strict mode representations as key/value pairs; however, only the MongoDB’s internal JSON parser also recognizes the type information conveyed by the format.
- mongo Shell mode. The MongoDB’s internal JSON parser and the mongo shell can parse this mode.
The representation used for the various data types depends on the context in which the JSON is parsed.
Parsers and Supported Format¶
Input in Strict Mode¶
The following can parse representations in strict mode with recognition of the type information.
- REST Interfaces
- mongoimport
- --query option of various MongoDB tools
Other JSON parsers, including mongo shell and db.eval(), can parse strict mode representations as key/value pairs, but without recognition of the type information.
Warning
mongoimport and mongoexport do not reliably preserve all rich BSON data types because JSON can only represent a subset of the types
supported by BSON. As a result, data exported or imported with these tools may
lose some measure of fidelity. See the Extended JSON reference for
more information.
JSON can only represent a subset of the types supported by BSON. To preserve
type information, mongoimport accepts strict mode representation for
certain types.
For example, to preserve type information for BSON types data_date and data_numberlong during mongoimport, the data should be in strict mode
representation, as in the following:
{ “_id” : 1, “volume” : { “$numberLong” : “2980000” }, “date” : { “$date” : “2014-03-13T13:47:42.483-0400” } }
For the data_numberlong type, mongoimport converts into a float during the import.
[…] Please refer my previous post for the background of the issues with nested documents at Issues-with-Nested-documents […]