Books example data set

From Knowledge Kitchen
Jump to navigation Jump to search


These instructions will help you set up an example data set of popular books in Western literature in either a MongoDB database or a MySQL database.

The data is given as both JSON and CSV formatted text files. JSON is the preferred text file format for importing data into a MongoDB collection, while CSV is the preferred text file format for importing data into a MySQL table. MongoDB can also import CSV text files for simple data sets such as this example.


Import JSON-formatted data into MongoDB

With JSON files, the field names are embedded into each line of the text file.... this makes sense, since MongoDB is schema-less, documents within the same collection can have completely different set of fields. In this example, for simplicity, all of our documents have the same fields.


Save data in JSON text file

Save the following text into a file named books.json on the web server:

{"title": "Northanger Abbey", "author": "Austen, Jane", "year_written": 1814, "edition": "Penguin", "price":  18.2}
{"title": "War and Peace", "author": "Tolstoy, Leo", "year_written": 1865, "edition": "Penguin", "price":  12.7}
{"title": "Anna Karenina", "author": "Tolstoy, Leo", "year_written": 1875, "edition": "Penguin", "price":  13.5}
{"title": "Mrs. Dalloway", "author": "Woolf, Virginia", "year_written": 1925, "edition": "Harcourt Brace", "price":  25}
{"title": "The Hours", "author": "Cunnningham, Michael", "year_written": 1999, "edition": "Harcourt Brace", "price":  12.35}
{"title": "Huckleberry Finn", "author": "Twain, Mark", "year_written": 1865, "edition": "Penguin", "price":  5.76}
{"title": "Bleak House", "author": "Dickens, Charles", "year_written": 1870, "edition": "Random House", "price":  5.75}
{"title": "Tom Sawyer", "author": "Twain, Mark", "year_written": 1862, "edition": "Random House", "price":  7.75}
{"title": "A Room of One's Own", "author": "Woolf, Virginia", "year_written": 1922, "edition": "Penguin", "price":  29}
{"title": "Harry Potter", "author": "Rowling, J.K.", "year_written": 2000, "edition": "Harcourt Brace", "price":  19.95}
{"title": "One Hundred Years of Solitude", "author": "Marquez", "year_written": 1967, "edition": "Harper  Perennial", "price":  14.00}
{"title": "Hamlet, Prince of Denmark", "author": "Shakespeare", "year_written": 1603, "edition": "Signet  Classics", "price":  7.95}
{"title": "Lord of the Rings", "author": "Tolkien, J.R.", "year_written": 1937, "edition": "Penguin", "price":  27.45}


Import into MongoDB collection

Go to the UNIX command line on the web server, navigate to whichever folder you saved the above file into, and issue the following command:

mongoimport --db your_database_name --collection books --host class-mongodb.cims.nyu.edu --username your_username --password your_password --type json --file books.json 

Where your_database_name, your_username, and your_password are the credentials you received when creating an account on our MongoDB database server.


Importing CSV-formatted data into MongoDB

When importing data in CSV format, it is easiest to use comma-separated values with double quotes as the delimiter and where the field names are in the first line of the file.


Save data in CSV text file

Save the following into a file named books.csv:

"title","author","year_written","edition","price"
"Northanger Abbey", "Austen, Jane", 1814,"Penguin",18.2
"War and Peace", "Tolstoy, Leo", 1865, "Penguin",12.7
"Anna Karenina", "Tolstoy, Leo", 1875, "Penguin",13.5
"Mrs. Dalloway", "Woolf, Virginia", 1925, "Harcourt Brace",25
"The Hours", "Cunnningham, Michael", 1999, "Harcourt Brace",12.35
"Huckleberry Finn", "Twain, Mark", 1865, "Penguin",5.76
"Bleak House", "Dickens, Charles", 1870, "Random House",5.75
"Tom Sawyer", "Twain, Mark", 1862, "Random House",7.75
"A Room of One's Own", "Woolf, Virginia", 1922,"Penguin",29
"Harry Potter","Rowling, J.K.",2000,"Harcourt Brace",19.95
"One Hundred Years of Solitude","Marquez",1967,"Harper  Perennial",14.00
"Hamlet, Prince of Denmark","Shakespeare",1603,"Signet  Classics",7.95
"Lord of the Rings","Tolkien, J.R.",1937,"Penguin",27.45


Import into MongoDB collection

Go to the UNIX command line on the server, navigate to whichever folder you saved the above files into, and issue the following command:

mongoimport --db your_database_name --collection books --host class-mongodb.cims.nyu.edu --username your_username --password your_password --type csv --headerline --file books.csv 

Where your_database_name, your_username, and your_password are the credentials you received when creating an account on our MongoDB database server.


Importing CSV-formatted data into MySQL table

If you are importing data into MySQL, you will need to first create a table with the correct field names and data types.


Create MySQL table

DROP TABLE IF EXISTS books;
CREATE TABLE books (
   book_id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
   title varchar(100) NOT NULL,
   author varchar(50) NOT NULL,
   year_written int(4) default 0,
   edition varchar(20) NOT NULL,
   price decimal(7,3),
   created TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
   PRIMARY KEY(book_id)
);


Import into MySQL table

And now load the same CSV file listed above] into this table using the following command in the MySQL client:

LOAD DATA LOCAL INFILE 'books.csv' 
 INTO TABLE books
 FIELDS TERMINATED BY ',' 
 ENCLOSED BY '"'
 LINES TERMINATED BY '\n'
 IGNORE 1 LINES
 (title, author, year_written, edition, price);


What links here