MongoDB - CRUD
The dirty work of database data management.
Overview
Concept
MongoDB is a popular “document-oriented” database.
-
MongoDB records are stored as Javascript Object Notation (JSON) documents.
-
commands and queries issued to the command-line MongoDB client program are written in Javascript.
-
MongoDB integrates easily with popular programming languages, such as Javascript, Python, and others.
Document-oriented database
MongoDB is not a relational database. It does not use the SQL query language. It attempts to fill a need for document-centric databases that can scale up or down rapidly, as needed.
-
no more tables, rows, or columns
-
data are stored as “documents” in “collections”
-
a database server can house many “collections”
-
documents can contain the equivalent of primary and foreign keys
No SQL
MongoDB is part of the “NoSQL” movement of databases that do not use SQL.
-
MongoDB uses its own query language, based on Javascript
-
many app or web programmers know Javascript, so this should make it more approachable
-
read more about the NoSQL trend in database systems
Flexible data model
Schema-less, meaning there is no fixed set of fields in any document
-
Documents can have whatever fields you decide to store in them
-
Even documents within the same collection can differ in their fields
Cross-platform
MongoDB can run be “installed” and “setup” on all popular platforms, including Mac, Windows, Linux, and UNIX
- “Drivers” are available to be “installed” for most popular programming languages to allow them to communicate with MongoDB databases
Highly scalable
Built-in ability to be “sharded”
-
a single database can be split across many servers, or even many thousands of servers
-
this allows MongoDB to potentially run thousands of times faster than if a database were on a single computer with limited processing and memory power
Fast for large data sets
MongoDB is fast for large data sets
-
includes sophisticated indexes to “pre-sort” results before queried
-
supports many aggregate functions
Geospatial search
MongoDB has built-in ability to store geospatial coordinates
-
this means it can perform searches for documents “near me”
-
easily performs searches within a particular radius of any location
-
in other words, it can order results by how far they are from any geospatial location
Industry-proven
MongoDB is used by:
-
GitHub
-
Etsy
-
Google
-
Ebay
-
Expedia
-
and many more
Despite this MongoDB marketing, you can be sure that these organizations also use relational SQL databases aplenty.
CRUD with the mongo shell client
The mongosh
command line client (called mongo
in older versions) uses a Javascript command shell interpreter to allow users to issue commands to any database to which they connect.
-
as with databases of all kinds, most MongoDB commands engage in some form of CRUD
- Create
- Read
- Update
- Delete
We will show how to do common CRUD operations using the mongosh
client.
Servers & Clients
Basic setup
Before using MongoDB, you will need a MongoDB server and at least one MongoDB client. See details for how to get MongoDB set up.
Example Data Set
Save the data file
This set of slides shows examples using a data set about books. To follow along, save the following JSON data into a file named books.json.
{"title": "Northanger Abbey", "author": "Austen, Jane", "year_written": 1814, "edition": "Penguin", "price": 18.2}
{"title": "War and Peace", "author": "Tolstoy, Leo", "year_written": 1865, "edition": "Penguin", "price": 12.7}
{"title": "Anna Karenina", "author": "Tolstoy, Leo", "year_written": 1875, "edition": "Penguin", "price": 13.5}
{"title": "Mrs. Dalloway", "author": "Woolf, Virginia", "year_written": 1925, "edition": "Harcourt Brace", "price": 25}
{"title": "The Hours", "author": "Cunnningham, Michael", "year_written": 1999, "edition": "Harcourt Brace", "price": 12.35}
{"title": "Huckleberry Finn", "author": "Twain, Mark", "year_written": 1865, "edition": "Penguin", "price": 5.76}
{"title": "Bleak House", "author": "Dickens, Charles", "year_written": 1870, "edition": "Random House", "price": 5.75}
{"title": "Tom Sawyer", "author": "Twain, Mark", "year_written": 1862, "edition": "Random House", "price": 7.75}
{"title": "A Room of One's Own", "author": "Woolf, Virginia", "year_written": 1922, "edition": "Penguin", "price": 29}
{"title": "Harry Potter", "author": "Rowling, J.K.", "year_written": 2000, "edition": "Harcourt Brace", "price": 19.95}
{"title": "One Hundred Years of Solitude", "author": "Marquez", "year_written": 1967, "edition": "Harper Perennial", "price": 14.00}
{"title": "Hamlet, Prince of Denmark", "author": "Shakespeare", "year_written": 1603, "edition": "Signet Classics", "price": 7.95}
{"title": "Lord of the Rings", "author": "Tolkien, J.R.", "year_written": 1937, "edition": "Penguin", "price": 27.45}
Import the data into MongoDB
To import the data file into a MongoDB database collection, go to the UNIX/Linux command line from where you typically connect to a MongoDB database, navigate to whichever folder you saved the data file into, and issue the following command:
- Note that this requires the
mongoimport
utility - see installation instructions if the command is not found on your system.
mongoimport --db your_database_name --collection books --host database_host --username your_username --password your_password --type json --file books.json
-
replace
database_host
,your_database_name
,your_username
, andyour_password
with the correct credentials. -
MongoDB can import both JSON and CSV plain text data formats, although JSON is much more closely related to MongoDB’s internal data structure and can be considered its native format.
-
note that the
mongoimport
command is not run from within themongosh
client.
Import the data into MongoDB (Docker version)
If running the MongoDB server within a Docker container, you may have to modify this command (see discussion](https://stackoverflow.com/questions/49895447/i-want-to-execute-mongoimport-on-a-docker-container)). For example:
- copy the data file into the container:
docker cp books.json mongodb_dockerhub:/books.json
- import the data into MongoDB:
docker exec mongodb_dockerhub mongoimport -d your_database_name --authenticationDatabase admin --collection books --username admin --password secret --type json --file /books.json
Connect to the database with the mongo shell client
Once the data has been imported, you can log in to the database server using the mongosh
client (or mongo
on older systems):
mongosh your_database_name --host your_hostname -u your_username -p
- replace
your_database_name
with your own NYU Net ID - replace
your_hostname
with your database host name - replace
your_username
with your own NYU Net ID - you will be prompted to enter your MongoDB server password… do so… this is not the same as your i6 password
Connect to the database with the mongo shell client (Docker version)
If running the MongoDB server within a Docker container, you may have to modify this command. For example:
- “remotely” log into the container:
docker exec -ti mongodb_dockerhub bash
- log into the database:
mongosh -u admin -p secret
Create
Collections
There is no need to explicitly create a collection. Collections are created automatically as soon as the first document is saved into them.
insertOne(), insertMany()
Documents can be created and stored into collections by importing from a JSON data file, as we have seen, or by using the insertOne()
and insertMany()
functions.
db.books.insertOne({
title: "Pattern Recognition",
author: "William Gibson",
year_written: 2003,
edition: "Penguin",
price: 14.99,
})
Where books
is a collection name that already exists or will be created automatically.
insertMany()
works much the same way, but takes an array of documents as its argument.
Read
find()
The MongoDB find()
function retrieves existing data from records in a given collection. It takes two arguments, both objects.
-
the criteria - that criteria that records must match to be included in the result set
- e.g.
db.books.find({ author: "Tolstoy, Leo" }})
- this acts much like SQL’s
WHERE
clause
- e.g.
-
the projection - limits which fields from the records matching the criteria will be included in the records in the result set
- e.g.
db.books.find({}, { title: 1, year: 1 })
- this acts much like the list of fields to retrieve in SQL’s
SELECT
query
- e.g.
Retrieve all documents
db.books.find()
Where books
is the collection from which to retrieve documents.
One criterion
db.books.find({
author: "Tolstoy, Leo",
})
Retrieves all books where the author
field has the value Leo Tolstoy
Projection
db.books.find({}, { author: 1 })
Retrieves all books, but only includes the author
field in the results, in addition to the _id
field, which is always included in results unless explicitly disabled.
Projection (continued)
db.books.find({ author: "Tolstoy, Leo" }, { _id: 0, title: 1, year: 1 })
Retrieves all books authored by Tolstoy, Leo
, and only includes the title
and year
field in the results. The _id
field is explicitly removed.
One criterion with an operator
db.books.find({
price: {
$gt: 25,
},
})
The $gt
operator performs a “greater than” comparison.
Two criteria
db.books.find(
{
price: {
$gte: 10,
$lte: 20,
},
edition: "Penguin",
},
{
_id: 0,
}
)
Notes:
- and logic is implicit in the syntax of the price criteria
- the projection (the second argument) indicates to not include the
_id
field in the results.
Two criteria
db.books.find(
{},
{
_id: 0,
title: 1,
}
)
Notes:
- This will return all book documents, since the criteria argument is a blank object.
- Only the
title
field will be included in the results for each document, since the projection argument indicates to do so.
One criterion using the $or operator
db.books.find({
$or: [
{
price: {
$gte: 20,
},
},
{
edition: "Penguin",
},
],
})
Notes:
- or logic has to be explicitly indicated by the $or operator
- the syntax for the
$and
operator follows the same pattern
Counting records
The count()
function can be run on any query to count results.
db.books
.find({
edition: "Penguin",
})
.count()
Operators
Comparisons
A variety of operators allow criteria to be fine-tuned.
$eq
- equal to$ne
- not equal to$gt
- greater than$gte
- greater than or equal to$lt
- less than$lte
- less than or equal to
Logic
Some other common operations have operators as well:
$and
- performs and logic$or
- performs or logic$in
- determines whether a value in a given field matches one of the values in an array$nin
- determines whether a value in a given field does not match one of the values in an array
Implicit equality
The following retrieval will match books where the edition
field has exactly the value, “Penguin”.
db.books.find({
edition: "Penguin",
})
The equality operator is implicit.
Implicit $eq
The same retrieval, in a more formal style, explicitly uses the $eq
operator.
db.books.find({
edition: {
$eq: "Penguin",
},
})
The equality $eq
operator is explicit.
Implicit and
As we’ve seen, MongoDB’s default operator is the and operator.
db.books.find({
price: {
$gte: 10,
$lte: 20,
},
edition: "Penguin",
})
-
There is no need to explicitly mention and.
-
This will find books that match the
price
and theedition
criteria -
The
price
criterion will match books that are greater than $10 and less than $20 -
The
edition
criterion will match books that have exactly “Penguin” as their edition value.
Explicit $and
The and logic in a query could be made more explicit. We can check a field in each documents against an array of criteria.
db.books.find({
$and: [
{
price: {
$gte: 10,
},
},
{
price: {
$lte: 20,
},
},
],
})
- This is quite a mouthful of
(
,[
,{
,$
,,
,}
,]
, and)
characters. - The implicit and seems nicer, when possible.
Explicit $or
or logic is written in much the same way as the explicit $and
.
db.books.find({
$or: [
{
author: "Gibson, William",
},
{
author: "Tolstoy, Leo",
},
],
})
$in
If checking for a single field to have one of the values from a list of values, the $in
operator can be less verbose than the explicit $or operator.
db.books.find({
author: {
$in: ["Gibson, William", "Tolstoy, Leo"],
},
})
$nin
Check that a document’s field value is not one of an array of values.
db.books.find({
year_written: {
$nin: [1865, 1925, 1927, 1802, 1814, 1999],
},
})
Update
updateOne(), updateMany()
To update existing documents, use the updateOne()
or updateMany()
methods.
db.books.updateOne({ year_written: { $lt: 2000 } }, { $set: { price: 5.99 } })
- the first argument specifies the criteria indicating which documents to update
- the second argument represents the fields to be changed and their new values
updateMany()
works much the same way, but takes an array of documents as its argument.
Delete
deleteOne(), deleteMany()
To delete existing documents, use the deleteOne()
and deleteMany()
methods.
db.books.deleteOne({
author: "William Gibson",
title: "Pattern Recognition",
})
The argument sets the criteria that must be matched for a single document to be deleted.
The deleteMany()
method works much the same way, but will delete more than one matching document.
Conclusions
Thank you. Bye.