Syllabus
Database Design & Web Implementation
New York University
Department of Computer Science
Course description
In this course, we introduce principles and applications of databases. We begin by studying issues related to data collection and discovery, and techniques for prepping and munging the raw data to shape it into a form that is ready for database storage. Studying the design of relational databases, we learn how to craft databases to model “real-world” data in a way that lends itself to the types of manipulation and analysis that we have in mind. Once we have a clear understanding of these issues in relational database design, we look at more advanced topics in data, such as data analysis in Python, NoSQL databases like MongoDB, web APIs, blockchain, and building simple web apps that publish data held in databases.
Prerequisites
Introduction to Computer Programming (either CSCI-UA.0002 or CSCI-UA.0003) with a grade of C or better. Students that have successfully completed Data Management and Analysis (CSCI-UA 479) are not eligible to take this course.
Learning objectives
Upon completing this course, students will be familiar with some of the most common database systems and practices, including:
- Explorations of “data in the wild”
- Spreadsheets as databases
- Using Python to prepare datafiles
- Understanding common textual representations of data, e.g. CSV, JSON, XML, HTML
- Common issues in scrubbing, munging, and massaging data
- Normal forms and relational database design
- Relational database programming with SQL using SQLite
- MongoDB as an example of a NoSQL database
- An overview of Python integration with both SQL and NoSQL databases
- Implementation of a Python-based database-driven web application using MongoDB, Flask, and pymongo
- Data analysis using pandas
- Data visualization with matplotlib
- Time-permitting, we will cover blockchain technology in detail as a unique variety of database
To achieve mastery in these topics, students will take quizzes and exercises corresponding to each lecture topic as well as a midterm and final exam.
While one class session is usually dedicated to hands-on student work to start each assigned exercise, students are expected to work independently to complete the exercises for approximately 5-10 hours each week.
Instructor
Amos Bloomberg
amos at cs dot nyu dot edu
WWH 424
Textbooks
All books are available online:
- via Online Access through NYU Libraries / O'Reilly Safari (log in using your NYU Net ID email address; see notes on accessing via Safari apps)
- free html / ebook versions supplied by author or publisher
- database platform's documentation
Readings will be selected from the following books:
- Python for Everybody: Exploring Data Using Python 3 by Charles Severance (py4e)
- Bad Data Handbook by Q. Ethan McCallum
- Using SQLite by Jay A. Kreibich
- Database Design by Adrienne Watt (primary author)
- Learning MySQL and MariaDB: heading in the right direction with MySQL and MariaDB by Russell J. T. Dyer
- MongoDB Manual
Getting help
Help resources available to you are listed in order of “seriousness” of your problem:
Messaging
Our course uses Discord as its main communication channel for announcements and discussion. This is a good place to ask questions that anyone - other students, graders, tutors, or the professor - can answer.
Create a private channel named assgn_fb1258
, where fb1258
is replaced with your own NYU Net ID. Invite the graders to that channel (we will tell you how to do this).
You are not required to supply any personally-identifiable information when signing up for any software services we use. Discuss with the professor if you have concerns or questions about privacy.
Tutoring
Tutors for this course are waiting to answer your questions remotely using Zoom meeting software.
Tutoring hours (all times in Eastern Time):
- TBD
Talk with the professor
- see me before class
- raise your hand or simply speak during class
- see me after class
- come to my open office hours - hours to be distrubuted in class
Additional tutoring resources
Academic support is also available through the University Learning Center.
Attendance & participation
Attendance is mandatory. In-class and online message board participation is encouraged. Students who do not attend class regularly and who do not participate in discussions tend to do poorly.
Required software and hardware
All students require access to a computer on which they can write software using a specific set of applications. Computers at any of the university’s computer labs will do, as will any laptop or desktop computer.
i6 account
In addition to your NYU Home Account, we will be using a special computer account on a Unix Web server named i6.cims.nyu.edu which will be assigned to you automatically based on your enrollment. This is called an i6 account and we will use it to host our websites.
- Common questions about i6 accounts are answered on this FAQ page.
- If you forget your i6 password and would like to reset it, go to this page for instructions on how to do so.
- If you do not receive notification that an account has been created for you, check your spam, and try to reset your password using the link above.
Computer labs
Macintosh computers with all of the necessary software installed are available to you in the ITS labs. You do not need your own computer nor do you need to purchase any software. However, you will be learning how to use various programs and may wish to have access to them at home or on your laptop. In this case, you must purchase your own license or use a trial version, which is sometimes available from the publisher. You can download software provided by ITS to all students, including SFTP programs, by going to the ITS software page.
The main computer lab to use for this class is the LaGuardia Co-op, located at 541 LaGuardia Place. There are other labs on campus, but this is also where tutors will be available to meet with you.
Saving your work in the lab
You will be able to save your work in the ITS labs under your NYU Home Account and/or on your own flash drives. Although you can write to the hard disks of the machines in the labs, you cannot be sure that you will have access to the same machine the next time you enter the lab and the drives in the lab are frequently erased. A good option is to upload your files online and download them as needed.
Grading
You will receive a grade calculated mechanically on the following rubric:
- 15%: Quizzes
- 35%: Exercises
- 25%: Midterm exam
- 25%: Final exam
Quizzes
Quizzes are completed outside of class. You must be logged into Google with your NYU Net ID account in Google in order to view the Quizzes.
Quizzes are submitted by submitting a Google Form.
Exercises
Exercises are usually begun in-class with the remainder completed outside of class.
All exercises are submitted by pushing code to GitHub.
- we will cover how to push code to GitHub
- unless you have good reason to do otherwise, follow best-practices for all basic file names and file extensions
Late policy
All assigned work is due before class on the due date indicated on the schedule
- for every 24 hours that work is late, we apply a 10% penalty on the grade, up to a maximum penalty of 30%.
- after 72 hours, we will no longer accept the work.
- for group work, each member of the group will be penalized individually.
Extensions
Students are automatically granted 2 late assignment extensions of up to 3 days late each, with the exception that all assignments must be submitted before the last day of regular classes before the final exam period.
- When submitting an assignment for which you would like to use one of these automatic extensions, you must notify the grader that you are using the extension, otherwise your assignment will be rejected.
- for any group work, each member of the group must use an extension (or lose points if none is available) for the entire group to submit work late.
- Do not ask for any extensions from the professor
Regrade requests
If a student requests a regrade of any work, we will regrade the work in full, not just the part that the student believes has been mis-graded.
Academic Integrity
Working with others and leveraging all resources available to you is a prerequisite for success. This is different from copying, cheating, plagiarism, and mental laziness. All submitted work must be your own. There are very reliable [http://theory.stanford.edu/~aiken/moss/ systems we use to detect plagiarism in computer code]. If you submit any work that is not your own, you risk failure or worse.
Please read the Computer Science department’s policy on academic integrity and the University-wide policy which supercedes it.