Three problems this week, though they are all related. There are multiple template files for this one, so make sure you download the full zip of all the starting files once you accept below.

Problem 1: Constraining Movies

You’d like to set up some tables to keep track of some movie scores from a variety of reviewers. As such, you have sketched out a basic design shown in the below schematic:

The file in the starting repository named Prob1.sql has the initial layout of these tables, in addition to defining a schema to place them into. Using this file as your starting point, you should add the necessary lines to create the correct columns and types for each table, including primary and foreign keys. Additionally, there are some further constraints that you would like to place on the data:

The combination of title and year in the movies table should be unique. (Only one movie with a given title should come out each year)
The name of the reviewer should not be null.
The number of stars a reviewer gives a movie should not be null.
The date that a reviewer rates a movie should not be null.
Only movies that came out after the start of the year 1900 are allowed in the movies table.
Reviewers can only give 1, 2, 3, 4, or 5 stars.
All ratings must come after the start of the year 2000.

After adding your columns and constraints, you should be able to execute everything within Prob1.sql to create your necessary tables. To test that everything has gone well, you should then be able to run all the commands in Successful_Inserts.sql, which should add all the given rows without any errors. If an error occurs here, go back to your tables and determine what may have happened and fix it. Make sure your column names and types are just as indicated in the schematic. Once everything in Successful_Inserts is inserting successfully, upload your Prob1.sql back to GitHub to submit this problem. (Though you may want to wait until the next two problems are complete to ensure you don’t need to make further changes here.)

Problem 2: Finding Flaws

Now that you have created and populated the tables from Problem 2, trying to run any of the commands in Prob3_Failures.md should result in an error. For each command below (and also shown in Prob3_Failures.md), indicate specifically what constraint caused the operation to fail and why. Use plain English here to describe the constraint: do not just regurgitate the error message that pops up! Realize that some queries may fail because of the same constraint. If any of these queries completes successfully, then you have missed a constraint somewhere back in your table creation that you need to go fix, and then regenerate and populate your tables.

INSERT INTO hw5.movies (id, title, year, director) 
VALUES
 (2, 'The Graduate', 1967, 'Mike Nichols');

INSERT INTO hw5.movies (id, title, year, director) 
VALUES
  (51, 'The Graduate', 1865, 'Mike Nichols');

INSERT INTO hw5.ratings (reviewer_id, movie_id, stars, rating_date) 
VALUES
  (4, 36, 5, '2010-01-01');

```
DROP TABLE hw5.reviewers;
```

INSERT INTO hw5.ratings (reviewer_id, movie_id, rating_date) 
VALUES
  (1, 3, '2015-04-02');

Problem 3: Finding Compound Flaws

This is a continuation of what you were doing in Problem 3. Trying to run any of the commands in Prob3_Failures.md should result in an error. For each command, indicate specifically what constraint(s) caused the operation to fail and why. Use plain English here to describe the constraint: do not just regurgitate the error message that pops up! Realize that some queries may fail because of the same constraint, and some queries might have multiple constraints being violated. So don’t just assume a single thing is wrong. If any of these queries completes successfully, then you have missed a constraint somewhere back in your table creation that you need to go fix, and then regenerate and populate your tables.

INSERT INTO hw5.ratings (reviewer_id, movie_id, stars, rating_date) 
VALUES
  (1, 2, 0, '2012-02-18');

INSERT INTO hw5.ratings (reviewer_id, movie_id, stars, rating_date) 
VALUES
  (4, 3, 4.3, '1999-12-31');

INSERT INTO hw5.reviewers (id) 
VALUES
  (20);

INSERT INTO hw5.ratings (reviewer_id, movie_id, stars) 
VALUES
  (1, 3, 3);

INSERT INTO hw5.movies (id,title,year,director) 
VALUES
  (51, 'The Graduate', 1867, 'Mike Nichols'),
  (52, 'The Graduate', 1867, 'Krzysztof Kieslowski');

Sidenote: These ratings were all artificially generated for these mysterious reviewers, so if your favorite movie on the list didn’t get an appropriate score, then I apologize!

Problem 4: That’s Not Normal

One of the more common ways that you will receive denormalized data is through various JSON formats. Consider then the example JSON response shown below, which might depict a response from an internal API containing information about student assignment submissions and corresponding feedback and rubric items.

[
  {
    "student_id": "2948294",
    "name": "Janet",
    "submitted_date": "2022-09-22",
    "assignment": "Homework 1",
    "total_available_points": 20,
    "deductions": [
      {
          "id": 1, 
          "comment": "Didn't read instructions well", 
          "deduction": -1
      },
      {
          "id": 2, 
          "comment": "Missed a negative", 
          "deduction": -0.5
      }
    ]
  }, ...etc...
]

The repository has a more complete example that you can look through, which includes multiple submissions. Note that you can have the same deduction item showing up multiple times for a given assignment. Your task here is to take this information and progressively create from it tables of various normal forms. I will not ask you to show the specific contents within these tables, but am instead just interested in the structure of the tables. As such, the easiest way to represent table structure (and potential relations between tables) is through ER Diagrams. For each of the below parts, I’d ask you to upload an ER diagram created by a tool such as DrawSQL, ChartDB, DBDiagram.io, or similar to the repository.

The first step is always to get the data / table into the first normal form. As a reminder, in this form you just need to have 1 piece of information per cell, and have some primary or compound primary key that uniquely identifies each row of the table. Create an ERD for a table that would hold the above data. This will be just a single table in this case, so you really just need to worry about specifying the column names and types, as well as determining what your primary key will be.
To put the table into 2NF, you need to ensure that there are no partial dependencies present, where some columns only depend on a subset of the columns that comprise a compound primary key. If you do not have a compound primary key, then your table is already in 2NF. Break the above into more tables as necessary to ensure that you no longer have any partial dependencies in any table. Be sure to include in the ERD which tables and columns are related to others through foreign key dependencies.
Finally, to ensure that your tables are in 3NF, you must ensure that there are no transitive dependencies, where one column actually depends on another (non-primary key) column, instead of the primary key. Continue to break apart your tables as necessary to ensure that there are no transitive dependencies between any non-primary key columns. Again, be sure to include in your ERD the foreign key constraints that indicate what rows are related.

Don’t forget to include an image of the ERD for each step of the process above! If you could upload them back to GitHub with names like Prob4_1NF.png, Prob4_2NF.png, and Prob4_3NF.png, that would be very appreciated!