This project presents an analytical web platform that helps evaluate the quality of movies based on publicly available data. By analyzing over 4,000 films, we identified how factors like IMDb rating, Metascore, genre, release year, budget, and box office returns influence the perception of a movie.
The platform provides interactive visualizations and allows users with known parameters of a film to get data-driven insights into its quality. This tool can benefit viewers, investors, analysts, and studios looking for objective evaluations.
Link to the deployed website: https://data-wrangling-visualization-project.onrender.com/
• Metascore and IMDb ratings are generally consistent, but viewers tend to avoid extreme ratings.
• Genre and year of release greatly influence how a film is perceived.
• Budget and box office are important, but they don't tell the whole story without genre.
• Box office success ≠ successful film.
At this stage, our team:
- Collected data from the imdb website about 4,000 films using Scrapy
- Cleaned and preprocessed data about films
- Prepared Advanced Data Analysis including analysis of the completeness of the dataset
For the second checkpoint, our team:
- Prepared a full analysis of the dataset
- Developed a unique interactive website with a stylish modern design
- Conducted an analysis of patterns between research factors such as:
- IMDB rating
- Metascore rating
- Genre
- Year of creation
- Box office receipts
- Production budget
- Popularity of the cast
- Added interactive graphs visualizing these dependencies to the website
- Deployed Project to: https://data-wrangling-visualization-project.onrender.com/
- Added Docker
- HTML + CSS + JavaScript for Frontend
- Charts.js, D3.js, Plotly for Visualizations
- Flask for Backend
- Scrapy for parsing
- All parsed data are in JSONs
- Python (MatPlotLib, Numpy, Seaborn, Pandas) for EDA
- In the starwars folder there is code for web scraping of the site.
- Scraped data is in the films_data.json file
- In the data_preparation.ipynb and Advanced_Data_Analysis.ipynb files there is code for cleaning and analyzing the dataset
- Cleaned and grouped datasets can be found in the data folder
There are screenshots of our site
- assets folder - it contains various icons and pictures for the site
- lib folder - it contains library files for the site
- index.html
- script.js
- Styles.css
This file contains all the main functions and backend part for the site
Below are attached screenshots of our website:
-
Ensure you have Docker and Docker Compose installed on your system. You can download them from Docker's official website.
-
Clone the repository:
git clone <repository-url> cd <repository-folder>
-
Ensure you have a
docker-compose.ymlfile in the root of your project with the following content:services: app: build: context: . ports: - "8080:8080"
-
Build and start the services:
docker-compose up --build
-
Open your browser and navigate to: http://127.0.0.1:8080
-
To stop the services, press
Ctrl+Cand run:docker-compose down
To run the project locally, follow these steps:
-
Clone the repository:
git clone <repository-url> cd <repository-folder>
-
Set up a virtual environment (optional but recommended):
python3 -m venv venv source venv/bin/activate -
Install the required dependencies:
pip install -r requirements.txt
-
Run the Flask application:
flask run --port=8080
-
Open your browser and navigate to: http://127.0.0.1:8080







