From 1ac1464040c6489f776192792d3c3dcf24c1276e Mon Sep 17 00:00:00 2001 From: Pierre Lermant Date: Wed, 14 Sep 2022 14:29:26 -0700 Subject: [PATCH 1/5] Add files via upload --- .../api_data_wrangling_mini_project.ipynb | 234 ++++++++++++------ 1 file changed, 155 insertions(+), 79 deletions(-) diff --git a/mec-3.4.1-api-mini-project/api_data_wrangling_mini_project.ipynb b/mec-3.4.1-api-mini-project/api_data_wrangling_mini_project.ipynb index 0d34bd5cc..4bbca8426 100755 --- a/mec-3.4.1-api-mini-project/api_data_wrangling_mini_project.ipynb +++ b/mec-3.4.1-api-mini-project/api_data_wrangling_mini_project.ipynb @@ -2,147 +2,147 @@ "cells": [ { "cell_type": "markdown", + "metadata": {}, "source": [ "This exercise will require you to pull some data from https://data.nasdaq.com/ (formerly Quandl API)." - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "As a first step, you will need to register a free account on the https://data.nasdaq.com/ website." - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "After you register, you will be provided with a unique API key, that you should store:\r\n", - "\r\n", - "*Note*: Use a `.env` file and put your key in there and `python-dotenv` to access it in this notebook. \r\n", - "\r\n", - "The code below uses a key that was used when generating this project but has since been deleted. Never submit your keys to source control. There is a `.env-example` file in this repository to illusrtate what you need. Copy that to a file called `.env` and use your own api key in that `.env` file. Make sure you also have a `.gitignore` file with a line for `.env` added to it. \r\n", - "\r\n", + "After you register, you will be provided with a unique API key, that you should store:\n", + "\n", + "*Note*: Use a `.env` file and put your key in there and `python-dotenv` to access it in this notebook. \n", + "\n", + "The code below uses a key that was used when generating this project but has since been deleted. Never submit your keys to source control. There is a `.env-example` file in this repository to illusrtate what you need. Copy that to a file called `.env` and use your own api key in that `.env` file. Make sure you also have a `.gitignore` file with a line for `.env` added to it. \n", + "\n", "The standard Python gitignore is [here](https://github.com/github/gitignore/blob/master/Python.gitignore) you can just copy that. " - ], - "metadata": {} + ] }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 3, + "metadata": {}, + "outputs": [], "source": [ "# get api key from your .env file\n", "import os\n", - "from dotenv import load_dotenv # if missing this module, simply run `pip install python-dotenv`\n", + "from dotenv import load_dotenv # if missin this module, simply run `pip install python-dotenv`\n", "\n", "load_dotenv()\n", - "API_KEY = os.getenv('NASDAQ_API_KEY')\n", + "pierreKey = os.getenv('NASDAQ_API_KEY')\n", "\n", - "print(API_KEY)" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "KRfk96yoWvruWZ-LjPbo\n" - ] - } - ], - "metadata": {} + "#print(pierreKey)" + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "Nasdaq Data has a large number of data sources, but, unfortunately, most of them require a Premium subscription. Still, there are also a good number of free datasets." - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "For this mini project, we will focus on equities data from the Frankfurt Stock Exhange (FSE), which is available for free. We'll try and analyze the stock prices of a company called Carl Zeiss Meditec, which manufactures tools for eye examinations, as well as medical lasers for laser eye surgery: https://www.zeiss.com/meditec/int/home.html. The company is listed under the stock ticker AFX_X." - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "You can find the detailed Nasdaq Data API instructions here: https://docs.data.nasdaq.com/docs/in-depth-usage" - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "While there is a dedicated Python package for connecting to the Nasdaq API, we would prefer that you use the *requests* package, which can be easily downloaded using *pip* or *conda*. You can find the documentation for the package here: http://docs.python-requests.org/en/master/ " - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "Finally, apart from the *requests* package, you are encouraged to not use any third party Python packages, such as *pandas*, and instead focus on what's available in the Python Standard Library (the *collections* module might come in handy: https://pymotw.com/3/collections/).\n", "Also, since you won't have access to DataFrames, you are encouraged to us Python's native data structures - preferably dictionaries, though some questions can also be answered using lists.\n", "You can read more on these data structures here: https://docs.python.org/3/tutorial/datastructures.html" - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "Keep in mind that the JSON responses you will be getting from the API map almost one-to-one to Python's dictionaries. Unfortunately, they can be very nested, so make sure you read up on indexing dictionaries in the documentation provided above." - ], - "metadata": {} + ] }, { "cell_type": "code", - "execution_count": 6, - "source": [ - "# First, import the relevant modules" - ], + "execution_count": 4, + "metadata": {}, "outputs": [], - "metadata": {} + "source": [ + "# First, import the relevant modules\n", + "import json\n", + "import requests" + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "Note: API's can change a bit with each version, for this exercise it is reccomended to use the nasdaq api at `https://data.nasdaq.com/api/v3/`. This is the same api as what used to be quandl so `https://www.quandl.com/api/v3/` should work too.\r\n", - "\r\n", + "Note: API's can change a bit with each version, for this exercise it is reccomended to use the nasdaq api at `https://data.nasdaq.com/api/v3/`. This is the same api as what used to be quandl so `https://www.quandl.com/api/v3/` should work too.\n", + "\n", "Hint: We are looking for the `AFX_X` data on the `datasets/FSE/` dataset." - ], - "metadata": {} + ] }, { "cell_type": "code", "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\"dataset_data\":{\"limit\":null,\"transform\":null,\"column_index\":null,\"column_names\":[\"Date\",\"Open\",\"High\",\"Low\",\"Close\",\"Change\",\"Traded Volume\",\"Turnover\",\"Last Price of the Day\",\"Daily Traded Units\",\"Daily Turnover\"],\"start_date\":\"2020-12-01\",\"end_date\":\"2020-12-01\",\"frequency\":\"daily\",\"data\":[[\"2020-12-01\",112.2,112.2,111.5,112.0,null,51.0,5703.0,null,null,null]],\"collapse\":null,\"order\":null}}\n" + ] + } + ], "source": [ "# Now, call the Nasdaq API and pull out a small sample of the data (only one day) to get a glimpse\n", - "# into the JSON structure that will be returned" - ], - "outputs": [], - "metadata": {} + "# into the JSON structure that will be returned\n", + "re='https://data.nasdaq.com/api/v3/datasets/FSE/AFX_X/data.json?start_date=2020-12-01&end_date=2020-12-03&api_key='\n", + "req=re+pierreKey\n", + "#print(re)\n", + "r=requests.get(req)\n", + "print(r.text)" + ] }, { "cell_type": "code", - "execution_count": 9, + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "# Inspect the JSON structure of the object you created, and take note of how nested it is,\n", "# as well as the overall structure" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "{'dataset': {'id': 10095370, 'dataset_code': 'AFX_X', 'database_code': 'FSE', 'name': 'Carl Zeiss Meditec (AFX_X)', 'description': 'Stock Prices for Carl Zeiss Meditec (2020-11-02) from the Frankfurt Stock Exchange.

Trading System: Xetra

ISIN: DE0005313704', 'refreshed_at': '2020-12-01T14:48:09.907Z', 'newest_available_date': '2020-12-01', 'oldest_available_date': '2000-06-07', 'column_names': ['Date', 'Open', 'High', 'Low', 'Close', 'Change', 'Traded Volume', 'Turnover', 'Last Price of the Day', 'Daily Traded Units', 'Daily Turnover'], 'frequency': 'daily', 'type': 'Time Series', 'premium': False, 'limit': None, 'transform': None, 'column_index': None, 'start_date': '2021-01-03', 'end_date': '2020-12-01', 'data': [], 'collapse': None, 'order': None, 'database_id': 6129}}\n" - ] - } - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "These are your tasks for this mini project:\n", "\n", @@ -153,28 +153,107 @@ "5. What was the largest change between any two days (based on Closing Price)?\n", "6. What was the average daily trading volume during this year?\n", "7. (Optional) What was the median trading volume during this year. (Note: you may need to implement your own function for calculating the median.)" - ], - "metadata": {} + ] }, { "cell_type": "code", - "execution_count": null, - "source": [], - "outputs": [], - "metadata": {} + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "For request = https://data.nasdaq.com/api/v3/datasets/FSE/AFX_X/data.json?start_date=2020-12-01&end_date=2020-12-03&api_key= we got 255 data points\n", + "Minimum opening price is: 34.0 , max opening price is: 53.11\n", + "Maximum price difference in a day is: 2.81\n", + "Maximum price difference between 2 consecutive days is: 2.56\n", + "Average daily trading volume is: 89124.34\n", + "Median daily trading volume is: 76286.0\n" + ] + } + ], + "source": [ + "req='https://data.nasdaq.com/api/v3/datasets/FSE/AFX_X/data.json?start_date=2017-01-01&end_date=2017-12-31&api_key=JmmAsy5eSmvidqxBF9Re'\n", + "t=requests.get(req)\n", + "# for testing only t='''{\"dataset_data\":{\"limit\":null,\"transform\":null,\"column_index\":null,\"column_names\":[\"Date\",\"Open\",\"High\",\"Low\",\"Close\",\"Change\",\"Traded Volume\",\"Turnover\",\"Last Price of the Day\",\"Daily Traded Units\",\"Daily Turnover\"],\"start_date\":\"2020-12-01\",\"end_date\":\"2020-12-01\",\"frequency\":\"daily\",\"data\":[[\"2020-12-01\",112.2,112.2,111.5,112.0,null,53.0,5703.0,null,null,null],[\"2020-12-02\",110.0,114.2,109.5,112.2,null,51.0,5703.0,null,null,null]],\"collapse\":null,\"order\":null}}'''\n", + "o=json.loads(t.text) #main dictionary object\n", + "d=o['dataset_data']['data'] #daily list information\n", + "#print(json.dumps(d))\n", + "\n", + "#NOTE, we perform separate loops for each question below.\n", + "#We could optimize the code to only parse the data once as a future exercice\n", + "\n", + "#3- let's compute the lowest and highest opening prices for the time range.\n", + "min=-1\n", + "max=0\n", + "for i in d:\n", + " if i[1]: #test for null values\n", + " if min == -1:\n", + " min = i[1] #opening price is the second element\n", + " max = i[1]\n", + " else:\n", + " if i[1] < min: min=i[1]\n", + " if i[1] > max: max=i[1]\n", + "print(\"For request =\",re,\"we got\",len(d),\"data points\")\n", + "print(\"Minimum opening price is:\",str(min),\", max opening price is:\",str(max))\n", + "\n", + "#4-computing max difference in one day, high is index 2, low is 3\n", + "diff=-1\n", + "for i in d:\n", + " if i[2] and i[3]:\n", + " if diff == -1: \n", + " diff = i[2]-i[3]\n", + " else:\n", + " if i[2]-i[3] > diff: diff= i[2]-i[3]\n", + "print(\"Maximum price difference in a day is:\",str(round(diff,2)))\n", + "\n", + "#5-computing max difference between 2 consecutive days, based on closing price\n", + "diff=0\n", + "for i in range(0,len(d)-2):\n", + " if d[i+1][4] and d[i][4]:\n", + " if diff < d[i+1][4]-d[i][4]: \n", + " diff = d[i+1][4]-d[i][4]\n", + "print(\"Maximum price difference between 2 consecutive days is:\",str(round(diff,2)))\n", + "\n", + "#6-Compute the average daily trading volume during this year - index 6 in data\n", + "sum=0\n", + "tot=0\n", + "for i in d:\n", + " if i[6]:\n", + " sum=sum+i[6]\n", + " tot=tot+1\n", + "print(\"Average daily trading volume is:\",str(round(sum/tot,2)))\n", + "\n", + "#7-Compute the median trading volume during this year\n", + "def median(l):\n", + " n = len(l)\n", + " s = sorted(l)\n", + " return (s[n//2-1]/2.0+s[n//2]/2.0, s[n//2])[n % 2] if n else None\n", + "\n", + "vol=[]\n", + "for i in d:\n", + " if i[6]:\n", + " vol.append(i[6])\n", + "print(\"Median daily trading volume is:\",str(round(median(vol),2)))" + ] }, { "cell_type": "code", "execution_count": null, - "source": [], + "metadata": {}, "outputs": [], - "metadata": {} + "source": [] } ], "metadata": { + "interpreter": { + "hash": "4885f37acae9217c235118400878352aafa7b76e66df698a1f601374f86939a7" + }, "kernelspec": { - "name": "python3", - "display_name": "Python 3.7.9 64-bit ('springboard': conda)" + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" }, "language_info": { "codemirror_mode": { @@ -186,12 +265,9 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.9" - }, - "interpreter": { - "hash": "4885f37acae9217c235118400878352aafa7b76e66df698a1f601374f86939a7" + "version": "3.9.12" } }, "nbformat": 4, "nbformat_minor": 4 -} \ No newline at end of file +} From 0e69b885581bfc0e4cd85273dec4d491257a28e6 Mon Sep 17 00:00:00 2001 From: Pierre Lermant Date: Wed, 14 Sep 2022 14:38:18 -0700 Subject: [PATCH 2/5] Add files via upload --- .../api_data_wrangling_mini_project.ipynb | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/mec-3.4.1-api-mini-project/api_data_wrangling_mini_project.ipynb b/mec-3.4.1-api-mini-project/api_data_wrangling_mini_project.ipynb index 4bbca8426..a911ffd73 100755 --- a/mec-3.4.1-api-mini-project/api_data_wrangling_mini_project.ipynb +++ b/mec-3.4.1-api-mini-project/api_data_wrangling_mini_project.ipynb @@ -157,14 +157,14 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "For request = https://data.nasdaq.com/api/v3/datasets/FSE/AFX_X/data.json?start_date=2020-12-01&end_date=2020-12-03&api_key= we got 255 data points\n", + "For request = https://data.nasdaq.com/api/v3/datasets/FSE/AFX_X/data.json?start_date=2017-01-01&end_date=2017-12-31&api_key= we got 255 data points\n", "Minimum opening price is: 34.0 , max opening price is: 53.11\n", "Maximum price difference in a day is: 2.81\n", "Maximum price difference between 2 consecutive days is: 2.56\n", @@ -174,7 +174,8 @@ } ], "source": [ - "req='https://data.nasdaq.com/api/v3/datasets/FSE/AFX_X/data.json?start_date=2017-01-01&end_date=2017-12-31&api_key=JmmAsy5eSmvidqxBF9Re'\n", + "re='https://data.nasdaq.com/api/v3/datasets/FSE/AFX_X/data.json?start_date=2017-01-01&end_date=2017-12-31&api_key='\n", + "req=re+pierreKey\n", "t=requests.get(req)\n", "# for testing only t='''{\"dataset_data\":{\"limit\":null,\"transform\":null,\"column_index\":null,\"column_names\":[\"Date\",\"Open\",\"High\",\"Low\",\"Close\",\"Change\",\"Traded Volume\",\"Turnover\",\"Last Price of the Day\",\"Daily Traded Units\",\"Daily Turnover\"],\"start_date\":\"2020-12-01\",\"end_date\":\"2020-12-01\",\"frequency\":\"daily\",\"data\":[[\"2020-12-01\",112.2,112.2,111.5,112.0,null,53.0,5703.0,null,null,null],[\"2020-12-02\",110.0,114.2,109.5,112.2,null,51.0,5703.0,null,null,null]],\"collapse\":null,\"order\":null}}'''\n", "o=json.loads(t.text) #main dictionary object\n", From df01e75afb04c174489f14974941d172f47547ac Mon Sep 17 00:00:00 2001 From: Pierre Lermant Date: Mon, 26 Sep 2022 16:15:48 -0700 Subject: [PATCH 3/5] Update Mini_Project_Data_Wrangling_Pandas.ipynb --- .../Mini_Project_Data_Wrangling_Pandas.ipynb | 3521 ++++++++++++++++- 1 file changed, 3520 insertions(+), 1 deletion(-) diff --git a/mec-5.3.10-data-wranging-with-pandas-mini-project/Mini_Project_Data_Wrangling_Pandas.ipynb b/mec-5.3.10-data-wranging-with-pandas-mini-project/Mini_Project_Data_Wrangling_Pandas.ipynb index ed51607a2..c5e3c56f9 100755 --- a/mec-5.3.10-data-wranging-with-pandas-mini-project/Mini_Project_Data_Wrangling_Pandas.ipynb +++ b/mec-5.3.10-data-wranging-with-pandas-mini-project/Mini_Project_Data_Wrangling_Pandas.ipynb @@ -1,4 +1,21 @@ { + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Mini-Project: Data Wrangling and Transformation with Pandas\n", + "\n", + "Working with tabular data is a necessity for anyone with enterprises having a majority of their data in relational databases and flat files. This mini-project is adopted from the excellent tutorial on pandas by Brandon Rhodes which you have watched earlier in the Data Wrangling Unit. In this mini-project, we will be looking at some interesting data based on movie data from the IMDB.\n", + "\n", + "This assignment should help you reinforce the concepts you learnt in the curriculum for Data Wrangling and sharpen your skills in using Pandas. Good Luck!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Please make sure you have one of the more recent versi{ "cells": [ { "cell_type": "markdown", @@ -30,6 +47,3508 @@ "%matplotlib inline" ] }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'1.4.2'" + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pd.__version__" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Taking a look at the Movies dataset\n", + "This data shows the movies based on their title and the year of release" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 244914 entries, 0 to 244913\n", + "Data columns (total 2 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 title 244914 non-null object\n", + " 1 year 244914 non-null int64 \n", + "dtypes: int64(1), object(1)\n", + "memory usage: 3.7+ MB\n" + ] + } + ], + "source": [ + "movies = pd.read_csv('titles.csv')\n", + "movies.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
titleyear
0The Ticket to the Life2009
1Parallel Worlds: A New Rock Music Experience2016
2Morita - La hija de Jesus2008
3Gun2017
4Love or Nothing at All2014
\n", + "
" + ], + "text/plain": [ + " title year\n", + "0 The Ticket to the Life 2009\n", + "1 Parallel Worlds: A New Rock Music Experience 2016\n", + "2 Morita - La hija de Jesus 2008\n", + "3 Gun 2017\n", + "4 Love or Nothing at All 2014" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "movies.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Taking a look at the Cast dataset\n", + "\n", + "This data shows the cast (actors, actresses, supporting roles) for each movie\n", + "\n", + "- The attribute `n` basically tells the importance of the cast role, lower the number, more important the role.\n", + "- Supporting cast usually don't have any value for `n`" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 3786176 entries, 0 to 3786175\n", + "Data columns (total 6 columns):\n", + " # Column Dtype \n", + "--- ------ ----- \n", + " 0 title object \n", + " 1 year int64 \n", + " 2 name object \n", + " 3 type object \n", + " 4 character object \n", + " 5 n float64\n", + "dtypes: float64(1), int64(1), object(4)\n", + "memory usage: 173.3+ MB\n" + ] + } + ], + "source": [ + "cast = pd.read_csv('cast.csv.zip')\n", + "cast.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
titleyearnametypecharactern
0Closet Monster2015Buffy #1actorBuffy 431.0
1Suuri illusioni1985Homo $actorGuests22.0
2Battle of the Sexes2017$hutteractorBobby Riggs Fan10.0
3Secret in Their Eyes2015$hutteractor2002 Dodger FanNaN
4Steve Jobs2015$hutteractor1988 Opera House PatronNaN
5Straight Outta Compton2015$hutteractorClub PatronNaN
6Straight Outta Compton2015$hutteractorDopemanNaN
7For Thy Love 22009Bee Moe $limactorThug 1NaN
8Lapis, Ballpen at Diploma, a True to Life Journey2014Jori ' Danilo' Jurado Jr.actorJaime (young)9.0
9Desire (III)2014Syaiful 'AriffinactorActor Playing Eteocles from 'Antigone'NaN
\n", + "
" + ], + "text/plain": [ + " title year \\\n", + "0 Closet Monster 2015 \n", + "1 Suuri illusioni 1985 \n", + "2 Battle of the Sexes 2017 \n", + "3 Secret in Their Eyes 2015 \n", + "4 Steve Jobs 2015 \n", + "5 Straight Outta Compton 2015 \n", + "6 Straight Outta Compton 2015 \n", + "7 For Thy Love 2 2009 \n", + "8 Lapis, Ballpen at Diploma, a True to Life Journey 2014 \n", + "9 Desire (III) 2014 \n", + "\n", + " name type character \\\n", + "0 Buffy #1 actor Buffy 4 \n", + "1 Homo $ actor Guests \n", + "2 $hutter actor Bobby Riggs Fan \n", + "3 $hutter actor 2002 Dodger Fan \n", + "4 $hutter actor 1988 Opera House Patron \n", + "5 $hutter actor Club Patron \n", + "6 $hutter actor Dopeman \n", + "7 Bee Moe $lim actor Thug 1 \n", + "8 Jori ' Danilo' Jurado Jr. actor Jaime (young) \n", + "9 Syaiful 'Ariffin actor Actor Playing Eteocles from 'Antigone' \n", + "\n", + " n \n", + "0 31.0 \n", + "1 22.0 \n", + "2 10.0 \n", + "3 NaN \n", + "4 NaN \n", + "5 NaN \n", + "6 NaN \n", + "7 NaN \n", + "8 9.0 \n", + "9 NaN " + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "cast.head(10)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Taking a look at the Release dataset\n", + "\n", + "This data shows details of when each movie was release in each country with the release date" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 479488 entries, 0 to 479487\n", + "Data columns (total 4 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 title 479488 non-null object \n", + " 1 year 479488 non-null int64 \n", + " 2 country 479488 non-null object \n", + " 3 date 479488 non-null datetime64[ns]\n", + "dtypes: datetime64[ns](1), int64(1), object(2)\n", + "memory usage: 14.6+ MB\n" + ] + } + ], + "source": [ + "release_dates = pd.read_csv('release_dates.csv', parse_dates=['date'], infer_datetime_format=True)\n", + "release_dates.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
titleyearcountrydate
0#73, Shaanthi Nivaasa2007India2007-06-15
1#BKKY2016Cambodia2017-10-12
2#Beings2015Romania2015-01-29
3#Captured2017USA2017-09-05
4#Ewankosau saranghaeyo2015Philippines2015-01-21
\n", + "
" + ], + "text/plain": [ + " title year country date\n", + "0 #73, Shaanthi Nivaasa 2007 India 2007-06-15\n", + "1 #BKKY 2016 Cambodia 2017-10-12\n", + "2 #Beings 2015 Romania 2015-01-29\n", + "3 #Captured 2017 USA 2017-09-05\n", + "4 #Ewankosau saranghaeyo 2015 Philippines 2015-01-21" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "release_dates.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Section I - Basic Querying, Filtering and Transformations" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### What is the total number of movies?" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "244914\n" + ] + } + ], + "source": [ + "print(movies.shape[0])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### List all Batman movies ever made" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Total Batman Movies: 2\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
titleyear
52734Batman1943
150621Batman1989
\n", + "
" + ], + "text/plain": [ + " title year\n", + "52734 Batman 1943\n", + "150621 Batman 1989" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "batman_df = movies[movies.title == 'Batman']\n", + "print('Total Batman Movies:', len(batman_df))\n", + "batman_df" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### List all Batman movies ever made - the right approach" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Total Batman Movies: 35\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
titleyear
16813Batman: Anarchy2016
30236Batman Forever1995
31674Batman Untold2010
31711Scooby-Doo & Batman: the Brave and the Bold2018
41881Batman the Rise of Red Hood2018
43484Batman: Return of the Caped Crusaders2016
46333Batman & Robin1997
51811Batman Revealed2012
52734Batman1943
56029Batman Beyond: Rising Knight2014
\n", + "
" + ], + "text/plain": [ + " title year\n", + "16813 Batman: Anarchy 2016\n", + "30236 Batman Forever 1995\n", + "31674 Batman Untold 2010\n", + "31711 Scooby-Doo & Batman: the Brave and the Bold 2018\n", + "41881 Batman the Rise of Red Hood 2018\n", + "43484 Batman: Return of the Caped Crusaders 2016\n", + "46333 Batman & Robin 1997\n", + "51811 Batman Revealed 2012\n", + "52734 Batman 1943\n", + "56029 Batman Beyond: Rising Knight 2014" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "batman_df = movies[movies.title.str.contains('Batman', case=False)]\n", + "print('Total Batman Movies:', len(batman_df))\n", + "batman_df.head(10)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Display the top 15 Batman movies in the order they were released" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "batman_df.sort_values(by=['year'], ascending=True).iloc[:15]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section I - Q1 : List all the 'Harry Potter' movies from the most recent to the earliest" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " title year\n", + "143147 Harry Potter and the Deathly Hallows: Part 2 2011\n", + "152831 Harry Potter and the Deathly Hallows: Part 1 2010\n", + "109213 Harry Potter and the Half-Blood Prince 2009\n", + "50581 Harry Potter and the Order of the Phoenix 2007\n", + "187926 Harry Potter and the Goblet of Fire 2005\n", + "61957 Harry Potter and the Prisoner of Azkaban 2004\n", + "82791 Harry Potter and the Chamber of Secrets 2002\n", + "223087 Harry Potter and the Sorcerer's Stone 2001\n" + ] + } + ], + "source": [ + "print(movies[movies['title'].str.contains('Harry Potter',case=False)].sort_values('year',ascending=False))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### How many movies were made in the year 2017?" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "11474" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(movies[movies.year == 2017])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section I - Q2 : How many movies were made in the year 2015?" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "8702" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(movies[movies.year == 2015])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section I - Q3 : How many movies were made from 2000 till 2018?\n", + "- You can chain multiple conditions using OR (`|`) as well as AND (`&`) depending on the condition" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "114070" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "movies[(movies['year']>=2000) & (movies['year']<=2018)].shape[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section I - Q4: How many movies are titled \"Hamlet\"?" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "20\n" + ] + } + ], + "source": [ + "print(movies[movies['title']=='Hamlet'].shape[0])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section I - Q5: List all movies titled \"Hamlet\" \n", + "- The movies should only have been released on or after the year 2000\n", + "- Display the movies based on the year they were released (earliest to most recent)" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
titleyear
55639Hamlet2000
1931Hamlet2009
227953Hamlet2011
178290Hamlet2014
186137Hamlet2015
191940Hamlet2016
244747Hamlet2017
\n", + "
" + ], + "text/plain": [ + " title year\n", + "55639 Hamlet 2000\n", + "1931 Hamlet 2009\n", + "227953 Hamlet 2011\n", + "178290 Hamlet 2014\n", + "186137 Hamlet 2015\n", + "191940 Hamlet 2016\n", + "244747 Hamlet 2017" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "movies[(movies['year']>=2000) & (movies['title']=='Hamlet')].sort_values('year')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section I - Q6: How many roles in the movie \"Inception\" are of the supporting cast (extra credits)\n", + "- supporting cast are NOT ranked by an \"n\" value (NaN)\n", + "- check for how to filter based on nulls" + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Asnwer is 27\n" + ] + } + ], + "source": [ + "#n is the only column with nan, let's drop it\n", + "castClean=cast.dropna()\n", + "#Movies titled Inception count, with and without nan\n", + "a=cast[cast['title']=='Inception'].shape[0] \n", + "aClean=castClean[castClean['title']=='Inception'].shape[0] \n", + "print('Asnwer is',a-aClean)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section I - Q7: How many roles in the movie \"Inception\" are of the main cast\n", + "- main cast always have an 'n' value" + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Asnwer is 51\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
titleyearnametypecharactern
0Closet Monster2015Buffy #1actorBuffy 431.0
1Suuri illusioni1985Homo $actorGuests22.0
2Battle of the Sexes2017$hutteractorBobby Riggs Fan10.0
3Secret in Their Eyes2015$hutteractor2002 Dodger FanNaN
4Steve Jobs2015$hutteractor1988 Opera House PatronNaN
.....................
3786171Foxtrot1988Lilja ??risd?ttiractressD?ra24.0
3786172Niceland (Population. 1.000.002)2004Sigr??ur J?na ??risd?ttiractressWoman in Bus26.0
3786173Skammdegi1985Dalla ??r?ard?ttiractressHj?krunarkona9.0
3786174U.S.S.S.S...2003Krist?n Andrea ??r?ard?ttiractressAfgr.dama ? bens?nst??17.0
3786175Bye Bye Blue Bird1999Rosa ? R?gvuactressPensionatv?rtindeNaN
\n", + "

3786176 rows × 6 columns

\n", + "
" + ], + "text/plain": [ + " title year name \\\n", + "0 Closet Monster 2015 Buffy #1 \n", + "1 Suuri illusioni 1985 Homo $ \n", + "2 Battle of the Sexes 2017 $hutter \n", + "3 Secret in Their Eyes 2015 $hutter \n", + "4 Steve Jobs 2015 $hutter \n", + "... ... ... ... \n", + "3786171 Foxtrot 1988 Lilja ??risd?ttir \n", + "3786172 Niceland (Population. 1.000.002) 2004 Sigr??ur J?na ??risd?ttir \n", + "3786173 Skammdegi 1985 Dalla ??r?ard?ttir \n", + "3786174 U.S.S.S.S... 2003 Krist?n Andrea ??r?ard?ttir \n", + "3786175 Bye Bye Blue Bird 1999 Rosa ? R?gvu \n", + "\n", + " type character n \n", + "0 actor Buffy 4 31.0 \n", + "1 actor Guests 22.0 \n", + "2 actor Bobby Riggs Fan 10.0 \n", + "3 actor 2002 Dodger Fan NaN \n", + "4 actor 1988 Opera House Patron NaN \n", + "... ... ... ... \n", + "3786171 actress D?ra 24.0 \n", + "3786172 actress Woman in Bus 26.0 \n", + "3786173 actress Hj?krunarkona 9.0 \n", + "3786174 actress Afgr.dama ? bens?nst?? 17.0 \n", + "3786175 actress Pensionatv?rtinde NaN \n", + "\n", + "[3786176 rows x 6 columns]" + ] + }, + "execution_count": 65, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "print('Asnwer is',aClean)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section I - Q8: Show the top ten cast (actors\\actresses) in the movie \"Inception\" \n", + "- main cast always have an 'n' value\n", + "- remember to sort!" + ] + }, + { + "cell_type": "code", + "execution_count": 68, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
titleyearnametypecharactern
3731263Inception2010Shannon WellesactressOld Mal51.0
833376Inception2010Jack GilroyactorOld Cobb50.0
2250605Inception2010Jason TendellactorFischer's Driver49.0
3473041Inception2010Lisa (II) ReynoldsactressPrivate Nurse48.0
1812091Inception2010Andrew PleavinactorBusinessman47.0
2049179Inception2010Felix ScottactorBusinessman46.0
807795Inception2010Michael GastonactorImmigration Officer45.0
149008Inception2010Peter BashamactorFischer's Jet Captain44.0
3444628Inception2010Nicole PulliamactressLobby Sub Con43.0
3203564Inception2010Alex (II) LombardactressLobby Sub Con42.0
\n", + "
" + ], + "text/plain": [ + " title year name type character \\\n", + "3731263 Inception 2010 Shannon Welles actress Old Mal \n", + "833376 Inception 2010 Jack Gilroy actor Old Cobb \n", + "2250605 Inception 2010 Jason Tendell actor Fischer's Driver \n", + "3473041 Inception 2010 Lisa (II) Reynolds actress Private Nurse \n", + "1812091 Inception 2010 Andrew Pleavin actor Businessman \n", + "2049179 Inception 2010 Felix Scott actor Businessman \n", + "807795 Inception 2010 Michael Gaston actor Immigration Officer \n", + "149008 Inception 2010 Peter Basham actor Fischer's Jet Captain \n", + "3444628 Inception 2010 Nicole Pulliam actress Lobby Sub Con \n", + "3203564 Inception 2010 Alex (II) Lombard actress Lobby Sub Con \n", + "\n", + " n \n", + "3731263 51.0 \n", + "833376 50.0 \n", + "2250605 49.0 \n", + "3473041 48.0 \n", + "1812091 47.0 \n", + "2049179 46.0 \n", + "807795 45.0 \n", + "149008 44.0 \n", + "3444628 43.0 \n", + "3203564 42.0 " + ] + }, + "execution_count": 68, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "castClean[castClean['title']=='Inception'].sort_values('n',ascending=False).head(10)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section I - Q9:\n", + "\n", + "(A) List all movies where there was a character 'Albus Dumbledore' \n", + "\n", + "(B) Now modify the above to show only the actors who played the character 'Albus Dumbledore'\n", + "- For Part (B) remember the same actor might play the same role in multiple movies" + ] + }, + { + "cell_type": "code", + "execution_count": 75, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " title year name \\\n", + "704984 Epic Movie 2007 Dane Farwell \n", + "792421 Harry Potter and the Goblet of Fire 2005 Michael Gambon \n", + "792423 Harry Potter and the Order of the Phoenix 2007 Michael Gambon \n", + "792424 Harry Potter and the Prisoner of Azkaban 2004 Michael Gambon \n", + "947789 Harry Potter and the Chamber of Secrets 2002 Richard Harris \n", + "947790 Harry Potter and the Sorcerer's Stone 2001 Richard Harris \n", + "1685537 Ultimate Hero Project 2013 George (X) O'Connor \n", + "2248085 Potter 2015 Timothy Tedmanson \n", + "\n", + " type character n \n", + "704984 actor Albus Dumbledore 17.0 \n", + "792421 actor Albus Dumbledore 37.0 \n", + "792423 actor Albus Dumbledore 36.0 \n", + "792424 actor Albus Dumbledore 27.0 \n", + "947789 actor Albus Dumbledore 32.0 \n", + "947790 actor Albus Dumbledore 1.0 \n", + "1685537 actor Albus Dumbledore NaN \n", + "2248085 actor Albus Dumbledore NaN \n" + ] + }, + { + "data": { + "text/plain": [ + "Index(['Dane Farwell', 'George (X) O'Connor', 'Michael Gambon',\n", + " 'Richard Harris', 'Timothy Tedmanson'],\n", + " dtype='object', name='name')" + ] + }, + "execution_count": 75, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#A\n", + "print(cast[cast['character']=='Albus Dumbledore'])" + ] + }, + { + "cell_type": "code", + "execution_count": 93, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Index(['Dane Farwell', 'George (X) O'Connor', 'Michael Gambon',\n", + " 'Richard Harris', 'Timothy Tedmanson'],\n", + " dtype='object', name='name')" + ] + }, + "execution_count": 93, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#B\n", + "cast[cast['character']=='Albus Dumbledore'].groupby('name').count().index" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section I - Q10:\n", + "\n", + "(A) How many roles has 'Keanu Reeves' played throughout his career?\n", + "\n", + "(B) List the leading roles that 'Keanu Reeves' played on or after 1999 in order by year." + ] + }, + { + "cell_type": "code", + "execution_count": 91, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "56\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
titleyearnametypecharactern
1892378Siberia2018Keanu ReevesactorLucas Hill1.0
1892362John Wick: Chapter 22017Keanu ReevesactorJohn Wick1.0
1892399The Whole Truth2016Keanu ReevesactorRamsey1.0
1892366Knock Knock2015Keanu ReevesactorEvan1.0
1892361John Wick2014Keanu ReevesactorJohn Wick1.0
189234247 Ronin2013Keanu ReevesactorKai1.0
1892359Henry's Crime2010Keanu ReevesactorHenry Torne1.0
1892382Street Kings2008Keanu ReevesactorDetective Tom Ludlow1.0
1892385The Day the Earth Stood Still2008Keanu ReevesactorKlaatu1.0
1892388The Lake House2006Keanu ReevesactorAlex Wyler1.0
1892348Constantine2005Keanu ReevesactorJohn Constantine1.0
1892358Hard Ball2001Keanu ReevesactorConor O'Neill1.0
1892383Sweet November2001Keanu ReevesactorNelson Moss1.0
1892397The Replacements2000Keanu ReevesactorShane Falco1.0
1892390The Matrix1999Keanu ReevesactorNeo1.0
\n", + "
" + ], + "text/plain": [ + " title year name type \\\n", + "1892378 Siberia 2018 Keanu Reeves actor \n", + "1892362 John Wick: Chapter 2 2017 Keanu Reeves actor \n", + "1892399 The Whole Truth 2016 Keanu Reeves actor \n", + "1892366 Knock Knock 2015 Keanu Reeves actor \n", + "1892361 John Wick 2014 Keanu Reeves actor \n", + "1892342 47 Ronin 2013 Keanu Reeves actor \n", + "1892359 Henry's Crime 2010 Keanu Reeves actor \n", + "1892382 Street Kings 2008 Keanu Reeves actor \n", + "1892385 The Day the Earth Stood Still 2008 Keanu Reeves actor \n", + "1892388 The Lake House 2006 Keanu Reeves actor \n", + "1892348 Constantine 2005 Keanu Reeves actor \n", + "1892358 Hard Ball 2001 Keanu Reeves actor \n", + "1892383 Sweet November 2001 Keanu Reeves actor \n", + "1892397 The Replacements 2000 Keanu Reeves actor \n", + "1892390 The Matrix 1999 Keanu Reeves actor \n", + "\n", + " character n \n", + "1892378 Lucas Hill 1.0 \n", + "1892362 John Wick 1.0 \n", + "1892399 Ramsey 1.0 \n", + "1892366 Evan 1.0 \n", + "1892361 John Wick 1.0 \n", + "1892342 Kai 1.0 \n", + "1892359 Henry Torne 1.0 \n", + "1892382 Detective Tom Ludlow 1.0 \n", + "1892385 Klaatu 1.0 \n", + "1892388 Alex Wyler 1.0 \n", + "1892348 John Constantine 1.0 \n", + "1892358 Conor O'Neill 1.0 \n", + "1892383 Nelson Moss 1.0 \n", + "1892397 Shane Falco 1.0 \n", + "1892390 Neo 1.0 " + ] + }, + "execution_count": 91, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#A\n", + "print(len(cast[cast['name']=='Keanu Reeves'].groupby('character')))" + ] + }, + { + "cell_type": "code", + "execution_count": 92, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
titleyearnametypecharactern
1892378Siberia2018Keanu ReevesactorLucas Hill1.0
1892362John Wick: Chapter 22017Keanu ReevesactorJohn Wick1.0
1892399The Whole Truth2016Keanu ReevesactorRamsey1.0
1892366Knock Knock2015Keanu ReevesactorEvan1.0
1892361John Wick2014Keanu ReevesactorJohn Wick1.0
189234247 Ronin2013Keanu ReevesactorKai1.0
1892359Henry's Crime2010Keanu ReevesactorHenry Torne1.0
1892382Street Kings2008Keanu ReevesactorDetective Tom Ludlow1.0
1892385The Day the Earth Stood Still2008Keanu ReevesactorKlaatu1.0
1892388The Lake House2006Keanu ReevesactorAlex Wyler1.0
1892348Constantine2005Keanu ReevesactorJohn Constantine1.0
1892358Hard Ball2001Keanu ReevesactorConor O'Neill1.0
1892383Sweet November2001Keanu ReevesactorNelson Moss1.0
1892397The Replacements2000Keanu ReevesactorShane Falco1.0
1892390The Matrix1999Keanu ReevesactorNeo1.0
\n", + "
" + ], + "text/plain": [ + " title year name type \\\n", + "1892378 Siberia 2018 Keanu Reeves actor \n", + "1892362 John Wick: Chapter 2 2017 Keanu Reeves actor \n", + "1892399 The Whole Truth 2016 Keanu Reeves actor \n", + "1892366 Knock Knock 2015 Keanu Reeves actor \n", + "1892361 John Wick 2014 Keanu Reeves actor \n", + "1892342 47 Ronin 2013 Keanu Reeves actor \n", + "1892359 Henry's Crime 2010 Keanu Reeves actor \n", + "1892382 Street Kings 2008 Keanu Reeves actor \n", + "1892385 The Day the Earth Stood Still 2008 Keanu Reeves actor \n", + "1892388 The Lake House 2006 Keanu Reeves actor \n", + "1892348 Constantine 2005 Keanu Reeves actor \n", + "1892358 Hard Ball 2001 Keanu Reeves actor \n", + "1892383 Sweet November 2001 Keanu Reeves actor \n", + "1892397 The Replacements 2000 Keanu Reeves actor \n", + "1892390 The Matrix 1999 Keanu Reeves actor \n", + "\n", + " character n \n", + "1892378 Lucas Hill 1.0 \n", + "1892362 John Wick 1.0 \n", + "1892399 Ramsey 1.0 \n", + "1892366 Evan 1.0 \n", + "1892361 John Wick 1.0 \n", + "1892342 Kai 1.0 \n", + "1892359 Henry Torne 1.0 \n", + "1892382 Detective Tom Ludlow 1.0 \n", + "1892385 Klaatu 1.0 \n", + "1892388 Alex Wyler 1.0 \n", + "1892348 John Constantine 1.0 \n", + "1892358 Conor O'Neill 1.0 \n", + "1892383 Nelson Moss 1.0 \n", + "1892397 Shane Falco 1.0 \n", + "1892390 Neo 1.0 " + ] + }, + "execution_count": 92, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#B\n", + "cast[(cast['name']=='Keanu Reeves') & (cast['n']==1.0) &(cast['year'] >= 1999)].sort_values('year',ascending=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section I - Q11: \n", + "\n", + "(A) List the total number of actor and actress roles available from 1950 - 1960\n", + "\n", + "(B) List the total number of actor and actress roles available from 2007 - 2017" + ] + }, + { + "cell_type": "code", + "execution_count": 97, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "139020" + ] + }, + "execution_count": 97, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(cast[(cast['year']<=1960) &(cast['year'] >= 1950)].groupby('character'))" + ] + }, + { + "cell_type": "code", + "execution_count": 99, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "619945" + ] + }, + "execution_count": 99, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(cast[(cast['year']<=2017) &(cast['year'] >= 2007)].groupby('character'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section I - Q12: \n", + "\n", + "(A) List the total number of leading roles available from 2000 to present\n", + "\n", + "(B) List the total number of non-leading roles available from 2000 - present (exclude support cast)\n", + "\n", + "(C) List the total number of support\\extra-credit roles available from 2000 - present" + ] + }, + { + "cell_type": "code", + "execution_count": 101, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "38919" + ] + }, + "execution_count": 101, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(cast[(cast['n']==1.0) &(cast['year'] >= 2000)].groupby('character'))" + ] + }, + { + "cell_type": "code", + "execution_count": 103, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "473347" + ] + }, + "execution_count": 103, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(castClean[(castClean['n']>1.0) &(castClean['year'] >= 2000)].groupby('character'))" + ] + }, + { + "cell_type": "code", + "execution_count": 112, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "382872" + ] + }, + "execution_count": 112, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#C\n", + "len(cast[(pd.isnull(cast.loc[:,'n'])) &(cast['year'] >= 2000)].groupby('character'))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Section II - Aggregations, Transformations and Visualizations" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## What are the top ten most common movie names of all time?\n" + ] + }, + { + "cell_type": "code", + "execution_count": 113, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Hamlet 20\n", + "Carmen 17\n", + "Macbeth 16\n", + "Maya 12\n", + "Temptation 12\n", + "The Outsider 12\n", + "Freedom 11\n", + "The Three Musketeers 11\n", + "Honeymoon 11\n", + "Othello 11\n", + "Name: title, dtype: int64" + ] + }, + "execution_count": 113, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "top_ten = movies.title.value_counts()[:10]\n", + "top_ten" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Plot the top ten common movie names of all time" + ] + }, + { + "cell_type": "code", + "execution_count": 114, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 114, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAdIAAAD4CAYAAABYIGfSAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAew0lEQVR4nO3dfZQcVZ3/8feHEEKQGBYCOEZkFAcwQJiQIfJkBEVWwRURENiwJri7Oe4qiB5wo7Is+EREFEQUGFkEXIQsKygQwLAxKE+B9CQhkwQUgaAEBBZ/DIRAgMn390ffIUWne56qpztJf17n9Omqe2/d+nZNpb+5t6q7FRGYmZnZ4GxW7wDMzMw2Zk6kZmZmOTiRmpmZ5eBEamZmloMTqZmZWQ6b1zsAq70xY8ZEc3NzvcMwM9uodHR0/F9EbF9a7kTagJqbmykUCvUOw8xsoyLp8XLlnto1MzPLwYnUzMwsBydSMzOzHJxIzczMcvDNRg2oc2UXzTNm1zsMswFbMfOIeodgth6PSM3MzHJwIq0iSe+Q9CtJD0t6RNIPJG0hqVXS4Zl2Z0k6bYB9r5A0Ji2vqnbsZmY2OE6kVSJJwPXALyOiBdgV2Br4FtAKHF55azMz21g5kVbPB4FXIuKnABHRDXwR+CfgXOA4SYslHZfaj5N0h6RHJZ3S04mkEyXdn9peKmlYpR2q6LuSlkrqzPRtZmY14kRaPXsAHdmCiHgBWAF8E5gVEa0RMStV7w78LTAJ+A9JwyW9FzgOODAiWoFuYEov+/wkxdHu3sChwHclNZVrKGm6pIKkQvfqrsG9QjMzW4/v2q0eATGA8tkRsQZYI+kZYEfgQ8BEYEFxppiRwDO97PMg4Jo0+n1a0m+BfYEbSxtGRDvQDjCiqaVcPGZmNghOpNWzDDg6WyDprcBOFEeWpdZklrsp/i0EXBkRX+nnPjWIOM3MrIo8tVs9c4GtJH0aIF3b/B5wBfA0MKqffRwjaYfUx7aSdu6l/e8oXnsdJml7YDJw/+BfgpmZDZQTaZVERABHAcdKehj4A/AK8FVgHsWbixb3dkNQRCwHzgDmSFoC3A6UveaZ3AAsAR4AfgN8OSL+Uo3XY2Zm/aPi+781khFNLdE09YJ6h2E2YP5mI6snSR0R0VZa7mukDWivsaMp+A3JzKwqPLVrZmaWgxOpmZlZDk6kZmZmOTiRmpmZ5eBEamZmloMTqZmZWQ5OpGZmZjk4kZqZmeXgRGpmZpaDE6mZmVkO/orABtS5sovmGbPrHYbZgPm7dm1D5BGpmZlZDg2VSCWtKlmfJumiesVjZmYbv4ZKpGZmZtXmRJpI2lnSXElL0vM7U/kVki6UdI+kRyUdk9nmdEkL0jZnp7JvSPpCps23JJ0i6WBJv5X035L+IGmmpCmS7pfUKWmXPuIYcHxmZjb0Gi2RjpS0uOcBfD1TdxFwVUSMB64GLszUNQEHAR8DZgJIOgxoASYBrcBESZOB/wSmpjabAcen/gD2Br4A7AX8A7BrREwCLgNO7iOOAcVXStJ0SQVJhe7VXX0fKTMz65dGS6QvR0RrzwM4M1O3P/DztPwziompxy8jYm1ELAd2TGWHpcciYCGwO9ASESuA5yRN6KmPiOfSNgsi4qmIWAM8AsxJ5Z1Acx9xDDS+N4mI9ohoi4i2YVuNLn90zMxswPzxl8ois7wms6zM8zkRcWmZbS8DpgFvAy6v0M/azPpaKv8toh/l5eIzM7MaaLQRaW/uoTgNCzAFuKuP9r8GPiNpawBJYyXtkOpuAD4C7JvaVSOOgcZnZmY14BHpOqcAl0s6HXgWOKm3xhExR9J7gXslAawCTgSeiYhXJc0Dno+I7irFMaD4zMysNhRRaebQBivdZLQQODYiHq53PKXa2tqiUCjUOwwzs42KpI6IaCst99RulUkaB/wRmLshJlEzM6suT+1WWbpz9t31jsPMzGrDI1IzM7McnEjNzMxycCI1MzPLwYnUzMwsBydSMzOzHJxIzczMcnAiNTMzy8GJ1MzMLAd/IUMD6lzZRfOM2fUOw2zAVsw8ot4hmK3HI1IzM7McnEjNzMxy6DWRStpO0uL0+IuklWn5eUnLB7NDSSdl+nxVUmdaninpLEmnDe6lDCiGOyT9Sen3z1LZLyWtGmR/V0g6pp9tt5H0r4PZj5mZbXh6TaQR8VxEtEZEK3AJcH5abgXWDmaHEfHTTJ9PAoek9Rn92V5F1RhJPw8cmPrcBmiqQp/9sQ1QtUQqaVi1+jIzs4HLk5CGSfqJpGWS5kgaCSBpF0m3SeqQdKek3QfY77g0YnxU0impz2ZJD0r6McXf+dxJ0umSFkhaIunsno0lnSjp/jTKvbSXRHMtcHxa/iRwfaaPgyXdnFm/SNK0tDxT0vK03/NKO5X0jTRC3axCjDOBXVJ8303bDOi1SFol6euS7gP27ysmMzMbOnkSaQvwo4jYg+Lo7uhU3g6cHBETgdOAHw+w392BvwUmAf8haXgq3w24KiImpOWW1KYVmChpsqT3AscBB6YRbzcwpcJ+5gKTU3I6HpjVV2CStgWOAvaIiPHAN0vqzwV2AE4CDi0XIzADeCSNwk+XdNggXstbgKUR8T5geW8xZWKbLqkgqdC9uquvl2pmZv2U5+Mvj0XE4rTcATRL2ho4ALguc/lxxAD7nR0Ra4A1kp4Bdkzlj0fE/LR8WHosSutbU0xG44GJwIK0/5HAMxX20w3cRTFZjYyIFZmYK3kBeAW4TNJs4OZM3b8D90XEdICUIMvF+KeSPgfzWrqBX/QjpjdERDvF/+Qwoqkl+nqhZmbWP3kS6ZrMcjfFN/rNgOfTCKpa/fbE+FKmXMA5EXFpdkNJJwNXRsRX+rmva4EbgLNKyl/nzaP1LQEi4nVJk4APURzFfh74YGqzgOJoctuI+GsvMTaX7Gswr+WViOjuR0xmZjbEqvrxl4h4AXhM0rHwxo1Be1dzH8mvgc+kETCSxkrageJ07TFpGUnbStq5l37uBM4Brikpf5zitdoRkkZTTFKk/Y2OiFuAUylOxfa4jeL1z9mSRvUS44vAqGq9lj5iMjOzITYU32w0BbhY0hnAcIqjvgequYOImJOuId6bpj1XASdGxPK03znpzt7XgM9RTIzl+glgvZtzIuLPkv4bWAI8zLpp11HAryRtSXEk+cWS7a5LSfRG4HDg52VifETS3ZKWArem66R5XkuvMZmZ2dBSMZdYI2lra4tCoVDvMMzMNiqSOiKirbTc32xkZmaWgxOpmZlZDk6kZmZmOTiRmpmZ5eBEamZmloMTqZmZWQ5OpGZmZjk4kZqZmeXgRGpmZpaDE6mZmVkOQ/Fdu7aB61zZRfOM2fUOw2zAVsw8ot4hmK3HI1IzM7McnEgHSFK3pMWZR3OV+58m6aJq9mlmZkPHU7sD93KlHy5X8XfQFBFraxuSmZnVi0ekOUlqlvSgpB8DC4GdJJ0uaYGkJZLOzrQ9UdL9aSR7qaRhqfwkSX+Q9FvgwEz7nSXNTf3MlfTOVH6FpIslzZP0qKQPSLo8xXFFbY+AmVljcyIduJGZad0bUtluwFURMSEttwCTgFZgoqTJ6ce7jwMOTCPabmCKpCbgbIoJ9MPAuMy+Lkr9jgeuBi7M1P0N8EGKP+R9E3A+sAewl6TW0qAlTZdUkFToXt1VhcNgZmbgqd3BeNPUbrpG+nhEzE9Fh6XHorS+NcXEOh6YCCwozgAzEngGeB9wR0Q8m/qbBeyatt0f+GRa/hlwbiaOmyIiJHUCT0dEZ9p+GdAMLM4GHRHtQDvAiKYW/5q7mVmVOJFWx0uZZQHnRMSl2QaSTgaujIivlJR/AuhvYsu2W5Oe12aWe9b9dzUzqxFP7Vbfr4HPSNoaQNJYSTsAc4Fj0jKStpW0M3AfcLCk7SQNB47N9HUPcHxangLcVasXYWZm/eORS5VFxJx0PfTeNIW7CjgxIpZLOgOYI2kz4DXgcxExX9JZwL3AUxRvWBqWujsFuFzS6cCzwEm1fTVmZtYXRfhyWaMZ0dQSTVMvqHcYZgPmbzayepLUERFtpeUekTagvcaOpuA3JDOzqvA1UjMzsxycSM3MzHJwIjUzM8vBidTMzCwHJ1IzM7McnEjNzMxycCI1MzPLwYnUzMwsBydSMzOzHJxIzczMcvBXBDagzpVdNM+YXe8wzGrO39VrQ8EjUjMzsxw2yUSafttzcXr8RdLKtPy8pOU5+/6EpCWSHpLUmX6Yu69tWiUd3kebt0v6nwp1d0ha7xcHzMys/jbJqd2IeA5oBUi/9bkqIs6T1AzcPNh+Je0NnAd8OCIek/Qu4HZJj0bEkl42bQXagFt6iflJ4JjBxlYS57CI6K5GX2Zm1rtNckTah2GSfiJpmaQ5kkYCSNpF0m2SOiTdKWn3MtueBnw7Ih4DSM/nAKenPt4YOUoaI2mFpC2ArwPHpVHxcZI+kBkxL5I0SlKzpKVp25GSrk0j31nAyJ4AJB0m6V5JCyVdJ2nrVL5C0pmS7gKOHaqDZ2Zmb9aIibQF+FFE7AE8DxydytuBkyNiIsWE+eMy2+4BdJSUFVJ5WRHxKnAmMCsiWiNiVur/cxHRCrwfeLlks38BVkfEeOBbwEQoJmfgDODQiNgn7ftLme1eiYiDIuLa0jgkTZdUkFToXt1VKVwzMxugTXJqtw+PRcTitNwBNKdR3QHAdZJ62o0os62A6EdZX+4Gvi/pauD6iHgis1+AycCFABGxRFLPtPF+wDjg7tR+C+DezHazKu0wItop/meBEU0tA43XzMwqaMREuiaz3E1x2nQz4Pk0QuzNMorXOrPXQ/cBem5gep11o/wtK3USETMlzQYOB+ZLOhR4pbRZmU0F3B4RJ1To+qXewzczs2prxKnd9UTEC8Bjko4FUNHeZZqeB3wl3bREev4q8L1Uv4I0Dcubbxx6ERjVsyJpl4jojIjvUJyeLb0e+ztgSmq7JzA+lc8HDpT0nlS3laRdB/p6zcysepxI15kC/KOkByiOPI8sbZCmhP8NuEnSQ8BNwJczU8XnAf8i6R5gTGbTecC4npuNgFMlLU37ehm4tWRXFwNbpyndLwP3p/0/C0wDrkl181k/CZuZWQ0pwpfLGs2IppZomnpBvcMwqzl/s5HlIakjItb7TH8jXiNteHuNHU3BbyhmZlXhqV0zM7McnEjNzMxycCI1MzPLwYnUzMwsBydSMzOzHJxIzczMcnAiNTMzy8GJ1MzMLAcnUjMzsxycSM3MzHLwVwQ2oM6VXTTPmF3vMMxqzt+1a0PBI1IzM7McGjaRStou/azZYkl/kbQys75Flfd1qqStBtpO0i2StqlmLGZmVl0Nm0gj4rmIaI2IVuAS4Pye9Yh4tcq7OxXoM5GWtouIwyPi+SrHYmZmVdSwibQcSRMl/VZSh6RfS2pK5XdIOl/S7yQ9KGlfSddLeljSN1ObZkkPSbpS0hJJ/yNpK0mnAG8H5kmal9peLKkgaZmks1NZuXYrJI1Jy19KPwa+VNKpmX0+KOknqa85kkbW+LCZmTU0J9J1BPwQOCYiJgKXA9/K1L8aEZMpjl5/BXwO2BOYJmm71GY3oD0ixgMvAP8aERcCTwKHRMQhqd3X0o/Djgc+IGl8hXbFwKSJwEnA+4D9gH+WNCFVtwA/iog9gOeBo8u+OGl6St6F7tVdgzk+ZmZWhhPpOiMoJsbbJS0GzgDekam/MT13Assi4qmIWAM8CuyU6v4cEXen5f8CDqqwr09JWggsAvYAxvUR20HADRHxUkSsAq4H3p/qHouIxWm5A2gu10FEtEdEW0S0DdtqdB+7MzOz/vLHX9YRxQS5f4X6Nel5bWa5Z73nOEbJNqXrSHoXcBqwb0T8P0lXAFv2I7ZKsrF0A57aNTOrIY9I11kDbC9pfwBJwyXtMcA+3tmzPXACcFdafhEYlZbfCrwEdEnaEfhoZvtsu6zfAZ9I11zfAhwF3DnA2MzMbAg4ka6zFjgG+I6kB4DFwAED7ONBYKqkJcC2wMWpvB24VdK8iHiA4pTuMorXYe/ObP9Gu2ynEbEQuAK4H7gPuCwiFg0wNjMzGwKKWG/20QZBUjNwc0TsWe9Y+tLW1haFQqHeYZiZbVQkdaQbRd/EI1IzM7McfLNRlUTECop3/ZqZWQPxiNTMzCwHJ1IzM7McnEjNzMxycCI1MzPLwYnUzMwsBydSMzOzHJxIzczMcnAiNTMzy8FfyNCAOld20Txjdr3DMKu5FTOPqHcItgnyiNTMzCwHJ1IzM7McnEhrTFJI+llmfXNJz0q6uZ5xmZnZ4DiR1t5LwJ6SRqb1DwMr6xiPmZnl4ERaH7cCPXc9nABc01MhaZKkeyQtSs+7pfI7JbVm2t0taXyl9mZmVhtOpPVxLXC8pC2B8cB9mbqHgMkRMQE4E/h2Kr8MmAYgaVdgREQs6aX9m0iaLqkgqdC9umsIXpKZWWPyx1/qICKWSGqmOBq9paR6NHClpBYggOGp/Drg3yWdDnwGuKKP9qX7bAfaAUY0tUTVXoyZWYPziLR+bgTOIzOtm3wDmBcRewJ/B2wJEBGrgduBI4FPAT/vrb2ZmdWGR6T1cznQFRGdkg7OlI9m3c1H00q2uQy4CbgzIv7aj/ZmZjbEPCKtk4h4IiJ+UKbqXOAcSXcDw0q26QBeAH7an/ZmZjb0FOHLZRsLSW8H7gB2j4i1g+2nra0tCoVC1eIyM2sEkjoioq203CPSjYSkT1O8u/dreZKomZlVl6+RbiQi4irgqnrHYWZmb+YRqZmZWQ5OpGZmZjk4kZqZmeXgRGpmZpaDE6mZmVkOTqRmZmY5OJGamZnl4ERqZmaWg7+QoQF1ruyiecbseodhZjW2YuYR9Q5hk+QRqZmZWQ5OpAMkKST9LLO+uaRnJd08yP5WSBozgPYHSzogs36FpGMGs28zM8vPiXTgXgL2lDQyrX+Ydb8HWgsHAwf01cjMzGrDiXRwbgV6LjacAFzTUyFpkqR7JC1Kz7ul8mGSzpPUKWmJpJMz/Z0u6f70eE9qv72kX0hakB4HSmoGPgt8UdJiSe9P209O+3rUo1Mzs9pyIh2ca4HjJW0JjKf482Y9HgImR8QE4Ezg26l8OvAuYEJEjAeuzmzzQkRMAi4CLkhlPwDOj4h9gaOByyJiBXBJKm+NiDtT2ybgIOBjwMxyAUuaLqkgqdC9umvwr9zMzN7Ed+0OQkQsSaPDE4BbSqpHA1dKagECGJ7KDwUuiYjXUx9/zWxzTeb5/Ez7cZJ62rxV0qgKIf0y/Ubpckk7Voi5HWgHGNHU4l9zNzOrEifSwbsROI/iNcvtMuXfAOZFxFEp2d6RykUxsZYTZZY3A/aPiJezDTOJNWtNtknfoZuZWbV4anfwLge+HhGdJeWjWXfz0bRM+Rzgs5I2B5C0babuuMzzvZn2n+9pIKk1Lb4IVBqZmplZjTmRDlJEPBERPyhTdS5wjqS7gWGZ8suAPwFLJD0A/H2mboSk+4AvAF9MZacAbenGpOUUbzICuAk4quRmIzMzqxNF+HJZoxnR1BJNUy+odxhmVmP+ZqN8JHVERFtpua+RNqC9xo6m4H9QZmZV4aldMzOzHJxIzczMcnAiNTMzy8GJ1MzMLAcnUjMzsxycSM3MzHJwIjUzM8vBidTMzCwHJ1IzM7McnEjNzMxy8FcENqDOlV00z5hd7zDMbCPj7+otzyNSMzOzHJxIc5L0NknXSnpE0nJJt0jatd5xmZlZbTiR5iBJwA3AHRGxS0SMA74K7NifbSX5+JuZbeT8Rp7PIcBrEXFJT0FELAYWSZoraaGkTklHAkhqlvSgpB8DC4H3S3pI0mWSlkq6WtKhku6W9LCkSWm7t0i6XNICSYsy/U2TdL2k21L7c2t+BMzMGpwTaT57Ah1lyl8BjoqIfSgm2++l0SvAbsBVETEBeBx4D/ADYDywO/D3wEHAaRRHtwBfA34TEfum/r4r6S2prhU4DtgLOE7STuUClTRdUkFSoXt1V46XbGZmWb5rd2gI+LakycBaYCzrpnsfj4j5mbaPRUQngKRlwNyICEmdQHNqcxjwcUmnpfUtgXem5bkR0ZW2Xw7sDPy5NKCIaAfaAUY0tURVXqWZmTmR5rQMOKZM+RRge2BiRLwmaQXF5AfwUknbNZnltZn1taz7+wg4OiJ+n91Q0vtKtu/Gf1Mzs5ry1G4+vwFGSPrnngJJ+1IcFT6TkughaT2PXwMn90wPS5qQsz8zM6sSJ9IcIiKAo4APp4+/LAPOAm4B2iQVKI5OH8q5q28Aw4ElkpamdTMz2wComAuskYxoaommqRfUOwwz28g0+jcbSeqIiLbScl9Pa0B7jR1NocH/QZiZVYunds3MzHJwIjUzM8vBidTMzCwHJ1IzM7McnEjNzMxycCI1MzPLwYnUzMwsBydSMzOzHJxIzczMcvA3GzWgzpVdNM+YXe8wzMxqaqi+4tAjUjMzsxycSM3MzHJwIq0iSatK1qdJuqhKfd8hab1fHShpc6qkraqxPzMz6x8n0k3LqYATqZlZDTmR1oikv5N0n6RFkv5X0o6p/CxJV0qaI2mFpE9KOldSp6TbJA0v09dhku6VtFDSdZK2lnQK8HZgnqR5tX59ZmaNyom0ukZKWtzzAL6eqbsL2C8iJgDXAl/O1O0CHAEcCfwXMC8i9gJeTuVvkDQGOAM4NCL2AQrAlyLiQuBJ4JCIOKQ0MEnTJRUkFbpXd1Xp5ZqZmT/+Ul0vR0Rrz4qkaUDPdc13ALMkNQFbAI9ltrs1Il6T1AkMA25L5Z1Ac8k+9gPGAXdLIvV1b1+BRUQ70A4woqklBvKizMysMifS2vkh8P2IuFHSwcBZmbo1ABGxVtJrEdGT6Nay/t9IwO0RccLQhmtmZv3hqd3aGQ2sTMtTc/QzHzhQ0nsAJG0laddU9yIwKkffZmY2QE6ktXMWcJ2kO4H/G2wnEfEsMA24RtISiol191TdDtzqm43MzGpH62YRrVG0tbVFoVCodxhmZhsVSR0Rsd7n+T0iNTMzy8GJ1MzMLAcnUjMzsxycSM3MzHJwIjUzM8vBd+02IEkvAr+vdxy9GEOOjwgNsQ05NnB8eTm+fDb1+HaOiO1LC/3NRo3p9+Vu4d5QSCpsqPFtyLGB48vL8eXTqPF5atfMzCwHJ1IzM7McnEgbU3u9A+jDhhzfhhwbOL68HF8+DRmfbzYyMzPLwSNSMzOzHJxIzczMcnAi3URJ+oik30v6o6QZZeol6cJUv0TSPjWMbSdJ8yQ9KGmZpC+UaXOwpC5Ji9PjzFrFl/a/QlJn2vd6P5VT5+O3W+a4LJb0gqRTS9rU9PhJulzSM5KWZsq2lXS7pIfT899U2LbXc3UI4/uupIfS3+8GSdtU2LbXc2EI4ztL0srM3/DwCtvW6/jNysS2QtLiCtsO6fGr9H5S0/MvIvzYxB7AMOAR4N3AFsADwLiSNocDtwIC9gPuq2F8TcA+aXkU8Icy8R0M3FzHY7gCGNNLfd2OX5m/9V8oflC8bscPmAzsAyzNlJ0LzEjLM4DvVIi/13N1COM7DNg8LX+nXHz9OReGML6zgNP68fevy/Erqf8ecGY9jl+l95Nann8ekW6aJgF/jIhHI+JV4FrgyJI2RwJXRdF8YBtJTbUILiKeioiFaflF4EFgbC32XUV1O34lPgQ8EhGP12Hfb4iI3wF/LSk+ErgyLV8JfKLMpv05V4ckvoiYExGvp9X5wDuqvd/+qnD8+qNux6+HJAGfAq6p9n77o5f3k5qdf06km6axwJ8z60+wfqLqT5shJ6kZmADcV6Z6f0kPSLpV0h61jYwA5kjqkDS9TP0GcfyA46n8BlbP4wewY0Q8BcU3O2CHMm02lOP4GYozDOX0dS4Mpc+nqefLK0xNbgjH7/3A0xHxcIX6mh2/kveTmp1/TqSbJpUpK/2cU3/aDClJWwO/AE6NiBdKqhdSnK7cG/gh8MtaxgYcGBH7AB8FPidpckn9hnD8tgA+DlxXprrex6+/NoTj+DXgdeDqCk36OheGysXALkAr8BTF6dNSdT9+wAn0PhqtyfHr4/2k4mZlygZ8/JxIN01PADtl1t8BPDmINkNG0nCKJ/3VEXF9aX1EvBARq9LyLcBwSWNqFV9EPJmenwFuoDgFlFXX45d8FFgYEU+XVtT7+CVP90x3p+dnyrSp93k4FfgYMCXSRbNS/TgXhkREPB0R3RGxFvhJhf3W+/htDnwSmFWpTS2OX4X3k5qdf06km6YFQIukd6VRy/HAjSVtbgQ+ne4+3Q/o6pkGGWrpmsp/Ag9GxPcrtHlbaoekSRTP1edqFN9bJI3qWaZ4U8rSkmZ1O34ZFUcC9Tx+GTcCU9PyVOBXZdr051wdEpI+Avwb8PGIWF2hTX/OhaGKL3vN/agK+63b8UsOBR6KiCfKVdbi+PXyflK782+o7qTyo74PineV/oHiHWlfS2WfBT6blgX8KNV3Am01jO0gitMnS4DF6XF4SXyfB5ZRvItuPnBADeN7d9rvAymGDer4pf1vRTExjs6U1e34UUzoTwGvUfxf/j8C2wFzgYfT87ap7duBW3o7V2sU3x8pXh/rOQcvKY2v0rlQo/h+ls6tJRTf3Js2pOOXyq/oOecybWt6/Hp5P6nZ+eevCDQzM8vBU7tmZmY5OJGamZnl4ERqZmaWgxOpmZlZDk6kZmZmOTiRmpmZ5eBEamZmlsP/B7piWX5OilDDAAAAAElFTkSuQmCC", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "top_ten.plot(kind='barh')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section II - Q1: Which years in the 2000s saw the most movies released? (Show top 3)" + ] + }, + { + "cell_type": "code", + "execution_count": 134, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
title
year
201711474
20169440
20158702
\n", + "
" + ], + "text/plain": [ + " title\n", + "year \n", + "2017 11474\n", + "2016 9440\n", + "2015 8702" + ] + }, + "execution_count": 134, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "movies[movies['year']>=2000].groupby(by='year').count().sort_values(('title'),ascending=False)[:3]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section II - Q2: # Plot the total number of films released per-decade (1890, 1900, 1910,....)\n", + "- Hint: Dividing the year and multiplying with a number might give you the decade the year falls into!\n", + "- You might need to sort before plotting" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAD4CAYAAAAAczaOAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAWg0lEQVR4nO3df7RlZX3f8fcngwwCAiqQThEdXCVZhVHQuSJWY7Iw4I8QtTW2UKqu1dSo0Sw0sSljYhr7l7oSE4lN4iwdtYnxRzVWjVjqIiZGJZo7gs4MSBl0GkeJA/EHiksi8O0f+7nM4XLv3HMv+9yzh/N+rXXW2fc5e+/zHYb5zpl9nv15UlVIkmbDj027AEnS+rHpS9IMselL0gyx6UvSDLHpS9IMOWLaBazkxBNPrM2bN0+7DEk6rOzcufPWqjpp8fjgm/7mzZuZn5+fdhmSdFhJ8v+WGvfyjiTNEJu+JM2QFZt+klOTfDLJ9Un2JLm0jT+//Xx3krmR/R/e9v9+krcsOtfWJLuS7E1yeZL0/0uSJC1nnE/6dwK/VlX/EjgXeHmSM4DdwL8BPrVo/x8CrwVevcS5/gj4JeD09njGGuuWJK3Bik2/qm6uqi+07e8B1wOnVNX1VXXDEvvfXlWfpmv+90iyCTiuqq6uLvDnfwDP7eHXIEka06qu6SfZDDwO+Nwa3usUYP/Iz/vbmCRpnYzd9JMcC3wQeGVV3baG91rq+v2SEZ9JfinJfJL5W265ZQ1vJUlaylhNP8mD6Br+u6vqz9f4XvuBR4z8/AjgG0vtWFXbq2ququZOOuk+9xZIktZonNk7Ad4OXF9Vb1rrG1XVzcD3kpzbzvlC4MNrPZ8kafXGuSP3ycALgF1Jrm1jrwE2An8AnAR8LMm1VfV0gCT7gOOAI5M8F7igqq4DXga8E3gw8PH2kCStkxWbfpuJs9x8+g8tc8zmZcbngS3jFidJ6pd35ErSDLHpS9IMselL0gyx6UvSDLHpS9IMGXzT3/X17067BEl6wLg/0coPS/KJJDe254e28fOT7GwRyjuTnDdyLqOVJWmK7k+08mXAVVV1OnBV+xngVuDnq+oxwIuAPxk5l9HKkjRFa45WBp4DvKvt9i5aTHJVXVNVC5k6e4Cjkmw0WlmSpu/+RCv/eMvTWcjVOXmJQ54HXFNVd7CKaOXRlM27fuA1fUnqy8SilZOcCbwBeMnC0BK7LRmtPJqyueHo48ctUZK0gvsTrfzNdslmYVWsAyP7P4Iul+eFVXVTGx47WlmSNBn3J1r5I3Rf1NKeP9z2PwH4GLCtqj6zsLPRypI0feN80l+IVj4vybXt8Szg9cD5SW4Ezm8/A7wC+BfAa0f2X7je/zLgbcBe4CbGiFZ+zCle3pGkvqSbSDNcc3NzNT8/P+0yJOmwkmRnVc0tHh/8HbmSpP7Y9CVphtj0JWmG2PQlaYbY9CVphqy4MHqSU+lycv4ZcDewvarenORhwPuAzcA+4N9W1beTPBz4APAE4J1V9YqRc20F3gk8GLgCuLRWmD606+vfZfNlH1v9r0ySDmP7Xv9zEznvJFI2fwi8Fnj1EucyZVOSpmgSKZu3V9Wn6Zr/PUzZlKTpm3TK5qixUzYlSZMxsZTNpU6xxNiS1/ONVpakyZhIyuYyxk7ZNFpZkiaj95TN5ZiyKUnTt2LgWpKnAH8D7KKbsgnwGrrr+u8HHgn8PfD8qvpWO2YfcBxwJPAd4IKqui7JHAenbH4c+JWVpmwauCZJq7dc4NqK8/TbTJylrscDPG2ZYzYvMz4PbFnpPSVJk+EduZI0Q2z6kjRDbPqSNENs+pI0Q2z6kjRDVpy9M22mbGqIJpWAKE3aODdnnZrkk0muT7InyaVt/GFJPpHkxvb80JFjtiXZm+SGJE8fGd+aZFd77fJ2k5YkaZ30Hq3cXrsIOJMuOvkPk2xo5zJaWZKmqPdo5Tb+3qq6o6q+CuwFzjFaWZKmbxLRyqcAXxs5bCFCeexoZVM2JWkyJhGtvFyE8tjRyqZsStJkTCJaeT9w6sjhCxHKY0crS5ImYxLRyh8BLkqyMclpdF/Yft5oZUmavnHm6T8ZeAGwK8m1bew1wOuB9yf5RVq0MkBV7UnyfuA6upk/L6+qu9pxL+Pe0cofX+nNH3PK8cw7J1qSerFinv60macvSau3XJ6+MQySNENs+pI0Q2z6kjRDbPqSNENs+pI0Q1acsplkB3AhcKCqtrSxs4A/Bo4F9gGXVNVtSY4E3grMAXcDl1bVX7VjtnJwuuYV7bUVpw4ZrTx9xghLDxzjfNJ/J/dNw3wbcFlVPQb4EPCf2/iLAdr4+cDvJll4DxM2JWnKxknZ/BTwrUXDPwl8qm1/Anhe2z6DLmaZqjoAfAeYM2FTkoZhrdf0dwPPbtvP52DWzheB5yQ5okUwbG2vjZ2wKUmanLU2/f9It5jKTuAhwD+18R10DX0e+H3gs3RRDGMnbILRypI0KWtaI7eqvgxcAJDkJ4Cfa+N3Aq9a2C/JZ4EbgW+zioTNqtoObAfYuOn0YedESNJhZE2f9JOc3J5/DPhNupk8JDk6yTFt+3zgzqq6zoRNSRqGcaZsvgf4GeDEJPuB/wocm+TlbZc/B97Rtk8GrkxyN/B1unTOBatO2ARTNiWpTys2/aq6eJmX3rzEvvvoZvYsdZ55YMtqipMk9cs7ciVphtj0JWmG2PQlaYbY9CVphtj0JWmGrOnmrPVkyuZkmaApzZYVP+kn2ZHkQJLdI2NnJbk6ya4kH01yXBt/UJJ3tfHrk2wbOWZrG9+b5PJ2k5YkaR31Ha38fGBjG98KvCTJ5vaa0cqSNGV9RysXcEySI+juvP0n4DajlSVpGPqOVv4AcDtwM/D3wO9U1bdYZbSyKZuSNBl9RyufA9wF/HPgNODXkjyaVUYrV9X2qpqrqrkNRx+/xhIlSYv1Gq0M/Hvgf1fVj4ADST5Dt17u37CKaGVJ0mT0Gq1Md0nnvHSOAc4Fvmy0siQNQ9/Ryv+9be+mu6Tzjqr6UnvNaGVJmrK+o5W/T/fF7lLnMVpZkqbMGAZJmiE2fUmaITZ9SZohNn1JmiE2fUmaIeNM2dwBXAgcqKotbewsurn5xwL7gEuq6rYkl3AwfA3gscDjq+raJFs5OGXzCuDSlsNzSEYr98MIZUnQc8pmVb27qs6uqrOBFwD7quradowpm5I0ZX2nbI66GHgPgCmbkjQMfadsjvp3tKbPKlM2JUmT0XfKJgBJngj8oKoWVttaVcqm0cqSNBl9p2wuuIiDn/Kh+2Q/dspmVW0HtgNs3HT6il/2SpLG03fK5sLY84H3LoyZsilJw9B3yibAU4H9VfWVRacyZVOSpixjTJWfqrm5uZqfn592GZJ0WEmys6rmFo97R64kzRCbviTNEJu+JM0Qm74kzRCbviTNkDXdnLWeTNkcn0maklay4if9JDuSHEiye2TsrCRXJ9mV5KNJjht57bHttT3t9aPa+Nb2894kl7ebtCRJ66jXaOUkRwB/Cry0qs6ku6nrR+0Yo5Ulacr6jla+APhSVX2xHfuPVXWX0cqSNAx9Ryv/BFBJrkzyhSS/3sZXFa1syqYkTUbf0cpHAE8BLmnP/zrJ01hltHJVba+quaqa23D08WssUZK0WN/RyvuBv66qW9trVwCPp7vOP3a0siRpMvqOVr4SeGySo9uXuj8NXGe0siQNQ6/RylX17SRvAv6O7vLNFVW1MMneaGVJmjKjlSXpAchoZUmSTV+SZolNX5JmiE1fkmaITV+SZsg4UzZ3ABcCB6pqSxs7i25u/rHAPuCSqrotyWbgeuCGdvjfVtVL2zFbOThl8wrg0hpj6tADNVrZGGRJ09BrymZzU1Wd3R4vHRk3ZVOSpqzvlM0lmbIpScPQd8omwGlJrkny10l+qo2tKmVTkjQZfads3gw8sqoeB/wq8GdtVa1VpWwarSxJk9FrymZV3QHc0bZ3JrmJLmN/P6tI2ayq7cB2gI2bTh92ToQkHUZ6TdlMclKSDW370XRf2H7FlE1JGoZeUzaBpwL/LcmdwF10a+UufAlsyqYkTZkpm5L0AGTKpiTJpi9Js8SmL0kzxKYvSTPEpi9JM2RNN2etp6GlbJqOKelwtuIn/SQ7khxIsntk7KwkVyfZleSjLWph9JhHJvl+klePjG1t++9Ncnm7SUuStI4mEa0M8Hvc9+Yro5Ulacp6j1ZO8lzgK8CekTGjlSVpAHqNVk5yDPBfgNct2n9V0cqmbErSZPQdrfw64Peq6vuL9l9VtHJVba+quaqa23D08WssUZK0WK/RysATgV9I8kbgBODuJD8EPsgqopUlSZOxpqaf5OSqOrA4Wrmqfmpkn98Gvl9Vb2k/fy/JucDn6KKV/+B+1i5JWqW+o5UPxWhlSZoyo5Ul6QHIaGVJkk1fkmaJTV+SZohNX5JmiE1fkmbIOFM2dwAXAgeqaksbO4tubv6xwD7gkqq6Lck5wPaFQ4HfrqoPtWO2cnDK5hXApTXG1KGhRSsfirHLkoau75TN3cBcVZ3djnlrkoW/WEzZlKQp6zVls6p+UFV3tvGjaPk6pmxK0jD0mrIJkOSJSfYAu4CXtr8EVpWyKUmajL5TNqmqz1XVmcATgG1JjmKVKZtGK0vSZPSdsjm6z/VJbge20H2yHztls6q2074Q3rjp9GHnREjSYWRNn/STnNye75WymeS0hS9ukzyK7tr/vqq6GfheknPb2rgvBD7cQ/2SpFXoO2XzKcBlSX4E3A38clXd2l4zZVOSpsyUTUl6ADJlU5Jk05ekWWLTl6QZYtOXpBli05ekGbKmm7PW01BTNk3UlHQ4WvGTfpIdSQ4k2T0ydlaSq5PsSvLRJMe18fOT7GzjO5OcN3LM1ja+N8nl7SYtSdI66jta+Vbg59v4i4A/GTnGaGVJmrK+o5WvqaqFTJ09wFFJNhqtLEnD0Hu08ojnAddU1R2sMlrZlE1Jmozeo5UBkpwJvAF4ycLQEudYNv+hqrZX1VxVzW04+vg1lihJWqz3aOUkj6C7zv/CqrqpDa8qWlmSNBl9RyufAHwM2FZVn1nY32hlSRqGFVM2R6OVgW/SopWB0WjlbVVVSX4T2AbcOHKKC6rqQJI57h2t/Cs1RsSnKZuStHrLpWwarSxJD0BGK0uSbPqSNEts+pI0Q2z6kjRDbPqSNENWvDkryQ7gQuBAVW1pY2fRzc0/FtgHXFJVtyV5OPAB4AnAO6vqFSPn2crBKZtXAJeOM2VzvaOVjUyW9EDWd8rmD4HXAq9e4jymbErSlPWdsnl7VX2arvnfw5RNSRqGSaZsjlpVyqYkaTImkrK5hFWlbBqtLEmT0XvK5jJWlbJZVduB7QAbN50+7JwISTqM9JqyuRxTNiVpGMaZsnlPymaS/bSUzSSjKZvvGNl/H3AccGSS59KlbF4HvIx7p2x+fJwCH3PK8cw7jVKSerFi06+qi5d56c3L7L95mfF5YMvYlUmSeucduZI0Q2z6kjRDbPqSNENs+pI0Q2z6kjRDbPqSNENWbPpJdiQ5kGT3yNhZSa5OsivJR5McN/LatiR7k9yQ5Okj41vb/nuTXN5u0pIkraNeo5WTnAFcBJzZjvnDJBvaMUYrS9KU9RqtDDwHeG9V3VFVXwX2AucYrSxJw9B3tPIpwNdG9luIUF5VtPJoyuYtt9yyxhIlSYv1Ha28XITyqqKVq2p7Vc1V1dxJJ520xhIlSYv1Ha28n3svqLIQobyqaGVJ0mT0Ha38EeCiJBuTnEb3he3njVaWpGHoNVq5qvYkeT9wHXAn8PKquqvtt6ZoZUlSf9JNphmuubm5mp+fn3YZknRYSbKzquYWj3tHriTNEJu+JM0Qm74kzRCbviTNEJu+JM2QtaZsnp3kb5Nc2+ISzmnjRyZ5R0vT/GKSnxk5xpRNSZqytaZsvhF4XVWdDfxW+xngxQAtffN84HfbDVxgyqYkTd1aUzYLWMjQP56DkQpnAFe14w4A3wHmTNmUpGFYU/YO8ErgyiS/Q/cXx79q418EnpPkvXQZPFvb892sImVTkjQZa/0i92XAq6rqVOBVwNvb+A66hj4P/D7wWbo4hlWlbBqtLEmTsdam/yK6zB2A/wmcA1BVd1bVq6rq7Kp6DnACcCOrTNk0WlmSJmOtTf8bwE+37fPoGjtJjk5yTNs+H7izqq4zZVOShmGtKZsvBt6c5Ajgh3SzcgBOprvWfzfwdeAFI6cyZVOSpmzFpl9VFy/z0tYl9t1Ht37uUueZB7aspjhJUr+8I1eSZohNX5JmyOAXUUnyPeCGadcxhhOBW6ddxBiss1/W2S/r7M+jquo+0x/XenPWerphqdVfhibJvHX2xzr7ZZ39OlzqXIqXdyRphtj0JWmGHA5Nf/u0CxiTdfbLOvtlnf06XOq8j8F/kStJ6s/h8ElfktQTm74kzZDBNv0kz0hyQ1te8bJ1es+lloZ8WJJPJLmxPT905LVtrb4bkjx9ZHzJpSGTbEzyvjb+uSSb11jnqUk+meT6JHuSXDrEWpMcleTzbenMPUleN8Q6R95jQ5JrkvzFUOtMsq+d/9ok8wOu84QkH0jy5fb/6ZOGVmeSn2z/HRcetyV55dDq7F1VDe4BbABuAh4NHEm3OMsZ6/C+TwUeD+weGXsjcFnbvgx4Q9s+o9W1ETit1buhvfZ54El06wh8HHhmG/9l4I/b9kXA+9ZY5ybg8W37IcD/bfUMqtZ2zmPb9oOAzwHnDq3OkXp/Ffgz4C8G/Hu/Dzhx0dgQ63wX8J/a9pF0MeuDq3Ok3g3APwCPGnKdfTym+uaH+A14EnDlyM/bgG3r9N6buXfTvwHY1LY30d0sdp+agCtb3ZuAL4+MXwy8dXSftn0E3R196aHmD9OtSTzYWoGjgS8ATxxinXRrPFxFFxW+0PSHWOc+7tv0B1Un3VKqX1183NDqXFTbBcBnhl5nH4+hXt45BfjayM/TXF7xx6tbD4D2fHIbX67GU1h+ach7jqmqO4HvAg+/P8W1fy4+ju5T9OBqbZdMrgUOAJ+oqkHWSbfS26/TLe25YIh1FvB/kuxMshBpPrQ6Hw3cAryjXS57W7p1NoZW56iLgPe07SHXeb8NtemvannFKVmuxkPV3uuvK8mxwAeBV1bVbYfadZn3nXitVXVXVZ1N90n6nCSHiteeSp1JLgQOVNXOcQ9Z5j3X4/f+yVX1eOCZwMuTPPUQ+06rziPoLpP+UVU9Drid7jLJcqb6ZynJkcCz6VYBPOSuy7znuv2Z78NQm/5+ugXVFxxyecUJ+2aSTQDt+UAbX67GQy0Nec8x6RagOR741lqKSvIguob/7qpaWLpykLUCVNV3gL8CnjHAOp8MPDvJPuC9wHlJ/nSAdVJV32jPB4AP0S1VOrQ69wP727/qAD5A95fA0Opc8EzgC1X1zfbzUOvsxVCb/t8Bpyc5rf0tfBHwkSnV8hG6NYFpzx8eGb+ofTt/GnA68Pk69NKQo+f6BeAvq13sW4123rcD11fVm4Zaa5KTkpzQth8M/Czw5aHVWVXbquoRVbWZ7v+1v6yq/zC0OpMck+QhC9t016F3D63OqvoH4GtJFhZUehpw3dDqHHExBy/tLD73kOrsxzS/UDjUA3gW3ayUm4DfWKf3fA9wM/Ajur+hf5Hu+ttVdOsAXwU8bGT/32j13UD7tr6Nz9H9YbwJeAsH73w+iu6fkHvpvu1/9BrrfArdPxG/BFzbHs8aWq3AY4FrWp27gd9q44Oqc1HNP8PBL3IHVSfdtfIvtseehT8XQ6uznedsYL793v8v4KEDrfNo4B+B40fGBldnnw9jGCRphgz18o4kaQJs+pI0Q2z6kjRDbPqSNENs+pI0Q2z6kjRDbPqSNEP+P2L/EnZw4YjpAAAAAElFTkSuQmCC", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "movies['decade']=movies['year'].apply(lambda x: int(x/10.0)*10)\n", + "d=movies.decade.value_counts().sort_index()\n", + "d.plot(kind='barh')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section II - Q3: \n", + "\n", + "(A) What are the top 10 most common character names in movie history?\n", + "\n", + "(B) Who are the top 10 people most often credited as \"Herself\" in movie history?\n", + "\n", + "(C) Who are the top 10 people most often credited as \"Himself\" in movie history?" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "character\n", + "Himself 20746\n", + "Dancer 12477\n", + "Extra 11948\n", + "Reporter 8434\n", + "Student 7773\n", + "Doctor 7669\n", + "Party Guest 7245\n", + "Policeman 7029\n", + "Nurse 6999\n", + "Bartender 6802\n", + "dtype: int64" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "cast.groupby('character').size().sort_values(ascending=False)[:10]" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "name\n", + "Queen Elizabeth II 12\n", + "Mar?a Luisa (V) Mart?n 9\n", + "Luisa Horga 9\n", + "Joyce Brothers 9\n", + "Margaret Thatcher 8\n", + "Hillary Clinton 8\n", + "Oprah Winfrey 6\n", + "Mar?a Isabel (III) Mart?n 6\n", + "Sumie Sakai 6\n", + "Joan Rivers 6\n", + "dtype: int64" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "cast[(cast['character']=='Herself') & (cast[\"type\"]=='actress')].groupby('name').size().sort_values(ascending=False)[:10]" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "name\n", + "Adolf Hitler 99\n", + "Richard Nixon 44\n", + "Ronald Reagan 41\n", + "John F. Kennedy 37\n", + "George W. Bush 25\n", + "Winston Churchill 24\n", + "Martin Luther King 23\n", + "Bill Clinton 22\n", + "Ron Jeremy 22\n", + "Franklin D. Roosevelt 21\n", + "dtype: int64" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "cast[(cast['character']=='Himself') & (cast[\"type\"]=='actor')].groupby('name').size().sort_values(ascending=False)[:10]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section II - Q4: \n", + "\n", + "(A) What are the top 10 most frequent roles that start with the word \"Zombie\"?\n", + "\n", + "(B) What are the top 10 most frequent roles that start with the word \"Police\"?\n", + "\n", + "- Hint: The `startswith()` function might be useful" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "character\n", + "Zombie 6264\n", + "Zombie Horde 206\n", + "Zombie - Protestor - Victim 78\n", + "Zombie Extra 70\n", + "Zombie Dancer 43\n", + "Zombie Girl 36\n", + "Zombie #1 36\n", + "Zombie #2 31\n", + "Zombie Vampire 25\n", + "Zombie Victim 22\n", + "dtype: int64" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "cast[cast['character'].str.startswith('Zombie')].groupby('character').size().sort_values(ascending=False)[:10]" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "character\n", + "Policeman 7029\n", + "Police Officer 4808\n", + "Police Inspector 742\n", + "Police Sergeant 674\n", + "Police officer 539\n", + "Police 456\n", + "Policewoman 415\n", + "Police Chief 410\n", + "Police Captain 387\n", + "Police Commissioner 337\n", + "dtype: int64" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "cast[cast['character'].str.startswith('Police')].groupby('character').size().sort_values(ascending=False)[:10]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section II - Q5: Plot how many roles 'Keanu Reeves' has played in each year of his career." + ] + }, + { + "cell_type": "code", + "execution_count": 54, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 54, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYsAAAD5CAYAAADWfRn1AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAguUlEQVR4nO3de7Rd87338fcndyIRBE2lj9AiEiTYyVBOFSl6FNXW9fFoc+jRQ/Xy6M2leqHOMVrt0NL2NG1Rp6oJ6inqfqtL0SSklKBosBOVSFTiEkS+zx+/305WtrX3WmHNtefa+/MaY4+991xrzvlbjOSXOX/f72cqIjAzM+tOv54egJmZlZ8nCzMzq8mThZmZ1eTJwszMavJkYWZmNXmyMDOzmgYUdWBJ7wEuAt4FrASmRcQPJW0ITAfGAPOAQyPiBUkbAZcBk4ALI+KEimMdBpwK9Af+EBFfrXX+kSNHxpgxYxr6mczMervZs2c/HxEbd96uovosJI0CRkXEfZKGAbOBg4CpwJKIOEvSScAGEfE1SUOBHYHtgO06Jos8idwP7BwRiyT9CrgoIm7u7vxtbW0xa9asQj6bmVlvJWl2RLR13l7YlUVEPAs8m39eJmkusBnwUWCP/LZfAbcBX4uIl4E7Jb2v06G2BB6LiEX595uATwDdThYPzn+RMSf9oQGfpPXMO+sjPT0EM+tlmrJmIWkM6arhXmDTPJF0TCib1Nj9cWCspDGSBpCuTt5T3GjNzKyzwicLSesBlwNfjIila7t/RLwAHEda57iDtM6xootzHStplqRZb77y4tsftJmZraGw21AAkgaSJoqLI+J3efNzkkZFxLN5XWNhreNExFXAVfmYxwJvdvG+acA0gMGjtnLolVkv88Ybb9De3s7y5ct7eigtb8iQIYwePZqBAwfW9f4iq6EE/BKYGxE/qHjpSuBTwFn5++/rONYmEbFQ0gbA8cChtfbZfrP1meV792a9Snt7O8OGDWPMmDGkv2Ls7YgIFi9eTHt7O1tssUVd+xR5ZbEbcBTwoKQ5edsppElihqRjgKeBQzp2kDQPGA4MknQQsE9EPAz8UNKE/LbTI+KxAsdtZiW1fPlyTxQNIImNNtqIRYsW1X5zVuSaxVOkSqeB+euCiLgGCFLfBfl7wKoS2b8Dg0h9FqPzRAHpauRN0lrFVEkjCxy3mZWYJ4rGWNv/jkVOFiuAL0XEtsAuwGcljQNOAm6OiK1I5a8n5fcvB04Dvlx5kFwB9UNgz4jYAXgAOAEzsxKYOnUql112WdPON2fOHK655pqmna9DK/RZKH8NlbSYdJvq8Vrnd5+FWe/X6D/jzf6zExFEBP361f/v9jlz5jBr1iz222+/uvdZsWIFAwa8s7/uS99nERFvkEpnHwQWAONIC+dmZk130UUXscMOOzBhwgSOOuooAG6//XZ23XVXttxyy1VXGS+99BJTpkxhp512Yvvtt+f3v0+1PPPmzWPbbbfl+OOPZ6edduKZZ57huOOOo62tjfHjx/PNb35z1blmzpzJrrvuyoQJE5g8eTIvvvgi3/jGN5g+fToTJ05k+vTpvPzyyxx99NFMmjSJHXfccdV5LrzwQg455BAOOOAA9tlnn3f8uQstnYW39lms7X2yXH57HGmyeRI4FzgZ+E6V9x4LHAvQf/hbok3MzN6Rhx56iDPPPJO77rqLkSNHsmTJEk488USeffZZ7rzzTh555BEOPPBADj74YIYMGcIVV1zB8OHDef7559lll1048MADAXj00Ue54IIL+MlPfgLAmWeeyYYbbsibb77JlClTeOCBBxg7diyHHXYY06dPZ9KkSSxdupR1112X008/nVmzZnHeeecBcMopp7DXXntx/vnn889//pPJkyfzoQ99CIC7776bBx54gA033PAdf/ZW6LOYCBART+RjzmD1OscaKvss2trawqWzZtZIt9xyCwcffDAjR6Yam46/hA866CD69evHuHHjeO6554B0i+mUU07h9ttvp1+/fsyfP3/Va5tvvjm77LLLquPOmDGDadOmsWLFCp599lkefvhhJDFq1CgmTZoEwPDhw6uO6YYbbuDKK6/k7LPPBlLF2NNPPw3A3nvv3ZCJAlqjz2I+ME7Sxjkfam9gbgFDNjPrVkRUrSIaPHjwGu8BuPjii1m0aBGzZ89m4MCBjBkzZlUz4dChQ1e9/+9//ztnn302M2fOZIMNNmDq1KksX768y3NVG9Pll1/ONttss8b2e++9d43zvFNFrll8jNRn8R+SXpXULmk/4L+Br0h6HfgK8FNIpbOSXgUuAI7N7x8HLCNVVj2TX/8q0Lj/AmZmdZoyZQozZsxg8eLFACxZsqTL97744otssskmDBw4kFtvvZWnnnqq6vuWLl3K0KFDWX/99Xnuuee49tprARg7diwLFixg5syZACxbtowVK1YwbNgwli1btmr/fffdl3PPPXfVJHX//fc35LN2VuRtqLtJseKVEeXzgM8A36uIKP8P4Guk0tm96RRRnq0KDpQ0G/hNgeM2M6tq/PjxnHrqqXzwgx+kf//+7Ljjjl2+98gjj+SAAw6gra2NiRMnMnbs2KrvmzBhAjvuuCPjx49nyy23ZLfddgNg0KBBTJ8+nc997nO8+uqrrLPOOtx0003sueeenHXWWUycOJGTTz6Z0047jS9+8YvssMMORARjxozh6quvbvhnL+x5Fm85kfR74Lz8tUfFmsVtEbFNxfumAm2dJouO17YCbgH+V9QY+OBRW8WoT53TwE9g1hp6c+n03Llz2XbbbXt6GL1Gtf+eXT3PovSls50cAUyvNVGYmVljlT6ivJPDgUu6OZcjys3MClDoZNFd6Wx+va6I8vzeCcCAiJjd1XsiYlpEtEVEW/9113+Hozczsw6tUDrb4Qi6uarozBHlZr1TvSWl1r21vZvfKhHlkJ5hUX8Yipn1OkOGDGHx4sVstNFGnjDegY7nWQwZMqTufYqcLDoiyt9FiiKfFhHXSNqQ7iPKJ5EiyldVQ0kaBNwEXClpJXBqRFxe4NjNrIRGjx5Ne3v7Wj2HwarreFJevYqcLDoiylf1WUi6EZhKiijv6LM4idV9FqeR+yw6HetUYGFEbC2pH9CY/nUzaykDBw6s+8lu1litEFEOcDQwNh9rJfB8rfP35YhyM+u7iuqzKX2fhaQR+cczJN0n6VJJmxY4XDMz66QV+iwGAKOBuyJiJ1KMyNldnMt9FmZmBWiFiPLFwCvAFfn3S4Fjqr3REeVmZsUo7Mqijj4LqKPPIkd7XMXqdY4pwMNd7mBmZg1X5JVFR0T5a5I+Q7pCOJYUUf5nSaeT4sfbYFXpbDswBHijU5/F5sDVeQJaAexe4LjNzKyTItcsOiLKh5AWsV9hzYjyQcD3SBHlsDqi/DhST8boioa814DdI2KdiBgWEcUEtpuZWVWtUjq71vpy6Wxvjqg2s55R+tLZChdImiPpNLnP38ysqVqhdBbgyIjYHvhA/jqqi3O5dNbMrAAtEVEeEfPz92WkR6pO7uJ9jig3MytA6SPKJQ0ARkTE83ny2Z8UKtgtR5SbmTVO6SPKSem11+eJoj9povh5geM2M7NOirwN1RFRPjB/XRAR15AiybuLKB9EiigfHREPR8TLEbFzROwAPAFMiYg3Cxy3mZl1UuRk0RFRvi2wC/BZSeNIkeQ3R8RWwM35d1gdUf7lageT9HHgpQLHa2ZmXWiJPotcUXUiqQN8Rj3nd5+FmVnjtEqfxRnA90ld4GZm1mSl77OQNBF4X0RcUcd73WdhZlaAVogofz+wc66UGgBsIum2iNij8xsdUW5mVoxWiCj/aUS8OyLGAP8CPFZtojAzs+K0RES5pOuAUcA6wEhJ/V0+a2bWPK0SUX5oREwAtgFupaKRz8zMitcSpbMVC+MDSE17Uev8fbl0tq9z6bBZ47VK6SySricthi8DLitmpGZmVk3pS2c7RMS+pHWLwcBeXZzLpbNmZgVoiYjyDhGxnFRN9dEuXndEuZlZAVohonw9YFjuyxgA7AfcUev8jig3M2ucVogoXwxcKWkwKaL8FlL5rZmZNUnpI8pJC9oL83aAlyNiRYHjNjOzTlolovzsiBhLqqjaTdK/FjhuMzPrpPR9FhHxCqkRj4h4XdJ9wOha5+/LfRbuMzCzRmuZPot8nBHAAaQrEjMza5KW6bPIlVCXAD+KiCe7eI/7LMzMCtAKEeUdpgF/i4hzunqDI8rNzIpR+ojyfKzvAOsDX2zwMM3MrA6ljygHlgKnAq8Br0paCpwcEb8ocOxmZlah9BHlEdEOzAT2JE0ks4D5BY7bzMw6KX3pbF7XGB4Rd+ffLwIOAq7t7vwunTUza5xWKJ3djHR7qkN73mZmZk3SCqWzqrKt6sOPXDprZlaMVogob2fNju3RwIJqb3REuZlZMUofUZ77MZZJ2oV0G+uTwLm1zu+IcjOzxil9RHlOnj0OuBBYh7Sw3e3itpmZNVZpIsqznwFLSKWxx+SJAuC9+b3LgVciouqahZmZFaM0EeX5tcOB8cCHgZ9I6p+b9b4HTImI8cCmkqYUOG4zM+ukNH0WeftvI+I14O+SHgcmkyadxyJiUd7nJuAT1EiedZ+FmVnjlKnPYjPgmYrdOvopHgfGShqTk2cPAt7TjHGbmVlSpj6Lqv0UEfECaYF7OnAHKTKk6mNV3WdhZlaMMkWUt7PmFcOqfoqIuAq4Kh/zWODNaudzRLmZWTHKFFF+JXC4pMGStgC2Av6cj7VJ/r4BcDzgxFkzsyYqTUR5RDwk6R/AS6Ry2m9HRMcVxBWSds4/P0IqrzUzsyYpTUR5Lp19F7AesA1wdC6dHUC6yhidj3UDcEKB4zYzs05aoXR2Fmnxe6ikxaQO78drnb8vl872dS4dNmu80pfORsQbpGqoB0kL3uNIayFmZtYkpS+dzRVVx5Emm3cDDwAnd3Eul86amRWgTBHlXZXOTgSIiCdyJtQMYNdq53NEuZlZMcoUUX4l8BtJPyBdQXSUzm4KjJO0cY782BuYW+v8jig3M2uc0kSU59LZGcDDpA7tz+bS2QWSvg3cLukNUprt1ALHbWZmnZQtorzy95UV288nRX0MIV1x7FHUoM3M7K1KH1Gej3UqsDAitiZVQ/2xwHGbmVknrdBncTdwNDA2H2sl8Hyt87vPou9yn4VZ45W+z0LSiPz7GZLuk3SppE2bMW4zM0tK32dBuvoZDdwVETuRrjTO7uJc7rMwMytAK0SULyblSl2Rt18KHFPtfI4oNzMrRukjynMj3lWsXueYQiqvNTOzJmmViPLNgavzBLQC2L3AcZuZWSeljyjPx3oN2D0i1omIYRFxf4HjNjOzTlqldHatuXTW+iqXDlsRSl86W/H7BZLmSDot344yM7MmaYXSWYAjI2J74AP566guzuXSWTOzArRCRDkRMT9/Xwb8hnR76i0cUW5mVozSR5TnZ3CPiIjn8+SzP3BTrfM7otzMrHFKH1EuaShwfZ4o+pMmip8XOG4zM+uk9BHlEfEysKhi++ACx2xmZlV0O1lI6iep6iNM69DIiPJDI2ICsB2wMflqxMzMmqPb21ARsVLS94H3r+2BG9lnUVFFNQAYxJpXI1W5z8L6KvdZWBHquQ11g6RPvJPehkb0WUi6nlQ5tQy47O2OxczM1l49k8WJpKTX1yQtlbRMUnf9EmtoUJ8FEbEvMIq0ZrFXF+dyn4WZWQFqVkNFxLC3e/AGRZRXjmW5pCtJt6xurDJWR5SbmRWgrmooSRtImixp946vOvZpSES5pPUqmvgGAPsBj9QzbjMza4yaVxaSPg18gfQv/Tmkyqa76eJWUIWGRJTnPovZkkaS+ix+ko9hZmZNUs+VxReAScBTEbEnaaF6UR37NSSiPCKeAz5OeqbFqxHxuYhYUf9HNDOzd6qeDu7lea0ASYMj4hFJ29TaqcGls/cArE1Blktn+y6Xjpo1Xj2TRbukEcD/A26U9AKdFp5r6a50VlJl6ew9ledlzYhyMzPrIfVUQ30s//gtSbcC6wPX1XuCzqWz3VwddFs6W+e5jiWti9B/+MZrs6uZmXWj3mqof5H0bxHxR9JaRF3/4m9URHm9HFFuZlaMeqqhvkmqWNoGuIAUCvhrUqpsd/s1JKJ8bT5MJUeUm5k1Tj1XFh8DDgReBoiIBUA9jXodEeV75cehzpG0H2mS2FvS34C98+9ExENAR0T5deSIcgBJ35XUDqwrqV3St9biM5qZ2TtUz2TxekQEef0g9z3UoyER5dl04AXgSeB3wLfrHIOZmTVAPZPFDEk/A0ZI+nfqf/hQIyPKf0pauN4qf324zs9nZmYNUE/p7GukCWIpad3iGxHxllymzhrVZyFpHjA8Iu4GkHQRcBBwbXfnd5+F9VXuM7Ei1HNlsSnwX6QO6puo4/nXnb3DiPLN8s+dt5uZWZPUnCwi4uukWz+/BKYCf5P0n5LeW88JGhBRXnf/hSPKzcyKUc9tKCIicsjfP0hrERsAl0m6MSK+2tV+DYoob88/d95ebZyOKDczK0DNKwtJn5c0G/gucBewfUQcB+wMfKKb/RoSUZ5vVS2TtEs+5icr9jEzsyao58piJPDxiHiqcmN+Pvf+3ex3FfARUkT5Hnnbz0mL29tK+jowC/i4pEHAl0nrI0uB54DP5Ijyw4ARpIXwV4GLqbG4bWZmjVXPmsU3Ok8UFa/N7WbX75KuPh6PiIkRMZG05vHliFgP+DxwR0QsAf49H+9dpFtRi4DrJW1EijHfNUedXwlckfs+zMysSepas3g7IuL2XAVVaRvg9vzzjcD1wGnAOFLPBRGxUNI/SREjATwWER3Pz7iJdOvr5lrnd+ms9VUunbUi1BUk2EB/JUWHABzC6gXtvwAflTQgr1fsnF97HBgraUx+pOpBrLkIbmZmTdDsyeJoUif3bFK+1Ot5+/mkqqdZwDnAn4AVEfECcBwp7uMO0pP2unxKnktnzcyKUdhtqGoi4hFgHwBJW5MWwMmPSf2/He+T9Cfgb/m1q0iL5R3Pq3izm+OvKp0dPGorr2uYmTVIUycLSZvkNYl+wNeB/87b1wUUES9L2pt0VfFwp302AI4HDq3nXI4oNzNrnMImC0mXkMpkR+Z48W8C60n6bH7L70jPx4AU+XG9pJXAfFK0eYcfSpqQfz49Ih4rasxmZlZdkWsWrwL9gUcjYnRE/JLUK7GYFE44ntXPxVhAXqcANga2AJA0DNiWtLbxOnCepHMKHLOZmVVR5GRxIW+NEv8FcFJEbA9cAXwlb+/os9ie9ECk70vqFxHLOno0cp/GU6QrEjMza6Ky91mseqyqpK1It6vuqOf87rPou9xnYNZ4Ze+zqHQEMN3d22ZmzVfqPotO+x4OXNLdwd1nYWZWDBX5D/V8G+rqiNiuymtbA7+OiMlVXvsT8OmK8tkJwKURsXW9525ra4tZs2a97bGbmfVFkmZHRFvn7U29spC0Sf7+lj4LSUPzz2v0WWRHUOOqwszMilPYZCHpCeAJYLykdknHAF+S9CqprPb9wGX57ZsBCyQtJ1U7/bjiOINIzXiflPSIpC6foWFmZsUo8sri34BJwEMVfRZ7AR+OiMGk53p3lM7uA/wux5C/F/hqvvoAOBX4UURsQaqa+mOBYzYzsypaoXT2aGBsfm0l8Hw953fpbN/l0lmzxit16aykEfn1MyTdJ+lSSZs2dcRmZlb60tkBwGjgrojYCbgbOLurg7t01sysGGWPKF8MvEKKBgG4FDimm+M7otzMrACtEFF+FSm99hZgCvBw1YN34ohyM7PGaYWI8q8B/5PTZheRqqzMzKyJSh1Rnv0K2BRYCWwELC9wzGZmVkWpI8or9juyIqp8YYFjNjOzKlqhz+JtcZ9F3+U+C7PGK3WfRcV+F0iaI+k0SWrecM3MDMrfZwHpFtT2wAfyV+Xi9xrcZ2FmVoyWiCiv2D4VaIuIE2qd2xHlZmZrryUjyvNtqZF5+0Bgf9KtLDMza6Ii+yyeAMYA/Sr6LLaW9HnSJPU00HGFsBkwS9Jg4A3g03n7YFL/xcB8rNeBw4oas5mZVVfqiPKIeDkidga+BVwN/CMi3ixwzGZmVkXpS2clrQecCBwLzKj3/H25dNalo2bWaK1QOnsG8H1SoKCZmfWAUpfOSpoIvC8irnjrod7KpbNmZsUoe0T5B4GdJc3LY91E0m0RsUcXx3dEuZlZAcoeUf4w8NP8njGkno096jmXI8rNzBqnFSLKzcyshxV5ZVEZUb4dgKQJwOHAeqyOKF/K6ojyNlZHlD+V97kOGJXHeoek/i6fNTNrrlaIKD80IiYA25EmkkMKHLOZmVVR+j6LiFhaMdZBQF0L1325z8KsL3OfUTFaoc8CSdcDC4FlwGXNG66ZmUHJ+yw6doqIfUnrFoNJkSFVuc/CzKwYrRZR/ilgkiPKzcyK0aoR5etJGpW3DwD2Ax5p5pjNzKz8EeUjgTmShuTf/wocWdSYzcysulJHlJMWtT+Wtw8n9W7sXeCYzcysilKXzkbEn4Fb8/bXJd0HjK7n/H25dNalg2bWaC1ROgsgaQRwAHlSMTOz5mmJ0tm8uH0J8KOIeLKrg7t01sysGGWPKO8wDfhbRJxT4/iOKDczK0DZI8qR9B1gfVZXSNXFEeVmZo1T6ohySaOBU0m9FfdJAjgvIn5R1LjNzOytilyzqIwo7yidvQ1YDLzG6ohyWB1RvoLVEeVERDvwn6RI8/dFxERPFGZmzdcKEeVXAW+JBDEzs+YpdZ8FKaL8HoB8C6pufbnPwvo299lYEVqmz8LMzHpOS/RZ1Mt9FmZmxWiZiHJJL0XEevWe2xHlZmZrryUjyps5NjMz61phk0WOKH8CGC+pXdIxwJckvUoqq30/qx+RuhmwQNJyUv/FjyuO8ytJbwBDJS2T9K2ixmxmZtWVPaIcYFtg9zzWO4F7CxyzmZlVUerSWUnPAMMj4m4ASRcBBwHX1jq/S2f7LpeOmjVe2UtnNyNVSXVoz9vMzKyJyl46W60Tr8vyLZfOmpkVo+wR5S+w5pPxRpNypLo6viPKzcwK0AoR5csk7UJa2P4kcG4953JEuZlZ45Q6ojw7jhRKuA5pYbvm4raZmTVWWSLK5wP3kNYpNgP+d8Vx3gusBJYDr0SRLedmZlZVWSLKDwEG5+07A5+RNEbSRsD3gCkRMR7YVNKUAsdsZmZVlKXPIkgd2gNIt5teB5aSrioei4hFeZ+bgE+QezK64z6Lvst9FmaNV5Y+i8uAl4FngaeBsyNiCfA4MDZfZQwgNeQ5utzMrMnK0mcxGXgTeDfpkapfkrRlRLxAWuCeDtwBzKOb6HL3WZiZFaMUfRakBe3rIuINYKGku0hPynsyIq4iPVoVSceSJpWujr+qz6KtrS1cOmtm1hiliCgn3XraS8lQYBfgkU77bAAcT1okNzOzJipLRPnPgD3z9iXAvIh4IL92RY4ufzb/vqSoMZuZWXVliSjfD7gnR5RvAIyrWNTeChidX7sBOKHAMZuZWRVlL51V/hoqaTEwnFQhVZNLZ836JpdOF6PUpbN5wfs44EFSgOA44JdNHbGZmZW7dFbSQNJksWN+7QHg5K4O7tJZM7NiNHWyiIhHImKfiNgZuIS0AA4VpbMRsRDoKJ2dmPd7ImdCzQB27eb40yKiLSLa+q+7fpEfxcysTylFRDmrS2d/DaxLKp09B3ietNi9cY782BuYW8+5HFFuZtY4ZYko/3H++a+kBe0LOkpnJX0buF3SG8BTwNSixmxmZtWVJaL8NeAVUhS5WHMSO58U9TGEVEa7R4FjNjOzKkodUZ5fOxVYGBFbk6qh/ljgmM3MrIqy91lAqqAam4+5krSOUZP7LKyvcp+BFaHUfRaSRuTXz5B0n6RLJW3azAGbmVnJ+yxIVz6jgbsiYifgbuDsrg7uPgszs2KoyEda59tQV0fEdlVe2xr4dURMlvRjUjbU/+TXzgeuAy4FXgKGRcRKSe8h9WOMr3Xutra2mDVrVgM/jZlZ7ydpdkS0dd5e6ojy3Ih3FasroKYADzdzzGZmVmyfxRPAGKBfRZ/F1pI+T5qknmZ1guzPgDmkclsBN1dElG8OXC1JpKfk7V7UmM3MrLpSR5Tn114Ddo+IdSJiWETcX+CYzcysilYonX1b+nLprEsnzazRSl06W7HfBZLmSDot344yM7MmKnvpLMCRubP7A/nrqK4O7tJZM7NilD2inIiYn78vA35Dmli6Or4jys3MClDqiPK8hjEiIp7PD0LaH7ipnnM5otzMrHEKu7LIEeV3A9tIapd0DHCEpMeAR0iPSa2MKF+PtKYxk9UR5YOB6yU9QCqtnQ/8vKgxm5lZdYV2cPckScuAR3t6HD1kJHUGLvZS/vz+/P78b9/mEbFx541NvQ3VZI9Wa1nvCyTN6qufHfz5/fn9+Yv4/M2uhjIzsxbkycLMzGrqzZPFtJ4eQA/qy58d/Pn9+fu2Qj5/r13gNjOzxunNVxZmZtYgvW6ykPRhSY9KelzSST09nmaSdL6khZL+2tNj6QmS3iPpVklzJT0k6Qs9PaZmkjRE0p8l/SV//m/39JiaTVJ/SfdLurqnx9JskuZJejDn6DX8yW+96jaUpP7AY8DeQDupwe+IiOgTD0yStDvpyYIXVXs6YW8naRQwKiLukzQMmA0c1If+/wsYGhEv5cSDO4EvRMQ9PTy0ppF0IikqaHhE7N/T42kmSfOAtogopMekt11ZTAYej4gnI+J14LfAR3t4TE0TEbcDS2q+sZeKiGcj4r788zJgLrBZz46qeSJ5Kf86MH/1nn8N1iBpNPAR4Bc9PZbeqLdNFpsBz1T83k4f+svCVsvPUtkRuLeHh9JU+TbMHGAhcGNE9KXPfw7wVWBlD4+jpwRwg6TZko5t9MF722RR7VkXfeZfVpZIWg+4HPhiRLyjh2i1moh4MyImAqOByZL6xO1ISfsDCyNidk+PpQftFhE7Af9KehREQx9B3dsmi3ZWP1AJ0h+YBT00FusB+V795cDFEfG7nh5PT4mIfwK3AR/u2ZE0zW7Agfm+/W9ZnWLdZ0TEgvx9IXAF3TzO4e3obZPFTGArSVtIGgQcDlzZw2OyJskLvL8E5kbED3p6PM0maWNJI/LP6wAfIiU893oRcXJEjI6IMaQ/97dExP/p4WE1jaShuagDSUOBfUgp3g3TqyaLiFgBnEB6tvdcYEZEPNSzo2qeLmLh+5LdSE9S3CuXD86RtF9PD6qJRgG35kj/maQ1iz5XQtpHbQrcKekvwJ+BP0TEdY08Qa8qnTUzs2L0qisLMzMrhicLMzOryZOFmZnV5MnCzMxq8mRhZmY1ebIwM7OaPFmYmVlNnizMzKym/w/EGU9ATSN/OgAAAABJRU5ErkJggg==", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "cast[cast['name']=='Keanu Reeves'].groupby(by=['year']).agg({'character':'count'}).plot(kind='barh')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section II - Q6: Plot the cast positions (n-values) of Keanu Reeve's roles through his career over the years.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 57, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "year n \n", + "1985 1.0 1\n", + "1986 2.0 1\n", + " 3.0 1\n", + " 12.0 1\n", + "1988 1.0 1\n", + " 2.0 1\n", + " 5.0 1\n", + " 6.0 1\n", + "1989 1.0 1\n", + " 8.0 1\n", + "1990 2.0 1\n", + " 6.0 1\n", + "1991 1.0 1\n", + " 2.0 2\n", + "1992 4.0 1\n", + "1993 1.0 1\n", + " 5.0 1\n", + " 13.0 1\n", + "1994 1.0 1\n", + "1995 1.0 2\n", + "1996 1.0 2\n", + "1997 1.0 1\n", + " 2.0 1\n", + "1999 1.0 1\n", + " 42.0 2\n", + "2000 1.0 1\n", + " 3.0 2\n", + "2001 1.0 2\n", + "2003 3.0 1\n", + " 33.0 1\n", + " 59.0 1\n", + "2005 1.0 1\n", + " 17.0 1\n", + " 21.0 1\n", + "2006 1.0 1\n", + " 4.0 1\n", + "2008 1.0 2\n", + "2009 16.0 1\n", + "2010 1.0 1\n", + "2013 1.0 1\n", + " 2.0 1\n", + "2014 1.0 1\n", + "2015 1.0 1\n", + "2016 1.0 1\n", + " 4.0 1\n", + " 8.0 2\n", + " 25.0 1\n", + "2017 1.0 1\n", + " 11.0 1\n", + "2018 1.0 1\n", + "dtype: int64" + ] + }, + "execution_count": 57, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "cast[cast['name']=='Keanu Reeves'].groupby(by=['year','n']).size()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section II - Q7: Plot the number of \"Hamlet\" films made by each decade" + ] + }, + { + "cell_type": "code", + "execution_count": 71, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 71, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAEWCAYAAAB/tMx4AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAYCUlEQVR4nO3df5RedWHn8ffHJDT8CJWESRoNGGQDlGYxyoAoWyvyQxRr4tGI7BYDBzbtUhZqW7up9li1Z3dzate1HotuTnHJdouCAZoc5VBCWqy6iBlCQH4awYiRSKYRBEHk12f/uDcwDDPMMzPPc3O/yed1zpz7PN/nx/3MTPKZ+3zvc58r20RERHlesbsDRETExKTAIyIKlQKPiChUCjwiolAp8IiIQqXAIyIKNbXJlR188MGeP39+k6uMiCjeLbfc8q+2+4aPN1rg8+fPZ2BgoMlVRkQUT9IPRxrPFEpERKFS4BERhUqBR0QUqtE58IiIyXr66afZtm0bTz755O6O0nXTp09n3rx5TJs2raP7p8Ajoijbtm1jxowZzJ8/H0m7O07X2Gbnzp1s27aNww47rKPHZAolIory5JNPMmvWrD2qvAEkMWvWrHG9skiBR0Rx9rTy3mW831cKPCJiHB555BEuueQSAB588EHe9773AbB582auvfba5+932WWXceGFF/Y0S+bAI2Lc5q/42qQev3XlGV1KMvksw42VbVeBX3DBBbzqVa9izZo1QFXgAwMDvPOd7+xqnpeTLfCIiHFYsWIF9913H4sWLWLp0qUsXLiQp556io997GNcccUVLFq0iCuuuOJFjxkcHOS9730vxx13HMcddxzf+ta3upIlBR4RMQ4rV67k8MMPZ/PmzXzqU58CYJ999uGTn/wkZ555Jps3b+bMM8980WMuvvhiPvShD7Fx40auuuoqzj///K5kyRRKRESP3XDDDdx1113PX3/00Ud57LHHmDFjxqSet6MCl/Qh4HzAwHeBc4H9gCuA+cBW4P22H55UmoiIPdBzzz3HTTfdxL777tvV5x1zCkXSq4GLgH7bC4EpwAeAFcAG2wuADfX1iIg92owZM3jsscc6Hgc47bTT+NznPvf89c2bN3clS6dz4FOBfSVNpdryfhBYDKyub18NLOlKooiIFps1axYnnngiCxcu5MMf/vDz4yeddBJ33XXXiDsxP/vZzzIwMMAxxxzD0UcfzRe+8IWuZBlzCsX2jyX9FfAA8AvgetvXS5pje3t9n+2SZnclUUTEOHTzLYmduvzyy18yNnPmTDZu3PiisXPOOQeAgw8++CWl3g2dTKEcRLW1fRjwKmB/Sb/T6QokLZc0IGlgcHBw4kkjIuJFOplCOQX4ge1B208DVwNvBh6SNBegXu4Y6cG2V9nut93f1/eSMwJFRMQEdVLgDwAnSNpP1YH6JwN3A+uAZfV9lgFrexMxIiJG0skc+M2S1gCbgGeAW4FVwAHAlZLOoyr5pb0MGhGxi+098gOtbI/r/h29D9z2nwN/Pmz4l1Rb4xERjZk+fTo7d+7c4z5SdtfngU+fPr3jx+RIzIgoyrx589i2bRt74psidp2Rp1Mp8IgoyrRp0zo+Y82eLh9mFRFRqBR4REShUuAREYVKgUdEFCoFHhFRqBR4REShUuAREYVKgUdEFCoFHhFRqBR4REShUuAREYVKgUdEFCoFHhFRqBR4REShOjmp8ZGSNg/5elTSH0iaKWm9pC318qAmAkdERGXMArd9r+1FthcBxwJPANcAK4ANthcAG+rrERHRkPFOoZwM3Gf7h8BiYHU9vhpY0sVcERExhvEW+AeAL9WX59jeDlAvZ3czWEREvLyOC1zSPsC7ga+MZwWSlksakDSwJ57DLiJidxnPFvg7gE22H6qvPyRpLkC93DHSg2yvst1vu7+vr29yaSMi4nnjKfCzeGH6BGAdsKy+vAxY261QERExto4KXNJ+wKnA1UOGVwKnStpS37ay+/EiImI0Uzu5k+0ngFnDxnZSvSslIiJ2gxyJGRFRqBR4REShUuAREYVKgUdEFCoFHhFRqBR4REShUuAREYVKgUdEFCoFHhFRqBR4REShUuAREYVKgUdEFCoFHhFRqBR4REShUuAREYVKgUdEFKrTM/K8UtIaSfdIulvSmyTNlLRe0pZ6eVCvw0ZExAs63QL/a+A620cBrwPuBlYAG2wvADbU1yMioiFjFrikA4G3AJcC2H7K9iPAYmB1fbfVwJLeRIyIiJF0sgX+WmAQ+N+SbpX0t5L2B+bY3g5QL2f3MGdERAzTSYFPBd4AfN7264HHGcd0iaTlkgYkDQwODk4wZkREDNdJgW8Dttm+ub6+hqrQH5I0F6Be7hjpwbZX2e633d/X19eNzBERQbV1/bJs/0TSjyQdafte4GTgrvprGbCyXq7tadKIiBaav+Jrk3r81pVnTPixYxZ47T8Dfy9pH+B+4FyqrfcrJZ0HPAAsnXCKiIgYt44K3PZmoH+Em07uapqIiOhYjsSMiChUCjwiolCdzoFH7DaT3UkEk9tRFNFW2QKPiChUCjwiolAp8IiIQqXAIyIKlZ2YEVGk7NzOFnhERLFS4BERhUqBR0QUKgUeEVGoFHhERKFS4BERhUqBR0QUKgUeEVGojg7kkbQVeAx4FnjGdr+kmcAVwHxgK/B+2w/3JmZERAw3ni3wk2wvsr3rzDwrgA22FwAbGMeZ6iMiYvImM4WyGFhdX14NLJl0moiI6FinBW7gekm3SFpej82xvR2gXs7uRcCIiBhZpx9mdaLtByXNBtZLuqfTFdSFvxzg0EMPnUDEiIgYSUdb4LYfrJc7gGuA44GHJM0FqJc7RnnsKtv9tvv7+vq6kzoiIsYucEn7S5qx6zJwGnAHsA5YVt9tGbC2VyEjIuKlOplCmQNcI2nX/S+3fZ2kjcCVks4DHgCW9i5mREQMN2aB274feN0I4zuBk3sRKiIixpYjMSMiCpUCj4goVAo8IqJQKfCIiEKlwCMiCpUCj4goVAo8IqJQKfCIiEKlwCMiCpUCj4goVAo8IqJQKfCIiEKlwCMiCpUCj4goVAo8IqJQKfCIiEJ1XOCSpki6VdJX6+szJa2XtKVeHtS7mBERMdx4tsAvBu4ecn0FsMH2AmBDfT0iIhrSUYFLmgecAfztkOHFwOr68mpgSVeTRUTEy+p0C/wzwJ8Azw0Zm2N7O0C9nD3SAyUtlzQgaWBwcHAyWSMiYogxC1zSu4Adtm+ZyApsr7Ldb7u/r69vIk8REREjGPOs9MCJwLslvROYDhwo6f8CD0maa3u7pLnAjl4GjYiIFxtzC9z2n9qeZ3s+8AHgn2z/DrAOWFbfbRmwtmcpIyLiJSbzPvCVwKmStgCn1tcjIqIhnUyhPM/2jcCN9eWdwMndjxQREZ3IkZgREYVKgUdEFCoFHhFRqBR4REShUuAREYVKgUdEFCoFHhFRqBR4REShUuAREYVKgUdEFCoFHhFRqBR4REShUuAREYVKgUdEFCoFHhFRqBR4REShOjmp8XRJ35F0m6Q7JX2iHp8pab2kLfXyoN7HjYiIXTrZAv8l8DbbrwMWAadLOgFYAWywvQDYUF+PiIiGdHJSY9v+eX11Wv1lYDGwuh5fDSzpRcCIiBhZR3PgkqZI2gzsANbbvhmYY3s7QL2cPcpjl0sakDQwODjYpdgREdFRgdt+1vYiYB5wvKSFna7A9irb/bb7+/r6JhgzIiKGG9e7UGw/QnVW+tOBhyTNBaiXO7odLiIiRtfJu1D6JL2yvrwvcApwD7AOWFbfbRmwtkcZIyJiBFM7uM9cYLWkKVSFf6Xtr0q6CbhS0nnAA8DSHuaMiIhhxixw27cDrx9hfCdwcjfDzF/xtUk/x9aVZ3QhSURE++VIzIiIQqXAIyIKlQKPiChUCjwiolAp8IiIQqXAIyIKlQKPiChUCjwiolAp8IiIQnVyKH3sBjkqNSLGki3wiIhCpcAjIgqVAo+IKFQKPCKiUCnwiIhCpcAjIgrVySnVDpH0z5LulnSnpIvr8ZmS1kvaUi8P6n3ciIjYpZMt8GeAP7L968AJwO9LOhpYAWywvQDYUF+PiIiGjFngtrfb3lRffgy4G3g1sBhYXd9tNbCkRxkjImIE45oDlzSf6vyYNwNzbG+HquSB2aM8ZrmkAUkDg4ODk4wbERG7dFzgkg4ArgL+wPajnT7O9irb/bb7+/r6JpIxIiJG0FGBS5pGVd5/b/vqevghSXPr2+cCO3oTMSIiRjLmh1lJEnApcLftTw+5aR2wDFhZL9f2JGFES0z2A8by4WLRbZ18GuGJwNnAdyVtrsc+QlXcV0o6D3gAWNqThBERMaIxC9z2NwGNcvPJ3Y0TERGdypGYERGFSoFHRBQqBR4RUagUeEREoVLgERGFSoFHRBQqBR4RUagUeEREoVLgERGFSoFHRBQqBR4RUagUeEREoVLgERGFSoFHRBQqBR4RUagUeEREocYscElflLRD0h1DxmZKWi9pS708qLcxIyJiuE62wC8DTh82tgLYYHsBsKG+HhERDRqzwG3/C/DTYcOLgdX15dXAku7GioiIsUx0DnyO7e0A9XL2aHeUtFzSgKSBwcHBCa4uIiKG6/lOTNurbPfb7u/r6+v16iIi9hoTLfCHJM0FqJc7uhcpIiI6MXWCj1sHLANW1su1XUvUAvNXfG3Sz7F15RldSBIRMbpO3kb4JeAm4EhJ2ySdR1Xcp0raApxaX4+IiAaNuQVu+6xRbjq5y1kiImIcciRmREShUuAREYVKgUdEFCoFHhFRqBR4REShUuAREYVKgUdEFGqiR2LGXiJHpUa0V7bAIyIKlQKPiChUCjwiolAp8IiIQmUnZkRBslM5hsoWeEREoVLgERGFSoFHRBRqUgUu6XRJ90r6vqQV3QoVERFjm3CBS5oC/A3wDuBo4CxJR3crWEREvLzJbIEfD3zf9v22nwK+DCzuTqyIiBiLbE/sgdL7gNNtn19fPxt4o+0Lh91vObC8vnokcO/E4wJwMPCvk3yOyWpDBmhHjjZkgHbkaEMGaEeONmSAduToRobX2O4bPjiZ94FrhLGX/DWwvQpYNYn1vHil0oDt/m49X6kZ2pKjDRnakqMNGdqSow0Z2pKjlxkmM4WyDThkyPV5wIOTixMREZ2aTIFvBBZIOkzSPsAHgHXdiRUREWOZ8BSK7WckXQj8IzAF+KLtO7uWbHRdm46ZhDZkgHbkaEMGaEeONmSAduRoQwZoR46eZZjwTsyIiNi9ciRmREShUuAREYVKgUdEFCoFHhFRqJzQISL2CJLeDiwBXk11UOGDwFrb1+2pOVr/LpQ2/FLakKEtOdqQoS052pChLTl2dwZJnwGOAP4P1UGGUB1c+EFgi+2L98QcrS7wNvxS2pChLTnakKEtOdqQoS05WpLhe7aPGGFcwPdsL+h1ht2Sw3Zrv+pveKRxUf3D2CsytCVHGzK0JUcbMrQlR0sy3A4cP8L48cB3G/x9NJqj7XPgT0o63vZ3ho0fBzy5F2VoS442ZGhLjjZkaEuONmQ4B/i8pBm88CrgEODR+ramNJqj7VMobwA+D4z0w7jA9i17Q4a25GhDhrbkkHQscMnuzFDnaMPPYrdnGJLl16jm4QVss/2Tpta9O3K0usB3acMvpQ0Z2pKjDRnakqMNGdqSY3dnqOeZj+fFO1K/45aUnKSjbN/T1edsyfc2br34YYyxvmm2nx42drDt3fZh8ZIusH3Jblz/AVQ7r+63/UiD690HeHrXf0xJJwFvAO50c+96OMb27U2sayySDgUetf2IpPlAP3C3m/lwuaE5+qm2vJ+hmvtu8v/naVSviLYAP66H5wH/hupVwPVNZRmNpAdsH9rV5yy4wLv+wxhlPScBfwf8CnArsNz21vq2Tbbf0OsM9br+cIThjwD/DcD2pxvIcIntC+rL/w64HLiP6j/J79q+ttcZ6nXfBrzV9sOSPgy8B7gW+C1gwPafNpDhWeAHwJeAL9m+q9frHCXHCuB3gV8CfwX8MfAt4ATg0ob+XfwW8D+AR4Bj6/UfBDwNnG37Rw1kuBt4x67/m0PGDwOutf3rvc5Qr++zo90ELLN9YDfX1+qdmGP8MF7ZUIy/BN5u+876NHLrJZ1t+9uMfFaiXvkEVUndOWS9U6jmHZtywpDLfwEssb1J0muBK+t8TZhi++H68pnAb9r+haSVwCag5wVO9W6Ds4GzgHWSHqcq8y8PL5EeO5vqpOL7AVuB19oelLQ/cDPQ8wIHPgOcVq/3MODTtk+UdCpwKXBaAxmm8sL8+1A/BqY1sP5dzgX+iOoP6nBndXtlrS5wGv5hjGKfXS9Fba+p/9JfXW/5NPny5Teo/jPuD3zC9hOSltn+RIMZhjrQ9iYA2/dLmtLguh+VtND2HVTnGpwO/ILq33NTHw/hev0fBT4q6Xiqk5p8Q9KPbL+5oRzP1n+8nqL6Geyswz1eTQk3YortwfryA8Br6gzr6/eIN+GLwEZJXwZ2bfEfQvU7ubShDFCd6OYO2/9v+A2SPt7tlbV6CkXSPwF/NsoP4we2D2sgwwDwrqE7ZCTNA74KHG67yS1gJC0G/gT4n8Bf2n5tg+t+Avg+1SuA+cCh9TTGK4DbbS9sKMcxVNNat9VDJwJfB46h2vq7vIEMt9p+/QjjAt5i++u9zlCv7zJgH6o/7E9QzT9fB7wNmGH7/Q1k+CLVxswGYDHwY9t/KGk/YJPto3qdoc5xNPBuhuxIBdY1Ob0laSbwpO0nGllfywu80R/GKBlOAQZt3zZs/JXA79v+r7sh0/7Ax4E32n5Lg+t9zbCh7bafknQwVWld3WCWKVQvzY/ghZfP/9jUzlRJ/76JPxQd5JgKLKUq0DXAG6lenT4A/I3txxvIMA34j1RTObdRnZ3rWUn7ArNt/7DXGfZWrS7wiIhOSPpVqn0fS4C+engHsBZY2eAf9kZztPrjZCUdIOmTku6U9DNJg5K+LemcvSlDW3K0IUNbcrQhwxg5lu2GDHfsxp/FlcDDVO9OmmV7FnAS1TtjvtKCHA/3Ikert8AlrQWuAW4A3k81z/dl4M+o5tk+sjdkaEuONmRoS442ZGhLjpZkuNf2keO9rfgcbuhDXibyBdw27PrGevkK4J69JUNbcrQhQ1tytCFDW3K0JMP1VDv35wwZmwP8F+CGBn8fjeZo9RQK8LiqA0aQ9NvATwFsP0dz78FuQ4a25GhDhrbkaEOGtuRoQ4YzgVnA1yU9LOmnwI3ATKpXBU1pNkdTf5km+NfsGOA7VPNY3wSOqMf7gIv2lgxtydGGDG3J0YYMbcnRhgz1+o4CTgEOGDZ+elMZms7R2DfVgx/SucnQnhxtyNCWHG3I0JYcTWUALgLuBf6B6ojUxUNu29Tg99tojlbvxHw5auizUNqeoS052pChLTnakKEtOZrKIOm7wJts/1zVB3qtAf7O9l9rlIOu9oQcrT6UXtJon/Ymqh0De0WGtuRoQ4a25GhDhrbkaEMGqsP5fw5ge6uktwJrVB181uQ+iUZztLrAqX75b6d6D+VQAl5yeP0enKEtOdqQoS052pChLTnakOEnkhbZ3gxQbwG/i+ozUv5tQxkaz9H2Av8q1Y6AzcNvkHTjXpShLTnakKEtOdqQoS052pDhg1SfA/M8288AH5T0vxrK0HiOYufAIyL2dm1/H3hERIwiBR4RUagUeOxxJH1c0h/36LnPkfS5Xjx3xHilwCMiCpUCjz2CpI9KulfSDcCR9djhkq6TdIukb0g6qh6fI+kaSbfVX2+ux/+hvu+dkpYPee5zJX1P0tepzv6za7xP0lWSNtZfJxLRoLwLJYon6VjgMqqz0UylOrHxF4B3AL9ne4ukNwL/3fbbJF0B3GT7M6rO7HOA7Z9Jmmn7p6rOJLOR6iz3+1CdHPhY4GfAPwO32r5Q0uXAJba/KelQqjMCNXL28who//vAIzrxm8A1rk+9J2kd1YmO3wx8RS+c3PdX6uXbqN6vi+1nqYoZ4CJJ76kvHwIsAH4NuNH1SXvr8j+ivs8pwNFDnv9ASTNsP9b17zBiBCnw2FMMfyn5CuAR24s6eXB9yPMpVJ9j8UR9AMr0UZ576DreZPsX4w0b0Q2ZA489wb8A75G0r6QZwG9TnaH9B5KWQnW2eEmvq++/AfhP9fgUSQcCvwo8XJf3UcAJ9X1vBt4qaZaqk/cuHbLe64ELd12RtKhn32HECFLgUTzbm4ArgM3AVcA36pv+A3CepNuAO4HF9fjFwEn1J8fdAvwGcB0wtf5gpr8Avl0/93bg48BNVKcM2zRk1RcB/ZJul3QX8Hs9+hYjRpSdmBERhcoWeEREoVLgERGFSoFHRBQqBR4RUagUeEREoVLgERGFSoFHRBQqBR4RUaj/D7mTiaUcT+qcAAAAAElFTkSuQmCC", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "cast['decade']=cast['year'].apply(lambda x: int(x/10.0)*10)\n", + "cast[cast['title']=='Hamlet'].groupby('decade').agg('count').plot(kind='bar',y='title')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section II - Q8: \n", + "\n", + "(A) How many leading roles were available to both actors and actresses, in the 1960s (1960-1969)?\n", + "\n", + "(B) How many leading roles were available to both actors and actresses, in the 2000s (2000-2009)?\n", + "\n", + "- Hint: A specific value of n might indicate a leading role" + ] + }, + { + "cell_type": "code", + "execution_count": 86, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "9908" + ] + }, + "execution_count": 86, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(cast[(cast['decade']==1960) & (cast['n']==1.0)].character.unique())" + ] + }, + { + "cell_type": "code", + "execution_count": 87, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "19154" + ] + }, + "execution_count": 87, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(cast[(cast['decade']==2000) & (cast['n']==1.0)].character.unique())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section II - Q9: List, in order by year, each of the films in which Frank Oz has played more than 1 role." + ] + }, + { + "cell_type": "code", + "execution_count": 111, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
character
yeartitle
1979The Muppet Movie8
1981An American Werewolf in London2
The Great Muppet Caper6
1982The Dark Crystal2
1984The Muppets Take Manhattan7
1985Follow That Bird3
1992The Muppet Christmas Carol7
1996Muppet Treasure Island4
1999Muppets from Space4
The Adventures of Elmo in Grouchland3
\n", + "
" + ], + "text/plain": [ + " character\n", + "year title \n", + "1979 The Muppet Movie 8\n", + "1981 An American Werewolf in London 2\n", + " The Great Muppet Caper 6\n", + "1982 The Dark Crystal 2\n", + "1984 The Muppets Take Manhattan 7\n", + "1985 Follow That Bird 3\n", + "1992 The Muppet Christmas Carol 7\n", + "1996 Muppet Treasure Island 4\n", + "1999 Muppets from Space 4\n", + " The Adventures of Elmo in Grouchland 3" + ] + }, + "execution_count": 111, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "temp1=cast[(cast['name']=='Frank Oz')].groupby(['year','title']).agg({'character':'count'})\n", + "temp1[temp1['character']>1]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section II - Q10: List each of the characters that Frank Oz has portrayed at least twice" + ] + }, + { + "cell_type": "code", + "execution_count": 117, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "character\n", + "Animal 6\n", + "Bert 3\n", + "Cookie Monster 5\n", + "Fozzie Bear 4\n", + "Grover 2\n", + "Miss Piggy 6\n", + "Sam the Eagle 5\n", + "Yoda 6\n", + "Name: title, dtype: int64" + ] + }, + "execution_count": 117, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "temp2=cast[(cast['name']=='Frank Oz')].groupby('character').count()\n", + "temp2[temp2['title']>1].title" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Section III - Advanced Merging, Querying and Visualizations" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Make a bar plot with the following conditions\n", + "- Frequency of the number of movies with \"Christmas\" in their title \n", + "- Movies should be such that they are released in the USA.\n", + "- Show the frequency plot by month" + ] + }, + { + "cell_type": "code", + "execution_count": 120, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " title year country date\n", + "1237 12 Dog Days of Christmas 2014 USA 2014-11-28\n", + "1238 12 Dogs of Christmas: Great Puppy Rescue 2012 USA 2012-10-09\n", + "2653 2016 Dancing Dolls a Christmas Story 2017 USA 2017-01-15\n", + "6183 A Bad Moms Christmas 2017 USA 2017-11-01\n", + "6286 A Belle for Christmas 2014 USA 2014-11-04\n", + "... ... ... ... ...\n", + "418628 The Shootin' It Christmas Spectacular 2013 USA 2013-12-20\n", + "432835 This Christmas 2007 USA 2007-11-21\n", + "463173 What She Wants for Christmas 2012 USA 2012-12-01\n", + "465022 White Christmas 1954 USA 1954-10-14\n", + "474305 You Can't Fight Christmas 2017 USA 2017-11-01\n", + "\n", + "[137 rows x 4 columns]\n" + ] + }, + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 120, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD7CAYAAABzGc+QAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAN/0lEQVR4nO3dbZCd5V3H8e+PQLU8yMOwSaNQ1unEWqpT6uwgMziWNoCpqQ2dFqcw1tWiGUc64PjQieWVrwyOj+PDi0yBRvuAYEsTYazQYOpUa2EDyMOEmooRO4RkS1stHacK/fvi3JkJmw17snvOnXPJ9zOzc9/3dc7Z6zebzW/vc51z76aqkCS156QTHUCStDwWuCQ1ygKXpEZZ4JLUKAtckhplgUtSo07uc7Jzzz23pqen+5xSkpq3Z8+er1bV1MLxXgt8enqaubm5PqeUpOYl+ffFxl1CkaRGWeCS1CgLXJIaZYFLUqMscElqlAUuSY2ywCWpURa4JDWq1wt5JOn/m+kt96zo8fu3blz2Yz0Dl6RGWeCS1CgLXJIaZYFLUqMscElqlAUuSY2ywCWpURa4JDXKApekRlngktQoC1ySGmWBS1KjLHBJapQFLkmNssAlqVEWuCQ1ygKXpEZZ4JLUKAtckhplgUtSoyxwSWrUUH+VPsl+4JvAi8ALVTWT5BzgL4FpYD/w01X19fHElCQtdDxn4G+tqouqaqY73gLsqqp1wK7uWJLUk5UsoWwCtnf724GrVpxGkjS0YQu8gHuT7EmyuRtbU1UHALrt6sUemGRzkrkkc/Pz8ytPLEkChlwDBy6tqmeSrAbuS/LksBNU1TZgG8DMzEwtI6MkaRFDnYFX1TPd9hBwF3AxcDDJWoBue2hcISVJR1uywJOcluSMw/vAlcDjwE5gtrvbLLBjXCElSUcbZgllDXBXksP3/3hVfSbJg8AdSa4DngauHl9MSdJCSxZ4VT0FvGmR8eeA9eMIJUlamldiSlKjLHBJapQFLkmNssAlqVEWuCQ1atgrMSVpokxvuWfFn2P/1o0jSHLieAYuSY2ywCWpURa4JDXKApekRlngktQoC1ySGmWBS1KjLHBJapQFLkmNssAlqVEWuCQ1ygKXpEZZ4JLUKAtckhplgUtSoyxwSWqUBS5JjbLAJalRFrgkNcoCl6RGWeCS1KihCzzJqiQPJ7m7Oz4nyX1J9nXbs8cXU5K00PGcgd8I7D3ieAuwq6rWAbu6Y0lST4Yq8CTnARuBDx8xvAnY3u1vB64aaTJJ0ssa9gz8D4EPAt85YmxNVR0A6LarRxtNkvRylizwJO8ADlXVnuVMkGRzkrkkc/Pz88v5FJKkRQxzBn4p8M4k+4Hbgbcl+ShwMMlagG57aLEHV9W2qpqpqpmpqakRxZYkLVngVfWbVXVeVU0D7wXur6qfAXYCs93dZoEdY0spSTrKSt4HvhW4Isk+4IruWJLUk5OP585VtRvY3e0/B6wffSRJ0jC8ElOSGmWBS1KjLHBJapQFLkmNssAlqVEWuCQ1ygKXpEZZ4JLUKAtckhplgUtSoyxwSWqUBS5JjbLAJalRFrgkNcoCl6RGWeCS1CgLXJIaZYFLUqMscElqlAUuSY2ywCWpURa4JDXKApekRlngktQoC1ySGmWBS1KjLHBJapQFLkmNWrLAk3x3kgeS/HOSJ5L8Vjd+TpL7kuzrtmePP64k6bBhzsC/Dbytqt4EXARsSHIJsAXYVVXrgF3dsSSpJ0sWeA083x2e0n0UsAnY3o1vB64aR0BJ0uKGWgNPsirJI8Ah4L6q+iKwpqoOAHTb1cd47OYkc0nm5ufnRxRbkjRUgVfVi1V1EXAecHGSHxp2gqraVlUzVTUzNTW1zJiSpIWO610oVfUNYDewATiYZC1Atz006nCSpGMb5l0oU0nO6vZfDVwOPAnsBGa7u80CO8aUUZK0iJOHuM9aYHuSVQwK/46qujvJF4A7klwHPA1cPcackqQFlizwqnoUePMi488B68cRSpK0NK/ElKRGWeCS1CgLXJIaZYFLUqMscElqlAUuSY2ywCWpURa4JDXKApekRlngktQoC1ySGmWBS1KjLHBJapQFLkmNssAlqVEWuCQ1ygKXpEZZ4JLUKAtckhplgUtSoyxwSWqUBS5JjbLAJalRFrgkNcoCl6RGWeCS1CgLXJIatWSBJzk/yd8l2ZvkiSQ3duPnJLkvyb5ue/b440qSDhvmDPwF4Neq6g3AJcD1SS4EtgC7qmodsKs7liT1ZMkCr6oDVfVQt/9NYC/wfcAmYHt3t+3AVWPKKElaxHGtgSeZBt4MfBFYU1UHYFDywOpjPGZzkrkkc/Pz8yuMK0k6bOgCT3I68EngV6rqv4Z9XFVtq6qZqpqZmppaTkZJ0iKGKvAkpzAo749V1ae64YNJ1na3rwUOjSeiJGkxw7wLJcAtwN6q+v0jbtoJzHb7s8CO0ceTJB3LyUPc51LgfcBjSR7pxj4EbAXuSHId8DRw9VgSSpIWtWSBV9XngRzj5vWjjSNJGpZXYkpSoyxwSWqUBS5JjbLAJalRFrgkNcoCl6RGWeCS1CgLXJIaZYFLUqMscElqlAUuSY2ywCWpURa4JDXKApekRlngktQoC1ySGjXMX+SRpJeY3nLPih6/f+vGESV5ZfMMXJIaZYFLUqMscElqlAUuSY2ywCWpURa4JDXKApekRlngktQoC1ySGmWBS1KjlizwJLcmOZTk8SPGzklyX5J93fbs8caUJC00zBn4R4ANC8a2ALuqah2wqzuWJPVoyQKvqr8HvrZgeBOwvdvfDlw12liSpKUsdw18TVUdAOi2q0cXSZI0jLG/iJlkc5K5JHPz8/Pjnk6SXjGWW+AHk6wF6LaHjnXHqtpWVTNVNTM1NbXM6SRJCy23wHcCs93+LLBjNHEkScMa5m2EnwC+ALw+yVeSXAdsBa5Isg+4ojuWJPVoyT+pVlXXHOOm9SPOIkk6Dl6JKUmNssAlqVEWuCQ1ygKXpEZZ4JLUKAtckhplgUtSoyxwSWqUBS5JjVrySsw+TW+5Z8WfY//WjSNIIkmTzzNwSWqUBS5JjbLAJalRFrgkNcoCl6RGWeCS1CgLXJIaZYFLUqMscElqlAUuSY2aqEvpJ4WX9GsxK/2+GMX3hN+bOpJn4JLUKAtckhplgUtSo1wD18Rz3VdanGfgktQoC1ySGuUSyoSalGWDSckh6WgrOgNPsiHJl5J8OcmWUYWSJC1t2QWeZBXwp8DbgQuBa5JcOKpgkqSXt5Iz8IuBL1fVU1X1P8DtwKbRxJIkLSVVtbwHJu8BNlTVL3TH7wN+tKo+sOB+m4HN3eHrgS8tPy4A5wJfXeHnWKlJyACTkWMSMsBk5JiEDDAZOSYhA0xGjlFkuKCqphYOruRFzCwydtRPg6raBmxbwTwvnTSZq6qZUX2+VjNMSo5JyDApOSYhw6TkmIQMk5JjnBlWsoTyFeD8I47PA55ZWRxJ0rBWUuAPAuuSfH+SVwHvBXaOJpYkaSnLXkKpqheSfAD4W2AVcGtVPTGyZMc2suWYFZiEDDAZOSYhA0xGjknIAJORYxIywGTkGFuGZb+IKUk6sbyUXpIaZYFLUqMscElqlAU+hCQ/mGR9ktMXjG84gZn+/ETNfUSGH0vyq0mu7HHOG5Kcv/Q9x57jVUl+Nsnl3fG1Sf4kyfVJTukxx+uS/HqSP0rye0l+KcmZfc2vE6vZFzGT/HxV3dbDPDcA1wN7gYuAG6tqR3fbQ1X1Iz1kWPj2zABvBe4HqKp3jjtDl+OBqrq42/9FBl+Xu4Argb+uqq09ZPhP4FvAvwKfAO6sqvlxz7tIjo8xeBfXqcA3gNOBTwHrGfy/mu0hww3ATwGfA34SeAT4OvAu4Jerave4M+gEq6omP4Cne5rnMeD0bn8amGNQ4gAP95ThIeCjwGXAW7rtgW7/LT1+zR8+Yv9BYKrbPw14rK8MDJ45XgncAswDnwFmgTN6/Fo82m1PBg4Cq7rjHL6thwyPHTHvqcDubv+1fX1vdvOdCWwFngSe6z72dmNn9ZVjiYx/09M83wP8NvAXwLULbvuzUc830b8PPMmjx7oJWNNTjFVV9TxAVe1PchnwV0kuYPFfJzAOM8CNwE3Ab1TVI0n+u6o+19P8h52U5GwGBZrqznyr6ltJXugpQ1XVd4B7gXu75Yq3A9cAvwsc9fsixuSk7gK20xiU55nA14DvAnpbQmHwA+TFbt4zAKrq6T6XcYA7GDwbvKyqngVI8hoGP1TvBK7oI0SSYz0bDoNnz324DdgHfBJ4f5J3MyjybwOXjHqyiS5wBiX9EwyeFh4pwD/2lOHZJBdV1SMAVfV8kncAtwI/3EeArrD+IMmd3fYgJ+bf7kxgD4OvfyV5TVU927020NcPs5fMU1X/y+AK4J1JXt1TBhic/T/J4CK2m4A7kzzF4D/p7T1l+DDwYJJ/An4cuBkgyRSDHyZ9ma6qm48c6Ir85iTv7zHHgwyWkxb7Xjyrpwyvq6p3d/ufTnITcH+SsSxzTvQaeJJbgNuq6vOL3Pbxqrq2hwznAS8cPrNYcNulVfUP486wyLwbgUur6kN9z72YJKcCa6rq33qY6weq6l/GPc8wknwvQFU9k+Qs4HIGS3sP9JjhjcAbgMer6sm+5l2Q4V7gs8D2qjrYja0Bfg64oqou7ynH48C7qmrfIrf9R1WN/cXvJHuBN3YnXYfHZoEPMliKvWCk801ygUuafN2y2hYGfw9gdTd8kMEzo61VtfAZ9LhyvIfBazFH/crqJFdV1ad7yPA7wL1V9dkF4xuAP66qdSOdzwKXNC59vVushRzjyGCBSxqbJE9X1WvNMZ4Mk/4ipqQJNyHvFpuIHH1nsMAlrdQkvFtsUnL0msECl7RSdzN4h8UjC29IsvsVlqPXDK6BS1Kj/GVWktQoC1ySGmWBS1KjLHBJapQFLkmN+j/DmTN6id3RWQAAAABJRU5ErkJggg==", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "christmas = release_dates[(release_dates.title.str.contains('Christmas')) & (release_dates.country == 'USA')]\n", + "print(christmas)\n", + "christmas.date.dt.month.value_counts().sort_index().plot(kind='bar')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section III - Q1: Make a bar plot with the following conditions\n", + "- Frequency of the number of movies with \"Summer\" in their title \n", + "- Movies should be such that they are released in the USA.\n", + "- Show the frequency plot by month" + ] + }, + { + "cell_type": "code", + "execution_count": 122, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 122, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD7CAYAAABzGc+QAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAPkUlEQVR4nO3da5BkdXnH8e/PXTGsXFM74gXWUUtI1HjLeElIFOWSVYxo9IUYzXrLVBIVclGzhlRReZHKYkyMFXOpLVkwkWAJ4r1UUIOUCSKzuMLionghyyqwQ0g0giWiT150UzXbzE7Pdp8e9r98P1Vbc/qcM+d5emfm16f/fS6pKiRJ7XnQ/d2AJGk0BrgkNcoAl6RGGeCS1CgDXJIaZYBLUqNWr2SxtWvX1vT09EqWlKTmbd269faqmhqcv6IBPj09zdzc3EqWlKTmJfmvxeY7hCJJjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElq1IqeyCO1aHrjJ0f6vps2ndpxJ9Ke3AOXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNWpogCfZkmR3ku0D89+c5OtJrk/yjsm1KElazHL2wM8H1i+ckeR5wGnAk6vqicA7u29NkrSUoQFeVVcAdwzM/n1gU1X9uL/O7gn0Jklawqhj4McCv57kqiRfSPKMva2YZDbJXJK5+fn5EctJkgaNGuCrgSOBZwNvBT6YJIutWFWbq2qmqmampqZGLCdJGjRqgO8CLqmeLwM/A9Z215YkaZhRA/wjwPMBkhwLHATc3lFPkqRlGHo98CQXAicAa5PsAs4GtgBb+ocW3g1sqKqaZKOSpD0NDfCqOn0vi17VcS+SpH3gmZiS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1amiAJ9mSZHf/5g2Dy96SpJJ4OzVJWmHL2QM/H1g/ODPJMcDJwM6Oe5IkLcPQAK+qK4A7Fln0LuBtgLdSk6T7wUhj4EleDHy3qr7acT+SpGUaek/MQUnWAGcBpyxz/VlgFmDdunX7Wk6StBej7IE/DngM8NUkNwFHA9ckefhiK1fV5qqaqaqZqamp0TuVJO1hn/fAq+o64GH3Pu6H+ExV3d5hX5KkIZZzGOGFwJXAcUl2JXn95NuSJA0zdA+8qk4fsny6s24kScvmmZiS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ajk3dNiSZHeS7Qvm/XWSG5Jcm+TDSY6YaJeSpPtYzh74+cD6gXmXAU+qqicD3wDe3nFfkqQhhgZ4VV0B3DEw79Kquqf/8Ev0bmwsSVpBXYyBvw74VAfbkSTtg7ECPMlZwD3ABUusM5tkLsnc/Pz8OOUkSQuMHOBJNgAvAn67qmpv61XV5qqaqaqZqampUctJkgYMvSv9YpKsB/4UeG5V3dVtS5Kk5VjOYYQXAlcCxyXZleT1wHuAQ4HLkmxL8s8T7lOSNGDoHnhVnb7I7HMn0IskaR94JqYkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVHLuSPPliS7k2xfMO/nk1yW5Mb+1yMn26YkadBy9sDPB9YPzNsIfK6qHg98rv9YkrSChgZ4VV0B3DEw+zTgff3p9wEv6bYtSdIwo46BH1VVtwD0vz5sbysmmU0yl2Rufn5+xHKSpEET/xCzqjZX1UxVzUxNTU26nCQ9YIwa4LcleQRA/+vu7lqSJC3HqAH+MWBDf3oD8NFu2pEkLddyDiO8ELgSOC7JriSvBzYBJye5ETi5/1iStIJWD1uhqk7fy6ITO+5FkrQPPBNTkhplgEtSowxwSWqUAS5JjTLAJalRBrgkNWroYYTS/mZ64ydH+r6bNp3acSfS/cs9cElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjxgrwJH+U5Pok25NcmOTnumpMkrS0kQM8yaOAM4CZqnoSsAp4RVeNSZKWNu4Qymrg4CSrgTXA98ZvSZK0HCMHeFV9F3gnsBO4Bfh+VV3aVWOSpKWNM4RyJHAa8BjgkcBDk7xqkfVmk8wlmZufnx+9U0nSHsYZQjkJ+E5VzVfVT4BLgF8dXKmqNlfVTFXNTE1NjVFOkrTQOAG+E3h2kjVJQu8u9Tu6aUuSNMw4Y+BXARcD1wDX9be1uaO+JElDjHVDh6o6Gzi7o14kSfvAMzElqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0aK8CTHJHk4iQ3JNmR5Fe6akyStLSx7sgDvBv4dFW9PMlBwJoOepIkLcPIAZ7kMOA5wGsAqupu4O5u2pIkDTPOHvhjgXngvCRPAbYCZ1bVnQtXSjILzAKsW7dujHKSDgTTGz850vfdtOnUjjtp3zhj4KuBpwP/VFVPA+4ENg6uVFWbq2qmqmampqbGKCdJWmicAN8F7Kqqq/qPL6YX6JKkFTBygFfVrcDNSY7rzzoR+FonXUmShhr3KJQ3Axf0j0D5NvDa8VuSJC3HWAFeVduAmW5akSTtC8/ElKRGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0a9zhwSR3zWiFaLvfAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0aO8CTrErylSSf6KIhSdLydLEHfiawo4PtSJL2wVgBnuRo4FTgvd20I0larnGvhfJ3wNuAQ/e2QpJZYBZg3bp1e92Q139o2yg/P392+4cD/W9vpZ/fStYbeQ88yYuA3VW1dan1qmpzVc1U1czU1NSo5SRJA8YZQjkeeHGSm4APAM9P8v5OupIkDTVygFfV26vq6KqaBl4BfL6qXtVZZ5KkJXkcuCQ1qpMbOlTV5cDlXWxLkrQ87oFLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktSoTo4D1/7nQL9AkST3wCWpWQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJatQ498Q8Jsm/J9mR5PokZ3bZmCRpaeOcyHMP8CdVdU2SQ4GtSS6rqq911JskaQnj3BPzlqq6pj/9f8AO4FFdNSZJWlonY+BJpoGnAVd1sT1J0nBjXwslySHAh4A/rKofLLJ8FpgFWLdu3bjlOrPS1wrx2iSSujbWHniSB9ML7wuq6pLF1qmqzVU1U1UzU1NT45STJC0wzlEoAc4FdlTV33bXkiRpOcbZAz8eeDXw/CTb+v9e2FFfkqQhRh4Dr6ovAumwF0nSPvBMTElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSo8a9J+b6JF9P8s0kG7tqSpI03Dj3xFwF/APwAuAJwOlJntBVY5KkpY2zB/5M4JtV9e2quhv4AHBaN21JkoZJVY32jcnLgfVV9Yb+41cDz6qqNw2sNwvM9h8eB3x9hHJrgdtHanQ01mu33oH83Kz3wK336KqaGpw58k2NWfyGxvd5NaiqzcDmMeqQZK6qZsbZhvUeGPUO5OdmPesNGmcIZRdwzILHRwPfG68dSdJyjRPgVwOPT/KYJAcBrwA+1k1bkqRhRh5Cqap7krwJ+AywCthSVdd31tmexhqCsd4Dqt6B/NysZ709jPwhpiTp/uWZmJLUKANckhplgEtSowxwIMkvJDkxySED89dPoNYzkzyjP/2EJH+c5IVd11mi/r+sYK1f6z+/Uya0/WclOaw/fXCSv0jy8STnJDl8AvXOSHLM8DU7q3dQkt9JclL/8SuTvCfJG5M8eEI1H5fkLUneneRvkvzeJP4v1Y2mPsRM8tqqOq/jbZ4BvBHYATwVOLOqPtpfdk1VPb3DWmfTu3bMauAy4FnA5cBJwGeq6i+7qtWvN3hYZ4DnAZ8HqKoXd1zvy1X1zP7079L7f/0wcArw8ara1HG964Gn9I+I2gzcBVwMnNif/1sd1/s+cCfwLeBC4KKqmu+yxkC9C+j9rqwB/hc4BLiE3vNLVW3ouN4ZwG8CXwBeCGwD/gd4KfAHVXV5l/XUgapq5h+wcwLbvA44pD89DczRC3GAr0yg1ip6f5A/AA7rzz8YuHYCz+0a4P3ACcBz+19v6U8/dwL1vrJg+mpgqj/9UOC6CdTbsfC5DizbNonnR+9d6ynAucA88GlgA3DoBOpd2/+6GrgNWNV/nAn9vly3oMYa4PL+9Lqu/xb62z0c2ATcAPx3/9+O/rwjuq43pJdPTWCbhwF/Bfwr8MqBZf/YRY1xTqWfiCTX7m0RcNQESq6qqh8CVNVNSU4ALk7yaBa/XMA47qmqnwJ3JflWVf2gX/dHSX7WcS2AGeBM4CzgrVW1LcmPquoLE6gF8KAkR9ILuVR/77Sq7kxyzwTqbV/wruyrSWaqai7JscBPJlCvqupnwKXApf1hjBcApwPvBO5zrYoxPah/ktxD6QXq4cAdwEOAiQyh0Hux+Gm/xqEAVbVzQkM2H6T3bvCEqroVIMnD6b0gXgSc3GWxJHt7Nx167767dh5wI/Ah4HVJXkYvyH8MPLuLAvtdgNML6d+g99ZtoQD/OYF6tyZ5alVtA6iqHyZ5EbAF+KWOa92dZE1V3QX88r0z+2OMnQd4P2zeleSi/tfbmOzP/HBgK72fVSV5eFXd2v9soesXQ4A3AO9O8uf0LhB0ZZKbgZv7y7q2x3Ooqp/QO/v4Y0kOnkC9c+ntna6i9yJ8UZJv0/vj/8AE6r0XuDrJl4DnAOcAJJmi98LRtemqOmfhjH6Qn5PkdROodzW94aHFfhePmEC9x1XVy/rTH0lyFvD5JJ0NXe53Y+BJzgXOq6ovLrLs36rqlR3XO5renvGtiyw7vqr+o8NaD+m/+g7OXws8oqqu66rWXuqfChxfVX82yTqL1F0DHFVV35nQ9g8FHkvvxWlXVd02oTrHVtU3JrHtJWo+EqCqvpfkCHqfl+ysqi9PqN4TgV8EtlfVDZOosaDWpcBngffd+zNLchTwGuDkqjqp43rbgZdW1Y2LLLu5qjr9gDrJDuCJ/R2pe+dtAN5Gb9j20WPX2N8CXNIDQ3+4bSO9+wg8rD/7NnrvajZV1eC78HHrvZzeZzH3uaR1kpdU1Uc6rvcO4NKq+uzA/PXA31fV48euYYBL2t9M4oizA7GeAS5pv5NkZ1Wts97S9scPMSU9AKz0EWcHYj0DXNL9ZaWPODvg6hngku4vn6B3NMa2wQVJLrfecI6BS1KjvJiVJDXKAJekRhngktQoA1ySGmWAS1Kj/h94rLRfuNt50gAAAABJRU5ErkJggg==", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "summer = release_dates[(release_dates.title.str.contains('Summer')) & (release_dates.country == 'USA')]\n", + "summer.date.dt.month.value_counts().sort_index().plot(kind='bar')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section III - Q2: Make a bar plot with the following conditions\n", + "- Frequency of the number of movies with \"Action\" in their title \n", + "- Movies should be such that they are released in the USA.\n", + "- Show the frequency plot by week" + ] + }, + { + "cell_type": "code", + "execution_count": 124, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 124, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWoAAAD7CAYAAABDld6xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAQFUlEQVR4nO3ce5AlZX3G8edhBwRcWC47AQ0MYxmEoIaLw1IVouhCUQsbCX+QKrFCghVrkgq3eInZFKYQyuhiJZQa1GTDRYUQohIkShBJBCwSYGEXXBYXKoi7SIEB4gWIVCL4yx/9Ltvbe2amz+7pOT9mvp+qrunLb95+z3v6PKenT59xRAgAkNdOw+4AAGB6BDUAJEdQA0ByBDUAJEdQA0ByBDUAJDfSRaOLFy+O8fHxLpoGgDlpzZo1z0TEaK9tnQT1+Pi47r333i6aBoA5yfamqbZx6QMAkiOoASA5ghoAkiOoASA5ghoAkmt114ftjZKek/SSpBcjYqLLTgEAtujn9rx3RMQznfUEANATlz4AILm2Z9Qh6Zu2Q9LfRsSqZoHtSUmTkjQ2Nja4HgJTGF9x4zbrNq5c3qpuqlogo7Zn1MdGxFGSTpJ0lu23NQsiYlVETETExOhoz29BAgC2Q6ugjognys+nJF0vaUmXnQIAbDFjUNt+te09Ns9LOlHS+q47BgCotLlGvZ+k621vrr8mIr7Raa8AAC+bMagj4lFJh89CXwAAPXB7HgAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHIENQAkR1ADQHKtg9r2Atv32f56lx0CAGytnzPq8yRt6KojAIDeWgW17QMkLZd0WbfdAQA0jbSs+6SkD0naY6oC25OSJiVpbGxshzuGuWd8xY3brNu4cvkQegIMx/a+BmY8o7b9m5Keiog109VFxKqImIiIidHR0Rl3DABop82lj2MlnWJ7o6RrJS21fXWnvQIAvGzGoI6IP4uIAyJiXNK7JH0rIn6n854BACRxHzUApNf2w0RJUkTcJum2TnoCAOiJM2oASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkCGoASI6gBoDkZgxq27vaXm37O7YftH3hbHQMAFAZaVHzv5KWRsTztneWdIftmyLiro77BgBQi6COiJD0fFncuUzRZacAAFu0ukZte4Ht+yU9JemWiLi7014BAF7W5tKHIuIlSUfY3kvS9bbfFBHr6zW2JyVNStLY2JgkaXzFjdu0tXHl8h3rcXKvhMc86D5mb68Lr4Q+zhWMdZ93fUTETyTdJmlZj22rImIiIiZGR0cH0zsAQKu7PkbLmbRs7ybpBEkPddwvAEDR5tLHayR9wfYCVcH+pYj4erfdAgBs1uauj3WSjpyFvgAAeuCbiQCQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQHEENAMkR1ACQ3IxBbftA27fa3mD7QdvnzUbHAACVkRY1L0r6QESstb2HpDW2b4mI73bcNwCAWpxRR8STEbG2zD8naYOkX+66YwCASpsz6pfZHpd0pKS7e2yblDQpSWNjY4PoGySNr7hxm3UbVy6f8/sGhq3X8S8N5zXQ+sNE2wslXSfpjyPi2eb2iFgVERMRMTE6OjrIPgLAvNYqqG3vrCqk/z4i/qnbLgEA6trc9WFJl0vaEBGXdN8lAEBdmzPqYyWdIWmp7fvLdHLH/QIAFDN+mBgRd0jyLPQFANAD30wEgOQIagBIjqAGgOQIagBIjqAGgOQIagBIjqAGgOQIagBIjqAGgOQIagBIjqAGgOQIagBIjqAGgOQIagBIjqAGgOQIagBIjqAGgOQIagBIjqAGgOQIagBIjqAGgOQIagBIjqAGgOQIagBIjqAGgOQIagBIjqAGgOQIagBIjqAGgOQIagBIjqAGgOQIagBIbsagtn2F7adsr5+NDgEAttbmjPrzkpZ13A8AwBRmDOqI+LakH81CXwAAPXCNGgCSGxlUQ7YnJU1K0tjYWF+/O77ixm3WbVy5fLvrumizn323kb29+WpYz0umumx9bGtY++2qzbqBnVFHxKqImIiIidHR0UE1CwDzHpc+ACC5Nrfn/YOkOyUdYvtx27/ffbcAAJvNeI06Ik6fjY4AAHrj0gcAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJEdQA0ByBDUAJNcqqG0vs/2w7Udsr+i6UwCALWYMatsLJH1G0kmSDpN0uu3Duu4YAKDS5ox6iaRHIuLRiPg/SddK+q1uuwUA2MwRMX2BfZqkZRHx3rJ8hqRjIuLsRt2kpMmyeIikhxtNLZb0TIs+zZW6Ye47e90w9529bpj7zl43zH3PRt1BETHaszoipp0k/baky2rLZ0j665l+r0c7986nuldCHxmbfHWvhD4yNrP7mCOi1aWPxyUdWFs+QNITLX4PADAAbYL6HkkH236d7V0kvUvSP3fbLQDAZiMzFUTEi7bPlnSzpAWSroiIB7djX6vmWd0w9529bpj7zl43zH1nrxvmvof5mGf+MBEAMFx8MxEAkiOoASA5ghoAkuskqG0favt42wsb65f1qF1i++gyf5jt99s+uVFzjO09y/xuti+0/TXbF9teVKvbxfbv2j6hLL/b9qW2z7K9c6PN19v+oO1P2f4r239Ybwv9s/1LLev27bovr1Rtx7DPNufVeA96DDOM38CD2va5km6QdI6k9bbrXzf/WKP2AkmflvQ52x+XdKmkhZJW2D6/VnqFpJ+V+U9JWiTp4rLuylrdlZKWSzrP9lWqvqxzt6SjJV3W6OPfSNq1bNtN1b3id9p++3Y+9G3syBNse5HtlbYfsv3fZdpQ1u3Vso2bavN72v647atsv7tR99na/P62P2f7M7b3tf0R2w/Y/pLt19Tq9mlM+0pabXtv2/vU6lbaXlzmJ2w/Kulu25tsH1erW2v7w7Zf3+JxTdi+1fbVtg+0fYvtn9q+x/aRc3AMl9XmF9m+3PY629fY3q/Rj4GOt+2Fti+y/WAZ46dt32X7zEbdsMZ6oGPYdvz6HMMdHpvW34zp45tDD0haWObHJd0r6byyfF+P2gWSdpf0rKQ9y/rdJK2r1W2oza9ttHF/bX5d+Tki6b8kLSjLbrT3QG3b7pJuK/NjzT7O8Fhvqs2vlLS4zE9IelTSI5I2STquVrdQ0kWSHpT0U0lPS7pL0pmNtm+W9KeS9q+t27+su6W27qgpprdIerJWd13p46mq7oO/TtKrmmMq6Ruq3mRXSFpX9jdW1t1Qq/uFpO83pp+Xn4/Wx7o2f6uko8v8G1T7dlb5vb+U9Jik1ZLeJ+m1U4z7alX/JOx0ST+QdFpZf7ykO+fgGNbbvkzSRyUdVMboq83X1CDHW9VJ15mqvuj2fkl/LulgSV+Q9LEEYz3QMWw7fn2OYauxmTZr2oZSH+H13cbywnLgXqJaqJZt9/WaL8v1AP6ypPeU+SslTdQG755a3XpJu0jaW9JzkvYp63fV1mH/QO1J31vSmnobjX60PbDavkDaHvgPTzPGD9fmX5L0rbLP5vRCr/Esy+dL+ndJ+zYO4vpz8tg0z8kHy/P65vqB26OvD0kaKfN3NbbVx6zeh7dK+qykH5bHMTnNcdPsY33bXBnDtdP0obk80PGW9J1GG/eUnztJeijBWA90DNuOX59j2GpspptmLOh3Kk/CEY11I5K+KOmlxvq7Je2++YmvrV/UGIRFkj4v6Xvld36u6oz1dkmH1+reV9ZvknSupH+T9HeqgvmCWt15qs50VpUnZvObwKikbzf62PbAavsCaXvgf1PShyTtV1u3n6p34X+trVsv6eApnosf1OY31Me4rPs9VWf2m3r1T9JHZzhQD1D1JnqJpD1UO4Op1ZxTHstSSR+R9ElJb5N0oaSreh30tXULJC2TdGVj/Z2STlR1aWuTpFPL+uO09ZtixjFc11huM4aPq3pT/4Cq49vTtDfQ8Zb0H5J+o8y/U9LNtW31AB7KWA96DNuOX59j2Gpsppu6COoDVDvFb2w7trH8qinqFqv2Dllbv4ekw1Wdze43xe++VuXPD0l7STpN0pIedW8s2w6d4fG0PbDavkDaHvh7q7oO/5CkH0v6UTl4L1b5S6HUnSbpkCn6d2pt/hOSTuhRs0zSf9aWL1K5dNWo+xVJX5liP+9Udfnmh1Nsf7ukf5R0n6o3zX9R9Z8Wd67VXNvHMXa4qj8nb5J0qKrPLX6i6kX863NtDCVd0JhGy/r9JX2xy/EuY726jO8dkt5Q1o9KOneasf5xGetPdDnWjW2nDGIMJb2jx/j9QX38+hzDVsfhtG20fXHM16ntgVWWp3qBjNRqfq3NgV/WHSrphOaLXtW/nW3WHb8DdSftaHuqPld4U0f926qurPvVlm0u0ZbLUG9UdUZ1co/26nWHqTr7mu26N0v68I6019FjPqbtvhu/d9VMNaVumzeb7a0rx+GXZ3u/ffbxreU5ObFtu3yFfAfYfk9EXNlFnas7U85S9c57hKoPZG8o29ZGxFF91p0j6ewB1g2lf7U2/0jVGcp0bV6g6kPHEUm3qAqm21W9+d0cEX8xRd0xkm4bQt0O9W/Ij7nXP2pbquqyoSLilCnqrOoMdlB1bfe7Q3V99nF1RCwp8+9V9Xr4qqrLd1+LiJU99rm1tonO1POd8bGu6tTy7pn5Vrcdbba5q2hO1A25j2slXa3qr8rjys8ny/xxtbr7Blw36P22aq/fx1Kbv0dbLru8Wo3PfaaaZvzvefOd7XVTbVL1gUAndapuH3xekiJio6v7u79i+6BSO1/r+ql9MSJekvQz29+LiGfL77xg+xdzsG6Y+55Q9SH9+ZL+JCLut/1CRNze6N9bBlw36P22ba+fNneyvbeqGwYcEU9LUkT8j+0Xe7S7rTZpPp8nVfdjH6Hqnsv6NC7piQ7rWt09M9/q+myz7V1Fc6Ju2Psu6zffgXGppvlLcq7UtamVtFHVnSbfLz/3L+sXqnGr4JT7aFM0nydJl6vcpdFj2zUd1rW6e2a+1fXZZqu7iuZK3bD33di+XLXvBcz1un5rS/3ukl7XppYPEwEgOf57HgAkR1ADQHIENQAkR1ADQHIENQAk9/9CeQGZf4MtVgAAAABJRU5ErkJggg==", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "action = release_dates[(release_dates.title.str.contains('Action')) & (release_dates.country == 'USA')]\n", + "summer.date.dt.isocalendar().week.value_counts().sort_index().plot(kind='bar')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section III - Q3: Show all the movies in which Keanu Reeves has played the lead role along with their release date in the USA sorted by the date of release\n", + "- Hint: You might need to join or merge two datasets!" + ] + }, + { + "cell_type": "code", + "execution_count": 129, + "metadata": {}, + "outputs": [], + "source": [ + "merged=cast.merge(release_dates,on='title')\n", + "merged['year']=merged['year_x']\n", + "merged.drop(columns=['year_x','year_y'],inplace=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 131, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
titlenametypecharacterndecadecountrydateyear
11159047SpeedKeanu ReevesactorJack Traven1.01990USA1922-10-221994
11159049SpeedKeanu ReevesactorJack Traven1.01990USA1936-05-081994
8505570Sweet NovemberKeanu ReevesactorNelson Moss1.02000USA1968-02-082001
21030349The Night BeforeKeanu ReevesactorWinston Connelly1.01980USA1988-04-151988
11560862Bill & Ted's Excellent AdventureKeanu ReevesactorTed1.01980USA1989-02-171989
3957308Bill & Ted's Bogus JourneyKeanu ReevesactorTed1.01990USA1991-07-191991
13820799Little BuddhaKeanu ReevesactorSiddhartha1.01990USA1994-05-251993
11159052SpeedKeanu ReevesactorJack Traven1.01990USA1994-06-101994
2635357Johnny MnemonicKeanu ReevesactorJohnny Mnemonic1.01990USA1995-05-261995
14111937A Walk in the CloudsKeanu ReevesactorPaul Sutton1.01990USA1995-08-111995
17336298Chain ReactionKeanu ReevesactorEddie Kasalivich1.01990USA1996-08-021996
18731184Feeling MinnesotaKeanu ReevesactorJjaks Clayton1.01990USA1996-09-131996
10765357The Devil's AdvocateKeanu ReevesactorKevin Lomax1.01990USA1997-10-171997
16565628The MatrixKeanu ReevesactorNeo1.01990USA1999-03-311999
31232The ReplacementsKeanu ReevesactorShane Falco1.02000USA2000-08-112000
8505574Sweet NovemberKeanu ReevesactorNelson Moss1.02000USA2001-02-162001
1227124Hard BallKeanu ReevesactorConor O'Neill1.02000USA2001-09-142001
10820942ConstantineKeanu ReevesactorJohn Constantine1.02000USA2005-02-182005
3242481The Lake HouseKeanu ReevesactorAlex Wyler1.02000USA2006-06-162006
14231455Street KingsKeanu ReevesactorDetective Tom Ludlow1.02000USA2008-04-112008
5336469The Day the Earth Stood StillKeanu ReevesactorKlaatu1.02000USA2008-12-122008
885109047 RoninKeanu ReevesactorKai1.02010USA2013-12-252013
3949480John WickKeanu ReevesactorJohn Wick1.02010USA2014-10-242014
17635617Knock KnockKeanu ReevesactorEvan1.02010USA2015-10-092015
17527650John Wick: Chapter 2Keanu ReevesactorJohn Wick1.02010USA2017-02-102017
17635631Knock KnockKeanu ReevesactorEvan1.02010USA2017-10-062015
\n", + "
" + ], + "text/plain": [ + " title name type \\\n", + "11159047 Speed Keanu Reeves actor \n", + "11159049 Speed Keanu Reeves actor \n", + "8505570 Sweet November Keanu Reeves actor \n", + "21030349 The Night Before Keanu Reeves actor \n", + "11560862 Bill & Ted's Excellent Adventure Keanu Reeves actor \n", + "3957308 Bill & Ted's Bogus Journey Keanu Reeves actor \n", + "13820799 Little Buddha Keanu Reeves actor \n", + "11159052 Speed Keanu Reeves actor \n", + "2635357 Johnny Mnemonic Keanu Reeves actor \n", + "14111937 A Walk in the Clouds Keanu Reeves actor \n", + "17336298 Chain Reaction Keanu Reeves actor \n", + "18731184 Feeling Minnesota Keanu Reeves actor \n", + "10765357 The Devil's Advocate Keanu Reeves actor \n", + "16565628 The Matrix Keanu Reeves actor \n", + "31232 The Replacements Keanu Reeves actor \n", + "8505574 Sweet November Keanu Reeves actor \n", + "1227124 Hard Ball Keanu Reeves actor \n", + "10820942 Constantine Keanu Reeves actor \n", + "3242481 The Lake House Keanu Reeves actor \n", + "14231455 Street Kings Keanu Reeves actor \n", + "5336469 The Day the Earth Stood Still Keanu Reeves actor \n", + "8851090 47 Ronin Keanu Reeves actor \n", + "3949480 John Wick Keanu Reeves actor \n", + "17635617 Knock Knock Keanu Reeves actor \n", + "17527650 John Wick: Chapter 2 Keanu Reeves actor \n", + "17635631 Knock Knock Keanu Reeves actor \n", + "\n", + " character n decade country date year \n", + "11159047 Jack Traven 1.0 1990 USA 1922-10-22 1994 \n", + "11159049 Jack Traven 1.0 1990 USA 1936-05-08 1994 \n", + "8505570 Nelson Moss 1.0 2000 USA 1968-02-08 2001 \n", + "21030349 Winston Connelly 1.0 1980 USA 1988-04-15 1988 \n", + "11560862 Ted 1.0 1980 USA 1989-02-17 1989 \n", + "3957308 Ted 1.0 1990 USA 1991-07-19 1991 \n", + "13820799 Siddhartha 1.0 1990 USA 1994-05-25 1993 \n", + "11159052 Jack Traven 1.0 1990 USA 1994-06-10 1994 \n", + "2635357 Johnny Mnemonic 1.0 1990 USA 1995-05-26 1995 \n", + "14111937 Paul Sutton 1.0 1990 USA 1995-08-11 1995 \n", + "17336298 Eddie Kasalivich 1.0 1990 USA 1996-08-02 1996 \n", + "18731184 Jjaks Clayton 1.0 1990 USA 1996-09-13 1996 \n", + "10765357 Kevin Lomax 1.0 1990 USA 1997-10-17 1997 \n", + "16565628 Neo 1.0 1990 USA 1999-03-31 1999 \n", + "31232 Shane Falco 1.0 2000 USA 2000-08-11 2000 \n", + "8505574 Nelson Moss 1.0 2000 USA 2001-02-16 2001 \n", + "1227124 Conor O'Neill 1.0 2000 USA 2001-09-14 2001 \n", + "10820942 John Constantine 1.0 2000 USA 2005-02-18 2005 \n", + "3242481 Alex Wyler 1.0 2000 USA 2006-06-16 2006 \n", + "14231455 Detective Tom Ludlow 1.0 2000 USA 2008-04-11 2008 \n", + "5336469 Klaatu 1.0 2000 USA 2008-12-12 2008 \n", + "8851090 Kai 1.0 2010 USA 2013-12-25 2013 \n", + "3949480 John Wick 1.0 2010 USA 2014-10-24 2014 \n", + "17635617 Evan 1.0 2010 USA 2015-10-09 2015 \n", + "17527650 John Wick 1.0 2010 USA 2017-02-10 2017 \n", + "17635631 Evan 1.0 2010 USA 2017-10-06 2015 " + ] + }, + "execution_count": 131, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "merged[(merged['name']==\"Keanu Reeves\") & (merged['n']==1) & (merged['country']==\"USA\")].sort_values('date')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section III - Q4: Make a bar plot showing the months in which movies with Keanu Reeves tend to be released in the USA?" + ] + }, + { + "cell_type": "code", + "execution_count": 132, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 132, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWoAAAD7CAYAAABDld6xAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAANAklEQVR4nO3de4yldX3H8feHXbEsVxOmWIVl1KgttvHSKdjSKBWkyFqtlT+QaFFjN0210KvZ1iakfzRdGnshvSUbkbSVSgpSq5IqGoqJvSC7sOXiYlXcAuXStRepYArot3+cs3YYZpnD7PPsftl9v5LNnjnPOef7m9nZ93nOc86ZSVUhSerrkP29AEnSkzPUktScoZak5gy1JDVnqCWpOUMtSc2tHeNGjz322Jqfnx/jpiXpgLRt27avVdXccttGCfX8/Dxbt24d46Yl6YCU5F/3tM1DH5LUnKGWpOYMtSQ1Z6glqTlDLUnNGWpJas5QS1JzhlqSmhvlDS86MM1vumZV19u5ecPAK5EOLu5RS1JzhlqSmjPUktScoZak5gy1JDVnqCWpOUMtSc0ZaklqzlBLUnOGWpKaM9SS1JyhlqTmDLUkNWeoJak5Qy1JzRlqSWrOUEtSc4Zakpoz1JLU3EyhTvKLSW5PcluSDyf5rrEXJkmaWDHUSZ4LXAAsVNX3A2uAc8demCRpYtZDH2uBw5KsBdYB9463JEnSYiuGuqr+DXg/cBdwH/D1qrp26eWSbEyyNcnWXbt2Db9SSTpIzXLo41nAG4HnAc8BDk/y1qWXq6otVbVQVQtzc3PDr1SSDlKzHPo4A/hqVe2qqkeBq4EfGXdZkqTdZgn1XcArk6xLEuB0YMe4y5Ik7TbLMeobgKuAm4Bbp9fZMvK6JElTa2e5UFVdBFw08lokScvwnYmS1JyhlqTmDLUkNWeoJak5Qy1JzRlqSWrOUEtSc4Zakpoz1JLUnKGWpOYMtSQ1Z6glqTlDLUnNGWpJas5QS1JzhlqSmjPUktTcTL/h5elqftM1q7rezs0bBl6JJK2ee9SS1JyhlqTmDLUkNWeoJak5Qy1JzRlqSWrOUEtSc4Zakpoz1JLUnKGWpOYMtSQ1Z6glqTlDLUnNGWpJas5QS1JzhlqSmjPUktScoZak5gy1JDU3U6iTHJPkqiR3JNmR5IfHXpgkaWLWX257CfDJqjonyaHAuhHXJElaZMVQJzkKeBXwdoCqegR4ZNxlSZJ2m2WP+vnALuCyJC8FtgEXVtVDiy+UZCOwEWD9+vVDr1PLmN90zaqut3PzhoFXIj2R35/DmeUY9VrgFcCfVtXLgYeATUsvVFVbqmqhqhbm5uYGXqYkHbxmCfU9wD1VdcP046uYhFuStA+sGOqquh+4O8mLp2edDnxh1FVJkr5j1ld9/Dxw+fQVH3cC7xhvSZKkxWYKdVVtBxbGXYokaTm+M1GSmjPUktScoZak5gy1JDVnqCWpOUMtSc0ZaklqzlBLUnOGWpKaM9SS1JyhlqTmDLUkNWeoJak5Qy1JzRlqSWrOUEtSc4Zakpqb9VdxSfvc/KZrVnW9nZs3PC3mSbNyj1qSmjPUktScoZak5gy1JDVnqCWpOUMtSc0ZaklqzlBLUnOGWpKaM9SS1JyhlqTmDLUkNWeoJak5Qy1JzRlqSWrOUEtSc4Zakpoz1JLUnKGWpOZmDnWSNUluTvKJMRckSXq8p7JHfSGwY6yFSJKWN1OokxwPbAA+MO5yJElLrZ3xcn8AvBc4ck8XSLIR2Aiwfv36vV6YpGHNb7pmVdfbuXnDwCt5+tvXX8sV96iTvB7496ra9mSXq6otVbVQVQtzc3OrWowk6YlmOfRxKvCGJDuBK4DXJPnQqKuSJH3HiqGuql+rquOrah44F7iuqt46+sokSYCvo5ak9mZ9MhGAqroeuH6UlUiSluUetSQ1Z6glqTlDLUnNGWpJas5QS1JzhlqSmjPUktScoZak5gy1JDVnqCWpOUMtSc0ZaklqzlBLUnOGWpKaM9SS1JyhlqTmDLUkNfeUfsPLEPyV9dKE/xeGtZqv59Pla+ketSQ1Z6glqTlDLUnNGWpJas5QS1JzhlqSmjPUktScoZak5gy1JDVnqCWpOUMtSc0ZaklqzlBLUnOGWpKaM9SS1JyhlqTmDLUkNWeoJak5Qy1Jza0Y6iQnJPm7JDuS3J7kwn2xMEnSxCy/3PYx4Jer6qYkRwLbkny6qr4w8tokScywR11V91XVTdPT/wPsAJ479sIkSRNP6Rh1knng5cANy2zbmGRrkq27du0aaHmSpJlDneQI4CPAL1TVg0u3V9WWqlqoqoW5ubkh1yhJB7WZQp3kGUwifXlVXT3ukiRJi83yqo8AlwI7qur3xl+SJGmxWfaoTwXeBrwmyfbpn7NHXpckaWrFl+dV1eeA7IO1SJKW4TsTJak5Qy1JzRlqSWrOUEtSc4Zakpoz1JLUnKGWpOYMtSQ1Z6glqTlDLUnNGWpJas5QS1JzhlqSmjPUktScoZak5gy1JDVnqCWpuRV/w4tmN7/pmlVdb+fmDQOvRNKBxD1qSWrOUEtSc4Zakpoz1JLUnKGWpOYMtSQ1Z6glqTlDLUnNGWpJas5QS1JzhlqSmjPUktScoZak5gy1JDVnqCWpOUMtSc0ZaklqzlBLUnOGWpKamynUSc5K8sUkX06yaexFSZL+34qhTrIG+GPgdcBJwFuSnDT2wiRJE7PsUZ8MfLmq7qyqR4ArgDeOuyxJ0m6pqie/QHIOcFZVvWv68duAU6rqPUsutxHYOP3wxcAXV7GeY4GvreJ6q7EvZznPec47eOatdtaJVTW33Ia1M1w5y5z3hLpX1RZgy1Nc2OMHJVuramFvbqPjLOc5z3kHz7wxZs1y6OMe4IRFHx8P3DvkIiRJezZLqG8EXpjkeUkOBc4FPjbusiRJu6146KOqHkvyHuBTwBrgg1V1+0jr2atDJ41nOc95zjt45g0+a8UnEyVJ+5fvTJSk5gy1JDVnqCWpuYMm1Em+N8npSY5Ycv5ZI807OckPTU+flOSXkpw9xqw9zP/zfTjrR6ef35kj3f4pSY6anj4syW8m+XiSi5McPcK8C5KcsPIlB5l1aJKfTnLG9OPzkvxRkncnecZIM1+Q5FeSXJLkd5P87BhfRw2n5ZOJSd5RVZcNeHsXAO8GdgAvAy6sqr+Zbrupql4x1KzpbV7E5GejrAU+DZwCXA+cAXyqqn5r4HlLXy4Z4MeA6wCq6g0Dz/t8VZ08Pf0zTL62fw2cCXy8qjYPPO924KXTVyBtAR4GrgJOn57/UwPP+zrwEPAV4MPAlVW1a8gZi2ZdzuT7ZB3w38ARwNVMPrdU1fkDz7sA+Angs8DZwHbgv4A3AT9XVdcPOU8Dqap2f4C7Br69W4Ejpqfnga1MYg1w8wjrv5XJSxnXAQ8CR03PPwy4ZYR5NwEfAk4DXj39+77p6VePMO/mRadvBOampw8Hbh1h3o7Fn+uSbdvH+PyYPNo8E7gU2AV8EjgfOHLgWbdM/14LPACsmX6ckb5Xbl00Yx1w/fT0+jH+L0xv+2hgM3AH8B/TPzum5x0zxswnWcvfjnCbRwG/DfwFcN6SbX8yxIxZ3kI+iiS37GkTcNzA49ZU1TcAqmpnktOAq5KcyPJvkd9bj1XVt4CHk3ylqh6czv5mkm+PMG8BuBB4H/CrVbU9yTer6rMjzAI4JMmzmMQsNd3brKqHkjw2wrzbFj3K+uckC1W1NcmLgEdHmFdV9W3gWuDa6SGI1wFvAd4PLPvzGFbpkOkbyQ5nEs6jgf8EngmMcuiDyZ3Ct6YzjgSoqrvGOtQC/BWTR3enVdX9AEmezeSO70rgtUMOS7KnR8hh8oh6aJcBXwI+ArwzyZuZBPt/gVcOMWC/hZpJjH+cycOuxQL8w8Cz7k/ysqraDlBV30jyeuCDwA8MPAvgkSTrquph4Ad3nzk9Djh4qKdR+f0kV07/foBx/22PBrYx+beqJM+uqvunx//HuON7F3BJkt9g8sNu/jHJ3cDd021De9znUFWPMnk37seSHDbwrEuZ7GmuYXJHe2WSO5n8B79i4FkAHwBuTPJPwKuAiwGSzDG5gxjDfFVdvPiMabAvTvLOEebdyOTQznLfi8eMMO8FVfXm6emPJnkfcF2SwQ457rdj1EkuBS6rqs8ts+0vq+q8AWcdz2Qv9/5ltp1aVX8/1KzpbT5zem+69Pxjge+pqluHnLfMnA3AqVX162POWWbuOuC4qvrqSLd/JPB8JndC91TVAyPNeVFV/csYt72Hec8BqKp7kxzD5LmMu6rq8yPNewnwfcBtVXXHGDOWzLsW+AzwZ7v/zZIcB7wdeG1VnTHwvNuAN1XVl5bZdndVDfpEcZIdwEumO0y7zzsfeC+TQ64n7vWM/RVqSQeH6WGyTUx+jv13T89+gMmjlM1VtfRR9d7OO4fJcyVP+FHLSX6yqj468LzfAa6tqs8sOf8s4A+r6oV7PcNQS9pfhn6F14E6z1BL2m+S3FVV65335Pbnk4mSDgL7+BVeB+Q8Qy1pbPvyFV4H5DxDLWlsn2Dy6oftSzckud55K/MYtSQ1d9D8UCZJeroy1JLUnKGWpOYMtSQ1Z6glqbn/AxQYuWkhaOyFAAAAAElFTkSuQmCC", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "merged[(merged['name']==\"Keanu Reeves\") & (merged['country']==\"USA\")].date.dt.month.value_counts().sort_index().plot(kind='bar')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Section III - Q5: Make a bar plot showing the years in which movies with Ian McKellen tend to be released in the USA?" + ] + }, + { + "cell_type": "code", + "execution_count": 133, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 133, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAEICAYAAABPgw/pAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAaTklEQVR4nO3de7xdZX3n8c+XJFRKoBFyFCbhEDpNa4GKhNMIg1MDo5VbBV8FG6cC0lebgtDS1pkWtYOtndp02qEUomQYZTRj8QpiCkHBCwr2FSDkBiFQUiaQSCzhlhBBMfCbP54ndLGyL2uf7J3DefJ9v17rlXX57d9+nr32+WXtZ6+1tiICMzMb//Ya6waYmVl/uKCbmRXCBd3MrBAu6GZmhXBBNzMrhAu6mVkhJo7VE0+dOjVmzJgxVk9vZjYu3XPPPU9ExFCrbWNW0GfMmMGyZcvG6unNzMYlSY+02+YhFzOzQrigm5kVwgXdzKwQLuhmZoVwQTczK0Tjgi5pgqQVkm5ssU2SrpC0TtJqSbP620wzM+umlyP0i4G1bbadDMzM0zzgql1sl5mZ9ahRQZc0HTgV+GSbkNOBRZEsBaZIOrhPbTQzswaaXlh0OfDHwH5ttk8DNlSWN+Z1m6pBkuaRjuAZHh7upZ1mZuPajEtu2mnd+vmn9vU5uh6hSzoNeDwi7ukU1mLdTj+FFBFXR8RIRIwMDbW8ctXMzEapyZDL8cA7Ja0HPg+cKOmztZiNwCGV5enAY31poZmZNdK1oEfEByNiekTMAOYC34qI99bCFgPn5LNdjgW2RMSmei4zMxucUd+cS9L5ABGxEFgCnAKsA54DzutL68zMrLGeCnpE3AbclucXVtYHcGE/G2ZmZr3xlaJmZoVwQTczK4QLuplZIVzQzcwK4YJuZlYIF3Qzs0K4oJuZFcIF3cysEC7oZmaFcEE3MyuEC7qZWSFc0M3MCuGCbmZWCBd0M7NCuKCbmRXCBd3MrBBNfiT6NZLukrRK0hpJf94iZo6kLZJW5unSwTTXzMzaafKLRT8GToyIbZImAXdIujkiltbibo+I0/rfRDMza6JrQc8/L7ctL07KUwyyUWZm1rtGY+iSJkhaCTwO3BoRd7YIOy4Py9ws6Yh+NtLMzLprVNAj4sWIeBMwHZgt6chayHLg0Ig4CrgSuKFVHknzJC2TtGzz5s2jb7WZme2kp7NcIuIZ4DbgpNr6rRGxLc8vASZJmtri8VdHxEhEjAwNDY260WZmtrMmZ7kMSZqS5/cB3gY8UIs5SJLy/Oyc98m+t9bMzNpqcpbLwcBnJE0gFeovRsSNks4HiIiFwJnABZK2A88Dc/OXqWZmtps0OctlNXB0i/ULK/MLgAX9bZqZmfXCV4qamRXCBd3MrBAu6GZmhXBBNzMrhAu6mVkhXNDNzArhgm5mVggXdDOzQrigm5kVwgXdzKwQLuhmZoVwQTczK4QLuplZIVzQzcwK4YJuZlYIF3Qzs0K4oJuZFaLJb4q+RtJdklZJWiPpz1vESNIVktZJWi1p1mCaa2Zm7TT5TdEfAydGxDZJk4A7JN0cEUsrMScDM/P0ZuCq/K+Zme0mXY/QI9mWFyflqf4D0KcDi3LsUmCKpIP721QzM+uk0Ri6pAmSVgKPA7dGxJ21kGnAhsryxrzOzMx2kyZDLkTEi8CbJE0BviLpyIi4rxKiVg+rr5A0D5gHMDw83Htrra9mXHLTTuvWzz91DFrSH636A+O7T031si9L2+9j6dX2WvZ0lktEPAPcBpxU27QROKSyPB14rMXjr46IkYgYGRoa6q2lZmbWUZOzXIbykTmS9gHeBjxQC1sMnJPPdjkW2BIRm/rdWDMza6/JkMvBwGckTSD9B/DFiLhR0vkAEbEQWAKcAqwDngPOG1B7zcysja4FPSJWA0e3WL+wMh/Ahf1tmpmZ9cJXipqZFcIF3cysEC7oZmaFcEE3MyuEC7qZWSFc0M3MCuGCbmZWCBd0M7NCuKCbmRXCBd3MrBAu6GZmhXBBNzMrhAu6mVkhXNDNzArhgm5mVggXdDOzQrigm5kVoslvih4i6duS1kpaI+niFjFzJG2RtDJPlw6muWZm1k6T3xTdDnwgIpZL2g+4R9KtEXF/Le72iDit/000M7Mmuh6hR8SmiFie558F1gLTBt0wMzPrTU9j6JJmkH4w+s4Wm4+TtErSzZKO6EfjzMysuSZDLgBImgxcB/xBRGytbV4OHBoR2ySdAtwAzGyRYx4wD2B4eHi0bTYzsxYaHaFLmkQq5v8QEdfXt0fE1ojYlueXAJMkTW0Rd3VEjETEyNDQ0C423czMqpqc5SLgU8DaiLisTcxBOQ5Js3PeJ/vZUDMz66zJkMvxwNnAvZJW5nUfAoYBImIhcCZwgaTtwPPA3IiI/jfXzMza6VrQI+IOQF1iFgAL+tUoMzPrna8UNTMrhAu6mVkhXNDNzArhgm5mVggXdDOzQrigm5kVwgXdzKwQLuhmZoVwQTczK4QLuplZIVzQzcwK4YJuZlYIF3Qzs0K4oJuZFcIF3cysEC7oZmaFcEE3MytEk98UPUTStyWtlbRG0sUtYiTpCknrJK2WNGswzTUzs3aa/KboduADEbFc0n7APZJujYj7KzEnAzPz9GbgqvyvmZntJl2P0CNiU0Qsz/PPAmuBabWw04FFkSwFpkg6uO+tNTOztnoaQ5c0AzgauLO2aRqwobK8kZ2LvpmZDVCTIRcAJE0GrgP+ICK21je3eEi0yDEPmAcwPDzcQzOtFzMuuWmndevnn9rXfO1y9vu5B6VpO3vpzyD6Ppav5668Rr3Ejuf3x6tNoyN0SZNIxfwfIuL6FiEbgUMqy9OBx+pBEXF1RIxExMjQ0NBo2mtmZm00OctFwKeAtRFxWZuwxcA5+WyXY4EtEbGpj+00M7Mumgy5HA+cDdwraWVe9yFgGCAiFgJLgFOAdcBzwHl9b6mZmXXUtaBHxB20HiOvxgRwYb8aZWZmvfOVomZmhXBBNzMrhAu6mVkhXNDNzArhgm5mVggXdDOzQrigm5kVwgXdzKwQLuhmZoVwQTczK4QLuplZIVzQzcwK4YJuZlYIF3Qzs0K4oJuZFcIF3cysEC7oZmaFaPKbotdIelzSfW22z5G0RdLKPF3a/2aamVk3TX5T9NPAAmBRh5jbI+K0vrTIzMxGpesRekR8F3hqN7TFzMx2Qb/G0I+TtErSzZKO6FNOMzPrQZMhl26WA4dGxDZJpwA3ADNbBUqaB8wDGB4e7sNTm5nZDrt8hB4RWyNiW55fAkySNLVN7NURMRIRI0NDQ7v61GZmVrHLBV3SQZKU52fnnE/ual4zM+tN1yEXSZ8D5gBTJW0EPgJMAoiIhcCZwAWStgPPA3MjIgbWYjMza6lrQY+I93TZvoB0WqOZmY0hXylqZlYIF3Qzs0K4oJuZFcIF3cysEC7oZmaFcEE3MyuEC7qZWSFc0M3MCuGCbmZWCBd0M7NCuKCbmRXCBd3MrBAu6GZmhXBBNzMrhAu6mVkhXNDNzArhgm5mVoiuBV3SNZIel3Rfm+2SdIWkdZJWS5rV/2aamVk3TY7QPw2c1GH7ycDMPM0Drtr1ZpmZWa+6FvSI+C7wVIeQ04FFkSwFpkg6uF8NNDOzZvoxhj4N2FBZ3pjXmZnZbjSxDznUYl20DJTmkYZlGB4efnn9jEtu2il2/fxTd1rXKq6X2F2JG085x4NB7MvxorT+DMLu+ltvFzuWduX90Y8j9I3AIZXl6cBjrQIj4uqIGImIkaGhoT48tZmZ7dCPgr4YOCef7XIssCUiNvUhr5mZ9aDrkIukzwFzgKmSNgIfASYBRMRCYAlwCrAOeA44b1CNNTOz9roW9Ih4T5ftAVzYtxaZmdmo+EpRM7NCuKCbmRXCBd3MrBAu6GZmhXBBNzMrhAu6mVkhXNDNzArhgm5mVggXdDOzQrigm5kVwgXdzKwQLuhmZoVwQTczK4QLuplZIVzQzcwK4YJuZlYIF3Qzs0I0KuiSTpL0oKR1ki5psX2OpC2SVubp0v431czMOmnym6ITgI8Dbwc2AndLWhwR99dCb4+I0wbQRjMza6DJEfpsYF1EPBwRLwCfB04fbLPMzKxXTQr6NGBDZXljXld3nKRVkm6WdESrRJLmSVomadnmzZtH0VwzM2unSUFXi3VRW14OHBoRRwFXAje0ShQRV0fESESMDA0N9dRQMzPrrElB3wgcUlmeDjxWDYiIrRGxLc8vASZJmtq3VpqZWVdNCvrdwExJh0naG5gLLK4GSDpIkvL87Jz3yX431szM2ut6lktEbJd0EfB1YAJwTUSskXR+3r4QOBO4QNJ24HlgbkTUh2XMzGyAuhZ0eHkYZUlt3cLK/AJgQX+bZmZmvfCVomZmhXBBNzMrhAu6mVkhXNDNzArhgm5mVggXdDOzQrigm5kVwgXdzKwQLuhmZoVwQTczK4QLuplZIVzQzcwK4YJuZlYIF3Qzs0K4oJuZFcIF3cysEC7oZmaFaFTQJZ0k6UFJ6yRd0mK7JF2Rt6+WNKv/TTUzs066FnRJE4CPAycDhwPvkXR4LexkYGae5gFX9bmdZmbWRZMj9NnAuoh4OCJeAD4PnF6LOR1YFMlSYIqkg/vcVjMz60AR0TlAOhM4KSJ+Oy+fDbw5Ii6qxNwIzI+IO/LyN4E/iYhltVzzSEfwAL8APFh7uqnAEw3b3jTWOcvJWVp/nNPvj9HEHhoRQy2jI6LjBJwFfLKyfDZwZS3mJuAtleVvAsd0y93iuZb1O9Y5y8lZWn+c0++PfsZGRKMhl43AIZXl6cBjo4gxM7MBalLQ7wZmSjpM0t7AXGBxLWYxcE4+2+VYYEtEbOpzW83MrIOJ3QIiYruki4CvAxOAayJijaTz8/aFwBLgFGAd8Bxw3ijbc/UAYp2znJyl9cc5y3nusc4JNPhS1MzMxgdfKWpmVggXdDOzQrigm5kVwgXdzKwQLuhmr0KS3iHpKkmLJX01z5/Uw+MvrS1L0rslnZXn/1O+od77Je1ViZtae9x7c9w8Sapte5ekA/L8kKRFku6V9AVJ0ytxB0i6VNJv5+f+sKQbJf2NpNeOpu9N+9Ph9flWi3WD6Ptlko7v1p5K/AmSFuR+XydpvqSfa/z4sT7LRdI7SBcifTMi1lfW/1ZEXNPg8ZdGxEfz/LuA70TEU5KGgP8JHA3cD3wgIjZWHncAcBHpAqhPAR8CjgPWAh+LiKd7iavkPQH4ddKFVtuBh0hX2q6rxPTSzsuA6yLie11eh763M8e9AzgDmAZEzv/ViPhap/bUcry8jyo5u+7z/Ed0Vn7eLwMnku4b9ACwMCJe6vCc34qIE2vrpkbEE5Xl95LuVXQf8L+j8sfQdB813T+VvE3eH5cDPw8sIl20B+n1Ogd4KCIubvA8j0bEcGX5E8DrgL2BrcBPAf9IOt34X3fklLQ8Imbl+T8F/iNwLXAasDEi/rCS8/6IODzPfwFYCnwJeBvwmxHx9rxtCXAvsD/wi3n+i8DbgaMi4vRKzkZ9b9qfHLu6/vLk53gQICLeOMC+bwYeAYaALwCfi4gVtCBpPvB60pX2ZwD/D/hn4P2kv+EvtXrcK/RyWWm/J+BjwHeBy4F/AX6vsm15wxyPVubvr8x/AfhD0pvhfcCttcctAf6adGfI24Ar8w78KKlg9RSXY+cD/wd4L6kA/Q3wO8AK4KxRtnMzsCy/Kf4HcHSb12EQ7bw8550LvCVPc/O6v+9hP1f3UeN9Dnwit28x8FnSH8w5pBvE/X0lbnVtuhf48Y7lVvmBPyVdW3Fuzvt3tedutI+a7p8eX/d/bvN4kYrajuWtbaZnge21x96b/50EPAnsnZcn7tiWl1dUXy9g38rj7q3lfLAyf09t28r6fG7/99vF9dj3Rv3J63a8f94AHArMADbk+UMH3PcV+d+ZwH8D1pAOSD4C/HyrfVTpx/fy/GuB+xr9rTX9oxzERPrDm5jnp5AKxd+1eHEbvXGbvsi9vMl6fDM22iE9trPRG2JA7Wz0x9XjPmq0z6vtpHsRKuoPlvQf0ewWr/vsWo5Hgde32UcbWr2P8vzXOvTnAdInkWOAVV3eR/+LdMCwD+kTzBl5/QmkTzfV/rwWGAa2ADPy+gOp/MfZY98b9aey7l2kA4l35uWHW8QMou87HZgCbwT+inQX2+r6VcABeX4YWFrZtqbVfq5PYz2GPjEitgNExDPArwH7S/oS6aPUDs8AMyNi/9q0H1C9xcBtkj4qaZ88fwa8/DF3S+2598rjd4cAkyXNyLEH1p67aRzASzvG1YB/R7qylkjDHdUxuF7aGTnHQxHxFxFxBPBu4DWkYjjIdv5I0mx29svAj2rrnqHZPmq6zyENSRARPwHujnT7ZvLjX3z5BYp4J3Ad6aq6oyIN4/wkIh6JiEcq+faRdLSkY4AJEfHDSv4XeaWm+6jp/oHmr/v7gCsl3S/pljytJX3qel8lbhHpP61Wrq0t/0DS5Px8L49HSzoIeKEStwm4DPhb4Cnl22Dn99H2Ws6LgJdIQxdnAddLepb0qePsStxfkYrl3cBvAZ+UdCupeF9ey9m07037Q475Cul3G+ZIWszO77VB9V21xxERqyPigxFRHxv/GLBC0i3AHcBf5OcfIhX77ppU/UFNwI3AW1us/+/AS7Xlnf7Xztv+ujI/Cfgz0pHLo/kFf5b05h6uPe49wL/m6deBbwC3At8H5nWJ+0Y9Lsf+Bumj9y35+U/N64eAa0fZzhUNX8tBtHMWcCdp3PiWPK3N646p5Wy6jxrt87zuZmByi9iDgLtarN+X9Ae5mDTmWd/+7dp0cF5/ILW72jXdR033Ty+ve62fxwAjwEED+hvcF3hdg7gJwE932P4zwIFdHr/jk9nE3KeDO8SPqu9N+gMcBZzfQ85R973V+7fLcx2Q+zxlNPtzTL8UzUc/RMTzLbZNi4jv70LunyG9gZ7sEDOB9MXwdkkTgTeRhis2jSYuxx4A/Czp49Qzu9pOSZMjYlu3PINsZz7qmUY62tgYET9o0p42uXZ5n0valzRc8nib7UcBx0W6z1CTNk0Afioinmuzve0+6mX/5PhGr3v+Qng2r/wy+q6o/cE2jRtPOTu8Jm+IiAf6FVdkzrEs6C83QpoU6WNvdV39jIS9ACLiJaW7Ph4JrI+Ip2qPaxTXog2TSd98P1z9Q8s5frLjjZc/cs8ijf3dXMvxxoiof6Pe7vmGga0R8UweHhkBHoiI+0YbO4icOXaEylkZnd5YTWPHMuervT+SfpX0hfBDpE9YkL6Q/Tng/RFxSy9x4ylnJ6qdubOrcUXmHOMj9BOA/0s65WgFaWhgfd5WPYXoDNKXEC8B55NOyfshqQBfEBH/2Etcjv1ERLw/z7+F9FH6X0hvst+NiCV52ypgTkQ8Lem/kr5cWQK8lfQx/YOVnC+STjX6HOn0pPvb9PsS4HdJZ2L8LfBfgO8BxwKfiojLeo0dUM63kr7weYb08fd7pC+3fgKcHREbKjkbxY5lznHUn7XAyVE5pTOvPwxYEhG/2EvcOMt5Ba0JODci9u8lrsScHcUAxuWaTqQvSY7I82eS/gc/Ni+vqMStII2pHUY6c+IX8vpDqYx9No3L66qnsH0bmJXnf7aWs3r2wTJgn/i3ccDVtZwrSJ8I/pJ0K+FVwCXkb/UrcWtI344fSBqXHYp/G/+7bzSxA8q5orLtMOAref7twC0t+t41dixzjqP+PEQeb649fm8qZ0Y0jRtnOZ8l/UzluS2mJ3qNKzFnp6nr/dAHbO+IWAMQEV/O/5Nfn48goxoYedw2f/TYcUHAI6pdFdY0rmb/iFieYx/OY6o7bJV0ZKShiCdIZy88Tyro9ZyR4z4MfFjpDJG5wO2SNkTEf8hxL0bE85JeyLmezA/+oV55QVovsYPIOSEiNuf5R8lnVETErUoXgIwmdixzjpf+XAPcLenzpNMvIQ3RzCVdNNZr3HjKeTfpoOKfauuR9GejiCsxZ1tjPeSyDDgtKl+yKV02eyPw7yOd8oakFaSzKl6SNDsi7srrJ5DOFz2yl7i87jnSUbRI5y0PRxpW2Yt05L0j5xtJw0I7Ths6HvgO6VzSyyLi2krOFRFxdIt+CviViPhOXv406ehkX9IPgmwHvka6EnK/iHh35bGNYgeU8xrSf6zfJF2h+f2I+CNJP036hPOGSs5GsWOZc7z0J8ceDryTypfRwOKoDeM1jRsvOZW+NP5RtPmCute4EnN2zDHGBf1twOaIWFVbPwW4MCL+Mi//Mumigh/V4maQfpz6s73E5XWH1pqzKSJeULqfw69ExPWV2AnAr5LG4ieS3oxfj9pZCpL+c7XAd+j3RF55SfubSacdPgp8PPL50b3EDijnJNJ5tYeT/kO7JiJeVDpT5XVROce7aexY5hwv/TEbtWgwLuPJk6fdN5HOa55PuhjnyTytzeum9BrnnGXl7DSN6ZWikiYrXY23RtIWSZslLZX0vkHGvYpznttDznN3Y877euh7x9ixzDle+kO6cdXTpLOrDoyIA4ETSGfHfGkUcSXkfLphznpciTnbG+Mjka+SLuedDvwR6V4YM4HPkO4uNpA459wzc46j/jxYfVwtx4O9xjlnWTk7TY2L7yAmdr4Bzt35371IF7kMJM4598yc46g/twB/TOXGW6Tbqv4J8I1e45yzrJydprG+OdcPlS7qQdKvAU9BusoTXnFTm37HOeeemXO89Oc3SNcIfEfS05KeIt0S+QDSjb96jXPOsnK216TqD2oinfp3F2ks7Q7y7UZJNyv6/UHFOeeemXO89CevewPpxxIm19afNJo45ywrZ7tpTAt6x4bBeWMR55x7Zs5XU3+A3yfdlvUGYD1wemXb8l7jnLOsnB3fR03fcLt7ovIrN7szzjn3zJyvpv6QfgRkcp6fQbrlxMV5eUWvcc5ZVs5O05he+q+df+vv5U2kLwMGEuece2bO8dIf0i0CtgFExHpJc4AvK10Mp1HEOWdZOdsa63u5vB54B+k8yyoB/zTAOOfcM3OOl/78QNKbImIlQERsk3Qa6Z4ovzSKOOcsK2d7TQ7jBzWRbszzljbbrh1UnHPumTnHUX+m0+ZXeoDje41zzrJydppeFT9wYWZmu26sz0M3M7M+cUE3MyuEC7qZWSFc0M3MCuGCbmZWiP8Pzjbdtgu9G8wAAAAASUVORK5CYII=", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "merged[(merged['name']==\"Keanu Reeves\") & (merged['country']==\"USA\")].date.dt.year.value_counts().sort_index().plot(kind='bar')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.12" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} +ons of Pandas" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import pandas as pd\n", + "import matplotlib.pyplot as plt\n", + "\n", + "%matplotlib inline" + ] + }, { "cell_type": "code", "execution_count": 2, @@ -1172,4 +4691,4 @@ }, "nbformat": 4, "nbformat_minor": 2 -} \ No newline at end of file +} From bef0bcc2af2b979f0dc73d48ee2391478f3898a4 Mon Sep 17 00:00:00 2001 From: Pierre Lermant Date: Mon, 26 Sep 2022 16:18:11 -0700 Subject: [PATCH 4/5] Update Mini_Project_Data_Wrangling_Pandas.ipynb --- .../Mini_Project_Data_Wrangling_Pandas.ipynb | 1175 ----------------- 1 file changed, 1175 deletions(-) diff --git a/mec-5.3.10-data-wranging-with-pandas-mini-project/Mini_Project_Data_Wrangling_Pandas.ipynb b/mec-5.3.10-data-wranging-with-pandas-mini-project/Mini_Project_Data_Wrangling_Pandas.ipynb index c5e3c56f9..605a9d60b 100755 --- a/mec-5.3.10-data-wranging-with-pandas-mini-project/Mini_Project_Data_Wrangling_Pandas.ipynb +++ b/mec-5.3.10-data-wranging-with-pandas-mini-project/Mini_Project_Data_Wrangling_Pandas.ipynb @@ -1,21 +1,4 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Mini-Project: Data Wrangling and Transformation with Pandas\n", - "\n", - "Working with tabular data is a necessity for anyone with enterprises having a majority of their data in relational databases and flat files. This mini-project is adopted from the excellent tutorial on pandas by Brandon Rhodes which you have watched earlier in the Data Wrangling Unit. In this mini-project, we will be looking at some interesting data based on movie data from the IMDB.\n", - "\n", - "This assignment should help you reinforce the concepts you learnt in the curriculum for Data Wrangling and sharpen your skills in using Pandas. Good Luck!" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Please make sure you have one of the more recent versi{ "cells": [ { "cell_type": "markdown", @@ -3534,1161 +3517,3 @@ "nbformat": 4, "nbformat_minor": 2 } -ons of Pandas" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "import matplotlib.pyplot as plt\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "'0.25.3'" - ] - }, - "metadata": {}, - "execution_count": 2 - } - ], - "source": [ - "pd.__version__" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Taking a look at the Movies dataset\n", - "This data shows the movies based on their title and the year of release" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "RangeIndex: 244914 entries, 0 to 244913\n", - "Data columns (total 2 columns):\n", - " # Column Non-Null Count Dtype \n", - "--- ------ -------------- ----- \n", - " 0 title 244914 non-null object\n", - " 1 year 244914 non-null int64 \n", - "dtypes: int64(1), object(1)\n", - "memory usage: 3.7+ MB\n" - ] - } - ], - "source": [ - "movies = pd.read_csv('titles.csv')\n", - "movies.info()" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
titleyear
0The Ticket to the Life2009
1Parallel Worlds: A New Rock Music Experience2016
2Morita - La hija de Jesus2008
3Gun2017
4Love or Nothing at All2014
\n", - "
" - ], - "text/plain": [ - " title year\n", - "0 The Ticket to the Life 2009\n", - "1 Parallel Worlds: A New Rock Music Experience 2016\n", - "2 Morita - La hija de Jesus 2008\n", - "3 Gun 2017\n", - "4 Love or Nothing at All 2014" - ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "movies.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Taking a look at the Cast dataset\n", - "\n", - "This data shows the cast (actors, actresses, supporting roles) for each movie\n", - "\n", - "- The attribute `n` basically tells the importance of the cast role, lower the number, more important the role.\n", - "- Supporting cast usually don't have any value for `n`" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "\nRangeIndex: 3786176 entries, 0 to 3786175\nData columns (total 6 columns):\ntitle object\nyear int64\nname object\ntype object\ncharacter object\nn float64\ndtypes: float64(1), int64(1), object(4)\nmemory usage: 173.3+ MB\n" - ] - } - ], - "source": [ - "cast = pd.read_csv('cast.csv.zip')\n", - "cast.info()" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
titleyearnametypecharactern
0Closet Monster2015Buffy #1actorBuffy 431.0
1Suuri illusioni1985Homo $actorGuests22.0
2Battle of the Sexes2017$hutteractorBobby Riggs Fan10.0
3Secret in Their Eyes2015$hutteractor2002 Dodger FanNaN
4Steve Jobs2015$hutteractor1988 Opera House PatronNaN
5Straight Outta Compton2015$hutteractorClub PatronNaN
6Straight Outta Compton2015$hutteractorDopemanNaN
7For Thy Love 22009Bee Moe $limactorThug 1NaN
8Lapis, Ballpen at Diploma, a True to Life Journey2014Jori ' Danilo' Jurado Jr.actorJaime (young)9.0
9Desire (III)2014Syaiful 'AriffinactorActor Playing Eteocles from 'Antigone'NaN
\n", - "
" - ], - "text/plain": [ - " title year \\\n", - "0 Closet Monster 2015 \n", - "1 Suuri illusioni 1985 \n", - "2 Battle of the Sexes 2017 \n", - "3 Secret in Their Eyes 2015 \n", - "4 Steve Jobs 2015 \n", - "5 Straight Outta Compton 2015 \n", - "6 Straight Outta Compton 2015 \n", - "7 For Thy Love 2 2009 \n", - "8 Lapis, Ballpen at Diploma, a True to Life Journey 2014 \n", - "9 Desire (III) 2014 \n", - "\n", - " name type character \\\n", - "0 Buffy #1 actor Buffy 4 \n", - "1 Homo $ actor Guests \n", - "2 $hutter actor Bobby Riggs Fan \n", - "3 $hutter actor 2002 Dodger Fan \n", - "4 $hutter actor 1988 Opera House Patron \n", - "5 $hutter actor Club Patron \n", - "6 $hutter actor Dopeman \n", - "7 Bee Moe $lim actor Thug 1 \n", - "8 Jori ' Danilo' Jurado Jr. actor Jaime (young) \n", - "9 Syaiful 'Ariffin actor Actor Playing Eteocles from 'Antigone' \n", - "\n", - " n \n", - "0 31.0 \n", - "1 22.0 \n", - "2 10.0 \n", - "3 NaN \n", - "4 NaN \n", - "5 NaN \n", - "6 NaN \n", - "7 NaN \n", - "8 9.0 \n", - "9 NaN " - ] - }, - "execution_count": 6, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "cast.head(10)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Taking a look at the Release dataset\n", - "\n", - "This data shows details of when each movie was release in each country with the release date" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "RangeIndex: 479488 entries, 0 to 479487\n", - "Data columns (total 4 columns):\n", - " # Column Non-Null Count Dtype \n", - "--- ------ -------------- ----- \n", - " 0 title 479488 non-null object \n", - " 1 year 479488 non-null int64 \n", - " 2 country 479488 non-null object \n", - " 3 date 479488 non-null datetime64[ns]\n", - "dtypes: datetime64[ns](1), int64(1), object(2)\n", - "memory usage: 14.6+ MB\n" - ] - } - ], - "source": [ - "release_dates = pd.read_csv('release_dates.csv', parse_dates=['date'], infer_datetime_format=True)\n", - "release_dates.info()" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
titleyearcountrydate
0#73, Shaanthi Nivaasa2007India2007-06-15
1#BKKY2016Cambodia2017-10-12
2#Beings2015Romania2015-01-29
3#Captured2017USA2017-09-05
4#Ewankosau saranghaeyo2015Philippines2015-01-21
\n", - "
" - ], - "text/plain": [ - " title year country date\n", - "0 #73, Shaanthi Nivaasa 2007 India 2007-06-15\n", - "1 #BKKY 2016 Cambodia 2017-10-12\n", - "2 #Beings 2015 Romania 2015-01-29\n", - "3 #Captured 2017 USA 2017-09-05\n", - "4 #Ewankosau saranghaeyo 2015 Philippines 2015-01-21" - ] - }, - "execution_count": 8, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "release_dates.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Section I - Basic Querying, Filtering and Transformations" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### What is the total number of movies?" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "len(movies)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### List all Batman movies ever made" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "batman_df = movies[movies.title == 'Batman']\n", - "print('Total Batman Movies:', len(batman_df))\n", - "batman_df" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### List all Batman movies ever made - the right approach" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "batman_df = movies[movies.title.str.contains('Batman', case=False)]\n", - "print('Total Batman Movies:', len(batman_df))\n", - "batman_df.head(10)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Display the top 15 Batman movies in the order they were released" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "batman_df.sort_values(by=['year'], ascending=True).iloc[:15]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section I - Q1 : List all the 'Harry Potter' movies from the most recent to the earliest" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### How many movies were made in the year 2017?" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "len(movies[movies.year == 2017])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section I - Q2 : How many movies were made in the year 2015?" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section I - Q3 : How many movies were made from 2000 till 2018?\n", - "- You can chain multiple conditions using OR (`|`) as well as AND (`&`) depending on the condition" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section I - Q4: How many movies are titled \"Hamlet\"?" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section I - Q5: List all movies titled \"Hamlet\" \n", - "- The movies should only have been released on or after the year 2000\n", - "- Display the movies based on the year they were released (earliest to most recent)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section I - Q6: How many roles in the movie \"Inception\" are of the supporting cast (extra credits)\n", - "- supporting cast are NOT ranked by an \"n\" value (NaN)\n", - "- check for how to filter based on nulls" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section I - Q7: How many roles in the movie \"Inception\" are of the main cast\n", - "- main cast always have an 'n' value" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section I - Q8: Show the top ten cast (actors\\actresses) in the movie \"Inception\" \n", - "- main cast always have an 'n' value\n", - "- remember to sort!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section I - Q9:\n", - "\n", - "(A) List all movies where there was a character 'Albus Dumbledore' \n", - "\n", - "(B) Now modify the above to show only the actors who played the character 'Albus Dumbledore'\n", - "- For Part (B) remember the same actor might play the same role in multiple movies" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section I - Q10:\n", - "\n", - "(A) How many roles has 'Keanu Reeves' played throughout his career?\n", - "\n", - "(B) List the leading roles that 'Keanu Reeves' played on or after 1999 in order by year." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section I - Q11: \n", - "\n", - "(A) List the total number of actor and actress roles available from 1950 - 1960\n", - "\n", - "(B) List the total number of actor and actress roles available from 2007 - 2017" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section I - Q12: \n", - "\n", - "(A) List the total number of leading roles available from 2000 to present\n", - "\n", - "(B) List the total number of non-leading roles available from 2000 - present (exclude support cast)\n", - "\n", - "(C) List the total number of support\\extra-credit roles available from 2000 - present" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Section II - Aggregations, Transformations and Visualizations" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## What are the top ten most common movie names of all time?\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "top_ten = movies.title.value_counts()[:10]\n", - "top_ten" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Plot the top ten common movie names of all time" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "top_ten.plot(kind='barh')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section II - Q1: Which years in the 2000s saw the most movies released? (Show top 3)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section II - Q2: # Plot the total number of films released per-decade (1890, 1900, 1910,....)\n", - "- Hint: Dividing the year and multiplying with a number might give you the decade the year falls into!\n", - "- You might need to sort before plotting" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section II - Q3: \n", - "\n", - "(A) What are the top 10 most common character names in movie history?\n", - "\n", - "(B) Who are the top 10 people most often credited as \"Herself\" in movie history?\n", - "\n", - "(C) Who are the top 10 people most often credited as \"Himself\" in movie history?" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section II - Q4: \n", - "\n", - "(A) What are the top 10 most frequent roles that start with the word \"Zombie\"?\n", - "\n", - "(B) What are the top 10 most frequent roles that start with the word \"Police\"?\n", - "\n", - "- Hint: The `startswith()` function might be useful" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section II - Q5: Plot how many roles 'Keanu Reeves' has played in each year of his career." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section II - Q6: Plot the cast positions (n-values) of Keanu Reeve's roles through his career over the years.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section II - Q7: Plot the number of \"Hamlet\" films made by each decade" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section II - Q8: \n", - "\n", - "(A) How many leading roles were available to both actors and actresses, in the 1960s (1960-1969)?\n", - "\n", - "(B) How many leading roles were available to both actors and actresses, in the 2000s (2000-2009)?\n", - "\n", - "- Hint: A specific value of n might indicate a leading role" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section II - Q9: List, in order by year, each of the films in which Frank Oz has played more than 1 role." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section II - Q10: List each of the characters that Frank Oz has portrayed at least twice" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Section III - Advanced Merging, Querying and Visualizations" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Make a bar plot with the following conditions\n", - "- Frequency of the number of movies with \"Christmas\" in their title \n", - "- Movies should be such that they are released in the USA.\n", - "- Show the frequency plot by month" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "christmas = release_dates[(release_dates.title.str.contains('Christmas')) & (release_dates.country == 'USA')]\n", - "christmas.date.dt.month.value_counts().sort_index().plot(kind='bar')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section III - Q1: Make a bar plot with the following conditions\n", - "- Frequency of the number of movies with \"Summer\" in their title \n", - "- Movies should be such that they are released in the USA.\n", - "- Show the frequency plot by month" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section III - Q2: Make a bar plot with the following conditions\n", - "- Frequency of the number of movies with \"Action\" in their title \n", - "- Movies should be such that they are released in the USA.\n", - "- Show the frequency plot by week" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section III - Q3: Show all the movies in which Keanu Reeves has played the lead role along with their release date in the USA sorted by the date of release\n", - "- Hint: You might need to join or merge two datasets!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - " " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section III - Q4: Make a bar plot showing the months in which movies with Keanu Reeves tend to be released in the USA?" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Section III - Q5: Make a bar plot showing the years in which movies with Ian McKellen tend to be released in the USA?" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6-final" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} From f945342570a1a34daa32798cefcceb7cca468183 Mon Sep 17 00:00:00 2001 From: Pierre Lermant Date: Fri, 30 Sep 2022 13:16:34 -0700 Subject: [PATCH 5/5] Update Mini_Project_Wrangling_Json_Exercise.ipynb --- ...Mini_Project_Wrangling_Json_Exercise.ipynb | 1140 +++++++++++++---- 1 file changed, 893 insertions(+), 247 deletions(-) diff --git a/mec-5.4.4-json-data-wrangling-mini-project/Mini_Project_Wrangling_Json_Exercise.ipynb b/mec-5.4.4-json-data-wrangling-mini-project/Mini_Project_Wrangling_Json_Exercise.ipynb index a8bfea9e2..7eb7c1c11 100755 --- a/mec-5.4.4-json-data-wrangling-mini-project/Mini_Project_Wrangling_Json_Exercise.ipynb +++ b/mec-5.4.4-json-data-wrangling-mini-project/Mini_Project_Wrangling_Json_Exercise.ipynb @@ -17,10 +17,8 @@ }, { "cell_type": "code", - "execution_count": 3, - "metadata": { - "collapsed": true - }, + "execution_count": 2, + "metadata": {}, "outputs": [], "source": [ "import pandas as pd" @@ -35,10 +33,8 @@ }, { "cell_type": "code", - "execution_count": 6, - "metadata": { - "collapsed": true - }, + "execution_count": 3, + "metadata": {}, "outputs": [], "source": [ "import json\n", @@ -57,10 +53,8 @@ }, { "cell_type": "code", - "execution_count": 4, - "metadata": { - "collapsed": true - }, + "execution_count": 6, + "metadata": {}, "outputs": [], "source": [ "# define json string\n", @@ -74,20 +68,31 @@ " 'shortname': 'OH',\n", " 'info': {'governor': 'John Kasich'},\n", " 'counties': [{'name': 'Summit', 'population': 1234},\n", - " {'name': 'Cuyahoga', 'population': 1337}]}]" + " {'name': 'Cuyahoga', 'population': 1337}]}]\n" ] }, { "cell_type": "code", - "execution_count": 7, - "metadata": { - "collapsed": false - }, + "execution_count": 10, + "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
\n", + "
\n", + "\n", "\n", " \n", " \n", @@ -135,36 +140,47 @@ "4 Cuyahoga 1337" ] }, - "execution_count": 7, + "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# use normalization to create tables from nested element\n", - "json_normalize(data, 'counties')" + "pd.json_normalize(data, 'counties')" ] }, { "cell_type": "code", - "execution_count": 8, - "metadata": { - "collapsed": false - }, + "execution_count": 9, + "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
\n", + "
\n", + "\n", "
\n", " \n", " \n", " \n", " \n", " \n", - " \n", " \n", " \n", + " \n", " \n", " \n", " \n", @@ -172,63 +188,63 @@ " \n", " \n", " \n", - " \n", " \n", " \n", + " \n", " \n", " \n", " \n", " \n", " \n", - " \n", " \n", " \n", + " \n", " \n", " \n", " \n", " \n", " \n", - " \n", " \n", " \n", + " \n", " \n", " \n", " \n", " \n", " \n", - " \n", " \n", " \n", + " \n", " \n", " \n", " \n", " \n", " \n", - " \n", " \n", " \n", + " \n", " \n", " \n", "
namepopulationinfo.governorstateshortnameinfo.governor
0Dade12345Rick ScottFloridaFLRick Scott
1Broward40000Rick ScottFloridaFLRick Scott
2Palm Beach60000Rick ScottFloridaFLRick Scott
3Summit1234John KasichOhioOHJohn Kasich
4Cuyahoga1337John KasichOhioOHJohn Kasich
\n", "
" ], "text/plain": [ - " name population info.governor state shortname\n", - "0 Dade 12345 Rick Scott Florida FL\n", - "1 Broward 40000 Rick Scott Florida FL\n", - "2 Palm Beach 60000 Rick Scott Florida FL\n", - "3 Summit 1234 John Kasich Ohio OH\n", - "4 Cuyahoga 1337 John Kasich Ohio OH" + " name population state shortname info.governor\n", + "0 Dade 12345 Florida FL Rick Scott\n", + "1 Broward 40000 Florida FL Rick Scott\n", + "2 Palm Beach 60000 Florida FL Rick Scott\n", + "3 Summit 1234 Ohio OH John Kasich\n", + "4 Cuyahoga 1337 Ohio OH John Kasich" ] }, - "execution_count": 8, + "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# further populate tables created from nested element\n", - "json_normalize(data, 'counties', ['state', 'shortname', ['info', 'governor']])" + "pd.json_normalize(data, 'counties', ['state', 'shortname', ['info', 'governor']])" ] }, { @@ -245,182 +261,178 @@ }, { "cell_type": "code", - "execution_count": 9, - "metadata": { - "collapsed": false - }, + "execution_count": 11, + "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "[{u'_id': {u'$oid': u'52b213b38594d8a2be17c780'},\n", - " u'approvalfy': 1999,\n", - " u'board_approval_month': u'November',\n", - " u'boardapprovaldate': u'2013-11-12T00:00:00Z',\n", - " u'borrower': u'FEDERAL DEMOCRATIC REPUBLIC OF ETHIOPIA',\n", - " u'closingdate': u'2018-07-07T00:00:00Z',\n", - " u'country_namecode': u'Federal Democratic Republic of Ethiopia!$!ET',\n", - " u'countrycode': u'ET',\n", - " u'countryname': u'Federal Democratic Republic of Ethiopia',\n", - " u'countryshortname': u'Ethiopia',\n", - " u'docty': u'Project Information Document,Indigenous Peoples Plan,Project Information Document',\n", - " u'envassesmentcategorycode': u'C',\n", - " u'grantamt': 0,\n", - " u'ibrdcommamt': 0,\n", - " u'id': u'P129828',\n", - " u'idacommamt': 130000000,\n", - " u'impagency': u'MINISTRY OF EDUCATION',\n", - " u'lendinginstr': u'Investment Project Financing',\n", - " u'lendinginstrtype': u'IN',\n", - " u'lendprojectcost': 550000000,\n", - " u'majorsector_percent': [{u'Name': u'Education', u'Percent': 46},\n", - " {u'Name': u'Education', u'Percent': 26},\n", - " {u'Name': u'Public Administration, Law, and Justice', u'Percent': 16},\n", - " {u'Name': u'Education', u'Percent': 12}],\n", - " u'mjsector_namecode': [{u'code': u'EX', u'name': u'Education'},\n", - " {u'code': u'EX', u'name': u'Education'},\n", - " {u'code': u'BX', u'name': u'Public Administration, Law, and Justice'},\n", - " {u'code': u'EX', u'name': u'Education'}],\n", - " u'mjtheme': [u'Human development'],\n", - " u'mjtheme_namecode': [{u'code': u'8', u'name': u'Human development'},\n", - " {u'code': u'11', u'name': u''}],\n", - " u'mjthemecode': u'8,11',\n", - " u'prodline': u'PE',\n", - " u'prodlinetext': u'IBRD/IDA',\n", - " u'productlinetype': u'L',\n", - " u'project_abstract': {u'cdata': u'The development objective of the Second Phase of General Education Quality Improvement Project for Ethiopia is to improve learning conditions in primary and secondary schools and strengthen institutions at different levels of educational administration. The project has six components. The first component is curriculum, textbooks, assessment, examinations, and inspection. This component will support improvement of learning conditions in grades KG-12 by providing increased access to teaching and learning materials and through improvements to the curriculum by assessing the strengths and weaknesses of the current curriculum. This component has following four sub-components: (i) curriculum reform and implementation; (ii) teaching and learning materials; (iii) assessment and examinations; and (iv) inspection. The second component is teacher development program (TDP). This component will support improvements in learning conditions in both primary and secondary schools by advancing the quality of teaching in general education through: (a) enhancing the training of pre-service teachers in teacher education institutions; and (b) improving the quality of in-service teacher training. This component has following three sub-components: (i) pre-service teacher training; (ii) in-service teacher training; and (iii) licensing and relicensing of teachers and school leaders. The third component is school improvement plan. This component will support the strengthening of school planning in order to improve learning outcomes, and to partly fund the school improvement plans through school grants. It has following two sub-components: (i) school improvement plan; and (ii) school grants. The fourth component is management and capacity building, including education management information systems (EMIS). This component will support management and capacity building aspect of the project. This component has following three sub-components: (i) capacity building for education planning and management; (ii) capacity building for school planning and management; and (iii) EMIS. The fifth component is improving the quality of learning and teaching in secondary schools and universities through the use of information and communications technology (ICT). It has following five sub-components: (i) national policy and institution for ICT in general education; (ii) national ICT infrastructure improvement plan for general education; (iii) develop an integrated monitoring, evaluation, and learning system specifically for the ICT component; (iv) teacher professional development in the use of ICT; and (v) provision of limited number of e-Braille display readers with the possibility to scale up to all secondary education schools based on the successful implementation and usage of the readers. The sixth component is program coordination, monitoring and evaluation, and communication. It will support institutional strengthening by developing capacities in all aspects of program coordination, monitoring and evaluation; a new sub-component on communications will support information sharing for better management and accountability. It has following three sub-components: (i) program coordination; (ii) monitoring and evaluation (M and E); and (iii) communication.'},\n", - " u'project_name': u'Ethiopia General Education Quality Improvement Project II',\n", - " u'projectdocs': [{u'DocDate': u'28-AUG-2013',\n", - " u'DocType': u'PID',\n", - " u'DocTypeDesc': u'Project Information Document (PID), Vol.',\n", - " u'DocURL': u'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=090224b081e545fb_1_0',\n", - " u'EntityID': u'090224b081e545fb_1_0'},\n", - " {u'DocDate': u'01-JUL-2013',\n", - " u'DocType': u'IP',\n", - " u'DocTypeDesc': u'Indigenous Peoples Plan (IP), Vol.1 of 1',\n", - " u'DocURL': u'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000442464_20130920111729',\n", - " u'EntityID': u'000442464_20130920111729'},\n", - " {u'DocDate': u'22-NOV-2012',\n", - " u'DocType': u'PID',\n", - " u'DocTypeDesc': u'Project Information Document (PID), Vol.',\n", - " u'DocURL': u'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=090224b0817b19e2_1_0',\n", - " u'EntityID': u'090224b0817b19e2_1_0'}],\n", - " u'projectfinancialtype': u'IDA',\n", - " u'projectstatusdisplay': u'Active',\n", - " u'regionname': u'Africa',\n", - " u'sector': [{u'Name': u'Primary education'},\n", - " {u'Name': u'Secondary education'},\n", - " {u'Name': u'Public administration- Other social services'},\n", - " {u'Name': u'Tertiary education'}],\n", - " u'sector1': {u'Name': u'Primary education', u'Percent': 46},\n", - " u'sector2': {u'Name': u'Secondary education', u'Percent': 26},\n", - " u'sector3': {u'Name': u'Public administration- Other social services',\n", - " u'Percent': 16},\n", - " u'sector4': {u'Name': u'Tertiary education', u'Percent': 12},\n", - " u'sector_namecode': [{u'code': u'EP', u'name': u'Primary education'},\n", - " {u'code': u'ES', u'name': u'Secondary education'},\n", - " {u'code': u'BS', u'name': u'Public administration- Other social services'},\n", - " {u'code': u'ET', u'name': u'Tertiary education'}],\n", - " u'sectorcode': u'ET,BS,ES,EP',\n", - " u'source': u'IBRD',\n", - " u'status': u'Active',\n", - " u'supplementprojectflg': u'N',\n", - " u'theme1': {u'Name': u'Education for all', u'Percent': 100},\n", - " u'theme_namecode': [{u'code': u'65', u'name': u'Education for all'}],\n", - " u'themecode': u'65',\n", - " u'totalamt': 130000000,\n", - " u'totalcommamt': 130000000,\n", - " u'url': u'http://www.worldbank.org/projects/P129828/ethiopia-general-education-quality-improvement-project-ii?lang=en'},\n", - " {u'_id': {u'$oid': u'52b213b38594d8a2be17c781'},\n", - " u'approvalfy': 2015,\n", - " u'board_approval_month': u'November',\n", - " u'boardapprovaldate': u'2013-11-04T00:00:00Z',\n", - " u'borrower': u'GOVERNMENT OF TUNISIA',\n", - " u'country_namecode': u'Republic of Tunisia!$!TN',\n", - " u'countrycode': u'TN',\n", - " u'countryname': u'Republic of Tunisia',\n", - " u'countryshortname': u'Tunisia',\n", - " u'docty': u'Project Information Document,Integrated Safeguards Data Sheet,Integrated Safeguards Data Sheet,Project Information Document,Integrated Safeguards Data Sheet,Project Information Document',\n", - " u'envassesmentcategorycode': u'C',\n", - " u'grantamt': 4700000,\n", - " u'ibrdcommamt': 0,\n", - " u'id': u'P144674',\n", - " u'idacommamt': 0,\n", - " u'impagency': u'MINISTRY OF FINANCE',\n", - " u'lendinginstr': u'Specific Investment Loan',\n", - " u'lendinginstrtype': u'IN',\n", - " u'lendprojectcost': 5700000,\n", - " u'majorsector_percent': [{u'Name': u'Public Administration, Law, and Justice',\n", - " u'Percent': 70},\n", - " {u'Name': u'Public Administration, Law, and Justice', u'Percent': 30}],\n", - " u'mjsector_namecode': [{u'code': u'BX',\n", - " u'name': u'Public Administration, Law, and Justice'},\n", - " {u'code': u'BX', u'name': u'Public Administration, Law, and Justice'}],\n", - " u'mjtheme': [u'Economic management',\n", - " u'Social protection and risk management'],\n", - " u'mjtheme_namecode': [{u'code': u'1', u'name': u'Economic management'},\n", - " {u'code': u'6', u'name': u'Social protection and risk management'}],\n", - " u'mjthemecode': u'1,6',\n", - " u'prodline': u'RE',\n", - " u'prodlinetext': u'Recipient Executed Activities',\n", - " u'productlinetype': u'L',\n", - " u'project_name': u'TN: DTF Social Protection Reforms Support',\n", - " u'projectdocs': [{u'DocDate': u'29-MAR-2013',\n", - " u'DocType': u'PID',\n", - " u'DocTypeDesc': u'Project Information Document (PID), Vol.1 of 1',\n", - " u'DocURL': u'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000333037_20131024115616',\n", - " u'EntityID': u'000333037_20131024115616'},\n", - " {u'DocDate': u'29-MAR-2013',\n", - " u'DocType': u'ISDS',\n", - " u'DocTypeDesc': u'Integrated Safeguards Data Sheet (ISDS), Vol.1 of 1',\n", - " u'DocURL': u'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000356161_20131024151611',\n", - " u'EntityID': u'000356161_20131024151611'},\n", - " {u'DocDate': u'29-MAR-2013',\n", - " u'DocType': u'ISDS',\n", - " u'DocTypeDesc': u'Integrated Safeguards Data Sheet (ISDS), Vol.1 of 1',\n", - " u'DocURL': u'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000442464_20131031112136',\n", - " u'EntityID': u'000442464_20131031112136'},\n", - " {u'DocDate': u'29-MAR-2013',\n", - " u'DocType': u'PID',\n", - " u'DocTypeDesc': u'Project Information Document (PID), Vol.1 of 1',\n", - " u'DocURL': u'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000333037_20131031105716',\n", - " u'EntityID': u'000333037_20131031105716'},\n", - " {u'DocDate': u'16-JAN-2013',\n", - " u'DocType': u'ISDS',\n", - " u'DocTypeDesc': u'Integrated Safeguards Data Sheet (ISDS), Vol.1 of 1',\n", - " u'DocURL': u'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000356161_20130305113209',\n", - " u'EntityID': u'000356161_20130305113209'},\n", - " {u'DocDate': u'16-JAN-2013',\n", - " u'DocType': u'PID',\n", - " u'DocTypeDesc': u'Project Information Document (PID), Vol.1 of 1',\n", - " u'DocURL': u'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000356161_20130305113716',\n", - " u'EntityID': u'000356161_20130305113716'}],\n", - " u'projectfinancialtype': u'OTHER',\n", - " u'projectstatusdisplay': u'Active',\n", - " u'regionname': u'Middle East and North Africa',\n", - " u'sector': [{u'Name': u'Public administration- Other social services'},\n", - " {u'Name': u'General public administration sector'}],\n", - " u'sector1': {u'Name': u'Public administration- Other social services',\n", - " u'Percent': 70},\n", - " u'sector2': {u'Name': u'General public administration sector',\n", - " u'Percent': 30},\n", - " u'sector_namecode': [{u'code': u'BS',\n", - " u'name': u'Public administration- Other social services'},\n", - " {u'code': u'BZ', u'name': u'General public administration sector'}],\n", - " u'sectorcode': u'BZ,BS',\n", - " u'source': u'IBRD',\n", - " u'status': u'Active',\n", - " u'supplementprojectflg': u'N',\n", - " u'theme1': {u'Name': u'Other economic management', u'Percent': 30},\n", - " u'theme_namecode': [{u'code': u'24', u'name': u'Other economic management'},\n", - " {u'code': u'54', u'name': u'Social safety nets'}],\n", - " u'themecode': u'54,24',\n", - " u'totalamt': 0,\n", - " u'totalcommamt': 4700000,\n", - " u'url': u'http://www.worldbank.org/projects/P144674?lang=en'}]" + "[{'_id': {'$oid': '52b213b38594d8a2be17c780'},\n", + " 'approvalfy': 1999,\n", + " 'board_approval_month': 'November',\n", + " 'boardapprovaldate': '2013-11-12T00:00:00Z',\n", + " 'borrower': 'FEDERAL DEMOCRATIC REPUBLIC OF ETHIOPIA',\n", + " 'closingdate': '2018-07-07T00:00:00Z',\n", + " 'country_namecode': 'Federal Democratic Republic of Ethiopia!$!ET',\n", + " 'countrycode': 'ET',\n", + " 'countryname': 'Federal Democratic Republic of Ethiopia',\n", + " 'countryshortname': 'Ethiopia',\n", + " 'docty': 'Project Information Document,Indigenous Peoples Plan,Project Information Document',\n", + " 'envassesmentcategorycode': 'C',\n", + " 'grantamt': 0,\n", + " 'ibrdcommamt': 0,\n", + " 'id': 'P129828',\n", + " 'idacommamt': 130000000,\n", + " 'impagency': 'MINISTRY OF EDUCATION',\n", + " 'lendinginstr': 'Investment Project Financing',\n", + " 'lendinginstrtype': 'IN',\n", + " 'lendprojectcost': 550000000,\n", + " 'majorsector_percent': [{'Name': 'Education', 'Percent': 46},\n", + " {'Name': 'Education', 'Percent': 26},\n", + " {'Name': 'Public Administration, Law, and Justice', 'Percent': 16},\n", + " {'Name': 'Education', 'Percent': 12}],\n", + " 'mjsector_namecode': [{'name': 'Education', 'code': 'EX'},\n", + " {'name': 'Education', 'code': 'EX'},\n", + " {'name': 'Public Administration, Law, and Justice', 'code': 'BX'},\n", + " {'name': 'Education', 'code': 'EX'}],\n", + " 'mjtheme': ['Human development'],\n", + " 'mjtheme_namecode': [{'name': 'Human development', 'code': '8'},\n", + " {'name': '', 'code': '11'}],\n", + " 'mjthemecode': '8,11',\n", + " 'prodline': 'PE',\n", + " 'prodlinetext': 'IBRD/IDA',\n", + " 'productlinetype': 'L',\n", + " 'project_abstract': {'cdata': 'The development objective of the Second Phase of General Education Quality Improvement Project for Ethiopia is to improve learning conditions in primary and secondary schools and strengthen institutions at different levels of educational administration. The project has six components. The first component is curriculum, textbooks, assessment, examinations, and inspection. This component will support improvement of learning conditions in grades KG-12 by providing increased access to teaching and learning materials and through improvements to the curriculum by assessing the strengths and weaknesses of the current curriculum. This component has following four sub-components: (i) curriculum reform and implementation; (ii) teaching and learning materials; (iii) assessment and examinations; and (iv) inspection. The second component is teacher development program (TDP). This component will support improvements in learning conditions in both primary and secondary schools by advancing the quality of teaching in general education through: (a) enhancing the training of pre-service teachers in teacher education institutions; and (b) improving the quality of in-service teacher training. This component has following three sub-components: (i) pre-service teacher training; (ii) in-service teacher training; and (iii) licensing and relicensing of teachers and school leaders. The third component is school improvement plan. This component will support the strengthening of school planning in order to improve learning outcomes, and to partly fund the school improvement plans through school grants. It has following two sub-components: (i) school improvement plan; and (ii) school grants. The fourth component is management and capacity building, including education management information systems (EMIS). This component will support management and capacity building aspect of the project. This component has following three sub-components: (i) capacity building for education planning and management; (ii) capacity building for school planning and management; and (iii) EMIS. The fifth component is improving the quality of learning and teaching in secondary schools and universities through the use of information and communications technology (ICT). It has following five sub-components: (i) national policy and institution for ICT in general education; (ii) national ICT infrastructure improvement plan for general education; (iii) develop an integrated monitoring, evaluation, and learning system specifically for the ICT component; (iv) teacher professional development in the use of ICT; and (v) provision of limited number of e-Braille display readers with the possibility to scale up to all secondary education schools based on the successful implementation and usage of the readers. The sixth component is program coordination, monitoring and evaluation, and communication. It will support institutional strengthening by developing capacities in all aspects of program coordination, monitoring and evaluation; a new sub-component on communications will support information sharing for better management and accountability. It has following three sub-components: (i) program coordination; (ii) monitoring and evaluation (M and E); and (iii) communication.'},\n", + " 'project_name': 'Ethiopia General Education Quality Improvement Project II',\n", + " 'projectdocs': [{'DocTypeDesc': 'Project Information Document (PID), Vol.',\n", + " 'DocType': 'PID',\n", + " 'EntityID': '090224b081e545fb_1_0',\n", + " 'DocURL': 'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=090224b081e545fb_1_0',\n", + " 'DocDate': '28-AUG-2013'},\n", + " {'DocTypeDesc': 'Indigenous Peoples Plan (IP), Vol.1 of 1',\n", + " 'DocType': 'IP',\n", + " 'EntityID': '000442464_20130920111729',\n", + " 'DocURL': 'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000442464_20130920111729',\n", + " 'DocDate': '01-JUL-2013'},\n", + " {'DocTypeDesc': 'Project Information Document (PID), Vol.',\n", + " 'DocType': 'PID',\n", + " 'EntityID': '090224b0817b19e2_1_0',\n", + " 'DocURL': 'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=090224b0817b19e2_1_0',\n", + " 'DocDate': '22-NOV-2012'}],\n", + " 'projectfinancialtype': 'IDA',\n", + " 'projectstatusdisplay': 'Active',\n", + " 'regionname': 'Africa',\n", + " 'sector': [{'Name': 'Primary education'},\n", + " {'Name': 'Secondary education'},\n", + " {'Name': 'Public administration- Other social services'},\n", + " {'Name': 'Tertiary education'}],\n", + " 'sector1': {'Name': 'Primary education', 'Percent': 46},\n", + " 'sector2': {'Name': 'Secondary education', 'Percent': 26},\n", + " 'sector3': {'Name': 'Public administration- Other social services',\n", + " 'Percent': 16},\n", + " 'sector4': {'Name': 'Tertiary education', 'Percent': 12},\n", + " 'sector_namecode': [{'name': 'Primary education', 'code': 'EP'},\n", + " {'name': 'Secondary education', 'code': 'ES'},\n", + " {'name': 'Public administration- Other social services', 'code': 'BS'},\n", + " {'name': 'Tertiary education', 'code': 'ET'}],\n", + " 'sectorcode': 'ET,BS,ES,EP',\n", + " 'source': 'IBRD',\n", + " 'status': 'Active',\n", + " 'supplementprojectflg': 'N',\n", + " 'theme1': {'Name': 'Education for all', 'Percent': 100},\n", + " 'theme_namecode': [{'name': 'Education for all', 'code': '65'}],\n", + " 'themecode': '65',\n", + " 'totalamt': 130000000,\n", + " 'totalcommamt': 130000000,\n", + " 'url': 'http://www.worldbank.org/projects/P129828/ethiopia-general-education-quality-improvement-project-ii?lang=en'},\n", + " {'_id': {'$oid': '52b213b38594d8a2be17c781'},\n", + " 'approvalfy': 2015,\n", + " 'board_approval_month': 'November',\n", + " 'boardapprovaldate': '2013-11-04T00:00:00Z',\n", + " 'borrower': 'GOVERNMENT OF TUNISIA',\n", + " 'country_namecode': 'Republic of Tunisia!$!TN',\n", + " 'countrycode': 'TN',\n", + " 'countryname': 'Republic of Tunisia',\n", + " 'countryshortname': 'Tunisia',\n", + " 'docty': 'Project Information Document,Integrated Safeguards Data Sheet,Integrated Safeguards Data Sheet,Project Information Document,Integrated Safeguards Data Sheet,Project Information Document',\n", + " 'envassesmentcategorycode': 'C',\n", + " 'grantamt': 4700000,\n", + " 'ibrdcommamt': 0,\n", + " 'id': 'P144674',\n", + " 'idacommamt': 0,\n", + " 'impagency': 'MINISTRY OF FINANCE',\n", + " 'lendinginstr': 'Specific Investment Loan',\n", + " 'lendinginstrtype': 'IN',\n", + " 'lendprojectcost': 5700000,\n", + " 'majorsector_percent': [{'Name': 'Public Administration, Law, and Justice',\n", + " 'Percent': 70},\n", + " {'Name': 'Public Administration, Law, and Justice', 'Percent': 30}],\n", + " 'mjsector_namecode': [{'name': 'Public Administration, Law, and Justice',\n", + " 'code': 'BX'},\n", + " {'name': 'Public Administration, Law, and Justice', 'code': 'BX'}],\n", + " 'mjtheme': ['Economic management', 'Social protection and risk management'],\n", + " 'mjtheme_namecode': [{'name': 'Economic management', 'code': '1'},\n", + " {'name': 'Social protection and risk management', 'code': '6'}],\n", + " 'mjthemecode': '1,6',\n", + " 'prodline': 'RE',\n", + " 'prodlinetext': 'Recipient Executed Activities',\n", + " 'productlinetype': 'L',\n", + " 'project_name': 'TN: DTF Social Protection Reforms Support',\n", + " 'projectdocs': [{'DocTypeDesc': 'Project Information Document (PID), Vol.1 of 1',\n", + " 'DocType': 'PID',\n", + " 'EntityID': '000333037_20131024115616',\n", + " 'DocURL': 'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000333037_20131024115616',\n", + " 'DocDate': '29-MAR-2013'},\n", + " {'DocTypeDesc': 'Integrated Safeguards Data Sheet (ISDS), Vol.1 of 1',\n", + " 'DocType': 'ISDS',\n", + " 'EntityID': '000356161_20131024151611',\n", + " 'DocURL': 'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000356161_20131024151611',\n", + " 'DocDate': '29-MAR-2013'},\n", + " {'DocTypeDesc': 'Integrated Safeguards Data Sheet (ISDS), Vol.1 of 1',\n", + " 'DocType': 'ISDS',\n", + " 'EntityID': '000442464_20131031112136',\n", + " 'DocURL': 'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000442464_20131031112136',\n", + " 'DocDate': '29-MAR-2013'},\n", + " {'DocTypeDesc': 'Project Information Document (PID), Vol.1 of 1',\n", + " 'DocType': 'PID',\n", + " 'EntityID': '000333037_20131031105716',\n", + " 'DocURL': 'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000333037_20131031105716',\n", + " 'DocDate': '29-MAR-2013'},\n", + " {'DocTypeDesc': 'Integrated Safeguards Data Sheet (ISDS), Vol.1 of 1',\n", + " 'DocType': 'ISDS',\n", + " 'EntityID': '000356161_20130305113209',\n", + " 'DocURL': 'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000356161_20130305113209',\n", + " 'DocDate': '16-JAN-2013'},\n", + " {'DocTypeDesc': 'Project Information Document (PID), Vol.1 of 1',\n", + " 'DocType': 'PID',\n", + " 'EntityID': '000356161_20130305113716',\n", + " 'DocURL': 'http://www-wds.worldbank.org/servlet/WDSServlet?pcont=details&eid=000356161_20130305113716',\n", + " 'DocDate': '16-JAN-2013'}],\n", + " 'projectfinancialtype': 'OTHER',\n", + " 'projectstatusdisplay': 'Active',\n", + " 'regionname': 'Middle East and North Africa',\n", + " 'sector': [{'Name': 'Public administration- Other social services'},\n", + " {'Name': 'General public administration sector'}],\n", + " 'sector1': {'Name': 'Public administration- Other social services',\n", + " 'Percent': 70},\n", + " 'sector2': {'Name': 'General public administration sector', 'Percent': 30},\n", + " 'sector_namecode': [{'name': 'Public administration- Other social services',\n", + " 'code': 'BS'},\n", + " {'name': 'General public administration sector', 'code': 'BZ'}],\n", + " 'sectorcode': 'BZ,BS',\n", + " 'source': 'IBRD',\n", + " 'status': 'Active',\n", + " 'supplementprojectflg': 'N',\n", + " 'theme1': {'Name': 'Other economic management', 'Percent': 30},\n", + " 'theme_namecode': [{'name': 'Other economic management', 'code': '24'},\n", + " {'name': 'Social safety nets', 'code': '54'}],\n", + " 'themecode': '54,24',\n", + " 'totalamt': 0,\n", + " 'totalcommamt': 4700000,\n", + " 'url': 'http://www.worldbank.org/projects/P144674?lang=en'}]" ] }, - "execution_count": 9, + "execution_count": 11, "metadata": {}, "output_type": "execute_result" } @@ -432,15 +444,26 @@ }, { "cell_type": "code", - "execution_count": 10, - "metadata": { - "collapsed": false - }, + "execution_count": 12, + "metadata": {}, "outputs": [ { "data": { "text/html": [ - "
\n", + "
\n", + "\n", "\n", " \n", " \n", @@ -471,7 +494,7 @@ " \n", " \n", " \n", - " \n", + " \n", " \n", " \n", " \n", @@ -486,8 +509,8 @@ " \n", " \n", " \n", - " \n", - " \n", + " \n", + " \n", " \n", " \n", " \n", @@ -495,7 +518,7 @@ " \n", " \n", " \n", - " \n", + " \n", " \n", " \n", " \n", @@ -510,8 +533,8 @@ " \n", " \n", " \n", - " \n", - " \n", + " \n", + " \n", " \n", " \n", " \n", @@ -523,9 +546,9 @@ "" ], "text/plain": [ - " _id approvalfy board_approval_month \\\n", - "0 {u'$oid': u'52b213b38594d8a2be17c780'} 1999 November \n", - "1 {u'$oid': u'52b213b38594d8a2be17c781'} 2015 November \n", + " _id approvalfy board_approval_month \\\n", + "0 {'$oid': '52b213b38594d8a2be17c780'} 1999 November \n", + "1 {'$oid': '52b213b38594d8a2be17c781'} 2015 November \n", "\n", " boardapprovaldate borrower \\\n", "0 2013-11-12T00:00:00Z FEDERAL DEMOCRATIC REPUBLIC OF ETHIOPIA \n", @@ -535,25 +558,21 @@ "0 2018-07-07T00:00:00Z Federal Democratic Republic of Ethiopia!$!ET \n", "1 NaN Republic of Tunisia!$!TN \n", "\n", - " countrycode countryname countryshortname \\\n", - "0 ET Federal Democratic Republic of Ethiopia Ethiopia \n", - "1 TN Republic of Tunisia Tunisia \n", + " countrycode countryname countryshortname ... \\\n", + "0 ET Federal Democratic Republic of Ethiopia Ethiopia ... \n", + "1 TN Republic of Tunisia Tunisia ... \n", "\n", - " ... sectorcode source \\\n", - "0 ... ET,BS,ES,EP IBRD \n", - "1 ... BZ,BS IBRD \n", - "\n", - " status supplementprojectflg \\\n", - "0 Active N \n", - "1 Active N \n", + " sectorcode source status supplementprojectflg \\\n", + "0 ET,BS,ES,EP IBRD Active N \n", + "1 BZ,BS IBRD Active N \n", "\n", " theme1 \\\n", - "0 {u'Percent': 100, u'Name': u'Education for all'} \n", - "1 {u'Percent': 30, u'Name': u'Other economic man... \n", + "0 {'Name': 'Education for all', 'Percent': 100} \n", + "1 {'Name': 'Other economic management', 'Percent... \n", "\n", " theme_namecode themecode totalamt \\\n", - "0 [{u'code': u'65', u'name': u'Education for all'}] 65 130000000 \n", - "1 [{u'code': u'24', u'name': u'Other economic ma... 54,24 0 \n", + "0 [{'name': 'Education for all', 'code': '65'}] 65 130000000 \n", + "1 [{'name': 'Other economic management', 'code':... 54,24 0 \n", "\n", " totalcommamt url \n", "0 130000000 http://www.worldbank.org/projects/P129828/ethi... \n", @@ -562,7 +581,7 @@ "[2 rows x 50 columns]" ] }, - "execution_count": 10, + "execution_count": 12, "metadata": {}, "output_type": "execute_result" } @@ -586,35 +605,662 @@ "3. In 2. above you will notice that some entries have only the code and the name is missing. Create a dataframe with the missing names filled in." ] }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[{'code': '8', 'name': 'Human development'}, {'code': '11', 'name': ''}]" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import json\n", + "from pandas.io.json import json_normalize\n", + "\n", + "df = pd.read_json('data/world_bank_projects.json')\n", + "#print(df.info())\n", + "df.mjtheme_namecode[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "
0{u'$oid': u'52b213b38594d8a2be17c780'}{'$oid': '52b213b38594d8a2be17c780'}1999November2013-11-12T00:00:00ZIBRDActiveN{u'Percent': 100, u'Name': u'Education for all'}[{u'code': u'65', u'name': u'Education for all'}]{'Name': 'Education for all', 'Percent': 100}[{'name': 'Education for all', 'code': '65'}]65130000000130000000
1{u'$oid': u'52b213b38594d8a2be17c781'}{'$oid': '52b213b38594d8a2be17c781'}2015November2013-11-04T00:00:00ZIBRDActiveN{u'Percent': 30, u'Name': u'Other economic man...[{u'code': u'24', u'name': u'Other economic ma...{'Name': 'Other economic management', 'Percent...[{'name': 'Other economic management', 'code':...54,2404700000
\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
sectorsupplementprojectflgprojectfinancialtypeprodlinemjthemeidacommamtimpagencyproject_namemjthemecodeclosingdate...sector3majorsector_percentboard_approval_monththeme_namecodeurlsourceprojectstatusdisplayibrdcommamtsector_namecode_id
countryname
People's Republic of China19191919171919191916...11191917191919191919
Republic of Indonesia19191919191919191915...10191919191919191919
Socialist Republic of Vietnam17161717171717171714...10171717171717171717
Republic of India16161616161615161613...9161616161616161616
Republic of Yemen1313131313131313138...4131313131313131313
People's Republic of Bangladesh12121212121212121210...6121212121212121212
Nepal1212121211121112127...6121211121212121212
Kingdom of Morocco12121212111211121211...5121211121212121212
Republic of Mozambique1111111111111111119...6111111111111111111
Africa111111111111811117...7111111111111111111
\n", + "

10 rows × 49 columns

\n", + "
" + ], + "text/plain": [ + " sector supplementprojectflg \\\n", + "countryname \n", + "People's Republic of China 19 19 \n", + "Republic of Indonesia 19 19 \n", + "Socialist Republic of Vietnam 17 16 \n", + "Republic of India 16 16 \n", + "Republic of Yemen 13 13 \n", + "People's Republic of Bangladesh 12 12 \n", + "Nepal 12 12 \n", + "Kingdom of Morocco 12 12 \n", + "Republic of Mozambique 11 11 \n", + "Africa 11 11 \n", + "\n", + " projectfinancialtype prodline mjtheme \\\n", + "countryname \n", + "People's Republic of China 19 19 17 \n", + "Republic of Indonesia 19 19 19 \n", + "Socialist Republic of Vietnam 17 17 17 \n", + "Republic of India 16 16 16 \n", + "Republic of Yemen 13 13 13 \n", + "People's Republic of Bangladesh 12 12 12 \n", + "Nepal 12 12 11 \n", + "Kingdom of Morocco 12 12 11 \n", + "Republic of Mozambique 11 11 11 \n", + "Africa 11 11 11 \n", + "\n", + " idacommamt impagency project_name \\\n", + "countryname \n", + "People's Republic of China 19 19 19 \n", + "Republic of Indonesia 19 19 19 \n", + "Socialist Republic of Vietnam 17 17 17 \n", + "Republic of India 16 15 16 \n", + "Republic of Yemen 13 13 13 \n", + "People's Republic of Bangladesh 12 12 12 \n", + "Nepal 12 11 12 \n", + "Kingdom of Morocco 12 11 12 \n", + "Republic of Mozambique 11 11 11 \n", + "Africa 11 8 11 \n", + "\n", + " mjthemecode closingdate ... sector3 \\\n", + "countryname ... \n", + "People's Republic of China 19 16 ... 11 \n", + "Republic of Indonesia 19 15 ... 10 \n", + "Socialist Republic of Vietnam 17 14 ... 10 \n", + "Republic of India 16 13 ... 9 \n", + "Republic of Yemen 13 8 ... 4 \n", + "People's Republic of Bangladesh 12 10 ... 6 \n", + "Nepal 12 7 ... 6 \n", + "Kingdom of Morocco 12 11 ... 5 \n", + "Republic of Mozambique 11 9 ... 6 \n", + "Africa 11 7 ... 7 \n", + "\n", + " majorsector_percent board_approval_month \\\n", + "countryname \n", + "People's Republic of China 19 19 \n", + "Republic of Indonesia 19 19 \n", + "Socialist Republic of Vietnam 17 17 \n", + "Republic of India 16 16 \n", + "Republic of Yemen 13 13 \n", + "People's Republic of Bangladesh 12 12 \n", + "Nepal 12 12 \n", + "Kingdom of Morocco 12 12 \n", + "Republic of Mozambique 11 11 \n", + "Africa 11 11 \n", + "\n", + " theme_namecode url source \\\n", + "countryname \n", + "People's Republic of China 17 19 19 \n", + "Republic of Indonesia 19 19 19 \n", + "Socialist Republic of Vietnam 17 17 17 \n", + "Republic of India 16 16 16 \n", + "Republic of Yemen 13 13 13 \n", + "People's Republic of Bangladesh 12 12 12 \n", + "Nepal 11 12 12 \n", + "Kingdom of Morocco 11 12 12 \n", + "Republic of Mozambique 11 11 11 \n", + "Africa 11 11 11 \n", + "\n", + " projectstatusdisplay ibrdcommamt \\\n", + "countryname \n", + "People's Republic of China 19 19 \n", + "Republic of Indonesia 19 19 \n", + "Socialist Republic of Vietnam 17 17 \n", + "Republic of India 16 16 \n", + "Republic of Yemen 13 13 \n", + "People's Republic of Bangladesh 12 12 \n", + "Nepal 12 12 \n", + "Kingdom of Morocco 12 12 \n", + "Republic of Mozambique 11 11 \n", + "Africa 11 11 \n", + "\n", + " sector_namecode _id \n", + "countryname \n", + "People's Republic of China 19 19 \n", + "Republic of Indonesia 19 19 \n", + "Socialist Republic of Vietnam 17 17 \n", + "Republic of India 16 16 \n", + "Republic of Yemen 13 13 \n", + "People's Republic of Bangladesh 12 12 \n", + "Nepal 12 12 \n", + "Kingdom of Morocco 12 12 \n", + "Republic of Mozambique 11 11 \n", + "Africa 11 11 \n", + "\n", + "[10 rows x 49 columns]" + ] + }, + "execution_count": 65, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#Find the 10 countries with most projects\n", + "df.groupby('countryname').count().sort_values('sector',ascending=False)[:10]" + ] + }, + { + "cell_type": "code", + "execution_count": 52, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{'1': 'Economic management', '2': 'Public sector governance', '3': 'Rule of law', '4': 'Financial and private sector development', '5': 'Trade and integration', '6': 'Social protection and risk management', '7': 'Social dev/gender/inclusion', '8': 'Human development', '9': 'Urban development', '10': 'Rural development', '11': 'Environment and natural resources management'}\n", + " 0 \\\n", + "0 {'code': '8', 'name': 'Human development'} \n", + "1 {'code': '1', 'name': 'Economic management'} \n", + "2 {'code': '5', 'name': 'Trade and integration'} \n", + "3 {'code': '7', 'name': 'Social dev/gender/inclu... \n", + "4 {'code': '5', 'name': 'Trade and integration'} \n", + ".. ... \n", + "495 {'code': '4', 'name': 'Financial and private s... \n", + "496 {'code': '8', 'name': 'Human development'} \n", + "497 {'code': '10', 'name': 'Rural development'} \n", + "498 {'code': '10', 'name': 'Rural development'} \n", + "499 {'code': '9', 'name': 'Urban development'} \n", + "\n", + " 1 \\\n", + "0 {'code': '11', 'name': ''} \n", + "1 {'code': '6', 'name': 'Social protection and r... \n", + "2 {'code': '2', 'name': 'Public sector governance'} \n", + "3 {'code': '7', 'name': 'Social dev/gender/inclu... \n", + "4 {'code': '4', 'name': 'Financial and private s... \n", + ".. ... \n", + "495 {'code': '7', 'name': 'Social dev/gender/inclu... \n", + "496 {'code': '5', 'name': 'Trade and integration'} \n", + "497 {'code': '6', 'name': ''} \n", + "498 {'code': '10', 'name': 'Rural development'} \n", + "499 {'code': '8', 'name': 'Human development'} \n", + "\n", + " 2 \\\n", + "0 None \n", + "1 None \n", + "2 {'code': '11', 'name': 'Environment and natura... \n", + "3 None \n", + "4 None \n", + ".. ... \n", + "495 None \n", + "496 {'code': '2', 'name': 'Public sector governance'} \n", + "497 None \n", + "498 {'code': '10', 'name': 'Rural development'} \n", + "499 {'code': '5', 'name': 'Trade and integration'} \n", + "\n", + " 3 4 code_0 \\\n", + "0 None None 8 \n", + "1 None None 1 \n", + "2 {'code': '6', 'name': 'Social protection and r... None 5 \n", + "3 None None 7 \n", + "4 None None 5 \n", + ".. ... ... ... \n", + "495 None None 4 \n", + "496 {'code': '8', 'name': 'Human development'} None 8 \n", + "497 None None 10 \n", + "498 None None 10 \n", + "499 {'code': '4', 'name': 'Financial and private s... None 9 \n", + "\n", + " name_0 code_1 \\\n", + "0 Human development 11 \n", + "1 Economic management 6 \n", + "2 Trade and integration 2 \n", + "3 Social dev/gender/inclusion 7 \n", + "4 Trade and integration 4 \n", + ".. ... ... \n", + "495 Financial and private sector development 7 \n", + "496 Human development 5 \n", + "497 Rural development 6 \n", + "498 Rural development 10 \n", + "499 Urban development 8 \n", + "\n", + " name_1 code_2 \\\n", + "0 Environment and natural resources management NaN \n", + "1 Social protection and risk management NaN \n", + "2 Public sector governance 11 \n", + "3 Social dev/gender/inclusion NaN \n", + "4 Financial and private sector development NaN \n", + ".. ... ... \n", + "495 Social dev/gender/inclusion NaN \n", + "496 Trade and integration 2 \n", + "497 Social protection and risk management NaN \n", + "498 Rural development 10 \n", + "499 Human development 5 \n", + "\n", + " name_2 code_3 \\\n", + "0 NaN NaN \n", + "1 NaN NaN \n", + "2 Environment and natural resources management 6 \n", + "3 NaN NaN \n", + "4 NaN NaN \n", + ".. ... ... \n", + "495 NaN NaN \n", + "496 Public sector governance 8 \n", + "497 NaN NaN \n", + "498 Rural development NaN \n", + "499 Trade and integration 4 \n", + "\n", + " name_3 code_4 name_4 \n", + "0 NaN NaN NaN \n", + "1 NaN NaN NaN \n", + "2 Social protection and risk management NaN NaN \n", + "3 NaN NaN NaN \n", + "4 NaN NaN NaN \n", + ".. ... ... ... \n", + "495 NaN NaN NaN \n", + "496 Human development NaN NaN \n", + "497 NaN NaN NaN \n", + "498 NaN NaN NaN \n", + "499 Financial and private sector development NaN NaN \n", + "\n", + "[500 rows x 15 columns]\n", + "{'1': 38, '2': 199, '3': 15, '4': 146, '5': 77, '6': 168, '7': 130, '8': 210, '9': 50, '10': 216, '11': 250}\n" + ] + } + ], + "source": [ + "#Find the top 10 major project themes (using column 'mjtheme_namecode')\n", + "dd=pd.DataFrame(df.mjtheme_namecode) #only focus on this column\n", + "\n", + "#let's create one column for each code and name for each project\n", + "norm1=pd.json_normalize(data=dd['mjtheme_namecode'])\n", + "\n", + "for i in range(5): #5 is number of columns in norm1\n", + " n='name'+str(i)\n", + " norm1[('code_'+str(i))]=pd.json_normalize(norm1.iloc[:,i])['code']\n", + " norm1[('name_'+str(i))]=pd.json_normalize(norm1.iloc[:,i])['name']\n", + " \n", + "norm1.dropna(axis=1, how='all',inplace=True)#drop empty columns\n", + "norm1.dropna(subset=\"code_0\",inplace=True) #drop rows with nan in code_0\n", + "\n", + "#fill-in missing names\n", + "\n", + "#first, figure out code/name pairs\n", + "pairs = {}\n", + "for i in range(11): #we know there are 11 codes\n", + " for j in range(len(norm1)):\n", + " if int(norm1.loc[j,'code_0'])==i+1 and norm1.loc[j,'name_0'] != \"\":\n", + " pairs[str(i+1)]=norm1.loc[j,'name_0']\n", + " break\n", + "print(pairs)\n", + "#now fill in names\n", + "for j in range (5):\n", + " for i in range(len(norm1)):\n", + " n='name_'+str(j)\n", + " c=\"code_\"+str(j)\n", + " if norm1.loc[i,c] != \"\" and pd.notna(norm1.loc[i,c]):\n", + " if norm1.loc[i,n]==\"\": norm1.loc[i,n]= pairs[norm1.loc[i,c]]# we had a code but missing name\n", + "print(norm1)\n", + "\n", + "# question is a bit ambiguous, we'll calculate the overall number of project names across all columns\n", + "#since a given row may have more than one project name attached to it.\n", + "pairCount=pairs # just to re-use the key, now the values will hold count\n", + "for i in range(len(pairCount)):\n", + " pairCount[str(i+1)]=0\n", + "for j in range (5):\n", + " for i in range(len(norm1)):\n", + " n='name_'+str(j)\n", + " c=\"code_\"+str(j)\n", + " if pd.notna(norm1.loc[i,c]):\n", + " pairCount[norm1.loc[i,c]]+=1\n", + "print(pairCount)" + ] + }, { "cell_type": "code", "execution_count": null, - "metadata": { - "collapsed": true - }, + "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { - "display_name": "Python 2", + "display_name": "Python 3 (ipykernel)", "language": "python", - "name": "python2" + "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", - "version": 2 + "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", - "pygments_lexer": "ipython2", - "version": "2.7.9" + "pygments_lexer": "ipython3", + "version": "3.9.12" } }, "nbformat": 4, - "nbformat_minor": 0 + "nbformat_minor": 1 }