EMU COSC 146 Applied Programming and Scripting
In Python
Here are some web pages that you may need to refer to:
- Python.org
- Beginners Guide https://wiki.python.org/moin/BeginnersGuide/
- Python for Non-Programmers https://wiki.python.org/moin/BeginnersGuide/NonProgrammers
- Our Book
- Source code download https://www.dummies.com/store/product/Beginning-Programming-with-Python-For-Dummies-2nd-Edition.productCd-1119457890,navId-322468,descCd-DOWNLOAD.html
- John Mueller’s blog posts related to the book http://blog.johnmuellerbooks.com/category/technical/beginning-programming-with-python-for-dummies/
- Jupyter
- Examples https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks#social-data (The whole page is interesting, not just the social-data Jupyter examples)
- Bloom’s Taxonomy
- Bloom’s Taxonomy of Learning Domains http://www.nwlink.com/~donclark/hrd/bloom.html
- Question Starters https://education.illinoisstate.edu/downloads/casei/5-02-Revised%20Blooms.pdf
- Dummies Cheat Sheets
- Data for projects
- Wikipedia list of datasets for machine-learning https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research#Games
- https://github.com/jdorfman/awesome-json-datasets
- https://github.com/awesomedata/awesome-public-datasets
- https://www.kaggle.com/datasets
- https://github.com/wtsxDev/Machine-Learning-for-Cyber-Security
- Air Now API https://docs.airnowapi.org/
- https://www.ncdc.noaa.gov/cdo-web/datasets
- Jason resources https://github.com/burningtree/awesome-json
- Jupyter examples https://github.com/josephcslater/JupyterExamples/blob/master/SymPy_Multiple_Eqn_solution.ipynb
- Datasets for student projects
- George Cowan (This is an example of the kind of information that would be appropriate. You can ask for this dataset
if you like, but you must also include at least one additional dataset.)
- Example WHO cumulative Corona Virus cases by date
- https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset
- Download the datasets by clicking on the Download button under the header image. I have only looked at one of the files: covid_19_data.csv. It’s column descriptions can be found be expanding the Description section that is a little below the download button. The file seems to be updated almost daily.
- I think this would be an excellent dataset to explore because it is such an important topic, even though it is not related to my major. Categorical fields would be Country/Region, and perhaps Province/State. There are several numerical fields. Graphing the growth of cases over time for a selected state like Michigan would be possible, but the date field looks different than what we have studied. I may need some help there.
- Example WHO cumulative Corona Virus cases by date
- Hunter Vayda-Ramirez: I would like to use
- https://github.com/awesomedata/awesome-public-datasets by downloading the titantic.csv.zip file.
- I thought this dataset was very cool! It is categorical and numerical data. The data appears to be information from the Titantic about its passengers. If it’s true, it’s pretty darn cool! This also not only relates to my statistics because of it being a dataset. But, it relates to my professional writing major. This is a great example of a multimodal design. It used a form of media to convey information to an audience!
- https://github.com/GeostatsGuy/GeoDataSets/blob/master/dataset_101b.csv by clicking on the raw button and click Control-A to select text. Paste into a text editor. (Thank you George!)
- I would like to try to do the GeostatsGuy dataset, but need help trying to make sure I am able to get the information correctly. Since, I didn’t see a file I could download. I am sure I could figure it out, but just want to have clarificaiton. There are other datasets available for these studies and would find them equally as fun. This dataset has numerical fields and it appears to be earthquake data? It was intended for a geostats class. So that would be my guess! It relates to my major because I am a Stats major. Plus it looks fun!
- https://github.com/awesomedata/awesome-public-datasets by downloading the titantic.csv.zip file.
- Cody Sharp
- Yu-Gi-Oh! Trading Cards Dataset
- https://www.kaggle.com/tathor/yugioh-trading-cards-dataset/data
- YuGiOh is a card game that I have a lot of interest in and I think it would be fun to explore the most common card type or monster attributes given the over 6,000 cards included in this data set.
- Video Game Sales Dataset
- https://www.kaggle.com/gregorut/videogamesales
- I think this would be an interesting dataset since it details the sales in different regions of the world and can illustrate which games are most popular in these regions.
- Yu-Gi-Oh! Trading Cards Dataset
- Brendan Carr
- Summer Olympic Medals (1976 – 2008)
- APPROVED PROJECT DATASET
- https://www.kaggle.com/divyansh22/summer-olympics-medals
- Click the download button underneath the header image to download a zip folder contain the dataset. The dataset is named Summer- Olympic-medals-1976-to-2008.csv.
- The categorical fields are City, Year, Sport, Discipline, Event, Athlete, Gender, Country_Code, Country, Event_Gender, and Medal. I choose this dataset as my first choice even though it is not related to my major because it is interesting to see the data for all the medals in the Summer Olympics from 1976 to 2008.
- There are no numerical fields in this dataset.
- World Happiness Report
- https://www.kaggle.com/unsdsn/world-happiness
- Click the download button underneath the header image to download a zip folder containing the datasets. I was going to focus on the 2019.csv dataset.
- The categorical fields are Overall Rank, Country or Region, Score, GDP per Capita, Social Support, Healthy Life Expectancy, Freedom to make life choices, Generosity, and Perceptions of Corruption. Each category from GDP per Capita to Perceptions of Corruption has a number that contributes to each country’s happiness score which ranges from 0 – 10.
- There are no numerical fields in this dataset.
- Summer Olympic Medals (1976 – 2008)
- Abigail Hickey
- Tesla stock data from 2010 to 2020
- TSLA has been on the rice recently, with a crazy +100% spike in the last 30 days alone. With the history, maybe we can find out why?
- Link to download of dataset https://www.kaggle.com/timoboz/tesla-stock-data-from-2010-to-2020
- Videogame Sales
- This dataset contains a list of video games with sales greater than 100,000 copies. It was generated by a scrape of vgchartz.com
- The script to scrape the data is available at https://github.com/GregorUT/vgchartzScrape. It is based on BeautifulSoup using Python. There are 16,598 records. 2 records were dropped due to incomplete information.
- Download dataset here: https://www.kaggle.com/gregorut/videogamesales
- Students Performance in Exams
- Context Marks secured by the students
- Content This data set consists of the marks secured by the students in various subjects.
- Acknowledgements http://roycekimmons.com/tools/generated_data/exams Inspiration To understand the influence of the parents background, test preparation etc on students performance
- Download Dataset here: https://www.kaggle.com/spscientist/students-performance-in-exams
- Tesla stock data from 2010 to 2020
- Alexis Gipson
- Novel Coronavirus (Deaths or Recovered) (2019-2020 ((ONGOING)) )
- https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series
- Personally, this would be my first choice, as this is a very important focus in our life today, and having work to do on it would be both interesting to do and to teach myself more about statistically, in a sense.
- Air Pollution of PM2.5 in Korea by ug/m3 (2015 - 2019, but more specifically 2019)
- http://kosis.kr/eng/statisticsList/statisticsListIndex.do?menuId=M_01_01&vwcd=MT_ETITLE&parmTabId=M_01_01#SelectStatsBoxDiv
- I know overall this is a weird topic, but while I spent time in Korea last semester, it was only then that I realized how bad the air quality can get on a given season or month, so I thought it would be interesting to compare overall.
- Novel Coronavirus (Deaths or Recovered) (2019-2020 ((ONGOING)) )
- Cody Makinson
- I want to analyze NHL data from the birth of the NHL original 6 hockey teams. https://www.justsportsstats.com/hockeyindex.php?league=NHL I just need to download in excel
- I really enjoyed pokemon growing up, so I would use the data we previously used in an assignment. I want to see if there is a type bias, among other statitical analysis. https://pokemondb.net/pokedex/all * I would have to play around a little to make into .csv file*
- Weather data. It is cyclical and a different kind of data than the other two. PRECIP_HLY_sample_csv -My first choice is analyzing the hockey data. I have all the data in an excel file. I will put all the raw data in one sheet and convert to .csv file.
- George Cowan (This is an example of the kind of information that would be appropriate. You can ask for this dataset
if you like, but you must also include at least one additional dataset.)