There are 3 options, you can choose one of them (there are no restrictions on that)

  1. Bring your own data from work (you can remove any private or confidential information, for example: if you are bringing any sales or cost data of an item/product or service – the name can be masked)
  2. Use data from your previous work or company you have access to (again you can remove any private/confidential information)
  3. Use data from public domain – In today’s world, there is no dearth of structured data. Here are some places where you can get data from:
    • Any data source you have access to like the Hawkes Learning Resources
      • Datasets (1) (Links to an external site.) from Hawkes
      • Datasets (2) (Links to an external site.) from Hawkes – Look at the additional datasets, not the chapter datasets
    • U.S. Bureau of Labor Statics (Links to an external site.)
    • U.S. Government’s open data (Links to an external site.)
    • Center for Medicare and Medicaid services (Links to an external site.)
    • Kaggle datasets (Links to an external site.)
    • WHO Data repository (Links to an external site.)
    • World Bank Data (Links to an external site.)
    • Google Public data explorer (Links to an external site.)
      • Amazing visualization or graphics
      • But remember, we need the data to do analysis, if you look at the bottom of any figure – Google would provide the source name, and you can retrieve data from there.
    • Any sports data (from the appropriate website, getting data in structured format for several years might be challenge, but a few minutes or an hour – you can do it)
      • For example – Cricket data could be obtained from espncricinfo (Links to an external site.).

Grading Rubric


  • No more than 1.5 to 2 pages.
  • You should describe your source of data (including the data fields you have) and what you want to accomplish based on the topics you learnt.
  • You can state the research hypothesis you plan to check, confidence intervals you plan to estimate, or test any relationship between variables you think is important.
  • Remember – I need at least your plan based on the first three modules (see examples). No need for analysis, just what you plan to do.

I will provide feedback within 4 days to each of you (if you submit early, you get your feedback early), if I feel any change is needed – I will indicate that.

How are the 15 points given:

  • Your Data: 5 points (Note: Remember, the sample size should be at least 30 data points to due any parametric tests – aim for at least 50 to 100+ data points for Master’s level project)
  • Your plan of action: 10 points