Learn the basics of
If you want to become a data analyst pro, this course is for you. This course will help you analyse data to make better decisions and add a solid foundation in statistics to better interpret your data. Before taking this course, you should be familiar with organizing and summarizing data using Excel analytic tools, such as tables, pivot tables, and pivot charts. You should also be comfortable (or willing to try) creating complex formulas and visualizations. By the end of this course, the student will be able to conduct Data Analysis using 3 popular Data analysis tools- Excel, R and Tableau.
1.Starting Your Data Analyst Journey
Your journey to becoming a data analyst will start by understanding more about data and the analysis thereof. This lesson is geared to help you understand why data analysis is an important skill as well as where it can be used to enhance your business making decisions. Each lesson in this module is carefully balanced between theory and practicals, and in this lesson, you will learn how to import and clean data using a variety of methods and tools. You will focus on logical checks to help guide you towards thinking about data in a more logical fashion.
In this lesson, you will begin to understand data in a bit more detail. The aim is to assist you in understanding the different data types (such as categorical vs numerical) as well as understanding graphically represented data. You will also learn how to describe data (i.e. descriptive statistics) and how to use them.
As your journey continues, you will learn how to install the Data Analysis Toolpak together with some descriptive stats. This lesson will touch briefly on the basics of probability (with specific reference to Bayes Theorem) and delve into the details of mean and variance of random variables. This topic very neatly ties together a concept that you have previously covered (the mean) with one you are yet to cover (variance).
Lesson 4 is all about distributing data. You will learn about the various data distributions (with reference to the Central Limit Theorem) and understand how to use mean, median and standard deviation to know how your data is distributed. Lastly, you will also learn about skewness and kurtosis.
5.How Confident Are You in the Sample
Being confident in your sample is important. This lesson will focus on understanding the difference between a sample and a population as well as when to use variance or standard deviation for each. You will also cover confidence intervals in more detail, and by the end of this lesson, you will be well on your way to feeling more confident!
6.Hypothesising About the Outcome
Understanding what a hypothesis is is an important step in your journey. This lesson will expand on what a null and alternative hypothesis is and explore the difference between a Type 1 and Type 2 error. This lesson will also include more information on the Central Limit Theorem/the law of large numbers.
7.Testing for Differences: Categorical Vars
The penultimate lesson is focused on testing for differences (categorical vars). You will explore one-sample tests, the difference between two means of two populations as well as Chi-square tests.
8.Testing for Differences: Numerical Vars
This module will wrap up with an understanding of testing for differences (numerical vars). In this lesson, one-sample tests, the difference between two means of two populations, and T-tests will be covered. By the end of this lesson, you will have a firm and complete understanding of the basics of data and data analysis. However, the journey does not stop here and in Module 2 you can expect more complex concepts and a deeper understanding of the topic.
In this lesson, we add a new tool to our data analyst toolkit, called R. We will go through the basic steps of downloading and installing the tool and start exploring some of the packages that are available today in R. We will end the lesson by introducing another common method to estimate population parameters, the maximum likelihood method.
The first topic will introduce the brilliant package tidyverse by hadley wickham, the chief data scientist at rstudio. Thereafter, we will use R to reproduce some of the exploratory data analysis we have done with the titanic dataset in excel in module 1. We will end this lesson with a short and sweet introduction to merging and joining datasets.
3.Introduction to Linear Regression
The first topic for this lesson will introduce linear regression, thereafter we will dive deeper into understanding the concept of correlation. We will end this lesson by going back to basics with vectors and factors in R.
4.Linear Regression Continued
This lesson will continue to broaden our understanding of linear regression and data frames. We will understand what it means for the model to fit the data well and gain some further insight into treating data in R. We will end the lesson by exploring some basics surrounding dates values in R.
5.Dates and Times
Lesson 5 will continue to broaden our understanding of dealing with dates and times in r. Many datasets contain dates and times and we need to make the step of data handling dates and times as simple and effective in our data analytics arsenal as possible. Therefore, we will continue building on dates and times data wrangling throughout a large part of this lesson. We will end the lesson by introducing time series analysis concepts.
6.Time Series Analysis
This lesson will delve deeper into time series analysis. We will further discuss the concepts surrounding time series analysis that we introduced in the previous lesson and add some new concepts to that. Thereafter we will break down and understand some of the time series models a bit better. We will end today’s lesson by looking at how we can apply these concepts learnt in R in a more practical sense.
7.Multiple Linear Regression
In this lesson, we will elaborate on the principles of multiple linear regression. We will talk more about the assumptions that accompany linear regression, how to simplify a multiple linear regression model, and problems that can occur when fitting a multiple linear regression model to the data. Thereafter, we will discuss what happens if your model does not fit a linear trend well, in other words, if the data is non-linear. The lesson will end with an introduction to logistic regression.
8.Introduction to Logistic Regression
In this lesson, we will elaborate on the introduction to logistic regression from lesson 7. We will better understand when to utilize this model and how to interpret the outcome. Thereafter we will elaborate on the model fit statistics we have briefly touched on in previous lessons, such as the AIC and BIC statistics. We will end the lesson by cementing in all the knowledge we have gained through module 2 with a practical demonstration.