Diagnostic Analytics takes descriptive data a step further and helps you understand why something happened in the past. Population are all the elements to which we are going to make a study, regardless of what it is, whether they are pieces of a factory, animals, data of any type… Relationship Between Variables. Descriptive Statistics. Basic Concepts in Statistics. This tutorial is designed for Professionals who are willing to learn Statistics and want to clear B.A., B.Sc., B.COM, M.COM and other exams. Statistical concepts explained Probability and statistical modelling. Variance: The average squared difference of the values from the mean to measure how spread out a set of data is relative to mean. In general, statistics is a study of data: describing properties of the data, which is called descriptive statistics, and drawing conclusions about a population of interest from information extracted from a sample, which is called inferential statistics. Statistics is divided into two main areas, which are descriptive and inferential statistics. Statistics is a study of data: describing properties of data (descriptive statistics) and drawing conclusions about a population based on information in a sample (inferential statistics). Poisson Distribution: The distribution that expresses the probability of a given number of events k occurring in a fixed interval of time if these events occur with a known constant average rate λ and independently of the time. Statistical Features. From statistics you get to operate on the data in a much more information-driven and targeted way. Conditional Probability: P(A|B) is a measure of the probability of one event occurring with some relationship to one or more other events. Check normal distribution and normality for the residuals. If the data have multiple values that occurred the most frequently, we have a multimodal distribution. Normal/Gaussian Distribution: The curve of the distribution is bell-shaped and symmetrical and is related to the Central Limit Theorem that the sampling distribution of the sample means approaches a normal distribution as the sample size gets larger. In our example, the population is the set of all students, that is, the 200 students. STATISTICS – is a branch of mathematics that deals with the collection, organization, presentation, analyzation and interpretation of numerical data. It can be nominal (no order) or ordinal (ordered data). Basic Statistics Concepts for Finance. Step 1: Understand the model description, causality, and directionality, Step 2: Check the data, categorical data, missing data, and outliers, Step 3: Simple Analysis — Check the effect comparing between dependent variable to independent variable and independent variable to independent variable, Step 4: Multiple Linear Regression — Check the model and the correct variables, Step 6: Interpretation of Regression Output. P(A∩B)=P(A)P(B) where P(A) != 0 and P(B) != 0 , P(A|B)=P(A), P(B|A)=P(B). Percentiles, Quartiles and Interquartile Range (IQR). Regression. Statistics is a group of statistical measurements that aims to provide basic features of data in a study. Aims to infer or make interpretations by making a concluding statement. An essential process in statistics that refers to the gathering of data. A key focus of the field of statistics is data science. Data science is a multidisciplinary blend of data inference, algorithm development, and technology in order to solve analytically complex problems. Significance Level and Rejection Region: The rejection region is actually depended on the significance level. Correlation: Measure the relationship between two variables and ranges from -1 to 1, the normalized version of covariance. Descriptive Statistics - used to describe the basic features of data in a study. Statistics is used to answer long-range planning questions. Probability is concerned with the outcome of trials. This tutorial will give you great understanding on concepts present in Statistics syllabus and after completing this preparation you will be ready for more advanced topics. You should not confuse this concept with the population of a city for example. Samples and statistics Sample A sample is a representative group drawn from the population. Mutually Exclusive Events: Two events are mutually exclusive if they cannot both occur at the same time. The significance level is denoted by α and is the probability of rejecting the null hypothesis if it is true. Understanding the fundamentals of statistics is a core capability for becoming a Data Scientist. A dependent variable is a variable being measured in a scientific experiment. When p-value > α, we fail to reject the null hypothesis, while p-value ≤ α, we reject the null hypothesis, and we can conclude that we have a significant result. Kurtosis: A measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. An independent variable is a variable that is controlled in a scientific experiment to test the effects on the dependent variable. One-way ANOVA compare two means from two independent group using only one independent variable. Population: a complete set of data which we wish to study or analyze. An independent variable is a variable that is controlled in a scientific experiment to test the effects on the dependent variable. Over the years, Berenson has received several awards for teaching and for innovative contributions to statistics education. P-value: The probability of the test statistic being at least as extreme as the one observed given that the null hypothesis is true. A solid understanding of statistics is crucially important in helping us better understand finance. ▍Step 1: Understand the model description, causality and directionality, ▍Step 2: Check the data, categorical data, missing data and outliers, ▍Step 3: Simple Analysis — Check the effect comparing between dependent variable to independent variable and independent variable to independent variable, ▍Step 4: Multiple Linear Regression — Check the model and the correct variables, ▍Step 6: Interpretation of Regression Output. Two Basic Types of Statistics: A. Descriptive Statistics. It is used for collection, summarization, presentation and analysis of data. Today, we're going to look at 5 basic statistics concepts that data scientists need to know and how they can be applied most effectively! A Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution and tests the mean of a distribution in which we already know the population variance. Central Tendency. Null Hypothesis: A general statement that there is no relationship between two measured phenomena or no association among groups. By Shirley Chen, MSBA in ASU | Data Analyst. Relationship Between Variables. One-way ANOVA compares two means from two independent groups using only one independent variable. Statistical Features Statistical features is probably the most used statistics concept in data science. Covariance: A quantitative measure of the joint variability between two or more variables. We'll talk about cases and variables, and we'll explain how you can order them in a so-called data matrix. Measure of Central Tendency. Range: The difference between the highest and lowest value in the dataset. Sampling is the process by which numerical values will be selected from the population. Probability Mass Function(PMF): A function that gives the probability that a discrete random variable is exactly equal to some value. Building a Deep Learning Based Reverse Image Search. Prescriptive Analytics provides recommendations regarding actions that will take advantage of the predictions and guide the possible actions toward a solution. The primary role of statistics is to provide decision makers with methods for obtaining and analyzing information to help make these decisions. Statistics also plays a central role in decision making for business and government, including marketing, strategic planning, manufacturing and finance. Trials refers to an event whose outcome is unknown. Machine concepts and how statistics fits in. If experimental results are significant and sampling: a Function that gives the probability of an event based on prior knowledge. The two samples must come from two completely different populations. Independent Events: Two events are independent if the occurrence of one does not affect the probability of occurrence of the other. Provides companies with actionable insights based on the information. The statistical Test if the occurrence of the population is the way to find out if results are significant. A T-test is the statistical test to determine if a sample matches the population. It is used for collection, summarization, presentation and analysis of data. Chi-Square Test for Independence compare two sets of data to see if there is a relationship. Chi-Square Distribution: The distribution of the sum of squared standard normal deviates. Two-way ANOVA is the extension of one-way ANOVA using two independent variables to calculate the main effect and interaction effect. Uniform Distribution: Also called a rectangular distribution, is a probability distribution where all outcomes are equally likely. Exponential Distribution: A probability distribution of the time between the events in a Poisson point process. Paired sample means that we collect data twice from the same group, person, item or thing. 