5.6.2 Module 5 Quiz – Analyze the Data Using Statistics Exam Answers Full 100% | Data Analytics Essentials 2023 2024
This is 5.6.2 Module 5 Quiz – Analyze the Data Using Statistics Exam Answers Full 100% in 2023 and 2024. It is also module 5 quiz answers in the Cisco NetAcad SkillsForAll Data Analytics Essentials course. Our experts have verified all the answers with explanations to get the 100%.
-
What are inferential data sets?
- They are data sets gathered from a representative sample to make generalizations or predictions about a population.
- They are data sets that describe the current or historical state of the observed population.
- They are data sets that allow for summarization of findings based on historical data observed about a population.
- They are often represented in pie charts, bar charts or histograms.
Answers Explanation & Hints:
Inferential data are gathered from a sample to make generalizations or predictions about a population. Because a representative sample is used instead of actual data from the entire population – a possible concern is that the particular groups chosen for the study and/or the environment in which a study is carried out, may not accurately reflect characteristics of the larger group. It is important to make sure that the representative sample closely matches the characteristics of the overall population, in order to make accurate analysis from the inferential data set.
-
How is the Microsoft Excel VLOOKUP tool used in data analysis?
- to summarize descriptive data sets
- to interpret inferential data sets
- to find specific information in a large spreadsheet
- to find the mean average of data sets
Answers Explanation & Hints:
VLOOKUP is a very powerful data analysis tool in Microsoft Excel that is used to find information in a large spreadsheet. VLOOKUP is a vertical lookup function, so the data needs to be organized in a table where each row has different but related forms of data in each column.
-
What can a data analyst do if they wanted to remove duplicate values in a Microsoft Excel spreadsheet?
- use VLOOKUP
- use conditional formatting
- parse the data from “text to column”
- highlight errors
Answers Explanation & Hints:
VLOOKUP can also be used to help with data cleaning by finding duplicates. With VLOOKUP you can compare two columns (or lists) and find duplicate values. A formula is created using VLOOKUP to target the columns with the duplicate data values.
-
Which three key pieces of information are required to perform a VLOOKUP function in Microsoft Excel? (Choose three.)
- the column number in the range that contains the return value
- the mean average of the column
- the range where the value is located
- the locator value for the blank fields
- the lookup value
Answers Explanation & Hints:
To perform a VLOOKUP the function needs 4 required pieces of key information:
1. The value you want to lookup – aka the lookup value.
2. The range where the value is located.
3. The column number in the range that contains the return value.
4. And, optionally: Specify TRUE if you want an approximate match or FALSE if you want an exact match of the return value.
-
A data analyst wants to find data point values that are significantly different from others in a data set. What are these values called?
- outliers
- scatter plots
- cluster
- associations
Answers Explanation & Hints:
An outlier is defined as a value or data point that varies significantly from the others, either much smaller or much greater. Because they can lead to negative effects on the results of data analysis, they need to be investigated and removed to analyze data effectively.
-
Why are different sampling techniques used to gather inferential statistical data?
- to reduce error and increase confidence in the generalizations about the findings
- to get accurate mean values
- to verify historical data
- to describe or summarize the values and observations of a data set
Answers Explanation & Hints:
Different sampling techniques are used to gather the sampling data set for inferential statistical data to reduce error and increase confidence in the generalizations about the findings. The type of sampling technique used will depend on the type of data.
-
Which type of inferential and machine learning analysis is used to find groups of observations that are similar to each other?
- regression
- association
- cluster
- observation
Answers Explanation & Hints:
A number of types of inferential and machine learning analysis are very commonly used in Big Data analytics:
• Cluster – Used to find groups of observations that are similar to each other.
• Association – Used to find co-occurrences of values for different variables.
• Regression – Used to quantify the relationship, if any, between the variations of one or more variables.
-
Why is regression analytics used in the inferential and machine learning analyses of big data?
- It is used to find groups of observations that are similar to each other.
- It is used to find to find co-occurrences of values for different variables.
- It is used to quantify the relationship, if any, between the variations of one or more variables.
- It is used to find the mean average at various data points.
Answers Explanation & Hints:
A number of types of inferential and machine learning analysis are very commonly used in Big Data analytics:
• Cluster – Used to find groups of observations that are similar to each other.
• Association – Used to find co-occurrences of values for different variables.
• Regression – Used to quantify the relationship, if any, between the variations of one or more variables.
-
A data analyst wants to display the various segments of a country’s energy sources (e.g., oil, coal, gas, solar, wind) contributing to 100% of its energy sources in a visual format. What type of chart would be best used to accomplish this?
- bar chart
- pie chart
- column charts
- scatter plots
Answers Explanation & Hints:
Since the objective is to display the segments of the country’s energy resources out of a total of 100%, pie chart would be best suited to display this visually.
The main types of charts used to display data visually are:- Pie charts are used to show the composition of a static number. Segments represent a percentage of that number. The total sum of the segments must equal 100%.
- Line charts are one of the most commonly used types of comparison charts. Use line charts when you have a continuous set of data, the number of data points is high, and you would like to show a trend in the data over time.
- Column charts are positioned vertically. They are probably the most common chart type used when you want to display the value of a specific data point and compare that value across similar categories.
- Bar charts are similar to column charts except they are positioned horizontally. Longer bars indicate larger numbers. They are best used when the names for each data point is long.
- Scatter plots are very popular for correlation visualizations, or when you want to show the distribution of a large number of data points. Scatter plots are also useful for demonstrating clustering or identifying outliers in the data.
-
A data analyst wants to display outliers in the data set. Which type of visual representation would best suit this task?
- line chart
- bar chart
- scatter plot
- pie chart
Answers Explanation & Hints:
Since the objective is to display the outliers of a data set, the best chart used to display this visually would be a scatter plot.
The main types of charts used to display data visually are:- Pie charts are used to show the composition of a static number. Segments represent a percentage of that number. The total sum of the segments must equal 100%.
- Line charts are one of the most commonly used types of comparison charts. Use line charts when you have a continuous set of data, the number of data points is high, and you would like to show a trend in the data over time.
- Column charts are positioned vertically. They are probably the most common chart type used when you want to display the value of a specific data point and compare that value across similar categories.
- Bar charts are similar to column charts except they are positioned horizontally. Longer bars indicate larger numbers. They are best used when the names for each data point is long.
- Scatter plots are very popular for correlation visualizations, or when you want to show the distribution of a large number of data points. Scatter plots are also useful for demonstrating clustering or identifying outliers in the data.