Labour Day Special - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: top65certs

CompTIA DA0-001 Dumps

Page: 1 / 20
Total 262 questions

CompTIA Data+ Certification Exam Questions and Answers

Question 1

Which of the following query optimization techniques involves examining only the data that is needed for a particular task?

Options:

A.

Making a temporary table

B.

Creating a flat file

C.

Indexing documents

D.

Creating an execution plan

Question 2

During data profiling, an analyst decides to recode the status column in the following data set:

Which of the following data concerns explains why the analyst wants to take this action?

Options:

A.

Redundancy

B.

Duplication

C.

Invalidity

D.

Inconsistency

Question 3

A table in a hospital database has a column for patient height in inches and a column for patient height in centimeters. This is an example of:

Options:

A.

dependent data.

B.

duplicate data.

C.

invalid data

D.

redundant data

Question 4

A data analyst received the information in the table below from a recently completed marketing campaign:

Which of the following is the total order conversion rate?

Options:

A.

13.2%

B.

14.8%

C.

22.3%

D.

85.2%

Question 5

Given the following report:

Which of the following components need to be added to ensure the report is point-in-time and static? (Select two).

Options:

A.

A control group for the phrases

B.

A summary of the KPIs

C.

Filter buttons for the status

D.

The date when the report was last accessed

E.

The time period lhe report covers

F.

The date on which the report was run

Question 6

Which of the following best describes how discrete data differs from continuous data?

Options:

A.

Discrete data cannot create a sloped line.

B.

Discrete data can only be a finite number of values.

C.

Discrete data can have decimal points.

D.

Discrete data applies only to numbers.

Question 7

Amanda needs to create a dashboard that will draw information from many other data sources and present it to business leaders.

Which one of the following tools is least likely to meet her needs?

Options:

A.

QuickSight.

B.

Tableau.

C.

Power BI.

D.

SPSS Modeler.

Question 8

What R package makes it easy to work with dates?

Options:

A.

Lubridate.

B.

Datemath.

C.

Stringr.

D.

ggplot.

Question 9

The director of operations at a power company needs data to help identify where company resources should be allocated in order to monitor activity for outages and restoration of power in the entire state. Specifically, the director wants to see the following:

* County outages

* Status

* Overall trend of outages

INSTRUCTIONS:

Please, select each visualization to fit the appropriate space on the dashboard and choose an appropriate color scheme. Once you have selected all visualizations, please, select the appropriate titles and labels, if applicable. Titles and labels may be used more than once.

If at any time you would like to bring back the initial state of the simulation, please click the Reset All button.

Options:

Question 10

The number of phone calls that the call center receives in a day is an example of:

Options:

A.

continuous data.

B.

categorical data.

C.

ordinal data.

D.

discrete data.

Question 11

Given the following customer and order tables:

Which of the following describes the number of rows and columns of data that would be present after performing an INNER JOIN of the tables?

Options:

A.

Five rows, eight columns

B.

Seven rows, eight columns

C.

Eight rows, seven columns

D.

Nine rows, five columns

Question 12

A data set was recorded using multimedia technology. Which of the following is a necessary step on the way to interpretation?

Options:

A.

Structural equation modeling

B.

Transcription

C.

Sequential analysis

D.

Sampling

Question 13

Consider the following dataset which contains information about houses that are for sale:

Which of the following string manipulation commands will combine the address and region name columns to create a full address?

full_address------------------------- 85 Turner St, Northern Metropolitan 25 Bloomburg St, Northern Metropolitan 5 Charles St, Northern Metropolitan 40 Federation La, Northern Metropolitan 55a Park St, Northern Metropolitan

Options:

A.

SELECT CONCAT(address, ' , ' , regionname) AS full_address FROM melb LIMIT 5;

B.

SELECT CONCAT(address, '-' , regionname) AS full_address FROM melb LIMIT 5;

C.

SELECT CONCAT(regionname, ' , ' , address) AS full_address FROM melb LIMIT 5

D.

SELECT CONCAT(regionname, '-' , address) AS full_address FROM melb LIMIT 5;

Question 14

A data analyst is creating a report that will provide information about various regions, products, and time periods. Which of the following formats would be the most efficient way to deliver this report?

Options:

A.

A workbook with multiple tabs for each region

B.

A daily email with snapshots of regional summaries

C.

A static report with a different page for every filtered view

D.

A dashboard with filters at the top that the user can toggle

Question 15

Given the data below:

In which of the following file formats is the data presented?

Options:

A.

Xs

B.

CSV

C.

RIF

D.

XML

Question 16

Each month an analyst needs to execute a data pull for the two prior months. Which of the following is the most efficient function for the analyst to use?

Options:

A.

Logical

B.

Date

C.

Aggregate

D.

System

Question 17

Which of the following is the most likely reason for a data analyst to optimize a query using parameterization?

Options:

A.

To return a subset of records

B.

To insert a temporary table

C.

To prevent SQL injections

D.

To increase the query speed

Question 18

A collections manager has a team calling customers who are past due on their accounts in an attempt to collect payments. The manager receives the call list in the form of a printed report that is generated by the accounting department at the beginning of each week. Consequently, the collections team calls some customers who have made payments in the time since the report was last printed. Which of the following reporting enhancements could the accounting department implement to best reduce the number of calls on current accounts?

Options:

A.

Modify the date range on the report

B.

Include a time stamp on the report.

C.

Increase the frequency of report generation.

D.

Add a report run date to the report.

Question 19

Q3 2020 has just ended, and now a data analyst needs to create an ad-hoc sales report that demonstrates how well the Q3 2020 promotion went versus last year's Q3 promotion.

Which of the following date parameters should the analyst use?

Options:

A.

2019 vs. YTD 2020

B.

Q3 2019 vs. Q3 2020

C.

YTD 2019 vs. YTD 2020

D.

Q4 2019 vs. Q3 2020

Question 20

A data analyst is working with a team to create a dashboard for a client who requires on-demand access. Which of the following is the best delivery method to support the clients’ requirement?

Options:

A.

Email

B.

Scheduled

C.

Subscription

D.

Static

Question 21

A data analyst for a media company needs to determine the most popular movie genre. Given the table below:

Which of the following must be done to the Genre column before this task can be completed?

Options:

A.

Append

B.

Merge

C.

Concatenate

D.

Delimit

Question 22

A JSON file is an example of:

Options:

A.

structured data.

B.

web data.

C.

machine data.

D.

processed data.

Question 23

Which of the following techniques is used to quantify data?

Options:

A.

Decoding

B.

Enumeration

C.

Coding

D.

Structure

Question 24

The ACME Corporation hired an analyst to detect data quality issues in their Excel documents. Which of the following are the most common issues? (Select TWO)

Options:

A.

Apostrophe.

B.

Commas.

C.

Symbols.

D.

Duplicates.

E.

Misspellings.

Question 25

A data analyst is performing a data merge within a spreadsheet using the tables below:

The analyst is attempting to pull the addresses from Table 2 into Table 1 using the last names and is receiving an error message. Which of the following steps can the analyst perform to fix the error?

Options:

A.

Use concatenate to combine the tables.

B.

Ensure the formula is pulling from right to left.

C.

Sort the data by the last name field.

D.

Review the spelling and data type.

Question 26

A data analyst must fulfill a request for information that is needed weekly and should be automatically emailed to a specific set of users. Which of the following types of reports should the analyst recommend?

Options:

A.

A self-service report

B.

A research report

C.

An ad hoc report

D.

An operational report

Question 27

Which of the following would be considered non-personally identifiable information?

Options:

A.

Cell phone device name

B.

Customer’s name

C.

Government ID number

D.

Telephone number

Question 28

After completing web scraping, which of the following file formats needs to be parsed?

Options:

A.

.html

B.

.txt

C.

.csv

D.

.tsv

Question 29

Which of the following is used for calculations and pivot tables?

Options:

A.

IBM SPSS

B.

SAS

C.

Microsoft Excel

D.

Domo

Question 30

A data analyst has a set of data that shows the number of gallons of oil produced each day. The company would like to know the standard deviation for the data set. The variance for the data is 36 gallons. Which of the following is the standard deviation for gallons produced?

Options:

A.

1.16

B.

6

C.

36

D.

72

Question 31

Angela is aggregating data from CRM system with data from an employee system.

While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.

What kind of issues is Angela facing?

Choose the best answer.

Options:

A.

ETL process.

B.

Record linkage.

C.

ELT process.

D.

System integration.

Question 32

A company’s marketing department wants to do a promotional campaign next month. A data analyst on the team has been asked to perform customer segmentation, looking at how recently a customer bought the product, at what frequency, and at what value. Which of the following types of analysis would this practice be considered?

Options:

A.

Prescriptive

B.

Trend

C.

Gap

D.

Custer

Question 33

A sales director has requested a report for individual team members within the division be developed. The director would like the report to be shared with all team members, but individual team members should not be identifiable within the report Which of the following access requirements would support the director's needs?

Options:

A.

Create an acceptable use policy for the sales data.

B.

Release the report as user-group-based access and include data masking.

C.

Get a data use agreement from the individual team members.

D.

Provide the report based on role and include data encryption.

Question 34

A data analyst has been asked to create an ad-hoc sales report for the Chief Executive Officer (CEO).

Which of the following should be included in the report?

Options:

A.

The sales representatives' home addresses.

B.

Line-item SKU numbers.

C.

YTD total sales.

D.

The customers' first and last names.

Question 35

Given the customer table below:

Which of the following chart types is the most appropriate to represent the average spending of active customers vs. inactive customers?

Options:

A.

Pie chart

B.

Heat graph

C.

Scatter plot

D.

Line chart

Question 36

A data analyst is developing a dashboard to track and monitor metrics. Which of the following best practices should be taken into during the FIRST pment process?

Options:

A.

Create a A Aupirarrame:

B.

Deploy to production.

C.

Copy a dashboard design from the Internet.

D.

Develop a dashboard.

Question 37

An analyst is preparing a report that contains weather data. The temperatures are shown in Fahrenheit. but they must be reported in Celsius. Which of the following should the analyst do to fix this issue?

Options:

A.

Normalize the data.

B.

Standardize the data.

C.

Rescale the data.

D.

Aggregate the data.

Question 38

Given the following graph:

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D, over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Question 39

A data analyst needs to create a dashboard to help identify trends in the data sets. Which of the following is an appropriate consideration for dashboard development?

Options:

A.

Data sources and attributes

B.

Frequently asked questions

C.

A report from the data source

D.

A comparison of data sets

Question 40

An analyst is currently working on a ticket for revamping a company-wide dashboard that has been in use for five years. Which of the following should be the first step in the development process?

Options:

A.

Talk to the group that made the request to determine the desired goal.

B.

Make changes to a frequently used report that is already in production.

C.

Build an additional dashboard with fewer views that are tailored toward each specific team.

D.

Develop a more streanMined dashboard to roll out by the next delivery date.

Question 41

Which of the following will MOST likely be streamed live?

Options:

A.

Machine data

B.

Key-value pairs

C.

Delimited rows

D.

Flat files

Question 42

A research analyst collects ten data points from 1.000 specimens. The analyst will not need any additional data to complete the analysis and will not need to retrieve information by specifier. Which of the following is the best data structure for the analyst to use?

Options:

A.

NoSQL

B.

Flat file

C.

JSON

D.

Relational database

Question 43

Analytics reports should follow corporate style guidelines.

Options:

A.

True.

B.

False.

Question 44

While reviewing survey data, an analyst notices respondents entered “Jan,” “January,” and “01” as responses for the month of January. Which of the following steps should be taken to ensure data consistency?

Options:

A.

Delete any of the responses that do not have “January” written out.

B.

Replace any of the responses that have “01”.

C.

Filter on any of the responses that do not say “January” and update them to “January”.

D.

Sort any of the responses that say “Jan” and update them to “01”.

Question 45

A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:

Customer Table -

In-store Transactions –

Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?

Options:

A.

INNER: 6 rows; LEFT: 9 rows

B.

INNER: 9 rows; LEFT: 6 rows

C.

INNER: 9 rows; LEFT: 15 rows

D.

INNER: 15 rows; LEFT: 9 rows

Question 46

A data analyst needs to collect a similar proportion of data from every state. Which of the following sampling methods would be the most appropriate?

Options:

A.

Systematic sampling

B.

Convenience sampling

C.

Stratified sampling

D.

Random sampling

Question 47

Which of the following actions should be taken when transmitting data to mitigate the chance of a data leak occurring? (Choose two.)

Options:

A.

Data identification

B.

Data processing

C.

Data Reporting

D.

Data encryption

E.

Data masking

F.

Fata removal

Question 48

Emma is working in a data warehouse and finds a finance fact table links to an organization dimension, which in turn links to a currency dimension that not linked to the fact table.

What type of design pattern is the data warehouse using?

Options:

A.

Star.

B.

Sun.

C.

Snowflake.

D.

Comet.

Question 49

What would be an example of an acceptable form of primary identification for the Data+ exam?

Options:

A.

Passport.

B.

School ID card.

C.

Employee ID card.

D.

Credit card with photo and signature.

Question 50

A recurring event is being stored in two databases that are housed in different geographical locations. A data analyst notices the event is being logged three hours earlier in one database than in the other database. Which of the following is the MOST likely cause of the issue?

Options:

A.

The data analyst is not querying the databases correctly.

B.

The databases are recording different events.

C.

The databases are recording the event in different time zones.

D.

The second database is logging incorrectly.

Question 51

An analyst has been asked to validate data quality. Which of the following are the BEST reasons to validate data for quality control purposes? (Choose two.)

Options:

A.

Retention

B.

Integrity

C.

Transmission

D.

Consistency

E.

Encryption

F.

Deletion

Question 52

Randy scored 76 on a math test, Katie scored 86 on a science test, Ralph scored 80 on a history test, and Jean scored 80 on an English test. The table below contains the mean and standard deviation of the scores for each of the courses:

Using this information, which of the following students had the BEST score?

Options:

A.

Randy

B.

Katie

C.

Ralph

D.

Jean

Question 53

What SQL command is used to delete an entire table from a database?

Options:

A.

DROP.

B.

MODIFY.

C.

DELETE.

D.

ALTER.

Question 54

Which of the following is an example of a at flat file?

Options:

A.

CSV file

B.

PDF file

C.

JSON file

D.

JPEG file

Question 55

Which of the following best describes the law of large numbers?

Options:

A.

As a sample size decreases, its standard deviation gets closer to the average of the whole population.

B.

As a sample size grows, its mean gets closer to the average of the whole population

C.

As a sample size decreases, its mean gets closer to the average of the whole population.

D.

When a sample size doubles. the sample is indicative of the whole population.

Question 56

Which of the following are reasons to conduct data cleansing? (Select two).

Options:

A.

To perform web scraping

B.

To track KPls

C.

To improve accuracy

D.

To review data sets

E.

To increase the sample size

F.

To calculate trends

Question 57

Given the following data tables:

Which of the following MDM processes needs to take place FIRST?

Options:

A.

Creation of a data dictionary

B.

Compliance with regulations

C.

Standardization of data field names

D.

Consolidation of multiple data fields

Question 58

A marketing analytics team received customer transaction data from two different sources. The data is complete and accurate; however, the field names appear to be inconsistent. Given the following tables:

Which of the following is considered best practice if the team wants to consolidate the files and conduct further analysis?

Options:

A.

Standardize the field names.

B.

Recode the data values.

C.

Overwrite the field names in one of the tables.

D.

Edit the field names in the data dictionary.

Question 59

An analyst is designing a dashboard to determine which site has the highest percentage of new customers. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered to best display the data?

Options:

A.

Include a bar chart using the site and the percentage of new customers data.

B.

Include a line chart using the site and the percentage of new customers data.

C.

Include a pie chart using the site and percentage of new custorners data.

D.

Include a scatter chart using the site and the percent of new customers data.

Question 60

An analyst develops an IT document and needs to describe the technical terms used in the document. Which of the following is where the analyst should include descriptions of the technical terms?

Options:

A.

Glossary

B.

System diagram

C.

User requirements

D.

Index

Question 61

Which of the following is an example of a flat file?

Options:

A.

CSV file

B.

PDF file

C.

JSON file

D.

JPEG file

Question 62

Which of the following BEST describes standard deviation?

Options:

A.

A measure that is used to establish a relationship between two variables

B.

A measure of how data is distributed

C.

A measure of the amount of dispersion of a set of values

D.

A measure that is used to find the significant difference between variables

Question 63

An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?

Options:

A.

Complete an audit on the data pulled for the report.

B.

Complete a check for quality in the report.

C.

Complete a review of the data and a check for consistency

D.

Complete a trend analysis to be included in the report.

Question 64

A publishing group has requested a dashboard to track submissions before publication. A key requirement is that all changes are tracked, as multiple users will be checking out documents and editing them before submissions are considered final. Which of the following is the BEST way to meet this stakeholder requirement?

Options:

A.

Display the version number next to each submission on the dashboard.

B.

Present a data refresh date at the top of the dashboard.

C.

Confirm the dashboard is adhering to the corporate style guide.

D.

Use permissions to ensure users only see certain versions of the submissions.

Question 65

Given the table below:

Which of the following variables can be considered inconsistent, and how many distinct values should the variable have?

Options:

A.

Name, one

B.

Gender, two

C.

Level, three

D.

Code, four

E.

Region, five

Question 66

A data analyst has been asked to create a daily manufacturing report for the floor manager Which of the following metrics should be included in the report?

Options:

A.

Tons of steel produced per hour

B.

Annual sales budget

C.

End-of-day stock price

D.

Daily corporate employee count

Question 67

A data analyst is asked to create a sales report for the second-quarter 2020 board meeting, which will include a review of the business’s performance through the second quarter. The board meeting will be held on July 15, 2020, after the numbers are finalized. Which of the following report types should the data analyst create?

Options:

A.

Static

B.

Real-time

C.

Self-service

D.

Dynamic

Question 68

Given the information in the following tables:

Which of the following describes merging these tables to create a master file that includes all transactions for both online and in-store sales?

Options:

A.

Data audit

B.

Data completeness

C.

Data validation

D.

Data consolidation

Question 69

Which of the following is a domain-specific language used in programming that is designed for managing data that is held in a relational data stream management system?

Options:

A.

SAS

B.

SQL

C.

Python

D.

R

Question 70

Which of the following is an example of a data-mining ETL tool?

Options:

A.

SSIS

B.

Stata

C.

SPSS

D.

Cognos

Question 71

An analyst needs to create an analytics dashboard for an employee intranet site to improve the search functionality, display relevant information, and maintain an updated FAQ page. Which of the following visualizations would best represent what employees are searching for?

Options:

A.

A word cloud

B.

A histogram

C.

A pie chart

D.

A scatter plot

Question 72

An analyst needs to provide a chart to identify the composition between the categories of the survey response data set:

Which of the following charts would be BEST to use?

Options:

A.

Histogram

B.

Pie

C.

Line

D.

Scatter pot

E.

Waterfall

Question 73

Which of the following data types would a telephone number formatted as XXX-XXX-XXXX be considered?

Options:

A.

Numeric

B.

Date

C.

Float

D.

Text

Question 74

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D. over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Question 75

A county in Illinois is conducting a survey to determine the mean annual income per household. The county is 427sq mi (2.65q km). Which of the following sampling methods would MOST likely result in a representative sample?

Options:

A.

A stratified phone survey of 100 people that is conducted between 2:00 p.m. and 3:00 p.m.

B.

A systematic survey that is sent to 100 single-family homes in the county

C.

Surveys sent to ten randomly selected homes within 5mi (8km) of the county’s office

D.

Surveys sent to 100 randomly selected homes that are reflective of the population

Question 76

Which of the following data types best describe 4Ac1? (Select two).

Options:

A.

Alphanumeric

B.

Symbolic

C.

Numeric

D.

Float

E.

Boolean

F.

String

Question 77

A data scientist wants to see which products make the most money and which products attract the most customer purchasing interest in their company.

Which of the following data manipulation techniques would he use to obtain this information?

Options:

A.

Data append

B.

Data blending

C.

Normalize data

D.

Data merge

Question 78

A data analyst is asked on the morning of April 9, 2020, to create a sales report that identifies sales year to date. The daily sales data is current through the end of the day. Which of the following date ranges should be on the report?

Options:

A.

January 1, 2020 to April 1, 2020

B.

January 1, 2020 to April 7, 2020

C.

January 1, 2020 to April 8, 2020

D.

January 1, 2020 to April 9, 2020

Page: 1 / 20
Total 262 questions