Question : A Data Scientist is assigned to build a model from a reporting data warehouse. The warehouse contains data collected from many sources and transformed through a complex, multi-stage ETL process. What is a concern the data scientist should have about the data?
Correct Answer : Get Lastest Questions and Answer : Explanation: Prior to conducting data analysis, the required data must be collected and processed to extract the useful information. The degree of initial processing and data preparation depends on the volume of data, as well as how straightforward it is to understand the structure of the data. Highly processed data may loose some imporatnt information.
Question : Which word or phrase completes the statement? Emphasis color is to standard color as _______ .
Correct Answer : Get Lastest Questions and Answer : Explanation: Our brains are compelled to find meaning, whether it is intended or not. Because the eyes are attracted to bright and high-contrast colors, viewers will derive meaning from something that stands out. When you use color for emphasis, it's like shouting that this object or element has the greatest value. At the Lynda.com site, the bright yellow is used to prominently display their most important message.
Question : Which data asset is an example of semi-structured data?
Correct Answer : Get Lastest Questions and Answer : Explanation: 5.3. Semi-Structured Data idea predates XML but not HTML data is available electronically in database systems file systems, e.g., bibliographic data, Web data data exchange formats, e.g., EDI, scientific data attempt to reconcile database and document "worlds" semi-structured data organised in semantic entities similar entities are grouped together entities in same group may not have same attributes order of attributes not necessarily important not all attributes may be required size of same attributes in a group may differ type of same attributes in a group may differ 5.4. Example of Semi-Structured Data
name: Peter Wood email: ptw@dcs.bbk.ac.uk, p.wood@bbk.ac.uk name: first name: Mark last name: Levene email: mark@dcs.bbk.ac.uk name: Alex Poulovassilis affiliation: Birkbeck 5.5. Semi-Structured Data Models
based on labelled graphs rather than labelled trees used for data exchange among, and integration of, heterogeneous data sources
schema information is in the edge labels sometimes called schemaless or self-describing data stored at the leaves