Question : On analyzing your time series data you suspect that the data represented as y1, y2, y3, ... , yn-1, yn may have a trend component that is quadratic in nature. Which pattern of data will indicate that the trend in the time series data is quadratic in nature?
Correct Answer : Get Lastest Questions and Answer : Explanation: One definition of a time series is that of a collection of quantitative observations that are evenly spaced in time and measured successively. Examples of time series include the continuous monitoring of a person's heart rate, hourly readings of air temperature, daily closing price of a company stock, monthly rainfall data, and yearly sales figures. Time series analysis is generally used when there are 50 or more data points in a series. If the time series exhibits seasonality, there should be 4 to 5 cycles of observations in order to fit a seasonal model to the data.
Goals of time series analysis: 1. Descriptive: Identify patterns in correlated data-trends and seasonal variation 2. Explanation: understanding and modeling the data 3. Access Mostly Uused Products by 50000+ Subscribers 4. Intervention analysis: how does a single event change the time series? 5. Quality control: deviations of a specified size indicate a problem
Time series are analyzed in order to understand the underlying structure and function that produce the observations. Understanding the mechanisms of a time series allows a mathematical model to be developed that explains the data in such a way that prediction, monitoring, or control can occur. Examples include prediction/forecasting, which is widely used in economics and business. Monitoring of ambient conditions, or of an input or an output, is common in science and industry. Quality control is used in computer science, communications, and industry.
It is assumed that a time series data set has at least one systematic pattern. The most common patterns are trends and seasonality. Trends are generally linear or quadratic. To find trends, moving averages or regression analysis is often used. Seasonality is a trend that repeats itself systematically over time. A second assumption is that the data exhibits enough of a random process so that it is hard to identify the systematic patterns within the data. Time series analysis techniques often employ some type of filter to the data in order to dampen the error. Other potential patterns have to do with lingering effects of earlier observations or earlier random errors.
There are numerous software programs that will analyze time series, such as SPSS, JMP, and SAS/ETS. For those who want to learn or are comfortable with coding, Matlab, S-PLUS, and R are other software packages that can perform time series analyses. Excel can be used if linear regression analysis is all that is required (that is, if all you want to find out is the magnitude of the most obvious trend). A word of caution about using multiple regression techniques with time series data: because of the autocorrelation nature of time series, time series violate the assumption of independence of errors. Type I error rates will increase substantially when autocorrelation is present. Also, inherent patterns in the data may dampen or enhance the effect of an intervention; in time series analysis, patterns are accounted for within the analysis.
Observations made over time can be either discrete or continuous. Both types of observations can be equally spaced, unequally spaced, or have missing data. Discrete measurements can be recorded at any time interval, but are most often taken at evenly spaced intervals. Continuous measurements can be spaced randomly in time, such as measuring earthquakes as they occur because an instrument is constantly recording, or can entail constant measurement of a natural phenomenon such as air temperature, or a process such as velocity of an airplane.
Question : Which analytical method is considered unsupervised?
Correct Answer : Get Lastest Questions and Answer : Explanation: kmeans uses an iterative algorithm that minimizes the sum of distances from each object to its cluster centroid, over all clusters. This algorithm moves objects between clusters until the sum cannot be decreased further. The result is a set of clusters that are as compact and well-separated as possible. You can control the details of the minimization using several optional input parameters to kmeans, including ones for the initial values of the cluster centroids, and for the maximum number of iterations. Clustering is primarily an exploratory technique to discover hidden structures of the data, possibly as a prelude to more focused analysis or decision processes. Some specific applications of k-means are image processing, medical, and customer segmentation. Clustering is often used as a lead-in to classification. Once the clusters are identified, labels can be applied to each cluster to classify each group based on its characteristics. Marketing and sales groups use k-means to better identify customers who have similar behaviors and spending patterns.
Question : You have used k-means clustering to classify behavior of , customers for a retail store. You decide to use household income, age, gender and yearly purchase amount as measures. You have chosen to use 8 clusters and notice that 2 clusters only have 3 customers assigned. What should you do?
Explanation: kmeans uses an iterative algorithm that minimizes the sum of distances from each object to its cluster centroid, over all clusters. This algorithm moves objects between clusters until the sum cannot be decreased further. The result is a set of clusters that are as compact and well-separated as possible. You can control the details of the minimization using several optional input parameters to kmeans, including ones for the initial values of the cluster centroids, and for the maximum number of iterations. Clustering is primarily an exploratory technique to discover hidden structures of the data, possibly as a prelude to more focused analysis or decision processes. Some specific applications of k-means are image processing, medical, and customer segmentation. Clustering is often used as a lead-in to classification. Once the clusters are identified, labels can be applied to each cluster to classify each group based on its characteristics. Marketing and sales groups use k-means to better identify customers who have similar behaviors and spending patterns.
1. "Saturated" data, indicating potential issues with data definitions 2. Incomplete data, indicating potential issues with data transmission 3. Access Mostly Uused Products by 50000+ Subscribers 4. The exhibit does not raise any obvious concerns with the data.