Question : You are deploying a Stateful web application www.HadoopExam.com . Which uses Amazon RDS as a backend database. What is the below feature provided by AWS RDS so that your web application all time available?
1. It takes automated backup of database and copy in another region. Hence, in case of website failure you can recover it.
2. RDS has Multi Region deployment. Hence, one region is completely down. It can be used with another region.
3. RDS has Multi-Az deployment. Hence, one Availability zone is down than can be used with another AZ
4. Data Security is responsibility of the AWS.
Correct Answer : 3 Explanation: With optional Multi-AZ deployments, Amazon RDS also manages synchronous data replication across Availability Zones with automatic failover.
Amazon RDS manages the work involved in setting up a relational database: from provisioning the infrastructure capacity you request to installing the database software. Once your database is up and running, Amazon RDS automates common administrative tasks such as performing backups and patching the software that powers your database. With optional Multi-AZ deployments, Amazon RDS also manages synchronous data replication across Availability Zones with automatic failover.
Since Amazon RDS provides native database access, you interact with the relational database software as you normally would. This means you're still responsible for managing the database settings that are specific to your application. You'll need to build the relational schema that best fits your use case and are responsible for any performance tuning to optimize your database for your application's workflow.
When you create or modify your DB instance to run as a Multi-AZ deployment, Amazon RDS automatically provisions and maintains a synchronous "standby" replica in a different Availability Zone. Updates to your DB Instance are synchronously replicated across Availability Zones to the standby in order to keep both in sync and protect your latest database updates against DB instance failure. During certain types of planned maintenance, or in the unlikely event of DB instance failure or Availability Zone failure, Amazon RDS will automatically failover to the standby so that you can resume database writes and reads as soon as the standby is promoted. Since the name record for your DB instance remains the same, your application can resume database operation without the need for manual administrative intervention. With Multi-AZ deployments, replication is transparent: you do not interact directly with the standby, and it cannot be used to serve read traffic.
Question : What does Amazon RDS manage on my behalf?
A. Amazon RDS manages the work involved in setting up a relational database: from provisioning the infrastructure capacity you request B. Installing the database software C. Amazon RDS automates common administrative tasks such as performing backups and patching the software that powers your database. D. managing the database settings that are specific to your application E. Build the relational schema that best fits your use case and are responsible for any performance tuning to optimize your database for your application's workflow
1. A,B,C 2. B,C,D 3. C,D,E 4. A,D,E 5. A,C,E
Correct Answer : 1 Explanation: Amazon RDS manages the work involved in setting up a relational database: from provisioning the infrastructure capacity you request to installing the database software. Once your database is up and running, Amazon RDS automates common administrative tasks such as performing backups and patching the software that powers your database. With optional Multi-AZ deployments, Amazon RDS also manages synchronous data replication across Availability Zones with automatic failover.
Since Amazon RDS provides native database access, you interact with the relational database software as you normally would. This means you're still responsible for managing the database settings that are specific to your application. You'll need to build the relational schema that best fits your use case and are responsible for any performance tuning to optimize your database for your application's workflow.
Question : In which scenario, you will be considering Amazon Redshift solution? 1. You are daily receiving structured feed 20GB each day. And want to store in structured format for analysis, with already existing 200TB of data being accumulated.
2. It is the best solution for creating NoSQL database like HBase
3. It is the best solution for creating application level caching
4. It is good for daily retail banking transaction store.
5. It is best for storing raw files received as a feed.
Correct Answer : 1 Explanation: Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. Customers can start small for just $0.25 per hour with no commitments or upfront costs and scale to a petabyte or more for $1,000 per terabyte per year, less than a tenth of most other data warehousing solutions.
Traditional data warehouses require significant time and resource to administer, especially for large datasets. In addition, the financial cost associated with building, maintaining, and growing self-managed, on-premise data warehouses is very high. Amazon Redshift not only significantly lowers the cost of a data warehouse, but also makes it easy to analyze large amounts of data very quickly.
Amazon Redshift gives you fast querying capabilities over structured data using familiar SQL-based clients and business intelligence (BI) tools using standard ODBC and JDBC connections. Queries are distributed and parallelized across multiple physical resources. You can easily scale an Amazon Redshift data warehouse up or down with a few clicks in the AWS Management Console or with a single API call. Amazon Redshift automatically patches and backs up your data warehouse, storing the backups for a user-defined retention period. Amazon Redshift uses replication and continuous backups to enhance availability and improve data durability and can automatically recover from component and node failures. In addition, Amazon Redshift supports Amazon Virtual Private Cloud (Amazon VPC), SSL, AES-256 encryption and Hardware Security Modules (HSMs) to protect your data in transit and at rest.
Amazon DynamoDB stores structured data, indexed by primary key, and allows low latency read and write access to items ranging from 1 byte up to 400KB. Amazon S3 stores unstructured blobs and suited for storing large objects up to 5 TB. In order to optimize your costs across AWS services, large objects or infrequently accessed data sets should be stored in Amazon S3, while smaller data elements or file pointers (possibly to Amazon S3 objects) are best saved in Amazon DynamoDB.
Tomcat applications often store session-state data in memory. However, this approach doesn't scale well; once the application grows beyond a single web server, the session state must be shared between servers. A common solution is to set up a dedicated session-state server with MySQL. This approach also has drawbacks: you must administer another server, the session-state server is a single pointer of failure, and the MySQL server itself can cause performance problems.
DynamoDB, a NoSQL database store from Amazon Web Services (AWS), avoids these drawbacks by providing an effective solution for sharing session state across web servers.
JSON Document Support You can now store entire JSON-formatted documents as single DynamoDB items (subject to the newly increased 400 KB size limit ).
This new document-oriented support is implemented in the AWS SDKs and makes use of some new DynamoDB data types. The document support (available now in the AWS SDK for Java, the SDK for .NET, the SDK for Ruby, and an extension to the SDK for JavaScript in the Browser) makes it easy to map your JSON data or native language object on to DynamoDB's native data types and for supporting queries that are based on the structure of your document. You can also view and edit JSON documents from within the AWS Management Console.
With this addition, DynamoDB becomes a full-fledged document store. Using the AWS SDKs, it is easy to store JSON documents in a DynamoDB table while preserving their complex and possibly nested "shape." The new data types could also be used to store other structured formats such as HTML or XML by building a very thin translation layer.
Question : You are deploying an application to track GPS coordinates of delivery trucks in the United States. Coordinates are transmitted from each delivery truck once in every three seconds. You need to design an architecture that will enable real-time processing of these coordinates from multiple consumers. Which service should you use to implement data ingestion?
1. Amazon Kinesis 2. AWS Data Pipeline 3. Access Mostly Uused Products by 50000+ Subscribers 4. Amazon Simple Queue Service Ans : 1 Exp : What Is Amazon Kinesis? Use Amazon Kinesis to collect and process large streams of data records in real time. You'll create data-processing applications, known as Amazon Kinesis applications. A typical Amazon Kinesis application takes data from data generators called producers and puts it into an Amazon Kinesis stream as data records. These applications can use the Amazon Kinesis Client Library, and they can run on Amazon EC2 instances. The processed records can be sent to dashboards, used to generate alerts, dynamically change pricing and advertising strategies, or send data to a variety of other AWS services.
What Can I Do with Amazon Kinesis? You can use Amazon Kinesis for rapid and continuous data intake and aggregation. The type of data used includes IT infrastructure log data, application logs, social media, market data feeds, and web clickstream data. Because the response time for the data intake and processing is in real time, the processing is typically lightweight. The following are typical scenarios for using Amazon Kinesis: Accelerated log and data feed intake and processing You can have producers push data directly into a stream. For example, push system and application logs and they'll be available for processing in seconds. This prevents the log data from being lost if the front end or application server fails. Amazon Kinesis provides accelerated data feed intake because you don't batch the data on the servers before you submit it for intake. Real-time metrics and reporting You can use data collected into Amazon Kinesis for simple data analysis and reporting in real time. For example, your data-processing application can work on metrics and reporting for system and application logs as the data is streaming in, rather than wait to receive batches of data. Real-time data analytics This combines the power of parallel processing with the value of real-time data. For example, process website clickstreams in real time, and then analyze site usability engagement using multiple different Amazon Kinesis applications running in parallel. Complex stream processing You can create Directed Acyclic Graphs (DAGs) of Amazon Kinesis applications and data streams. This typically involves putting data from multiple Amazon Kinesis applications into another stream for downstream processing by a different Amazon Kinesis application.
Question : You are working in a credit card company, which continuously process the credit card data and its spending. They can use this data for analytics as well, to analyze any fraud or something. Being a very critical and sensitive information, what all the steps you will take care so that data stored in the RDS will be secured? A. You will be putting RDS instance in private subnet and configure security groups and network access control list. So that only permitted port and protocol can access the data. B. You will be having proper grants created on the tables in database. C. You will be having IAM policy created, so that only permitted users can access the RDS instances. D. You must have installed anti-malware on the RDS instance with the help of AWS support. E. You will always be using VPN connection to access these RDS instances.