THECRYSTALCLOUDS: AWS Blog

Showing posts with label AWS Blog. Show all posts

Tuesday, 23 October 2012

AWS Week in Review - October 8th to October 14th, 2012

Let's take a quick look at what happened in AWS-land last week:

Monday, October 8	AWS friends at AWS Partner 8KMiles released a new presentation, Email Architecture using Amazon SES.
Tuesday, October 9	AWS added two new checks to the AWS Trusted Advisor - S3 Bucket Policy and Idle Elastic Load Balancer. Jeff posted another episode of The AWS Report, Balan Subramanian talked about the Amazon Simple Workflow Service.
Wednesday, October 10	AWS introduced lots of new features for AWS GovCloud (US). AWS announced that AWS is now used by more than 300 government agencies and 1500 educational institutions. AWS published a number of new AWS case studies including Ask.fm, Automobil Lamborghini, Avianca, Politechnico di Milano, Shaw Media, and Swisstopo. AWS added SSL support to Amazon RDS for SQL Server.
Thursday, October 11	Amazon RDS for MySQL now supports promotion of read replicas. Version 2012.09 of the Amazon Linux AMI is now available.
Friday, October 12	This week AWS Marketplace added many new listings including Openbravo, Splunk Storm, iMeet, and ERP Oracle in the Cloud.
Sunday, October 14	AWS now provide console and API access to EC2 Spot Instance Bid Status.

SOURCE

AWS Week in Review - October 15th to October 21st, 2012

Let's take a quick look at what happened in AWS-land last week:

Monday, October 15	AWS CloudFormation now supports RDS DB Parameter Groups and Provisioned IOPS.
Tuesday, October 16	SAP Hana One is now certified for production use on AWS. New AWS Report posted - Patrick McBride of Xceedium.
Friday,October 19	You can now launch EC2 Micro Instances in a Virtual Private Cloud.

SOURCE

Wednesday, 3 October 2012

Get Started With Oracle Applications Now With Our New Test Drive Program

AWS has just launched Oracle Test Drive Labs.

The purpose of the Oracle Test Drive program is to provide customers with the ability to quickly and easily explore the benefits of using Oracle software on AWSserver infrastructure.

These labs have been developed by Oracle and AWS partners and are provided free of charge for educational and demonstration purposes.

Each Test Drive lab includes up to 5 hours of complimentary AWS server time to complete the lab, and you can return here and to try any or all of the Test drive Labs at any time, so feel free to experiment and explore!

Please note that there may be some pre-requisitesfor few labs. Kindly understand them, acquire the required accounts or softwares before proceeding with the labs.

For example, Oracle Secure Backup to S3 requires Oracle Technet (OTN) account.

The products vary from Oracle products for Database and Infrastructure, Oracle Applications and Oracle Fusion Middleware.

We can select from nearly a dozen labs which include but are not limited to:

Oracle Data Guard Disaster Recovery

Oracle Secure Backup to S3

Siebel on AWS

Read below post by Jeff for sample demo of back up Oracle database to AWS using the Oracle Secure Backup product.

One of the key advantages that customers and partners are telling us they really appreciate about AWS is its unique ability to cut down the time required to evaluate new software stacks. These "solution appliances" can now be easily deployed on AWS and evaluated by customers in hours or days, rather than in weeks or months, as is the norm with the previous generation of IT infrastructure.

With this in mind, AWS has teamed up with leading Oracle ecosystem partners on a new initiative called the Oracle Test Drive program.

Amazon RDS - Now Available in the AWS Free Usage Tier

Good News!!!

AWS has announced AWS RDS Support in AWS Free Usage Tier. Prior to this, AWS only supported AWS RDS - SQL Server in Free tier, such as:

Earlier

But starting today, AWS has announced support for all databases of AWS RDS :

Starting October 1st, 2012

NOTE:
* These free tiers are only available to new AWS customers, and are available for 12 months following your AWS sign-up date. When your free usage expires or if your application use exceeds the free usage tiers, you simply pay standard, pay-as-you-go service rates (see each service page for full pricing details). Restrictions apply; see offer terms for more details.

‡‡ Amazon RDS free tier is currently not available in VPC.

Read below abstract from Jeff's post on AWS Blog :

[...] Our customers appreciate the fact that they can launch DB Instances on demand at very affordable hourly rates, with the option to purchase Reserved DB Instances to reduce their costs even more. Here's a video (featuring Biswaroop Palit of the Amazon RDS team) with more information about what RDS is and how it will simplify your life:

We are now adding RDS to the AWS Free Usage Tier. New AWS customers (see the AWS Free Usage Tier FAQ for eligibility details) can use the MySQL, Oracle (BYOL licensing model), or SQL Server database engines on a Micro DB Instance for up to 750 hours per month, along with 20 GB of database storage, 10 million I/Os and 20 GB of backup storage. When you combine this new capability with the existing EC2 usage available on the Free Usage Tier, you may be to build and run a complete multi-tiered web application without spending a penny. Here's another video with more information on this important new development:

AWS WEBINAR

In order to help you to get the most from Amazon RDS on Oracle, we'll be hosting a free RDS webinar at 10:00 AM (PT) on October 18th.

Attend the webinar to learn how RDS lets you focus on your business by addressing the key pain points that come with Oracle database administration.

SOURCE for Jeff's Blog Post.

Friday, 28 September 2012

Amazon RDS Now Supports SQL Server 2012

Want to try SQL Server 2012 ? Now dont invest in hardware and software. AWS RDS now supports SQL Server 2012 with easy to use interface and very affordable prices.

With added support for Microsoft SQL Server 2012, Amazon RDS customers can use the new features Microsoft has introduced as part of SQL Server 2012 including improvements to manageability, performance, programmability, and security.

Read more below extract from Jeff's Blog on the announcement :

The Amazon Relational Database Service (RDS) now supports SQL Server 2012.You can now launch the Express, Web, and Standard Editions of this powerful database from the comfort of the AWS Management Console. SQL Server 2008 R2 is still available, as are multiple versions and editions of MySQL and Oracle Database.

If you are from the Microsoft world and haven't heard of RDS, here's the executive summary:

You can run the latest and greatest offering from Microsoft in a fully managed environment. RDS will install and patch the database, make backups, and detect and recover from failures. It will also provide you with a point-and-click environment to make it easy for you to scale your compute resources up and down as needed.

What's New?

SQL Server 2012 supports a number of new features including contained databases, columnstore indexes, sequences, and user-defined roles:

A contained database is isolated from other SQL Server databases including system databases such as "master." This isolation removes dependencies and simplifies the task of moving databases from one instance of SQL Server to another.
Columnstore indexes are used for data warehouse style queries. Used properly, they can greatly reduce memory consumption and I/O requests for large queries.
Sequences are counters that can be used in more than one table.
The new user-defined role management system allows users to create custom server roles.

Read the SQL Server What's New documentation to learn more about these and other features.

You can launch SQL Server 2012 from the AWS Management Console. First you select the edition that best meets your needs:

Then you fill in the details (SQL Server 2012 is version 11), and your DB Instance will be launched in a matter of minutes:

Yes, This is Cool!

You can now get started with SQL Server 2012 without having to invest in hardware or buying a license. If you are eligible for the AWS Free Usage Tier, you can get started without spending a penny. You can launch a DB Instance, evaluate the product, do a trial migration of your data, and learn all about the new features at minimal cost. When the time comes to move your organization to SQL Server 2012, you'll already have experience using it in a real-world environment.

For more information on what’s new in SQL Server 2012, please visit Microsoft’s SQL Server 2012 MSDN documentation.

To learn more about using RDS for SQL Server 2012, please visit the Amazon RDS for SQL Server detail page, AWS documentation and FAQs.

Thursday, 27 September 2012

AWS Announcement : High Performance Provisioned IOPS Storage For Amazon RDS

After announcing EBS Provisioned IOPS offering lately which allows you to specify both volume size and volume performance in term of number of I/O operations per second (IOPS), AWS has now announced High Performance Provisioned IOPS Storage for Amazon RDS.

You can now create an RDS database instance and specify your desired level of IOPS in order to get more consistent throughput and performance.

Amazon RDS Provisioned IOPS is immediately available for new database instances in the US East (N. Virginia), US West (N. California), and EU West (Ireland) Regions and AWS plan to launch in other AWS Regions in the coming months.

AWs is rolling this out in two phases. Read on more the extract from the announcement on AWS Blog by Jeff.

We are rolling this out in two stages. Here's the plan:

Effective immediately, you can provision new RDS database instances with 1,000 to 10,000 IOPS, and with 100GB to 1 TB of storage for MySQL and Oracle databases. If you are using SQL Server, the maximum IOPS you can provision is 7,000 IOPS. All other RDS features including Multi-AZ, Read Replicas, and the Virtual Private Cloud, are also supported.

In the near future, we plan to provide you with an automated way to migrate existing database instances to Provisioned IOPS storage for the MySQL and Oracle database engines. If you want to migrate an existing database instance to Provisioned IOPS storage immediately, you can export your data and re-import it into a new database instance equipped with Provisioned IOPS storage.

We expect database instances with RDS Provisioned IOPS to be used in demanding situations. For example, they are a perfect host for I/O-intensive transactional (OLTP) workloads.

We recommend that customers running production database workloads use Amazon RDS Provisioned IOPS for the best possible performance. (By the way, for mission critical OLTP workloads, you should also consider adding the Amazon RDS Multi-AZ option to improve availability.)

Check out the video with Rahul Pathak of the Amazon RDS team to learn more about this new feature and how some of AWS customers were using it:

Responses from AWS customers :

AWS customer Flipboard uses RDS to deliver billions of page flips each month to millions of mobile phone and tablet users. Sang Chi, Data Infrastructure Architect at Flipboard told us:

"We want to provide the best possible reading and content delivery experience for a rapidly growing base of users and publishers. This requires us not only to use a high performance database today but also to continue to improve our performance in the future. Throughput consistency is critical for our workloads. Based on results from our early testing, we are very excited about Amazon RDS Provisioned IOPS and the impact it will have on our ability to scale. We’re looking forward to scaling our database applications to tens of thousands of IOPS and achieving consistent throughput to improve the experience for our users."

AWS customer Shine Technologies uses RDS for Oracle to build complex solutions for enterprise customers. Adam Kierce, their Director said:

"Amazon RDS Provisioned IOPS provided a turbo-boost to our enterprise class database-backed applications. In the past, we have invested hundreds of days in time consuming and costly code based performance tuning, but with Amazon RDS Provisioned IOPS we were able to exceed those performance gains in a single day. We have demanding clients in the Energy, Telecommunication, Finance and Retail industries, and we fully expect to move all our Oracle backed products onto AWS using Amazon RDS for Oracle over the next 12 months. The increased performance of Amazon's RDS for Oracle with Provision IOPS is an absolute game changer, because it delivers more (performance) for less (cost)."

SOURCE

Tuesday, 18 September 2012

Amazon VPC - New Additions

AWS has added 3 new features / options to the Amazon Virtual Private Cloud (VPC) service.

Create Hardware VPN connections to your VPC using static routing.
Configure automatic propagation of routes from your VPN and Direct Connect links (gateways) to your VPC's routing tables
Launch RDS database instances running Microsoft SQL Server inside of a Virtual Private Cloud (VPC).

PFB extract for the two blogs written by Jeff on the same:

Amazon VPC - Additional VPN Features

The Amazon Virtual Private Cloud (VPC) gives you the power to create a private, isolated section of the AWS Cloud. You have full control of network addressing. Each of your VPCs can include subnets (with access control lists), route tables, and gateways to your existing network and to the Internet.

You can connect your VPC to the Internet via an Internet Gateway and enjoy all the flexibility of Amazon EC2 with the added benefits of Amazon VPC. You can also setup an IPsec VPN connection to your VPC, extending your corporate data center into the AWS Cloud. Today we are adding two options to give you additional VPN connection flexibility:

You can now create Hardware VPN connections to your VPC using static routing. This means that you can establish connectivity using VPN devices that do not support BGP such as Cisco ASA and Microsoft Windows Server 2008 R2. You can also use Linux to establish a Hardware VPN connection to your VPC. In fact, any IPSec VPN implementation should work.
You can now configure automatic propagation of routes from your VPN and Direct Connect links (gateways) to your VPC's routing tables. This will make your life easier as you won’t need to create static route entries in your VPC route table for your VPN connections. For instance, if you’re using dynamically routed (BGP) VPN connections, your BGP route advertisements from your home network can be automatically propagated into your VPC routing table.

If your VPN hardware is capable of supporting BGP, this is still the preferred way to go as BGP performs a robust liveness check on the IPSec tunnel. Each VPN connection uses two tunnels for redundancy; BGP simplifies the failover procedure that is invoked when one VPN tunnel goes down.

Amazon RDS News - Oracle Data Pump

The Amazon RDS team is rolling out new features at a very rapid clip.

The most awaited feature - Oracle Data Pump is finally here.

Extract from blog post by Jeff:
Customers have asked us to make it easier to import their existing databases into Amazon RDS. We are making it easy for you to move data on and off of the DB Instances by using Oracle Data Pump. A number of scenarios are supported including:

Transfer between an on-premises Oracle database and an RDS DB Instance.

Transfer between an Oracle database running on an EC2 instance and an RDS DB Instance.

Transfer between two RDS DB Instances.

These transfers can be run in either direction. We currently support the network mode of Data Pump where the job source is an Oracle database. Transfers using Data Pump should be considerably faster than those using the original Import and Export utilities. Oracle Data Pump is available on all new DB Instances running Oracle Database 11.2.0.2.v5. To use Data Pump with your existing v3 and v4 instances, please upgrade to v5 by following the directions in the RDS User Guide. To learn more about importing and exporting data from your Oracle databases, check out our new import/export guide.

For those who are not aware what Oracle Data Pump is -

Oracle Data Pump is a feature of Oracle Database 11g Release 2 that enables very fast bulk data and metadata movement between Oracle databases. Oracle Data Pump provides new high-speed, parallel Export and Import utilities (expdp and impdp) as well as a Web-based Oracle Enterprise Manager interface.

Data Pump Export and Import utilities are typically much faster than the original Export and Import Utilities. A single thread of Data Pump Export is about twice as fast as original Export, while Data Pump Import is 15-45 times fast than original Import.
Data Pump jobs can be restarted without loss of data, whether or not the stoppage was voluntary or involuntary.
Data Pump jobs support fine-grained object selection. Virtually any type of object can be included or excluded in a Data Pump job.
Data Pump supports the ability to load one instance directly from another (network import) and unload a remote instance (network export).

Friday, 14 September 2012

Amazon EC2 Reserved Instance Marketplace

Superbly detailed blog Post by Jeff on Amazon EC2 Reserved Instance Marketplace

No more words need to be added....

EC2 Options

I often tell people that cloud computing is equal parts technology and business model. Amazon EC2 is a good example of this; you have three options to choose from:

You can use On-Demand Instances, where you pay for compute capacity by the hour, with no upfront fees or long-term commitments. On-Demand instances are recommended for situations where you don't know how much (if any) compute capacity you will need at a given time.

If you know that you will need a certain amount of capacity, you can buy an EC2 Reserved Instance. You make a low, one-time upfront payment, reserve it for a one or three year term, and pay a significantly lower hourly rate. You can choose between Light Utilization, Medium Utilization, and Heavy Utilization Reserved Instances to further align your costs with your usage.

You can also bid for unused EC2 capacity on the Spot Market with a maximum hourly price you are willing to pay for a particular instance type in the Region and Availability Zone of your choice. When the current Spot Price for the desired instance type is at or below the price you set, your application will run.

Reserved Instance Marketplace

Today we are increasing the flexibility of the EC2 Reserved Instance model even more with the introduction of the Reserved Instance Marketplace. If you have excess capacity, you can list it on the marketplace and sell it to someone who needs additional capacity. If you need additional capacity, you can compare the upfront prices and durations of Reserved Instances on the marketplace to the upfront prices of one and three year Reserved Instances available directly from AWS. The Reserved Instances in the Marketplace are functionally identical to other Reserved Instances and have the then-current hourly rates, they will just have less than a full term and a different upfront price.

AWS Expands in Japan

Amazon Web Services (AWS) is expanding in Japan with the addition of a third Availability Zone.
The move means that AWS will most likely be adding more data centers to keep up with the steady demand in service it has had since it first began offering its service in Tokyo 18 months ago.

For people who are not aware of Availability zones and Regions of AWS -

Amazon Web Services serves hundreds of thousands of customers in more than 190 countries.
Currently, AWS has spanned across 8 regions around the Globe.
Each region has multiple availability zones.
Each availability zone can encompass multiple data centers.

See a detailed list of offerings at all AWS locations

Extracted below a nice blog post by Jeff:

We announced an AWS Region in Tokyo about 18 months ago. In the time since the launch, our customers have launched all sorts of interesting applications and businesses there. Here are a few examples:

Cookpad.com is the top recipe site in Japan. They are hosted entirely on AWS, and handle more than 15 million users per month.

KAO is one of Japan's largest manufacturers of cosmetics and toiletries. They recently migrated their corporate site to the AWS cloud.

Fukoka City launched the Kawaii Ward project to promote tourism to the virtual city. After a member of the popular Japanese idol group AKB48 raised awareness of this site, virtual residents flocked to the site to sign up for an email newsletter. They expected 10,000 registrations in the first week and were pleasantly surprised to receive over 20,000.

Demand for AWS resources in Japan has been strong and steady, and we've been expanding the region accordingly. You might find it interesting to know that an AWS region can be expanded in two different ways. First, we can add additional capacity to an existing Availability Zone, spanning multiple datacenters if necessary. Second, we can create an entirely new Availability Zone.

Over time, as we combine both of these approaches, a single AWS region can grow to encompass many datacenters. For example, the US East (Northern Virginia) region currently occupies more than ten datacenters structured as multiple Availability Zones.

Today, we are expanding the Tokyo region with the addition of a third Availability Zone.

This will add capacity and will also provide you with additional flexibility. As is always the case with AWS, untargeted launches of EC2 instances will now make use of this zone with no changes to existing applications or configurations. If you are currently targeting specific Availability Zones, please make sure that your code can handle this new option.

For more references: http://techcrunch.com/2012/09/13/amazon-web-services-expands-in-japan/

Tuesday, 28 August 2012

AWS Cost Allocation For Customer Bills

A good new feature by AWS to help customers keep control over costs and well put blog by Jeff...

Growth Challenges

You probably know how it goes when you put AWS to work for your company. You start small -- one Amazon S3 bucket for some backups, or one Amazon EC2 instance hosting a single web site or web application. Things work out well and before you know it, word of your success spreads to your team, and they start using it too. At some point the entire company jumps on board, and you become yet another AWS success story.

As your usage of AWS grows, you stop charging it to your personal credit card and create an account for your company. You use IAM to control access to the AWS resources created and referenced by each of the applications.

There's just one catch -- with all of those departments, developers, and applications making use of AWS from a single account, allocating costs to projects and to budgets is difficult because we didn't give you the necessary information. Some of our customers have told us that this cost allocation process can consume several hours of their time each month.

Cost Allocation Via Tagging

Extending the existing EC2 tagging system (keys and values), we are launching a new cost allocation system to make it easy for you to tag your AWS resources and to access billing data that is broken down by tag (or tags).

With this release you can tag the following types of AWS resources for cost allocation purposes:

S3 buckets
EC2 Instances
EBS volumes
Reserved Instances
Spot Instance requests
VPN connections
Amazon RDS DB Instances
AWS CloudFormation Stacks

Here's all that you need to do:

Decide on Your Tagging Model - Typically, the key name identifies some axis that you care about and the key values identify the points along the axis. You could have a tag named Department, with values like Sales, Marketing, Development, QA, Engineering, and so forth. You could choose to align this with your existing accounting system. You can use multiple tags for cost allocation purposes, each of which represents an additional dimension of usage. If each department runs several AWS-powered applications (or stores lots of data in S3), you could add an Application tag, with the values representing all of the applications that are running on behalf of the department. You can use the tags to create your own custom hierarchy.
Tag Your Resources - Apply the agreed-upon tags to your existing resources, and arrange to apply them to newly created resources as they appear. You can add up to ten tags per resource. You can do this from the AWS Management Console, the service APIs, the command line, or through Auto Scaling:

You can use CloudFormation to provision a set of related AWS resources and easily tag them.
Tell AWS Which Tags Matter -Now you need to log in to the AWS Portal, sign up for billing reports, and tell the AWS billing system which tag keys are meaningful for cost allocation purposes by using the Manage Cost Allocation Report option:

You can choose to include certain tags and to exclude others.
Access Billing Data - The estimated billing data is generated multiple times per day and the month-end charges are generated within three days of the end of the month. You can access this data by enabling programmatic access and arranging for it to be delivered to your S3 bucket.

Data Processing

The Cost Allocation Report will contain one additional column for each of the tag keys that you selected in step 3. The corresponding tag value (if any) will be included in the appropriate column of the data:

In the Cost Allocation Report above, the relevant keys were Owner, Stack, Cost Center, Application, and Project. The column will be blank if the AWS resource doesn't happen to have a value for the key. Data transfer and request charges are also included for tagged resources. In effect, these charges inherit the tags from the associated resource.

Once you have this data, you can feed it in to your own accounting system or you can slice and dice it any way you'd like for reporting or visualization purposes. For example, you could create a pivot table and aggregate the data along one or more dimensions:

SOURCE

Friday, 24 August 2012

AWS New Whitepaper: Mapping and GeoSpatial Analysis in the Cloud Using ArcGIS

Great new whitepaper by Jinesh Varia...

Esri is one of the leaders in the Geographic Information Systems (GIS) industry and one of the largest privately held software companies focused on mapping and geospatial applications in the world with offices in more than 100 countries. Both public and private sector organizations use Esri technology to analyze and manage their geographic information and make better decisions – uses range from planning cities and improving the quality of life for residents, to site selection, customer analytics, and streamlining logistics.

Esri and AWS have been working together since 2008 to bring the power of GIS to the masses. The AWS Partner Team recently attended the 2012 Esri International User Conference with over 14,000+ attendees, 300 exhibitors and a large number of ecosystem partners. A cloud computing theme dominated the conference.
Esri and AWS have co-authored a whitepaper, "Mapping and GeoSpatial Analysis Using ArcGIS", to provide users who have interest in performing spatial analysis using their data with complimentary datasets.

The paper discusses how users can publish and analyze imagery data (such as satellite imagery, or aerial imagery) and create and publish tile cache map services from spatially referenced data (such as data with x/y points, lines, polygons) in AWS using ArcGIS.

Download PDF: Mapping and GeoSpatial Analysis Using ArcGIS

The paper focuses on imagery because that has been the most challenging data type to manage in the cloud, but the approaches discussed are general enough to apply to any type of data.

It not only provides architecture guidance on how to scale ArcGIS servers in the cloud but also provides step-by-step guidance on publishing map services in the cloud.

For more information on GeoApps in the AWS Cloud, see the presentation -
The Cloud as a Platform for Geo below:
GeoApps in the AWS Cloud - Jinesh Varia from Amazon Web Services

SOURCE

Tuesday, 21 August 2012

Amazon CloudSearch - Start Searching in One Hour for Less Than $100 / Month

Extract from Amazon Web Service Evangelist Jeff Barr's CloudSearch blog post for more information about how you can start searching in an hour for less than $100 a month...

Continuing along in our quest to give you the tools that you need to build ridiculously powerful web sites and applications in no time flat at the lowest possible cost, I'd like to introduce you to Amazon CloudSearch. If you have ever searched Amazon.com, you've already used the technology that underlies CloudSearch. You can now have a very powerful and scalable search system (indexing and retrieval) up and running in less than an hour.

You, sitting in your corporate cubicle, your coffee shop, or your dorm room, now have access to search technology at a very affordable price. You can start to take advantage of many years of Amazon R&D in the search space for just $0.12 per hour (I'll talk about pricing in depth later).

What is Search?

Search plays a major role in many web sites and other types of online applications. The basic model is seemingly simple. Think of your set of documents or your data collection as a book or a catalog, composed of a number of pages. You know that you can find the desired content quickly and efficiently by simply consulting the index.

Search does the same thing by indexing each document in a way that facilitates rapid retrieval. You enter some terms into a search box and the site responds (rather quickly if you use CloudSearch) with a list of pages that match the search terms.

As is the case with many things, this simple model masks a lot of complexity and might raise a lot of questions in your mind. For example:

How efficient is the search? Did the search engine simply iterate through every page, looking for matches, or is there some sort of index?
The search results were returned in the form of an ordered list. What factor(s) determined which documents were returned, and in what order (commonly known as ranking)? How are the results grouped?
How forgiving or expansive was the search? Did a search for "dogs" return results for "dog?" Did it return results for "golden retriever," or "pet?"
What kinds of complex searches or queries can be used? Does the result for "dog training" return the expected results. Can you search for "dog" in the Title field and "training" in the Description?
How scalable is the search? What if there are millions or billions of pages? What if there are thousands of searches per hour? Is there enough storage space?
What happens when new pages are added to the collection, or old pages are removed? How does this affect the search results?
How can you efficiently navigate through and explore search results? Can you group and filter the search results in ways that take advantage of multiple named fields (often known as a faceted search).

Needless to say, things can get very complex very quickly. Even if you can write code to do some or all of this yourself, you still need to worry about the operational aspects. We know that scaling a search system is non-trivial. There are lots of moving parts, all of which must be designed, implemented, instantiated, scaled, monitored, and maintained. As you scale, algorithmic complexity often comes in to play; you soon learn that algorithms and techniques which were practical at the beginning aren't always practical at scale.

What is Amazon CloudSearch?

Amazon CloudSearch is a fully managed search service in the cloud. You can set it up and start processing queries in less than an hour, with automatic scaling for data and search traffic, all for less than $100 per month.

CloudSearch hides all of the complexity and all of the search infrastructure from you. You simply provide it with a set of documents and decide how you would like to incorporate search into your application.

You don't have to write your own indexing, query parsing, query processing, results handling, or any of that other stuff. You don't need to worry about running out of disk space or processing power, and you don't need to keep rewriting your code to add more features.

With CloudSearch, you can focus on your application layer. You upload your documents, CloudSearch indexes them, and you can build a search experience that is custom-tailored to the needs of your customers.

How Does it Work?

The Amazon CloudSearch model is really simple, but don't confuse simple, with simplistic -- there's a lot going on behind the scenes!

Here's all you need to do to get started (you can perform these operations from the AWS Management Console, the CloudSearch command line tools, or through the CloudSearch APIs):

Create and configure a Search Domain. This is a data container and a related set of services. It exists within a particular Availability Zone of a single AWS Region (initially US East).
Upload your documents. Documents can be uploaded as JSON or XML that conforms to our Search Document Format (SDF). Uploaded documents will typically be searchable within seconds. You can, if you'd like, send data over an HTTPS connection to protect it while it is transit.
Perform searches.

There are plenty of options and goodies, but that's all it takes to get started.

Amazon CloudSearch applies data updates continuously, so newly changed data becomes searchable in near real-time. Your index is stored in RAM to keep throughput high and to speed up document updates. You can also tell CloudSearch to re-index your documents; you'll need to do this after changing certain configuration options, such as stemming (converting variations of a word to a base word, such as "dogs" to "dog") or stop words (very common words that you don't want to index).
Amazon CloudSearch has a number of advanced search capabilities including faceting and fielded search:

Faceting allows you to categorize your results into sub-groups, which can be used as the basis for another search. You could search for "umbrellas" and use a facet to group the results by price, such as $1-$10, $10-$20, $20-$50, and so forth. CloudSearch will even return document counts for each sub-group.

Fielded searching allows you to search on a particular attribute of a document. You could locate movies in a particular genre or actor, or products within a certain price range.

Search Scaling
Behind the scenes, CloudSearch stores data and processes searches using search instances. Each instance has a finite amount of CPU power and RAM. As your data expands, CloudSearch will automatically launch additional search instances and/or scale to larger instance types. As your search traffic expands beyond the capacity of a single instance, CloudSearch will automatically launch additional instances and replicate the data to the new instance. If you have a lot of data and a high request rate, CloudSearch will automatically scale in both dimensions for you.

Amazon CloudSearch will automatically scale your search fleet up to a maximum of 50 search instances. We'll be increasing this limit over time; if you have an immediate need for more than 50 instances, please feel free to contact us and we'll be happy to help.

The net-net of all of this automation is that you don't need to worry about having enough storage capacity or processing power. CloudSearch will take care of it for you, and you'll pay only for what you use.

Pricing Model

The Amazon CloudSearch pricing model is straightforward:

You'll be billed based on the number of running search instances. There are three search instance sizes (Small, Large, and Extra Large) at prices ranging from $0.12 to $0.68 per hour (these are US East Region prices, since that's where we are launching CloudSearch).

There's a modest charge for each batch of uploaded data. If you change configuration options and need to re-index your data, you will be billed $0.98 for each Gigabyte of data in the search domain.
There's no charge for in-bound data transfer, data transfer out is billed at the usual AWS rates, and you can transfer data to and from your Amazon EC2 instances in the Region at no charge.

Advanced Searching

Like the other Amazon Web Services, CloudSearch allows you to get started with a modest effort and to add richness and complexity over time. You can easily implement advanced features such as faceted search, free text search, Boolean search expressions, customized relevance ranking, field-based sorting and searching, and text processing options such as stopwords, synonyms, and stemming.

CloudSearch Programming

You can interact with CloudSearch through the AWS Management Console, a complete set of Amazon CloudSearch APIs, and a set of command line tools. You can easily create, configure, and populate a search domain through the AWS Management Console.
Here's a tour, starting with the welcome screen:

You start by creating a new Search Domain:

You can then load some sample data. It can come from local files, an Amazon S3 bucket, or several other sources:

Here's how you choose an S3 bucket (and an optional prefix to limit which documents will be indexed):

You can also configure your initial set of index fields:

You can also create access policies for the CloudSeach APIs:

Your search domain will be initialized and ready to use within twenty minutes:

Processing your documents is the final step in the initialization process:

After your documents have been processed you can perform some test searches from the console:

The CloudSearch console also provides you with full control over a number of indexing options including stopwords, stemming, and synonyms:

CloudSearch in Action
Some of our early customers have already deployed some applications powered by CloudSearch. Here's a sampling:

Search Technologies has used CloudSearch to index the Wikipedia (see the demo).
NewsRight is using CloudSearch to deliver search for news content, usage and rights information to over 1,000 publications.
ex.fm is using CloudSearch to power their social music discovery website.
CarDomain is powering search on their social networking website for car enthusiasts.
Sage Bionetworks is powering search on their data-driven collaborative biological research website.
Smugmug is using CloudSearch to deliver search on their website for over a billion photos.

SOURCE

THECRYSTALCLOUDS

Pages

Tuesday, 23 October 2012

AWS Week in Review - October 8th to October 14th, 2012

AWS Week in Review - October 15th to October 21st, 2012

Wednesday, 3 October 2012

Get Started With Oracle Applications Now With Our New Test Drive Program

Amazon RDS - Now Available in the AWS Free Usage Tier

Friday, 28 September 2012

Amazon RDS Now Supports SQL Server 2012

Thursday, 27 September 2012

AWS Announcement : High Performance Provisioned IOPS Storage For Amazon RDS

Tuesday, 18 September 2012

Amazon VPC - New Additions

Saturday, 15 September 2012

Amazon RDS News - Oracle Data Pump

Friday, 14 September 2012

Amazon EC2 Reserved Instance Marketplace

AWS Expands in Japan

Tuesday, 28 August 2012

AWS Cost Allocation For Customer Bills

Growth Challenges

Cost Allocation Via Tagging

Data Processing

Friday, 24 August 2012

AWS New Whitepaper: Mapping and GeoSpatial Analysis in the Cloud Using ArcGIS

Tuesday, 21 August 2012

Amazon CloudSearch - Start Searching in One Hour for Less Than $100 / Month

Blog Archive

Contributors

Facebook Badge

Labels