Quoting a great and concise article from SQLServerCentral: http://www.sqlservercentral.com/articles/Azure+SQL+Data+Warehouse+(ASDW)/172251
It is always good to know the options available when you are considering the best cloud provider for your needs. See below a summary of each option available currently.
Azure DWH part 28: The ASDW enemies
By Daniel Calbimonte, 2018/06/26
Superman has a Lex Luthor, the Ninja Turtles have The Shredder, the Smurfs have a Gargamel, Mozart had a Salieri, Edmon Dantès had a Ferdinand and in our case, Azure data warehouse has multiple enemies.
This time we will talk about the competitors. What do they offer? We will mention the following enemies:
- Amazon Redshift
- Alibaba Cloud Max Compute
- Google Big Query
Amazon RedShift is the first enemy of ASDW. Amazon RedShift is a petabyte database service in the cloud. It is similar to ASDW, and as of now, it is the most popular cloud service (Azure is the second one). Redshift started in 2012 and is based on PostgreSQL, and it is easy to use and scale.
If you are familiar with PostgreSQL and you prefer over SQL Server, it is a good choice. ASDW is similar to SQL Server and it can be used with the SQL Server Management Studio. Then, if you like SQL Server, you will prefer ASDW.
Amazon offers a data warehouse in the cloud that is easy to maintain at a low cost. The biggest advantage in the cloud is that you can scale easily. If you have a data warehouse on-premises, if you need to scale, you need to buy new hardware, migrate data and suffer a lot. It can be easily integrated with well known BI tools like MicroStrategy, Jaspersoft, Pentaho, Tableau, Business Objects, Cognos, etc. It is also easy to create replicas of your data warehouse in different regions. It is also very easy to restore and encrypt your data.
I think it is the closest competitor because it offers a Database Platform with multiple services. Not only a data warehouse in the Cloud but also several other services.
Regarding prices, currently there are 3 options:
- On-demand pricing is a pay per hour. The payment depends on the Memory, Storage, CPU, IO, Region. For example, the price for the category dc2.large is 0.25 USD per hour and a dc2.8x.large is 4.8 USD per hour.
- Redshift spectrum query charges 5 USD per Terabyte scanned.
- Reserved instance pricing lets you save 75% of the price on-demand, but you should use the services per 1-3 years.
For more information about prices, refer to this link: Amazon Redshift Pricing
- Getting Started with Amazon Redshift
- Introduction to Amazon Redshift – data warehouse Solution on AWS
Alibaba Cloud Max Compute
Alibaba is part of the Alibaba Cloud applications it is a database cloud-based used as a data warehouse. This Cloud Datawarehouse claims to be very secure compared to the competitors and complies with the HIPAA for healthcare and Germany’s C5 standard, PCI DSS.
It supports SQL, Graph, MapReduce, MPI Integration Algorithm. It works with a Batch and Historical Data Tunnel, which is the service provided for the users to import and export data with a service easy to scale.
The Data Hub is used to easily import incremental data. It uses a 2D table storage with compression to reduce costs. It also supports Computing-MapReduce and Computing Graph. It also supports REST API and it has his SDK, Graph, SPARK, and SQL. It is not very popular yet, but it is in the race.
Less than 1 GB or less is free. 1-100 GB costs between 0.0028 and 0.28 USD. 100 GB-9 TB costs between 0.0014 -13 USD approx. 9 – 90 TB between 12 USD to 120 USD approx. For more information about prices refer to this link: Max Compute Prices
Snowflake is another data warehouse database based in the cloud. It is a great data warehouse solution, but it is not part of a Data Platform. It means that is not part of a solution like AWS, Max Compute, and Azure that offer other additional services in the cloud. It is just a data warehouse in the cloud, but it is a good one.
Snowflake supports SQL to access data and you can access semi-structured data like JSON. It is possible to access to non-relational data using SQL like we do with ASDW using PolyBase. It also offers to scale immediately and columnar storage. You can connect to Snowflake using Java (JDBC) or ODBC. There are also web consoles, native connectors, and the command line.
Snowflake claims to have a better architecture designed for the cloud and it is optimized for better performance.
Currently, the prices depend on the Region and edition. There are several editions like the Enterprise, Standard, Premier, Enterprise for Sensitive Data and virtual private Snowflake.In USA west and east, all the versions cost 40 USD per TB per month for storage, with the compute costing 2.25 and 2 USD per hour. The Enterprise Edition and Enterprise for Sensitive Data cost 3 and 4 USD PER compute hour, respectively.
For more information about prices, refer to this link: Select Pricing For Your Region
Google Big Query
Big Query uses a serverless system that can handle petabytes of data. This is a solution that offers really fast queries; it is able to handle queries of petabytes of data in seconds. The Google guys are experts on Big Data and Google Big Query shows the power they have. Big Query is like any Google technology: cloud-based, fast, easy to learn, and simple.
It also uses SQL to access data. Big Query works with the Google Cloud Storage, and it works with the following technologies:
- Google Analytics 360 Suite
Big Query has a web console (Web UI) to access the data. It also includes a command line tool or you can use REST API to query information. You can use Java, Python or .NET to access data.
The Big Query concept is to run a query with terabytes of information in seconds or minutes. You do not need a virtual machine and you do not need to worry about configuring hardware and software.
The storage costs 0.2 USD per GB per month. The first 10 GB are free. If it is a long-term storage, it costs 0.1 USD per GB. The queries cost 5 USD per GB. Load and copy data is free. For more information about pricing, refer to this link: Big data pricing
Teradata is a very popular database. It is commonly used as a data warehouse and also as a large scale database. It is one of the most popular databases in the world and many people like it. However, like SnowFlake, it is a single isolated solution and not part of a database platform like Azure or AWS or Alibaba Max Cloud. Those platforms offer not only a data warehouse, but also other solutions to complement it.
You have 3 options with Teradata:
- IntelliCloud™ offers a Teradata database in the Cloud+Aster Analytics+Hadoop.
- Public cloud offers a Teradata database+Aster Analytics in AWS or Azure.
- Private Cloud offers virtualized VMs with IntelliCloud and Public Cloud.
You can query using Big Data technology using Teradata QueryGrid™. It is possible to have your database in the cloud, on-premises or in a hybrid environment. It also includes an In-Memory Intelligent Processing and a gateway to actionable Data Insight.
The prices vary by the different editions. The developer edition is free. The Base, Advanced, Enterprise and IntelliSphere have different prices per hour. For example, the EC2 m4.4xl costs 1.564 USD per hour and the Enterprise 4.17. For more information about prices, refer to this link: Teradata Software Pricing
In this article, we saw different alternatives to create our data warehouse in the cloud. As you can see, there are a lot of competitors. Many of them have almost the same features. The price options change over the time. Even the features improve each day. It is good to know the competitors and check all the options available in the Cloud Data Word house world.