(The Coming Need Title)



Why Now?

The Need

Compute.ai adds incremental but substantial value in Spark, Presto and Trino environments where SQL is issued on Parquet and Iceberg. We support shared open catalogs like the Hive Metastore, and Spark & AWS Glue catalogs. Benefits include >10x performance for complex real world compute (including skewed data), no Out Of Memory (OOM) failures), and unlimited concurrency while using 1/5th cloud infrastructure.

Cloud Data Warehouse users experiencing high costs can easily offload their compute by breaking away from proprietary data warehouses and leveraging open standards (as supported by Spark, Presto and Trino) like Parquet/Iceberg or a lakehouse.

All open source ELT tools can be used for doing table transformations using compute.ai’s elastic servers, and then pushing Denorm or Summary tables back into the data warehouse for inexpensive access from BI tools.

The key to saving data warehouse compute costs is to offload complex compute (joins and group-bys) to a scalable open platform like compute.ai while reaping the benefits such as:

  1. Leveraging existing data/analyst engineering toolchains
  2. Using open standards
  3. Dramatically lowering cloud costs

Who It’s Not For

Compute.AI is not for customers who have a need for a Data Warehouse but have low compute needs (and hence low costs) and are not concerned about vendor lock-in.

Future Proofing

Future Proof Your Data Needs

Compute.AI unlocks the power of compute, making it abundant as your business desires. All of this while providing 100% open standards and helping you leverage your current investments in Spark, Presto and Trino.

Machine generated sources of SQL are rapidly on the rise. Currently, “The supply of compute is so constrained, demand outstrips it by a factor of 10!” (a16z).

With AI Powered BI, Autonomous Semantic Layer creation, and over $45B in revenues (by 2025) as a result of low-code and no-code applications, over a trillion CPU & GPU cores will be super recruited to handle Relational + AI/ML workloads.

These workloads need to be run in a fast, efficient, and cost-effective fashion and must have unlimited concurrency. As a reference, the data warehouses of the yesteryear can barely service 8 concurrent users before a new cluster needs to be spawned.

Open Standards

We Think Openly

At Compute.AI we believe in building open platforms that create value for the Open Source community building upon decades of brilliant collaborative minds. For us, it is less about the growth of data that we all know will continue. This is about the compute needed to process it.

We work closely with the Spark, Presto, Trino, and Iceberg OSS communities among others, and believe in giving back to the community.

Proprietary data warehouses, aka brick-walled gardens, do not fit into the future that Compute.AI is helping build. Neither are they cost effective nor are they efficient to meet the demands of machines generating SQL and driving a trillion cores of CPUs and GPUs.

(Additional Info to Add??)

FAQ Section

What does it do for me

How does it save costs and time

How does it work if I have Spark, Snowflake

Ask questions (“Do you make purchasing decision?”)

Are your snowflakes too high? (leads to a use case)

Are you having reliability issues with spark/trino/presto?

Links to Use Cases Page

Links to Technology Page