Home News Exclusive: Voltron Data brings new power to AI with Theseus distributed query engine

Exclusive: Voltron Data brings new power to AI with Theseus distributed query engine

by WeeklyAINews
0 comment

Are you able to deliver extra consciousness to your model? Think about turning into a sponsor for The AI Influence Tour. Study extra in regards to the alternatives here.


The fictional Voltron robotic (from the animated science fiction present of the identical title) is all about combining a number of robotic lions into one large robotic that is ready to accomplish nice duties.

Voltron Data, which made its splashy debut in 2022 with $110 million in funding, is all about bringing the facility of a number of open supply applied sciences, together with Apache Arrow, Apache Parquet and Ibis, collectively to assist enhance knowledge entry. At present, Voltron Information is taking the subsequent step, asserting the brand new Theseus distributed question engine, in a bid to assist dramatically speed up knowledge queries for more and more demanding AI workloads. 

Theseus is designed to speed up large-scale knowledge pipelines and queries utilizing GPUs and different {hardware} accelerators.

“We constructed Theseus based mostly on the very same rules of what we have been doing open supply help for, with modular, composable, accelerated libraries that make knowledge methods higher,” Josh Patterson, co-founder and CEO of Voltron Information advised VentureBeat in an unique interview. “That is our subsequent product as we proceed to go down this journey of attempting to be the main designer and builder of information methods.”

Theseus is constructed for enormous volumes of information

Theseus is optimized for operating distributed queries on massive datasets of 10 terabytes or extra. It’s focused at firms with petabyte-scale knowledge processing wants throughout Fortune 500 firms, authorities companies, hedge funds, telcos, and media leisure companies.

See also  Anthropic leads charge against AI bias and discrimination with new research

A key aim of Theseus is to speed up ETL (extract, remodel, load), characteristic engineering, and different knowledge preparation work to feed downstream AI and analytics methods quicker. As AI methods get quicker, they want extra real-time knowledge transformation.

“A whole lot of our customers are saying their greatest drawback at the moment is that they’re ravenous their AI methods as a result of they’ll’t get knowledge quick sufficient,” Patterson stated. “That was the primary driver behind Theseus.” 

A problem with knowledge queries at the moment is that they usually are restricted by CPU compute capability and efficiency. Theseus seems past conventional CPU approaches and makes use of accelerated computing applied sciences together with GPUs. Patterson stated that Theseus is “accelerator native” – which means it’s optimized to leverage Nvidia GPUs, networking, storage, and different accelerators. 

Based on Patterson, the accelerator native strategy permits it to run queries quicker than conventional CPU-based distributed engines like Apache Spark at scale.

One AI use case the place Patterson sees Theseus being notably helpful is for hyper 

parameter optimization. He defined that a company can churn by means of a variety of parameters for optimization and have engineering as a part of the method of adjusting inputs to construct higher fashions.

“The quicker you are able to do characteristic engineering, the quicker you are able to do ETL the quicker you may herald more energizing knowledge, the higher your fashions are,” he stated.

Theseus is interoperable from the bottom up

Theseus embraces open requirements like Apache Arrow, Apache Parquet, and Ibis for interoperability. 

See also  This week in data: Don't be an AI tourist

Patterson emphasised that it isn’t a proprietary siloed system and knowledge in any Apache Arrow-compatible knowledge lake may be queried by Theseus. Patterson defined that knowledge may be fed instantly into many various standard machine studying instruments and frameworks together with PyTorch, Tensorflow and several types of graph databases.

“We now have this seamless solution to principally transfer knowledge out and in of the methods,” Patterson stated.

Theseus itself is simply the distributed question system. Patterson defined that it doesn’t have its personal entrance finish consumer interface, slightly it makes use of issues like SQL queries and Ibis the place individuals can map different entrance ends to it. The essential thought is to allow organizations to simply combine Theseus into current workflows.

Going to market with HPE and extra companions

Voltron Information goes to market with Theseus through partnerships and the primary is with Hewlett Packard Enterprise (HPE). 

Voltron Information has partnered to deliver Theseus to the HPE GreenLake hybrid cloud platform. HPE GreenLake offers the infrastructure for Theseus whereas additionally giving clients a solution to unify queries throughout different engines utilizing Ibis.

Wanting ahead, Patterson stated that Voltron Information plans to increase Theseus partnerships and add extra performance like user-defined capabilities. The aim is tighter integration into full knowledge science pipelines.

“I believe 2024 will primarily be about making it quicker and simpler to combine with new completely different components of the information science pipeline, as a result of that basically empowers customers.” Patterson

Source link

You Might Be Interested In
See also  Why privacy-preserving synthetic data is a key tool for businesses

You may also like

logo

Welcome to our weekly AI News site, where we bring you the latest updates on artificial intelligence and its never-ending quest to take over the world! Yes, you heard it right – we’re not here to sugarcoat anything. Our tagline says it all: “because robots are taking over the world.”

Subscribe

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

© 2023 – All Right Reserved.