At this year’s global conference, Snowflake announced new collaborations with NVIDIA, making it easier for businesses to take full ownership of their data. “The era of enterprise AI is here,” said Snowflake CEO Sridhar Ramaswamy. The crowd of over 20,000 attendees at the Snowflake Summit 2024, hosted in San Francisco, went wild and cheered. In seconds, the keynote session’s blue strobe lights softened, and NVIDIA CEO Jensen Huang appeared on a screen next to Ramaswamy.
The Dawn of Enterprise AI
“We’re at the beginning of a new era in computing,” Huang said on a video call from Taiwan in his signature black leather jacket. “The computing infrastructure of the world is really built here. And so, I’m here to unite an ecosystem of companies, technology companies, so they can work on that AI infrastructure.” In this new era, according to Huang and Ramaswamy, businesses can build customized AI applications in the Snowflake data cloud, powered by NVIDIA AI. This essentially brings “computing to the data,” not the other way around, Jensen said.
This major announcement, shared on June 3, brings the global partnership between NVIDIA and Snowflake to a new level of influence and makes it easier for enterprises to unlock the power of AI so they can literally chat with their data, Huang explained.
Snowflake and NVIDIA Announce New AI Data Application Tools
With this latest collaboration, Snowflake has adopted NVIDIA AI Enterprise software to integrate NVIDIA’s NeMo Retriever microservices into Snowflake Cortex AI, Snowflake’s fully managed large language model and vector search service. This enables organizations to seamlessly connect custom models to diverse business data and deliver highly accurate responses.
In addition, Snowflake Arctic, the most open, enterprise-grade LLM, is now fully supported by NVIDIA TensorRT-LLM software, providing users with highly optimized performance. Arctic is also now available as an NVIDIA NIM inference microservice, allowing more developers to access Arctic’s efficient intelligence.
Transforming Data Sets Ahead of Generative AI Implementation
Snowflake’s unified platform specializes in “offering people the car and not the parts,” said Titiaan Palazzi, head of power and utilities go-to-market strategy at Snowflake. It’s “a ready-to-go platform, not just different tools that require a lot of ownership.”
But even with the built-in simplicity that Snowflake Cortex AI, Snowflake ML, Snowflake Iceberg Tables, and Snowflake Horizon offer, IT leaders still need to understand exactly how their data is being transformed ahead of AI implementation.
Joining the stage at the keynote’s conclusion, industry experts — including Anu Jain, head of data technology at JPMorgan Chase; Shahran Haider, deputy chief data officer at NYC Health + Hospitals; and Thomas Davey, chief data officer at Booking.com — emphasized that IT leaders must still verify that their enterprise data passes three rounds of assessment, even with automation procedures in place. These assessments include data integrity (as in trustworthiness, governance, and accuracy), data visualization (the degree to which the data tells a story and is represented clearly), and data value to stakeholders (this involves operational efficiencies, dozens of use cases across the business, and the degree to which LLMs can be operationalized to scale).
The Critical Role of Data Integrity, Visualization, and Value
“It’s about being really methodical,” said Phil Andriyevsky, principal, wealth and access management, data and analytics, at EY. And making sure companies “refactor your data the right way, in order for it to be labeled consumable and digestible by the right business stakeholders,” he added.
The era of enterprise AI is defined by a data-centric mindset. AI will be leveraged at every level of the organization and accessible to all employees, whether entry-level or executive.
It’s also an era of immense collaboration — between legacy on-premises systems and the cloud, new schools of thought and old, and an ecosystem of technology partners driving complementary solutions.
“Snowflake has a powerful and unique partner ecosystem — part of our success is that we have many partners that amplify the power of our platform,” Ramaswamy said, naming Microsoft, Google, and CDW as just a few.
The Impact of Breakneck Speed
But above all, this era is defined by “breakneck speed,” Huang said. “The technology is moving so fast.” Time-to-market is everything, he added, signaling the ricochet effect that enterprise AI will have on the way we work today.
“The size and scale of this is similar to a late 90s sort of moment, with the wide adoption of the internet,” Andriyevsky added. “Industries will just transform because of the advancement of technology and AI. And I fundamentally think it’s gonna be better for industry. It’s gonna be better for the world.”
The Strategies for Re-Engineering Data Pipelines
Organizations need a secure data pipeline to extract real-time analytics from workloads and deliver trusted data. But data pipelines are becoming increasingly complex to manage.
That’s why companies such as Booking.com, Capital One, Fidelity, and CNN are re-engineering them using Snowflake’s Apache Iceberg and a new solution, Iceberg Tables. The latter includes data lakehouses, data lakes, and data meshes, and it allows IT leaders to simplify the development of pipelines so they can work with open data on their own terms and flexibly scale according to business use cases.
Replacing Existing Batch Pipelines
Fidelity has reimagined its data pipelines using Snowflake Marketplace, saving the company time and resources in data engineering. Its supported business units, including fixed income and data science, can now analyze data faster, and the firm is spending “more time on research and less on pipeline management,” said Balaram Keshri, vice president of architecture at Fidelity.
With Snowflake managing its data, Fidelity has also improved performance time significantly so that it can load, query, and analyze data faster. In fact, the Snowflake Performance Index, which measures Snowflake’s impact, reports that it has “reduced organizations’ query duration by 27% since it started tracking this metric, and by 12% over the past 12 months,” according to a press release.
Capital One’s Data Sharing Capabilities
Capital One, reportedly the first U.S. bank to migrate its entire on-premises data center to the cloud, has also found success with its new data pipelines, thanks to Snowflake’s data sharing capabilities. The feature enables multiple analysts to access related data without affecting one another’s performance. Users can also categorize the data according to workload type.
“Snowflake is so flexible and efficient that you can quickly go from ‘data starved’ to ‘data drunk.’ To avoid that data avalanche and associated costs, we worked to put some controls in place,” Salim Syed, head of engineering for Capital One Software, wrote in a blog post.
CNN’s Transformation to Real-Time Data Pipelines
CNN’s dramatic pipeline transformation also gave it accelerated access to analytics. Over the past year, the multinational news channel and website, owned by Warner Bros. Discovery, has shifted to using real-time data pipelines for workloads that support critical parts of its content delivery strategy. The goal is to move the horizon of actionable data down “from hours to seconds” by replacing existing batch pipelines, noted JT Torrance, data engineer with Warner Bros. Discovery.
“We will move around 100 terabytes of data a day across about 600,000 queries from our various partners,” said Zach Lancaster, engineering manager of Warner Bros. Discovery. Now, with its scalable and newly managed pipeline, CNN can scrape the data for core use cases and prioritize workloads that drive the most business value.
Key Steps to Transform Your Data Pipeline
As user-friendly as the Snowflake platform is, IT leaders still need a clear strategy in mind as they improve their data pipelines. For starters, “think about how you can bring your stakeholders on board. You want them to become the ultimate stewards of the process,” Lancaster said.
Engaging stakeholders ensures that the pipeline transformation aligns with business goals and receives the necessary support for successful implementation.
Second, revisit your use cases. “Platforms develop over the years, as does your business, so try to re-evaluate your use cases and dial back your system,” Torrance advised. This approach can help with cost optimization and ensure that the pipeline meets current business needs.
Third, “make sure you understand the ask of each request and how you expect to use it over time in your data pipeline,” Lancaster said. Clear understanding of requests ensures that the pipeline is designed to handle future demands and remains flexible enough to accommodate changes.
Cross-Functional and Centralized Pipelines
If a company is redesigning its data pipeline, it needs to be cross-functional and serve the most central parts of the business. Consider “machine-to-machine use cases,” as these are important for interoperability within your entire tech stack.
Finally, remember that more intricate systems aren’t always better. “Think carefully. Just because I have a request, do I need to accomplish it? And does the added complexity add value to the business, or does it do a disservice to the stakeholder?” Lancaster said. This consideration ensures that the pipeline remains efficient and aligned with business objectives.
Conclusion
The Snowflake Summit 2024 highlighted the transformative potential of generative AI for businesses. By leveraging new tools and collaborations with NVIDIA, Snowflake is enabling companies to build robust, scalable, and efficient data pipelines. As organizations continue to navigate the complexities of data management, adopting flexible and scalable solutions like those offered by Snowflake will be key to maintaining a competitive advantage. Engaging stakeholders, revisiting use cases, and understanding requests are crucial steps in this transformation process.
Final Thoughts
As the era of enterprise AI unfolds, businesses must adapt to rapidly advancing technologies to stay ahead. By focusing on efficient pipeline management and leveraging advanced technologies, businesses can unlock the full potential of their data and drive innovation, ensuring they remain competitive in an ever-evolving landscape.