Vinoo Ganesh

Speaker, Technologist, and Startup Advisor

O’Reilly Radar: Data & AI

O'Reilly Radar

O’Reilly Radar: Data & AI will showcase what’s new, what’s important, and what’s coming in the field. It includes two keynotes and two concurrent three-hour tracks—designed to lay out for tech leaders the issues, tools, and best practices that are critical to an organization at any step of their data and AI journey. You’ll explore everything from prototyping and pipelines to deployment and DevOps to responsible and ethical AI. Link https://www.

Data SLA Nightmares & Lessons Learned

Databand 2021

Databricks Sr. Staff Developer Advocate, Denny Lee, Citadel Head of Business Engineering, Vinoo Ganesh, and Databand.ai Co-Founder & CEO, Josh Benamram, discuss the complexities and business necessity of setting clear data service-level agreements (SLAs). They share their experiences around the importance of contractual expectations and why data delivery success criteria are prone to disguise failures as success in spite of our best intentions. Denny, Vinoo, and Josh challenge businesses of all industries to see themselves as data companies by driving home a costly reality – what do businesses have to lose when their data is wrong?

Migrating to Parquet

Subsurface Summer 2021

I work at a data-as-a-service (DaaS) company that delivers PBs of geospatial data to customers across a variety of industries. We build and manage a central data lake, housing years of data, and operationalize that data to solve our customers’ problems. I recently gave a talk about the specifics of file formats at Spark+AI Summit 2020 that generated a lot of questions about my company’s migration from CSV to Apache Parquet.

Accelerating Data Evaluation

Data + Ai Summit 2021

As the data-as-a-service ecosystem continues to evolve, data brokers are faced with an unprecedented challenge – demonstrating the value of their data. Successfully crafting and selling a compelling data product relies on a broker’s ability to differentiate their product from the rest of the market. In smaller or static datasets, measures like row count and cardinality can speak volumes. However, when datasets are in the terabytes or petabytes though – differentiation becomes much difficult.

Guaranteeing pipeline SLAs and data quality standards with Databand

Airflow Summit 2021

We’ve all heard the phrase “data is the new oil.” But really imagine a world where this analogy is more real, where problems in the flow of data - delays, low quality, high volatility - could bring down whole economies? When data is the new oil with people and businesses similarly reliant on it, how do you avoid the fires, spills, and crises? As data products become central to companies’ bottom line, data engineering teams need to create higher standards for the availability, completeness, and fidelity of their data.