Designing Data Pipelines — with Interactivity

The data pipeline has become a fundamental component of the data science, data analyst, and data engineering workflow. Pipelines serve as the glue that links together various components of the data cleansing, data validation, and data transformation process. However, despite its importance to the data ecosystem, constructing the optimal data pipeline is generally an afterthought - if it’s considered at all. This makes any changes to the central pipeline highly error-prone and cumbersome. With the ever-growing demand for new kinds of data, especially from external vendors, constructing pipelines that are scalable and that allow for monitoring is pivotal for the safe and continued use of data. ...

September 19, 2022 · 1 min · 167 words · Vinoo Ganesh

O'Reilly Superstream Series: Data Pipelines

Data pipelines are the foundation for success in data analytics, so understanding how they work is of the utmost importance. Join us for four hours of expert-led sessions that will give you insight into how data is moved, processed, and transformed to support analytics and reporting needs. You’ll also learn how to address common challenges like monitoring and managing broken pipelines, explore considerations for choosing and connecting open source frameworks, commercial products, and homegrown solutions, and more. ...

August 10, 2022 · 2 min · 263 words · Vinoo Ganesh

Designing Data Pipelines — with Interactivity

The data pipeline has become a fundamental component of the data science, data analyst, and data engineering workflow. Pipelines serve as the glue that links together various components of the data cleansing, data validation, and data transformation process. However, despite its importance to the data ecosystem, constructing the optimal data pipeline is generally an afterthought - if it’s considered at all. This makes any changes to the central pipeline highly error-prone and cumbersome. With the ever-growing demand for new kinds of data, especially from external vendors, constructing pipelines that are scalable and that allow for monitoring is pivotal for the safe and continued use of data. ...

June 17, 2022 · 1 min · 167 words · Vinoo Ganesh

Designing Data Pipelines — with Interactivity

The data pipeline has become a fundamental component of the data science, data analyst, and data engineering workflow. Pipelines serve as the glue that links together various components of the data cleansing, data validation, and data transformation process. However, despite its importance to the data ecosystem, constructing the optimal data pipeline is generally an afterthought - if it’s considered at all. This makes any changes to the central pipeline highly error-prone and cumbersome. With the ever-growing demand for new kinds of data, especially from external vendors, constructing pipelines that are scalable and that allow for monitoring is pivotal for the safe and continued use of data. ...

March 10, 2022 · 1 min · 167 words · Vinoo Ganesh

Guaranteeing pipeline SLAs and data quality standards with Databand

We’ve all heard the phrase “data is the new oil.” But really imagine a world where this analogy is more real, where problems in the flow of data - delays, low quality, high volatility - could bring down whole economies? When data is the new oil with people and businesses similarly reliant on it, how do you avoid the fires, spills, and crises? As data products become central to companies’ bottom line, data engineering teams need to create higher standards for the availability, completeness, and fidelity of their data. ...

May 28, 2021 · 1 min · 193 words · Vinoo Ganesh