PinnedVictor Oketch SabareComparing Spark and MapReduce: The Pros and Cons of Two Popular Big Data Processing Frameworks on…Spark and MapReduce are both popular big data processing frameworks that run on the Hadoop ecosystem. Both have their own unique features…3 min read·Jan 9, 2023----
PinnedVictor Oketch SabareUnlocking the Power of Big Data Processing with Resilient Distributed DatasetsA resilient distributed dataset (RDD) is a fundamental data structure in the Apache Spark framework for distributed computing. It is a…3 min read·Jan 10, 2023----
Victor Oketch SabareExploring Outliers, Leverage, and InfluenceUnveiling Hidden Insights in Data Analysis7 min read·Jul 6, 2023----
Victor Oketch SabareBuilding a Data Pipeline for Blockchain Data with Apache Kafka and Apache FlinkThe rise of blockchain technology has brought about an explosion in the amount of data being generated and consumed by blockchain networks…5 min read·Mar 16, 2023----
Victor Oketch SabareIntroduction to Streamlit for Data EngineeringData engineering is a critical aspect of any data-driven organization, where data scientists and analysts work with large amounts of data…4 min read·Mar 12, 2023----
Victor Oketch SabareinTowards Data EngineeringBuilding a Real-time Fraud Detection System with Apache Kafka and Apache Storm — A Step-by-Step…Introduction6 min read·Jan 31, 2023----
Victor Oketch SabareBuilding a data pipeline for natural language processing with Apache Kafka and Apache Spark.Are you tired of slow and clunky data pipelines for your natural language processing (NLP) projects? Well, buckle up because we have the…3 min read·Jan 31, 2023--2--2
Victor Oketch SabareinTowards Data EngineeringBuilding a Scalable and Real-time Data Pipeline for Social Media Analytics with Apache Kafka and…Introduction4 min read·Jan 16, 2023--1--1
Victor Oketch SabareMastering Missing Data: A Comprehensive Guide with Code Examples and Illustrations on How to Handle…Handling missing data in a data pipeline can be a tricky task, but with the right approach, it can be effectively managed. In this article…4 min read·Jan 13, 2023----
Victor Oketch SabareMastering Negative Indexing in PythonIn Python, we have two indexing systems for lists:1 min read·Jan 13, 2023----