/images/avatar.png

Norbert Kozlov.ski

Decision Making Under Uncertainty

From alien invasions to cybernetic gladiators and time-travel heists, learn how to make optimal decisions with limited information. Utilize Bayesian methods to strategically navigate challenges and enhance outcomes in this comprehensive exploration of decision-making under uncertainty.

How to Set Up Mautic 5.x on Kubernetes: A Step-by-Step Guide

This guide is the fruit of my recent endeavor to set up Mautic 5.x, a marketing automation tool, in a local environment. The process shed light on the inner workings of the Doctrine queue and the role of cron jobs in background data processing. While Mautic setup guides are a dime a dozen, I found myself drawn to the installation methods using Docker Compose or Kubernetes Helm Chart. This post zeroes in on the latter, primarily to scratch my itch for expanding Kubernetes expertise.

Developers Guide to Data Lakehouse with Apache Iceberg

Why it matters? 🚀

  • Draws clear line between data storage and computation. Counteracts data gravity force allowing avoiding vendor lock-in.
  • Cost optimization. Properly implemented solution allows to get rid of costly, existing data warehouse products.
  • Future-proof data architecture. Iceberg’s forward-looking design caters to evolving data sizes and formats, ensuring your data architecture remains scalable and efficient as your needs grow.
  • With features like atomic transactions and consistent updates, the solution ensures data reliability and integrity, minimizing the risk of data loss or corruption.

Your levarage 🃏

  • Gain practical insights into deploying a Data Lakehouse solution that rivals industry-level data warehouses, with step-by-step instructions tailored for developers.
  • Learn how to implement and benefit from powerful features like time travel, schema evolution, and hidden data partitioning, enhancing your ability to manage and analyze data effectively.
  • Access a ready-to-use template for setting up a local research environment leveraging MinIO, and integrate with popular query engines like Apache Spark and Trino for a comprehensive development experience.

Apache Iceberg 101

Apache Iceberg, introduced by Netflix, stands as a premier open table format implementation designed to address three critical challenges in data processing:

From Ingestion to Insight: Creating a Budget-Friendly Data Lake with AWS

Why it matters? 🚀

  • The solution is equipped to handle both structured and unstructured data, a crucial aspect for both analytical and engineering tasks.
  • It is capable of facilitating both real-time streaming and batch data processing.
  • With proper configuration, it proves to be cost-efficient and scalable; both storage and processing tiers are decoupled and highly optimized.
  • The solution adheres to regulatory and compliance requirements, ensuring data protection and the safeguarding of sensitive information.

What you will learn?

  • How specific AWS services synergize to provide a serverless data platform.
  • A no-code approach to establishing an infrastructure capable of collecting, transforming, and querying underlying data with minimal cost implications.
  • The practical distinctions between representing files in JSON and Parquet formats.
  • Techniques for querying streaming data in quasi real-time using the SQL language.

Story background

Imagine being the owner of a cutting-edge bio-hacking startup. Naturally, your focus lies in meticulously monitoring user behavior, uncovering invaluable insights, and computing pertinent metrics that stand up to the scrutiny of potential investors.