April 2024 Newsletter - How we Built a 19 PiB Logging Platform, Migrating to ClickHouse, Building a Rate Limiter, & more

April 2024 Newsletter - How we Built a 19 PiB Logging Platform, Migrating to ClickHouse, Building a Rate Limiter, & more

Welcome to the April ClickHouse newsletter, where we round up what’s been happening in real-time data warehouses over the last month.

This month, we have the 24.3 release, building a rate limiter, a migration from MySQL to ClickHouse story, meetup videos, and more!

Featured community member

This month's featured community member is Shivji kumar Jha, a Staff Engineer for Data Platforms at Nutanix .

Shiv leads a five-member team, managing and supporting Nutanix's data platform, which acts as a service for messaging, streaming, event sourcing, analytics, and time series databases. Shiv actively engages with the communities of the technologies used at Nutanix, including ClickHouse.

We recently hosted a ClickHouse meetup in Nutanix’s office in Bangalore, India. Shiv was invaluable in making this event happen, helping organize it, and acting as an MC for the evening. He recorded all the talks and uploaded them to YouTube afterward. Shiv also participated in a follow-up Q&A session on 15th April to address unanswered questions from the meetup.

Thanks for all your work, Shiv, and we’ll see you at the next meetup!

Follow Shivji on LinkedIn

24.3 release

The big feature in the 24.3 release is the analyzer being enabled by default. Analyzer is a new query analysis and optimization infrastructure that’s been in the works for a couple of years and lets you have multiple ARRAY JOIN clauses in a query, treats tuple elements like columns, handles queries with nested CTEs and sub-queries, and more.

Read the release post

Storing Continuous Profiling Data in ClickHouse

Coroot is an open-source tool for observability that turns observability data into actionable insights. Nikolay Sivko wrote a blog post describing how they built their own storage system for profiling data based on ClickHouse. After defining continuous profiling, Nikolay takes us through the data model and gives examples of queries that check on the performance of a service.

Read the blog post

Migrating to ClickHouse: Releem's Journey

Releem is a MySQL performance tuning tool that automatically detects performance degradation and optimizes configuration files. To do this, they collect metrics from hundreds of database servers across various operating systems and cloud solutions.

They used to store these metrics in MySQL, which started to struggle once it reached almost 5 billion records. Enter ClickHouse, which helped shrink the database size by 20 times, cut aggregation query times from 45 to 2 minutes, and reduced the page load time of the Releem dashboard by 25%.

Read the blog post

How we Built a 19 PiB Logging Platform with ClickHouse and Saved Millions

Rory. C., SRE at ClickHouse, shared his experience building a platform for the logging data generated by ClickHouse Cloud. Rory takes us through key design decisions, including whether to use Kafka and structured vs unstructured logging. He also explains why the team decided to use OpenTelemetry to collect metrics and does a cost comparison of the in-house solution vs using an off-the-shelf product like Datadog. 

Read the blog post

Building a Rate Limiter with ClickHouse

If you were going to build a rate limiter, the obvious choice for storing the data would be Redis. But Brad Lhotsky , Systems and Security Administrator at craigslist, was curious whether ClickHouse would be suitable and used it to build a proof-of-concept. Brad shared the slides of a talk explaining how he imported data from Kafka, built a bridge from the ACL API to ClickHouse, and tested high availability, all in just one week.

View the slide deck

Video Corner

ClickHouse Cloud Updates

  • Over the last nine months, we’ve been rebuilding the UI for ClickHouse Cloud and started rolling it out to everybody last week.

  • Today, ClickPipes introduces beta support for continuous data ingestion from S3 and GCS. Reply to this post to let us know if you’re interested in trying it!

  • Tokyo (ap-northeast-1) has been added as a new region for AWS. Sign up now.

Post of the month

Our favorite post this month was by Divyendu Singh about real-time monitoring.

See it here

Upcoming events

 

Shivji kumar Jha

Staff Engineer/lead (Data platforms) at Nutanix | Past: MySQL, Swiggy

4mo

Delighted to be featured as a community member. Looking forward to continuing the great partnership 🤝

To view or add a comment, sign in

Explore topics