Data1984
771 subscribers
45 photos
1 video
17 files
772 links
This channel is mostly about data related stuff, some of the main topics are #DataEngineering #AI #SQL #Python #cloud

Contact: @gorros
Download Telegram
Looks like the semantic layer is becoming a new standard feature in analytical engineering. I think it is boosted by LLM adoption and MCP. I am experimenting with MetricFlow from dbt, but other vendors such as Snowflake are also providing such functionality.
https://www.bigdatawire.com/2025/08/21/atscale-likes-its-odds-in-race-to-build-universal-semantic-layer/
And here is the podcast with Boris, Head of Claude Code.
👍1
🦐 PicoClaw is an ultra-lightweight personal AI Assistant inspired by nanobot, refactored from the ground up in Go through a self-bootstrapping process, where the AI agent itself drove the entire architectural migration and code optimization.

⚡️ Runs on $10 hardware with <10MB RAM: That's 99% less memory than OpenClaw and 98% cheaper than a Mac mini!
https://github.com/sipeed/picoclaw
👍1
ClickHouse MergeTree and HBase are the same thing at their core.

Not literally — but architecturally, they share the same DNA: the LSM Tree (Log-Structured Merge Tree).

Here's how it works:

1. Writes hit memory first → fast, no disk I/O
2. When memory fills, flush to an immutable sorted file on disk
3. Background compaction merges files → removes duplicates, applies deletes
4. Bloom filters + sparse indexes make reads fast without scanning everything

HBase calls these HFiles. ClickHouse calls them Parts. Cassandra calls them SSTables. Same idea.

What ClickHouse adds on top:
★ Columnar layout inside each part (OLAP-optimized)
★ The merge step does useful analytical work — deduplication (ReplacingMergeTree), summation (SummingMergeTree), pre-aggregation (AggregatingMergeTree)
★ Sparse indexing at granule level (8192 rows) rather than row-level

I still teach HBase in my data engineering course — as a NoSQL example and as a core part of the Hadoop ecosystem. And honestly, I started my DE career working with it.

Sometimes I wondered: is this too specific? Should I simplify the curriculum and drop it?

But my teaching philosophy has always been to explain technologies by focusing on what's fundamental and shared across many systems. And this connection — HBase and ClickHouse both rooted in LSM Trees — is exactly why that approach pays off.

The tools change. The patterns underneath them don't.
👍2
I just learned you can run Claude Code locally with Ollama.

Ollama 0.19 (preview, released yesterday) is now powered by Apple's MLX framework — and one thing that caught my attention as a Claude Code user: Ollama now reuses its cache across conversations, meaning less memory overhead and more cache hits when using a shared system prompt with tools like Claude Code.

That's a meaningful improvement for agentic workflows.

The setup is a single command:

ollama launch claude --model qwen3.5:35b-a3b-coding-nvfp4

Whether you're working in an air-gapped environment or just tired of API costs — local coding agents are getting genuinely viable.

(Requires a Mac with 32GB+ unified memory)

https://ollama.com/blog/mlx
👍4