Macro Evals for Agentic Systems
This cookbook outlines a macro-evaluation workflow for analyzing multi-agent systems at scale using a simulated electric vehicle order pipeline. It demonstrates how to look past individual responses and evaluate systemic behaviors such as orchestration, routing, and tool choices by combining lower-level execution checks (via Promptfoo) into population-level trace analyses to discover and...
https://developers.openai.com/cookbook/examples/partners/macro_evals_for_agentic_systems/macro_evals_for_agentic_systems
This cookbook outlines a macro-evaluation workflow for analyzing multi-agent systems at scale using a simulated electric vehicle order pipeline. It demonstrates how to look past individual responses and evaluate systemic behaviors such as orchestration, routing, and tool choices by combining lower-level execution checks (via Promptfoo) into population-level trace analyses to discover and...
https://developers.openai.com/cookbook/examples/partners/macro_evals_for_agentic_systems/macro_evals_for_agentic_systems
Openai
Macro Evals for Agentic Systems
When an agentic system fails, the problem is often larger than a single bad response. A handoff may happen too late, a specialist agent may
Python utility package for building Claude Code hooks
https://github.com/RasmusGodske/claude-hook-utils
https://github.com/RasmusGodske/claude-hook-utils
GitHub
GitHub - RasmusGodske/claude-hook-utils
Contribute to RasmusGodske/claude-hook-utils development by creating an account on GitHub.
Production RAG with LangChain & Vector Databases – Full Course
Learn to build, debug, optimize, and scale RAG systems for production.
https://www.youtube.com/watch?v=mHxLXzYjQRE
Learn to build, debug, optimize, and scale RAG systems for production.
https://www.youtube.com/watch?v=mHxLXzYjQRE
YouTube
Production RAG with LangChain & Vector Databases – Full Course
Learn to build, debug, optimize, and scale RAG systems for production.
🚀 Free Production AI Starter Kit: https://bit.ly/production-ai-pack
This course teaches what tutorials skip: why 90% of RAG projects fail and how to fix them.
💻 Code, Parts 1-5: h…
🚀 Free Production AI Starter Kit: https://bit.ly/production-ai-pack
This course teaches what tutorials skip: why 90% of RAG projects fail and how to fix them.
💻 Code, Parts 1-5: h…
Build a Live Object Detection App for the Reachy Mini With TensorFlow and PyCharm
In this tutorial, we build a live object detection app using TensorFlow and PyCharm, then deploy it onto the Reachy Mini open-source robot for real-time object tracking.
https://blog.jetbrains.com/pycharm/2026/05/build-a-live-object-detection-app-for-reachy-mini-with-tensorflow-and-pycharm/
In this tutorial, we build a live object detection app using TensorFlow and PyCharm, then deploy it onto the Reachy Mini open-source robot for real-time object tracking.
https://blog.jetbrains.com/pycharm/2026/05/build-a-live-object-detection-app-for-reachy-mini-with-tensorflow-and-pycharm/
The JetBrains Blog
Build a Live Object Detection App for the Reachy Mini With TensorFlow and PyCharm | The PyCharm Blog
Learn how to build a real-time object detection app using TensorFlow and PyCharm, then deploy it onto the Reachy Mini robot for live object tracking.
nesquena / hermes-webui
Hermes WebUI: The best way to use Hermes Agent from the web or from your phone!
https://github.com/nesquena/hermes-webui
Hermes WebUI: The best way to use Hermes Agent from the web or from your phone!
https://github.com/nesquena/hermes-webui
GitHub
GitHub - nesquena/hermes-webui: Hermes WebUI: The best way to use Hermes Agent from the web or from your phone!
Hermes WebUI: The best way to use Hermes Agent from the web or from your phone! - nesquena/hermes-webui
Webwright
A simple SWE style browser agent framework that achieves SOTA results on long horizon web tasks.
https://github.com/microsoft/Webwright
A simple SWE style browser agent framework that achieves SOTA results on long horizon web tasks.
https://github.com/microsoft/Webwright
GitHub
GitHub - microsoft/Webwright: A simple SWE style browser agent framework that achieves SOTA results on long horizon web tasks.
A simple SWE style browser agent framework that achieves SOTA results on long horizon web tasks. - microsoft/Webwright
PookieDb
Django-style ORM for SQLite and PostgreSQL. Chainable QuerySet API, migrations, relationships, and an interactive CLI. Works outside Django.
https://pypi.org/project/pookiedb/
Django-style ORM for SQLite and PostgreSQL. Chainable QuerySet API, migrations, relationships, and an interactive CLI. Works outside Django.
https://pypi.org/project/pookiedb/
Breaking Circular Imports in Python Without Losing Type Safety
The article explores practical techniques for handling circular imports in Python while preserving static type safety, using a financial modeling framework as a real-world example. It argues that strategically placed local imports and dependency thunks are often the least-bad solution, offering better type checking and maintainability than context dictionaries, builder patterns, or large...
https://www.orcaset.com/blog/breaking-circular-imports-in-python-without-losing-type-safety
The article explores practical techniques for handling circular imports in Python while preserving static type safety, using a financial modeling framework as a real-world example. It argues that strategically placed local imports and dependency thunks are often the least-bad solution, offering better type checking and maintainability than context dictionaries, builder patterns, or large...
https://www.orcaset.com/blog/breaking-circular-imports-in-python-without-losing-type-safety
Orcaset
Breaking Circular Imports in Python Without Losing Type Safety - Orcaset | AI-Native Financial Models
Orcaset is an AI-native financial modeling platform. It gives firms the tools to quickly evaluate financial performance by creating robust, transparent models in code at portfolio scale.
boundless-world-model
High-fidelity world models for general embodied intelligence, such as data engines and world simulators.
https://github.com/boundless-large-model/boundless-world-model
High-fidelity world models for general embodied intelligence, such as data engines and world simulators.
https://github.com/boundless-large-model/boundless-world-model
GitHub
GitHub - boundless-large-model/boundless-world-model: High-fidelity world models for general embodied intelligence, such as data…
High-fidelity world models for general embodied intelligence, such as data engines and world simulators. - boundless-large-model/boundless-world-model
Django: introducing django-integrity-policy
The article introduces django-integrity-policy, a package that helps Django applications enforce browser Integrity Policy headers, protecting against unauthorized or tampered third-party scripts and resources.
https://adamj.eu/tech/2026/05/31/introducing-django-integrity-policy/
The article introduces django-integrity-policy, a package that helps Django applications enforce browser Integrity Policy headers, protecting against unauthorized or tampered third-party scripts and resources.
https://adamj.eu/tech/2026/05/31/introducing-django-integrity-policy/
adamj.eu
Django: introducing django-integrity-policy - Adam Johnson
Back in January, Firefox’s Security & Privacy Newsletter for 2025 Q4 piqued my interest with this mention:
AI Engineering for Developers
A tour through AI engineering for developers who already know how to ship software. Fourteen chapters, no LinkedIn voice, no slow warm-up. We will go from 'what is a foundation model' to 'how do you run agents in production on Google Cloud' without skipping the parts that matter.
https://www.lucavall.in/blog/ai-engineering-for-developers
A tour through AI engineering for developers who already know how to ship software. Fourteen chapters, no LinkedIn voice, no slow warm-up. We will go from 'what is a foundation model' to 'how do you run agents in production on Google Cloud' without skipping the parts that matter.
https://www.lucavall.in/blog/ai-engineering-for-developers
Luca Cavallin
AI Engineering for Developers | Blog
A tour through AI engineering for developers who already know how to ship software. Fourteen chapters, no LinkedIn voice, no slow warm-up. We will go from 'what is a foundation model' to 'how do you run agents in production on Google Cloud' without skipping…
How Servers Work: A Hands-On Introduction to TCP Sockets
Learn how servers actually work by building a tiny TCP server and client from scratch. A hands-on introduction to sockets, TCP, and the network programming model every backend, DevOps, and platform engineer should go through at least once.
https://labs.iximiuz.com/tutorials/how-servers-work-tcp-sockets
Learn how servers actually work by building a tiny TCP server and client from scratch. A hands-on introduction to sockets, TCP, and the network programming model every backend, DevOps, and platform engineer should go through at least once.
https://labs.iximiuz.com/tutorials/how-servers-work-tcp-sockets
iximiuz Labs
How Servers Work: A Hands-On Introduction to TCP Sockets | iximiuz Labs
Learn how servers actually work by building a tiny TCP server and client from scratch. A hands-on introduction to sockets, TCP, and the network programming model every backend, DevOps, and platform engineer should go through at least once.
👍1
The Smallest Brain You Can Build: A Perceptron in Python
https://ranpara.net/posts/perceptron-explained-from-scratch/
https://ranpara.net/posts/perceptron-explained-from-scratch/
Devarsh Ranpara
The Smallest Brain You Can Build
A perceptron explained from scratch in Python, with interactive demos. Learn weights, bias, the decision boundary, epochs, learning rate, and why we normalize data.
❤1
Are you expected to run five Python type-checkers now?
https://pyrefly.org/blog/too-many-type-checkers/
https://pyrefly.org/blog/too-many-type-checkers/
pyrefly.org
Are you really expected to run five type-checkers now? | Pyrefly
Library maintainers may feel overwhelmed by the plurality of type checkers that exist. We offer some guidance on how to focus their efforts where they matter most.
Open-LLM-VTuber / Open-LLM-VTuber
Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking face running locally across platforms
https://github.com/Open-LLM-VTuber/Open-LLM-VTuber
Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking face running locally across platforms
https://github.com/Open-LLM-VTuber/Open-LLM-VTuber
GitHub
GitHub - Open-LLM-VTuber/Open-LLM-VTuber: Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking…
Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking face running locally across platforms - Open-LLM-VTuber/Open-LLM-VTuber