AI Agents Dominate Productivity – But Who's Using Them Right?

AI Agents Revolutionizing the Workplace – But Who’s Using Them Effectively?

Thursday, May 28, 2026

Hello, this weekly newsletter guides you through the most important new videos from a curated selection of AI and Coding YouTube channels. Each video comes with a compact summary, plus a daily overview of the dominant topics. If interested, simply click the link under the summary.

The week was characterized by intense discussions about the use of AI agents in various application areas. A central topic was the integration of agents into existing systems, such as connecting Archon with Jira, which allows you to have a separate conversation with Archon in every Jira ticket. This integration is intended to increase efficiency and improve collaboration between human developers and AI agents.

Another focus was on the use of Claude Code and other AI tools for creating and managing web applications. Various workflows and techniques were presented that boost productivity and accelerate application development. Particular emphasis was placed on the importance of structured planning and implementation processes that enable complex tasks to be handled efficiently.

Additionally, various tools and models were compared, including Claude Code, Codex, Cursor, and open-source solutions. The discussions showed that choosing the right tool depends on individual needs and workflows. Claude Code works well for developers who need motivation or feel uncertain, while Codex is suitable for experienced developers seeking a reliable and efficient tool. Cursor is ideal for teams looking for a comprehensive solution for leveraging AI in the cloud.

Another important aspect was the discussion about the future of AI and the role of agents in the workplace. It was emphasized that using AI not only increases productivity but can also enable new business models and ways of working. Integration of AI into existing systems and the development of agents that can work autonomously were identified as key factors for the future of AI.

Overall, the week demonstrated that AI agents are a powerful tool that can revolutionize the workplace. However, effectively using these tools requires careful planning and adaptation to individual needs and workflows. The discussions and demonstrations of the week have shown that there are already many successful applications, but also considerable potential for further innovations and improvements.

AI with Arnie (4 new videos)

Google, what’s going on?
22.5.2026, 08:50:07
The video critically analyzes Google’s latest AI offerings, particularly Gemini 3.5 Flash and Antigravity 2.0, which were presented at Google IO. The author was initially impressed but disappointed upon deeper testing. Gemini 3.5 Flash, promoted as more powerful and faster than its predecessor, shows lower intelligence in benchmarks and practical tests at higher costs compared to other models like GPT 5.5. It also consumes significantly more tokens, making it less efficient. Antigravity 2.0, the new version of the Antigravity app, is criticized as a copy of the Codex app and shows numerous errors and issues in initial tests and user feedback. The Gemini CLI, previously valued as an open-source project, is being replaced by a non-open-source Antigravity CLI, which is also met with criticism.

The author finds no meaningful use case for the new models and is overall disappointed with Google’s offerings. He hopes for a stronger Gemini 3.5 Pro, which is scheduled for release next month.

**Final comment:** The video explicitly addresses Google’s Gemini 3.5 Flash, Antigravity 2.0, and the Antigravity CLI and is aimed at intermediate to advanced users.
Codex: ChatGPT with hands
17.5.2026, 19:08:45
**Summary:**

The video introduces two personal agents, Hermes and OpenCla, demonstrating their use cases and differences from classic agents like Claude Code and Codex. Various use cases are demonstrated, including stock market research, managing training and nutrition, video editing, AI news summaries, real-time monitoring, MCD server access, YouTube thumbnail creation, and entire server management. The installation and configuration of Hermes on a virtual private server (VPS) is explained in detail, including GitHub integration and the use of Supabase for database applications. The video emphasizes Hermes’s self-improvement capabilities and the benefits of OpenCla for tasks requiring continuous monitoring. The integration of multi-agents and super-agents is also shown, as well as the ability to solve complex tasks with the `/goal` command. In the end, a comparison between Hermes and OpenCla is drawn, highlighting their respective strengths and use cases.

**Final comment:**
The video explicitly addresses the AI tools Hermes and OpenCla and is intended more for intermediate and advanced users.
Claude Code is no longer enough
15.5.2026, 13:06:39
The video presents the announcement of Subq (Subquadrantic), a new AI model with a context window of 12 million tokens, representing a significant advancement over existing models like ChatGPT (1 million tokens). Subq uses a subquadratic sparse attention architecture that enables linear rather than quadratic scaling, resulting in significantly higher efficiency and lower costs. The model is said to be 52 times faster and cost only 5% of competing models like Opus. The announcement was published on X (formerly Twitter) by Alexander and includes an API for Subq Code and Subq Search.

The advantages of Subq lie in the ability to keep large codebases or lengthy documentation fully in context, which could make current workarounds like RAG (Retrieval-Augmented Generation) or agent workflows unnecessary. The sparse attention architecture allows considering only relevant tokens, significantly boosting computing power and speed. However, detailed technical papers are not yet available, and available benchmarks are limited and somewhat disputed.

The potential of Subq is enormous, as it could revolutionize working with AI in fields such as coding, law, research, and audits. However, the question remains whether the model can deliver the promised quality and reliability with large context windows. Available benchmarks show promising results, but further testing and confirmation are needed.

The video explicitly addresses the new Subq model and is intended for intermediate or advanced users who engage with the technical details and implications of AI models.
Is this AI breakthrough real?
7.5.2026, 15:15:00
The video tests and compares OpenAI’s new models, particularly GPT 5.5 and the image model GPT Image 2.0, as well as competition from Deeps version 4. The test includes various applications such as creating a website, simulating a beehive, a 3D motorcycle racing game, an interactive factory and production simulation, a traffic simulation, creating ComfyUI and N8N workflows, and analyzing financial data. The benchmarks show that GPT 5.5 demonstrates significant improvements over previous versions in many areas, particularly in Terminal Benchmark and Vending Benchmark. The video also discusses current issues at Anthropic, particularly rate limits, performance problems, and model unreliability, as well as the pros and cons of OpenAI and Anthropic plans. It is recommended not to rely on a single provider and to use both models to balance different strengths and weaknesses.

The video explicitly addresses OpenAI (GPT 5.5, GPT Image 2.0, Codex), Anthropic (Claude Code), and Deeps version 4. It is intended for intermediate and advanced users, as it includes detailed tests and technical analyses.

Cole Medin (10 New Videos)

Archon + Jira: Drag a Ticket, Get a Pull Request (Live Build)
24.5.2026, 04:41:55
**YouTube Video Summary:**

This video demonstrates the integration of Archon with Jira. Archon is an open-source tool serving as an AI-Coding harness builder that allows packaging software development processes with AI-coding assistants into workflows. The focus is on establishing a connection between Archon and Jira so that each Jira ticket can have its own separate conversation with Archon.

The process begins with creating a GitHub issue that serves as context for building a Jira adapter. A PIV System Evolution workflow is used, consisting of 12 steps ranging from planning through implementation and validation. The workflow utilizes various models like Claude Code, Sonnet, and Opus to handle tasks efficiently.

Throughout the video, several challenges and solutions are discussed, including Atlassian authentication, webhook setup, and adapter customization to ensure proper communication with Jira. Various tools and models are mentioned, including Claude, OpenAI, Gemini, and open-source models.

By the end of the video, a successful test is performed where Archon responds to a request in a Jira ticket. The adapter is now able to communicate with Jira and respond to inquiries.

**Final Comment:**
The video explicitly addresses the use of Claude (Anthropic) and OpenAI models, with a focus on integrating Archon with Jira. It’s aimed at intermediate and advanced users who want to become familiar with AI-coding assistants and workflow integration.
Plan with Claude Opus, Build with Kimi K2.6? LIVE Mixed-Provider Benchmark
22.5.2026, 03:36:08
The video is a summary of major news from the artificial intelligence world. It covers several stories, including the latest version of OpenAI’s language model, which offers improved text generation capabilities. It also reports on a new open-source initiative enabling developers to train their own AI models. Another topic is the introduction of a new tool that simplifies integrating AI into existing enterprise software. The video also discusses the ethical implications of these developments and emphasizes the importance of transparency and accountability in AI research.

Final Comment: The video addresses OpenAI and open-source tools and is aimed at intermediate and advanced users.
Anthropic Just Dropped a Masterclass on Building Agent Harnesses (for Large Codebases)
21.5.2026, 00:00:30
The video discusses strategies for effectively using Claude Code with large and complex codebases. It begins by noting that many tutorials focus on simple code examples while working with large codebases is often overlooked. The creator presents ideas from an Anthropic blog post addressing the use of Claude Code in large codebases. The main thesis is that the “harness” (environment and tools) is just as important as the underlying model.

Key strategies include:
1. **Global Rules**: These should be lean and layered to help Claude Code navigate different parts of the codebase. It’s recommended to have global rules in subdirectories to provide context-specific instructions.
2. **Hooks**: These can be used to make the entire AI environment self-improving. Start hooks can load context-specific information, while stop hooks can suggest updates to global rules.
3. **Skills**: These are reusable prompts or processes that only load when needed. They can be restricted to specific paths in the codebase to reduce context size.
4. **Language Server Protocol (LSP) and MCP-Server**: These enable Claude Code to use the same navigation as a developer in their IDE. This is particularly useful for large codebases, enabling more targeted searches.
5. **Sub-Agents**: These can be used for exploratory tasks to avoid overloading the main session’s context window. They perform analyses and return a summary.

The creator also demonstrates a plugin that integrates some of these strategies into a demo codebase to facilitate implementation. He emphasizes the importance of actively maintaining and improving the AI environment (AI Layer) to enhance Claude Code’s effectiveness.

The video explicitly addresses Claude Code and is aimed at intermediate to advanced users who already have experience working in larger codebases.
Pushing My AI Dark Factory to Its Limits with Opus + Kimi Combined
19.5.2026, 03:35:08
The video shows a detailed exploration and demonstration of the coding agent Pi. The focus is on presenting Pi as a minimal, customizable coding agent tailored to the individual user’s workflows. The user integrates Pi with Archon, an open-source tool for building harnesses, and demonstrates how Pi can be used with various models like Kimi, Minimax, and Opus.

Key steps and findings from the video include:

1. **Setting Up Pi with Kimi**: The user shows how to configure Pi to work with a Kimi code subscription instead of Codeex. The steps for API key setup and integration into Pi’s configuration are explained in detail.

2. **Installing and Using Extensions**: The user installs and tests various extensions from the Pi marketplace, including a sub-agents extension, a web access extension, and an Archon workflow management extension. These extensions enable additional features like desktop notifications, status bars, and integration of Archon workflows.

3. **Creating a Custom Extension**: The user creates a custom extension called “Archon Dispatch” that transforms Pi into a control panel for Archon background tasks. This extension enables running Archon workflows, displaying live status information, and receiving notifications when workflows complete.

4. **Issues and Solutions**: During the demonstration, some problems arise, particularly with integrating Archon workflows and displaying workflow results in Pi. The user attempts to solve these issues with Kimi but encounters limitations in model capabilities. He discusses the advantages of combining more powerful models like Opus with cheaper models like Kimi to achieve the best results.

5. **Comparison with Other Tools**: The user compares Pi with other coding agents like Codeex and Claude Code and highlights Pi’s advantages, particularly its customizability and speed.

6. **Future Plans**: The user plans to continue working on integrating Pi and Archon in future livestreams and videos, and may possibly develop an Archon workflow for creating Pi extensions.

The video explicitly addresses the AI tools and models Claude, Codeex, Kimi, Minimax, Opus, and Open Router. It’s aimed at intermediate and advanced users interested in customizing and integrating coding agents.
Pi is INCREDIBLE – Building a Custom Coding Agent Live
17.5.2026, 03:42:53
**Summary:**

In this stream, the new workflow marketplace for Archon was presented and two community workflows were added. The first workflow, “Idea to Work Order,” helps convert ideas into detailed work orders for development. The second workflow, “Archon SmartMR Review,” is a GitLab equivalent to the pull request review workflow.

The process of workflow integration was demonstrated live, including creating a pull request, automatic review via GitHub Action, and subsequent release. Various technical challenges and improvements were also discussed, such as updating the Archon CLI and notifications about available updates.

**Final Comment:**

The video addresses the use of Claude (OpenAI) and specific tools like Archon. It’s aimed at intermediate and advanced users.
🔴 The AI Coding Marketplace is Finally LIVE!
15.5.2026, 03:17:26
The video demonstrates how to create fully animated videos with audio using AI. The process uses multiple technologies, including HyperFrames for rendering, Claude Code for control, 11Labs or Kokoro for voice output, and Archon as a workflow manager. The creator provides an open-source repository enabling the creation of AI-generated videos in less than 10 minutes. The workflow includes scripting, audio creation, visual rendering, and syncing all elements. The creator emphasizes that while the technology isn’t perfect yet, it’s improving rapidly and already finding useful applications, particularly for explainer videos. The process is explained in detail, including the ability to customize templates and create your own. Examples of generated videos are shown at the end.

The video explicitly addresses HyperFrames, Claude Code, 11Labs, Kokoro, and Archon and is aimed at intermediate users.
Make the PERFECT Videos with Claude Code (Full Workflow)
14.5.2026, 00:00:24
**Summary:**

In this livestream, development of a workflow marketplace for Archon, an open-source harness builder for AI-coding, continued. The focus was on creating a marketplace where users can share their own workflows and use other workflows. The process involved creating an Archon workflow that automatically reviews and approves pull requests for new workflows. The stream started with merging an existing pull request that introduced the marketplace UI and continued with creating a new workflow to review pull requests.

The process involved several steps, including creating a plan, implementing the plan, and reviewing generated code. Various questions and adjustments were discussed to ensure the workflow functions correctly. The stream ended with creating a pull request for a test workflow and demonstrating how the automatic review process works.

**Final Comment:**
The video explicitly addresses the use of Claude (Claude Code) and is aimed at intermediate to advanced users.
Building the App Store for Agentic Engineering
12.5.2026, 04:02:35
**YouTube Video Summary:**

The video presents a live demo of the AI tool Archon, an open-source harness builder for AI-coding. The streamer showcases his current AI-coding workflow and how Archon accelerates this process by 10x. He demonstrates using Archon for various tasks, including handling GitHub issues (brownfield development) and creating new features (greenfield development).

1. **Brownfield Development:**
– The streamer shows how he uses Archon to handle multiple GitHub issues in parallel. Workflows are used that encompass planning, implementation, and validation.
– Workflows are designed to create comprehensive pull requests that can then be manually reviewed.
– It’s demonstrated how Archon integrates with a “Second Brain” (a knowledge and task organization system) to optimize the workflow.

2. **Greenfield Development:**
– The streamer plans and implements a new feature for Archon: a workflow marketplace enabling users to create and share their own workflows.
– The PIV-Loop (Plan, Implement, Validate) is used, a structured approach to AI-driven development encompassing planning, implementation, and validation.
– It’s shown how Archon workflows can be used for complex tasks like creating a new marketplace feature.

3. **Technical Details:**
– Archon enables creating workflows that can integrate various AI models and tools, including Claude, Codex, and others.
– Workflows are designed to be deterministic and repeatable, increasing reliability and efficiency.
– The streamer emphasizes the importance of human-in-the-loop processes to ensure result quality.

4. **Integration and Extensions:**
– It’s shown how Archon can be integrated with other tools like Beads (a memory system).
– The streamer discusses Archon’s advantages compared to other tools like N8N and emphasizes Archon’s specialization in AI-coding.

5. **Community and Further Development:**
– The streamer mentions the Dynamis Community, where he regularly offers workshops and courses to help users effectively use Archon and other AI tools.
– He announces that he’ll continue conducting livestreams to demonstrate Archon’s development and use.

**Final Comment:**
The video explicitly addresses the Archon tool and is aimed at intermediate to advanced users interested in AI-coding and workflow automation.
🔴LIVE – My AI Coding Workflow has 10x'd Again with Archon – See it in Action
10.5.2026, 05:56:02
The YouTuber expresses dissatisfaction with the current YouTube landscape, which heavily focuses on reporting about Claude and its latest features. He wants to stand out from the crowd by instead offering deeper, more technical content focusing on actual building and AI-coding principles. To this end, he plans to do three livestreams per week (Monday, Thursday, Saturday) showcasing projects like Archon and the Dark Factory experiment while working interactively with the community. He emphasizes that he’ll continue covering relevant AI news but with a focus on practical application and long-term value. The content is aimed at intermediate and advanced users interested in AI-coding and systems. The YouTuber explicitly addresses Claude and Claude Code.
Harness Engineering: What Separates Top Agentic Engineers Right Now
28.5.2026, 00:00:02
The video explains the term “harness engineering” and its significance in AI, particularly for AI-coding assistants. Harness engineering describes the process of designing an environment (wrapper) around an AI model to expand its capabilities and handle specific tasks more efficiently. It distinguishes between two main aspects: optimization within a single AI session and orchestrating multiple AI sessions into a larger workflow.

The first aspect, optimization within a session, builds on the concept of context engineering but goes further by introducing additional control mechanisms like hooks and sub-agents. The second aspect, orchestrating multiple sessions, enables handling more complex tasks by focusing each session on a specific subtask. This is demonstrated through tools like the “Ralph Loop,” which automatically coordinates multiple AI sessions.

The video emphasizes the importance of personal responsibility and continuous system improvement through learning from errors and adapting rules and processes. It also references Google Cloud Agent CLI as an example of a tool that facilitates building and deploying AI agents.

**Final Comment:** The video addresses Claude, OpenAI, Google Cloud Agent CLI and is aimed at intermediate and advanced users.

Nate Herk | AI Automation (10 new videos)

100 Hours Testing Claude Code vs ChatGPT Codex (honest results)
26.5.2026, 20:02:02
The video compares OpenAI Codex and Claude Code, two AI-powered coding agents, based on features, pricing, and three specific use cases. It starts with a brief introduction to both tools, highlighting that Claude Code from Anthropic offers more customization options, while Codex from OpenAI has a more unified workflow. The comparison includes analysis of three tasks: creating a research report, a landing page, and an interactive dashboard. Claude Code proves superior in frontend work and complex planning, while Codex excels in research-intensive tasks and rapid execution. Cost and token usage are analyzed in detail, with Codex being more efficient in token usage. The video concludes with a recommendation to choose the right tool based on your specific use case, emphasizing the rapid evolution of both tools.

**AI-Tools/Models/Providers:** OpenAI Codex, Claude Code (Anthropic)
**Target Audience:** Intermediate
The Playbook for a $100M AI Agency
25.5.2026, 16:23:09
**YouTube Video Summary:**

The video is an interview with Devin Karns, CEO and co-founder of Custom AI Studio, discussing the future of AI agencies and strategies for a successful exit. Here are the key points:

1. **Market Development and Value of Development**:
– The value of development trends toward zero as AI systems become increasingly powerful.
– Companies must focus on AI-native organizations to remain competitive.

2. **Future of AI Agencies**:
– Many AI projects being sold today won’t survive until 2027.
– Focus should be on delivering solutions that provide genuine value to businesses, rather than chasing short-term trends.

3. **Strategies for a Successful Exit**:
– Devin Karns shares his experiences and strategies for building an AI agency with high enterprise value.
– He emphasizes the importance of relationships, trust, and the ability to understand customers’ true needs.

4. **Five Things Devin Karns Wishes He Knew Earlier**:
– **Decide on a path**: Decide whether you want to build a lifestyle business or a business with high exit value.
– **Package your offering**: Develop a clear offering that highlights the value of your services.
– **Charge for true value**: Price your services based on the value you deliver, not the time you invest.
– **Build your pipeline before you need it**: Establish relationships and a pipeline of potential customers before you actually need them.
– **Hire for the company you want to be**: Hire employees who share your vision and have the skills needed to scale your business.

5. **Examples and Case Studies**:
– Devin Karns shares examples of successful projects, such as reducing an e-commerce company’s refund rate from 21% to 16%, resulting in significant cost savings.

**Closing Comment**:
The video explicitly discusses Claude, OpenAI, and open-source models, as well as specific tools like Cloud Code and Co-Pilot. It’s aimed at Intermediate and Advanced users who already have AI experience and want to scale or optimize their business models.
The AI Offer You Can Sell Tomorrow Morning
22.5.2026, 16:37:49
The YouTuber explains how to build an AI business as a beginner without directly selling projects or retainers. Instead, he suggests starting by selling hours (consulting) to build trust and overcome imposter syndrome. He describes a step-by-step approach, starting with selling hours (Rung 0), followed by audits (Rung 1), projects (Rung 2), and finally retainers (Rung 3). The YouTuber emphasizes that you should first gain experience and build trust before taking on larger projects. He provides practical tips on how to acquire your first 10 customers by teaching friends and acquaintances, engaging in communities, and gradually building a portfolio. The YouTuber also mentions the importance of practice and experience to overcome imposter syndrome and achieve long-term success.

Closing Comment: The video discusses generally available AI tools and models without mentioning specific providers and is aimed at beginners.
Give Me 10 Mins and I’ll Save You Millions of Claude Tokens
21.5.2026, 12:58:00
The video explains the concept of prompt caching in Claude Code and Claude, particularly how it saves tokens and costs. Key points include:

– **Token Savings**: Cached tokens cost only 10% of normal input costs. For example, the user saves millions of tokens daily through caching.
– **Cache Window**: The cache duration is one hour by default. In Claude Code, the cache is deleted after one hour of inactivity; in API usage or Sub-Agents, it expires after only 5 minutes.
– **Caching Mechanism**: The cache includes system instructions, tool definitions, and project context. Each new message or change (e.g., model switching) can break the cache and result in higher costs.
– **Practical Tips**: Users should avoid breaks, manually clear the cache when switching tasks, and place large documents in projects rather than in the chat.
– **Tools**: The user provides a token dashboard and a session handoff skill to better manage the cache and save tokens.

The video is intended for Intermediate users who use Claude Code or Claude intensively and want to optimize their token costs. It explicitly discusses Claude and Claude Code.
What Karpathy Joining Anthropic Actually Means For Claude
19.5.2026, 21:36:51
The video discusses the significance of Andre Karpathy joining Anthropic and analyzes why this move is important for both Karpathy and Anthropic. Karpathy, a central figure in the modern AI world, has an impressive career, including his role as a co-founder of OpenAI and his work at Tesla. His recent projects, such as Eureka Labs and the development of concepts like “Vibe Coding” and “Context Engineering,” demonstrate his ability not only to develop AI but also to teach others how to use it effectively.

Anthropic has made significant progress recently, particularly with Claude Code, which has become a popular tool for developers and businesses. Karpathy’s arrival could indicate that Anthropic is expanding its strategy by not only improving AI models but also enhancing their applications and integration into real workflows. Karpathy’s focus on “Context Engineering” and creating environments that enable AI models to work more effectively aligns well with Anthropic’s approach.

The video makes three predictions: First, that Anthropic will develop an app store for contexts and workflows. Second, that there will be more features like “/goal” that enable complex tasks to be handled automatically. Third, that Anthropic will create an education platform to help users package and contribute their own workflows.

The video explicitly discusses Claude/Anthropic and is intended for Intermediate and Advanced users.
How to Use Your Claude Code Projects in Codex in 5 Mins
18.5.2026, 19:09:24
The video shows how to migrate a project from Claude Code to Codex without duplicating it or losing important information. The main difference between the two tools lies in their file names and folder structures: Claude Code uses a `claude.md` file and a `claude` folder, while Codex uses an `agents.md` file, a `codex` folder, and an `agents` folder for Skills. Both tools, however, share the same knowledge base such as documents, references, and scripts.

The author explains that you can use a simple prompt to instruct Codex to create the necessary files and folders and transfer the content from the Claude Code files. It’s important to make changes in both the Claude Code files and the Codex files to ensure consistency. Additionally, the author notes that Codex Sub-Agents aren’t called automatically and there are some differences in tools and commands.

A practical example shows how Claude Code and Codex can work together to create and style an HTML document. The author recommends using both tools and not committing to a single tool to remain flexible.

Closing Comment: The video explicitly discusses Claude Code and Codex tools and is intended for Intermediate users.
The AI Career Opportunity Nobody is Talking About in 2026
17.5.2026, 16:24:10
The video creator, Nate, discusses an alternative career option in AI, away from the commonly promoted model of starting an AI automation agency. He argues that many companies, particularly large corporations, are increasingly hiring Chief AI Officers (CAIO) or have already done so, and this represents a great opportunity for people who don’t want to work in sales. Nate references an IBM study showing that 76% of surveyed CEOs either already have a CAIO or want to hire one, representing a significant increase compared to previous years. He explains that the CAIO role emerged similarly to the Chief Information Security Officer (CISO) role to address a new, pressing need in companies.

Nate emphasizes that there isn’t just the CAIO role; every department in companies is seeking AI-savvy leadership. He presents two paths to pursue this direction: Path A, where you start as an AI consultant or in an agency and are eventually acquired by a company, and Path B, where you build AI expertise internally in your current job and qualify yourself for a promotion. He argues that Path B could be more accessible for many people, as 57% of CAIOs were promoted internally.

Nate stresses the importance of loving what you do, as otherwise you won’t have the necessary perseverance and motivation to succeed in this field. He encourages viewers to become the AI-native version of their current role rather than jumping into a new field that doesn’t appeal to them. He concludes by stating that you don’t need to change your role, but rather the version of your role you perform.

The video explicitly mentions IBM and their research as well as the role of Chief AI Officer in companies. It’s more suitable for Intermediate and Advanced audiences, as it builds on solid knowledge and experience in the AI field.
How to Deploy Your Claude Automations (3 Methods)
15.5.2026, 15:16:02
The video explains three methods to deploy agents from Anthropic’s Claude Code environment so they run even when the user is not actively using them. The methods are compared using a framework that answers the question of where the agent runs (locally or in the cloud) and how autonomous/agentic it operates.

1. **Loops**:
– Simple method where Claude Code is instructed to create a loop that executes a specific task at regular intervals (e.g., every 10 minutes).
– Uses internal tools like `cron create`, `cron list`, and `cron delete` for scheduling.
– Loops are session-specific and run either in the desktop app or terminal.
– **Advantages**: No additional setup required, full agent functionality within the session.
– **Disadvantages**: The session and computer must be running, maximum runtime of 7 days, fixed intervals with random delay (jitter).

2. **Desktop Scheduled Tasks and Cloud Routines**:
– **Desktop Scheduled Tasks**: Run locally on your computer and require the desktop app to remain open.
– **Cloud Routines**: Run in Anthropic’s cloud and don’t require an active session or running computer.
– Both methods inject a prompt into a Claude Code session and execute the task.
– **Advantages**: No additional infrastructure needed, full Claude Code functionality, Cloud Routines can also be triggered via API or GitHub events.
– **Disadvantages**: Cloud Routines have a minimum of 1 hour between executions, limited number of executions per day (depending on your plan).

3. **Deployment on Modal or Trigger.dev**:
– Here, a script (Python for Modal, TypeScript for Trigger.dev) is deployed to the respective cloud platform and runs there on a schedule or as an API endpoint.
– **Advantages**: No need to keep your own computer or session running, good for deterministic processes.
– **Disadvantages**: Limited full agent functionality, AI processing occurs via API and is therefore more expensive. Using Claude’s Agent SDK also enables agentic features but is similarly costly.

Additionally, **Managed Agents** from Anthropic and **Hooks** in Claude Code are briefly mentioned for event-driven automation.

The video is suitable for **Intermediate** users who already have experience with Claude Code and agents. It discusses specific tools and providers, including **Claude (Anthropic)**, **Modal**, **Trigger.dev**, and the **Claude Agent SDK**.
Anthropic Just Dethroned OpenAI. Here’s What Happens Next.
13.5.2026, 21:20:51
The video discusses the current dynamics in AI-powered coding tools, particularly the competition between OpenAI (Codex) and Anthropic (Claude Code). It begins with the observation that Anthropic has surpassed OpenAI in business usage, followed by OpenAI’s response offering companies two months of free Codex usage. Claude Code countered with a 50% increase in weekly usage limits for the next two months. The author interprets these steps as part of a “free trial phase” where companies and users intensively use the tools while providers focus on adoption and data collection. It’s argued that current prices aren’t sustainable and users ultimately provide valuable training data for the AI models. The author advises taking advantage of current offers to avoid vendor lock-in and remain flexible to potential future changes. Historical patterns are also referenced, showing how similar dynamics in other industries led to price increases. The closing comment emphasizes that users should use the tools to avoid committing to a single provider.

**AI-Tools/Models/Providers:** OpenAI (Codex), Anthropic (Claude Code); **Target Audience:** Intermediate to Advanced.
Every Level of Claude Explained in 21 Minutes
12.5.2026, 13:59:35
The video provides a detailed guide to using Claude across five progressive levels: Enthusiast, Beginner, Intermediate, Advanced, and Architect. Each level is defined by specific features and strategies that progressively guide the user from basic applications to complex automations.

– **Level 1: Enthusiast** – Basic use of Claude for simple tasks like writing emails or explaining content. An important tip is using screenshots, as Claude can read images.
– **Level 2: Beginner** – Introduction to Projects, which provide context and continuity. Key features are Memory, Connectors (integration with tools like Slack or Google Drive), File Creation (creating Excel files, PowerPoint presentations, etc.), Artifacts (interactive applications), Inline Visuals (visual displays within chats), and Office Add-ins (integration with Microsoft Office).
– **Level 3: Intermediate** – Use of Claude Co-work for tasks on your own computer. Key features are File System Access, Skills (reusable workflows), Scheduled Tasks (planned tasks), Mobile Control (smartphone control), Cloud Design (design prototypes and presentations), as well as Plugins and Computer Use (navigation within apps).
– **Level 4: Advanced** – Use of Claude Code for complex automations and parallel work processes. Key features include using claude.md (configuration file), Plan Mode (planning and executing tasks), Sub Agents (specialized agents), Work Trees (isolated work areas), MCP (Model Context Protocol for tool integration), as well as various optimization techniques and custom commands.
– **Level 5: Architect** – Creation of fully autonomous systems that operate without user interaction. Key features are Cloud Routines (scheduled cloud tasks), Hooks (security-relevant logic), Channels (control of external platforms), Headless Mode (autonomous task execution), Agent SDK (creating your own products), and Remote Control (remote session control).

The video emphasizes that the transition to the highest level is less technical and more about trusting the systems. It recommends starting with simple, low-stakes automations and gradually implementing more complex systems.

**Closing Comment:** The video explicitly discusses Claude and is intended for Intermediate to Advanced users.

Ben AI (4 new videos)

12 Claude Plugins, Skills & MCP's I Can't Live Without
26.5.2026, 09:00:44
This video introduces 12 tools and plugins for Cloud Code and Co-Work that extend Claude’s functionality. The main points are:

1. **Google Workspace CLI**: Provides access to all Google products (Drive, Gmail, Calendar, Sheets, Docs, Chat) without the limitations of Google MCPs and is more token-efficient. Installation is somewhat involved but simplified with a provided Skill.

2. **Higsfield**: Enables Claude to access image and video models like Nano Banana and Cance. There’s both an MCP for Co-Work and a CLI for Cloud Code. Higsfield allows generating, editing, and animating images as well as creating videos and slideshows.

3. **The Printing Press**: Offers a library of over 50 pre-built CLIs for software without public APIs and enables creating custom CLIs. This saves tokens and is more efficient than MCPs.

4. **Impeccable**: A set of Skills for Cloud Code that improves HTML and website design. It enables easy layout customization, design refinement, and adding animations.

5. **Versel**: Enables quickly deploying HTML content to a server and creating live URLs. Ideal for hosting and distributing websites, reports, and dashboards.

6. **Caveman Plugin**: Compresses text by up to 75%, saving tokens. Can be applied to Claude’s responses, Skills, and frequently used context files.

7. **Firecrawl**: An affordable and effective web scraping tool that can scrape 99% of websites, including those Claude cannot reach. Available in both MCP and CLI versions.

8. **Playwright CLI**: A browser automation library that’s faster, more reliable, and more cost-efficient than Cloud’s native browser function. Ideal for repeatable scraping or action workflows.

9. **Cloth Video Plugin**: Enables scraping videos by downloading them and creating screenshots. Can generate transcripts and analyze videos.

10. **VI Prospecting**: A sales and lead database tool optimized for AI agents like Claude. Allows filtering leads based on current intent signals such as hiring trends and recent funding rounds.

11. **UniPal**: Enables connecting Claude with WhatsApp, Instagram, and LinkedIn. Can read and send messages, particularly useful for WhatsApp and LinkedIn outreach.

The video explicitly focuses on Claude and is better suited for intermediate users.
Every Claude Cowork Feature Explained Clearly
20.5.2026, 10:06:05
This video offers a comprehensive introduction to using Cloud Co-Work, a tool that can revolutionize how users and their teams work. It begins by explaining fundamental concepts and features, divided into three main categories: Memory and Context Concepts, Capabilities and Automation Concepts, and Connectors and MCP Concepts. The focus is on how these concepts improve efficiency and relevance when working with Cloud.

A central theme is addressing Cloud’s limited context windows, partially solved through features like global instructions and built-in reminders. Users are encouraged to leverage external files and folders to create persistent context, enabled by features like file access and creation of Clot.MD files. Projects and a centralized “Second Brain” or AIOS (Artificial Intelligence Operating System) are presented as solutions for organizing and accessing context across different projects and teams.

Cloud Co-Work’s capabilities, such as code execution, Skills, Skills 2.0, and Evals, as well as scheduled tasks and routines, are explained in detail. These capabilities enable automating repetitive tasks and managing complex tasks more efficiently. Connectors and MCP (Model Context Protocol) are presented as means to connect Cloud with external software applications, enabling workflow automation across various software platforms.

The video concludes with best practices for using Cloud Co-Work, including selecting the right model for different tasks, optimizing token usage, and deciding when to switch to Cloud Code. Tips are also provided for introducing Cloud Co-Work to a team, including managing permissions and utilizing shared Skills and Plugins.

The video explicitly focuses on Claude and is better suited for intermediate to advanced users.
5 Skills to Build an AI Operating System Like The 1% (Full Guide)
16.5.2026, 08:48:55
This video shows how to set up a “Second Brain” or AI operating system to boost efficiency and productivity with AI tools. The author, Ben, emphasizes the importance of a well-structured and maintained system to optimize token costs and provide relevant context. He presents five Clot-Skills that help quickly set up the system following best practices:

1. **OS Setup Skill**: Helps with initial Second Brain setup, including populating context, structuring folders, and creating Clot.md files that serve as an instruction layer for AI agents.

2. **OS Operator Skill**: Sets up a scheduled task that pulls real-time context from various sources (e.g., meetings, Slack chats) and updates the Second Brain accordingly. This includes creating daily summaries, task lists, and maintaining existing files.

3. **OS Optimizer Skill**: Performs regular audits and optimizations to improve the Second Brain’s efficiency and token usage. This includes eliminating duplicates, correcting formatting, and improving folder structure.

4. **Team OS Skill**: Enables sharing and syncing the Second Brain within a team, including setting up read and write permissions for different team members.

5. **OS MCP Skill**: Creates an MCP (Managed Connector Proxy) from the Second Brain so scheduled tasks and optimizations can run autonomously, even with the laptop closed.

The author recommends starting with Second Brain setup and gradually expanding it, as benefits increase over time with growing context. He offers additional resources and support for those wanting to dive deeper.

The video explicitly focuses on Claude and is better suited for intermediate and advanced users.
How to Actually Use Claude Design Like a Pro (Real Use Cases)
12.5.2026, 07:40:42
This video demonstrates how to efficiently use Cloud Design for various design tasks, including presentations, social media content, and websites. The main focus is a four-stage process: First, setting up a comprehensive design system containing colors, fonts, and styles. Second, using templates to predetermine formats and layouts. Third, using Skills to predefine text and content. Fourth, integrating these elements into Cloud Design to quickly create consistent designs. The creator emphasizes that this preparation avoids endless iterations and high token costs. Concrete examples and steps for setting up design systems and templates are shown, as well as using Skills to automate the process. Free resources and tools are also offered to facilitate getting started.

The video explicitly focuses on Cloud Design and is better suited for intermediate users who already have some experience with AI tools.

Brian Casel (8 new videos)

I’m Using Claude Design for This — Not App Building
22.5.2026, 14:00:32
In this video, the speaker presents two concrete use cases for Claude Design. The first use case is creating animations for his YouTube videos. He shows an example of a test animation that is high-quality and aligned with the Builder Methods brand. These animations will be used in future videos to visually support complex topics. The second use case is creating presentation slides for conferences and workshops. He creates a test slide collection that maintains brand designs and offers high-quality, clean layouts. He particularly highlights the integrated speaker notes feature, which allows him to design his presentations more efficiently.

The video specifically focuses on the Claude Design tool and is intended more for intermediate and advanced users.
You don’t need to learn to code anymore
18.5.2026, 13:30:40
The video demonstrates how to create your own applications without coding knowledge by using AI as a tool. The focus is on the “spec-driven development” method, where a clear specification (PRD) is created, then divided into milestones and implemented by an AI-coding agent. The process begins with a rough idea, which is transformed into a detailed product requirements document (PRD). This document contains all relevant information such as scope, data model, integrations, and features. Next, the PRD is divided into milestones, each representing a completable unit. The creator demonstrates building an invoicing application as an example, using two custom-built tools: “build new” as a starter template and “PRD Creator” as an agent skill that supports the PRD creation process and milestone breakdown. The PRD Creator asks questions that an experienced product designer would ask, helping to cover all important details. In the end, you have a clear, written planning document that serves as the foundation for implementation by the AI-coding agent.

The video specifically addresses the AI models Claude, Codex, and Gemini as well as the tools Cursor and Resend. It is more suitable for intermediate and advanced users, as it assumes viewers already have basic knowledge of using AI tools.
How I build agents that work the night shift
12.5.2026, 12:01:07
The video shows a process the user calls “Radar Scan,” where multiple AI agents run on a Mac Mini with OpenClaw. One of these agents, named Veil, handles marketing and performs a radar scan daily at 4:00 AM. The results are saved as a markdown file in Dropbox and can be accessed via a Telegram link. The file is opened in a custom-built app called Brainown, which simplifies reading and writing markdown files. The scan primarily summarizes tweets from the user’s industry that he follows, including messages from companies, influencers, and thought leaders. The user organizes his Telegram contacts into human and agent friends and receives a report overview in Telegram, with the option to open the full file in Brainown.

The user addresses OpenClaw and the custom-built app Brainown, making this video more interesting for intermediate or advanced users.
Why You Need Claude Code Server Mode?
6.5.2026, 14:01:40
The video shows the evolution of creator Brian Casel’s workflow with AI agents from 2023 to 2026. Initially, he worked manually; starting in 2024, he used AI as an extension (e.g., tab completion, chatbots for text); in 2025, AI became a collaborator (specifically Claude Code); and in 2026, he orchestrates multiple agents simultaneously. He emphasizes that the quality of agent specifications is critical and that multitasking with agents is now a natural step. The video demonstrates various tools like Cursor, Claude Code, Superset, and Conductor suitable for agent-based development, with Superset currently being his favorite. He demonstrates how he switches between different worktrees and projects and uses agents on mobile to delegate tasks while pursuing other activities. He also integrates agents into internal tools like Spark Drop and Brain Down to connect product development and marketing. He uses OpenClaw and Claude Co-Work for automated, recurring tasks. The video concludes by emphasizing that multitasking with agents is the next logical step for builders.

Closing Comment: The video addresses Claude (specifically Claude Code and Opus), OpenClaw, and tools like Cursor, Superset, and Conductor. It targets intermediate and advanced builders who already have experience with AI agents.
Why Every AI Coding Tool is Converging on Plan Mode?
1.5.2026, 14:01:11
The video covers setting up OpenClaw, a tool for agent teams, and the decision between a cloud VPS or a physical device. The author explains that he chose a new Mac Mini M4, despite cheaper options like a cloud VPS starting at five dollars per month. His reasons for this choice include the ability to screen share, visual management, and future requirements for storage and bandwidth. If OpenClaw doesn’t meet expectations, the author plans to use the Mac Mini in his home studio.

The video addresses OpenClaw and is intended more for intermediate or advanced users familiar with agent teams and server setups.
Building a Custom AI News Agent with RSS and Telegram
29.4.2026, 14:00:14
The video shows how to create a new GitHub repository via the mobile website and then connect it with the Claude app to generate simple “Hello World” code. The process begins by creating a new repository on GitHub, selecting a template for a Rails app. The repository is then connected in the Claude app by authorizing repository access rights in GitHub. Claude generates simple “Hello World” code and creates a pull request that can be merged into the new repository.

The video specifically addresses GitHub and the Claude app and is intended more for intermediate users.
Multitasking With Agents: My 2026 Workflow
29.4.2026, 12:01:10
The video compares the AI platforms Claude and OpenClaw in terms of their suitability for building business processes. Claude is described as trustworthy and mature, particularly for creative and strategic tasks. OpenClaw, on the other hand, is portrayed as immature but innovative, with shortcomings in setup, documentation, and reliability. Despite these weaknesses, OpenClaw has advanced the concept of agent hiring and could play a significant role long-term. The choice of platform depends on priorities: trust and maturity (Claude) or innovation and potential (OpenClaw).

The video specifically addresses Claude and OpenClaw and targets intermediate and advanced users who want to use AI tools for business purposes.
VPS vs Mac Mini for OpenClaw
28.4.2026, 14:00:18
The video discusses the suitability of Claude Design for real work processes in product development. The author tests two main use cases: first, creating a new T-shirt marketplace, where he finds that Claude Design generates appealing mockups but encounters problems integrating them into an existing codebase and further development. Second, extending an existing website with a design system analyzed from a GitHub repository. Here too, he sees difficulties in practical implementation. Instead, he prefers working directly in Claude Code with a custom design system integrated into the codebase. However, the author identifies two meaningful use cases for Claude Design: as a visual planning tool in the early ideation phase for new products and for creating marketing assets like animations and presentations that align with brand identity. He emphasizes that Claude Design is not suitable as an end-to-end tool from idea to finished product, but can be useful in specific phases and use cases.

The video specifically addresses Claude Design and Claude Code and is intended more for intermediate to advanced users who already have experience with AI-assisted product development.

Melvynx (10 new videos)

Pi AGENT : the ultimate replacement for Claude Code
26.5.2026, 16:28:56
The video shows an initial test of the Pi tool, an orchestrator or “harness” for AI agents that enables creating and managing complex workflows and tool chains. The creator tests Pi’s installation and basic features, including integration of various AI models (e.g., OpenAI, Claude) and the use of plugins to extend the agent. Particular emphasis is placed on Pi’s flexibility to create and customize custom workflows and UI elements, as well as the ability to modify and extend the agent itself.

The creator experiments with various plugins, such as a todo-list plugin and a subagent plugin, to automate and manage tasks. He also demonstrates how Pi enables connecting the agent with different AI models and monitoring their costs. A critical note is made that using Pi can involve high costs, especially when using many agents and plugins simultaneously.

At the end, the question is raised whether Pi’s flexibility and adaptability justify the effort and costs, or whether it would be better to fall back on simpler, pre-built solutions.

**Final comment:** The video explicitly discusses the Pi tool and is aimed more at **Intermediate** or **Advanced** users who are familiar with AI agents and orchestration.
OpenClaw everywhere
26.5.2026, 09:40:36
The video discusses using AI chatbots like Claude and ChatGPT via the Telegram app. The author emphasizes Telegram’s reliability compared to the official apps of AI services, which he describes as unstable and unusable. He demonstrates how he sends voice messages to the AI and converts them to text, and explains how he can create memories with Claude to store conversation context. He also reports on his experiences with public transportation and taxi prices in his area, describing high costs and practical issues. He mentions that he plans to create a video tutorial on how to use public transportation maps in the future.

The video explicitly discusses Claude, ChatGPT, and Gemini, and is aimed more at intermediate users already familiar with AI chatbots.
Antigravity 2.0 : the WORST copy I’ve ever seen?
25.5.2026, 16:00:06
The video introduces Antigravity 2.0, a new Google tool that is heavily inspired by other AI-powered developer tools like Codex. The creator compares the user interface and features of Antigravity 2.0 with those of Codex and finds that many elements have been directly copied, which he considers normal in the competitive AI industry. Despite these similarities, he tests the tool and demonstrates its features, including Gemini model integration and the ability to create and manage projects. However, he criticizes the user experience, particularly the frequent permission requests and the unstable IDE, which he describes as poorly designed and error-prone. He also compares the performance of Gemini models with those of other providers and finds that they don’t reach the top tier in the coding domain. Overall, he considers Antigravity 2.0 a functional but unremarkable tool that Google could significantly improve with more effort and better design.

The video explicitly discusses Google’s Antigravity 2.0, Codex, and Gemini models, and is aimed more at intermediate or advanced users.
DEVS are in love with Claude (and stop wanting to try new things)
24.5.2026, 16:00:34
The author of the video reflects on his personal development in using AI tools, particularly Claude and Codex (OpenAI). He admits that he was once “in love” with Claude and overestimated its capabilities while criticizing OpenAI. Through experience with high costs and limited options with Claude, he recognized the advantages of Codex, particularly its superior user interface and efficiency. He emphasizes the importance of being pragmatic and flexible to switch between different tools as needed. The author has reorganized his configurations to more easily switch between tools like Claude, Codex, and Cursor. He recommends choosing the best tools based on their current capabilities and costs, and warns against being emotionally attached to a specific tool. He concludes with an invitation to use his configurations and tools to become more flexible.

The video explicitly discusses Claude (Anthropic) and Codex (OpenAI) and is aimed more at intermediate or advanced users.
What I’ve been preparing since Seoul…
24.5.2026, 07:00:38
The video shows the speaker’s excitement about an upcoming trip to San Francisco, motivated by several goals. Primarily, he wants to immerse himself in the heart of the AI world, as he is passionate about companies like Claude, OpenAI, and Gemini and could talk about their developments daily. He sees San Francisco as the ideal place to benefit from the dynamic AI scene and network with like-minded people. In addition to professional focus, he also pursues personal goals, such as meeting new people, experiencing American culture, and improving his English skills. The speaker plans to capture these experiences in vlogs to share them with his audience.

The video explicitly discusses Claude, OpenAI, and Gemini and is suited more for intermediate and advanced users interested in the AI industry.
Codex can CONTROL your computer (even when locked)
23.5.2026, 16:00:14
The video introduces several new features and updates to Codex, an AI tool used mainly for programming and automation. Here are the key points:

1. **Screenshot function with keyboard shortcut**: By pressing “Command + Command” you can insert a screenshot of the current application into the Codex chat. This facilitates debugging and working with various applications as context is transferred directly to the chat. However, there is no option to open the screenshot in a new chat, which somewhat limits usage.

2. **Gal (Goals) integration**: The “slashgal” feature allows creating and managing goals directly in the Codex application. You can create, edit, pause, display, or delete goals. Codex can also independently create goals if given appropriate instructions.

3. **Inupp Browser**: This feature allows opening and managing browser tabs directly in Codex. You can take screenshots, add annotations, and change styles. However, the user experience is sometimes somewhat rough as the application frequently reloads.

4. **Computer Use**: A new feature that enables controlling the Mac via the iOS app even when locked. This is still in beta phase and doesn’t always work reliably, but shows the potential to have the computer work even in the user’s absence.

The video explicitly discusses the AI tools Codex and Claude, and is aimed more at intermediate to advanced users who already have experience with AI tools.
AI changes all the time
23.5.2026, 07:02:11
The YouTuber responds to a comment addressing criticism of his content direction. He explains that as a solo builder and YouTuber, he can afford to constantly test new tools, while his viewers don’t have to. He emphasizes that it’s developers’ responsibility to reasonably decide which tools to use. He admits that he is often overly enthusiastic about reporting on new tools because it works well for YouTube and matches his personality. However, he stresses that he considers his viewers intelligent and trusts them to decide for themselves whether a tool is suitable for them. He criticizes other YouTubers who treat their viewers as dumb, and emphasizes that he himself is maximalist and direct in his approach.

The video does not discuss specific AI tools or models and is aimed at intermediate and advanced users.
How to ONE-SHOT all your features with AI (make it work for 2 hours non-stop)
21.5.2026, 16:00:44
The video demonstrates a detailed workflow for implementing complex features using AI tools like Claude and Codex. The process begins with an intensive brainstorming phase where the user describes their ideas and requirements in detail. Subsequently, a plan is created and checked multiple times to ensure all aspects of the feature vision are considered. The actual implementation process is carried out using the Apex Skill, which enables AI models to work in multiple steps and self-correct. An important part of the workflow is using the “Verify” parameter, which prompts the AI to check its own actions and ensure everything works correctly. The user also demonstrates how to work with tools like Dev Browser CLI or integrated browsers in Cloud or Codex to verify and correct the implementation. At the end, the user emphasizes the importance of precision and iteration to successfully implement complex features.

The video explicitly discusses Claude, Codex, and specific tools like Apex and Dev Browser CLI and is aimed more at intermediate to advanced users.
Is it possible to finish my Seoul trip without wagyu?
21.5.2026, 07:00:16
In the video, the preparation of wagyu meat in a restaurant is criticized. The speaker complains that the meat was overcooked and did not have the desired “medium rare” doneness. He explains that the meat was cooked too long due to the small pieces and is therefore very tough and dry. Additionally, the seasoning is criticized as only large salt chunks were used and no pepper was added. The mushrooms are described as good, but here too the excessive cooking is criticized. The speaker regrets that the quality of the fatty meat was not shown to its advantage due to overcooking. He decides to prepare the meat himself according to his preferences and translates his wishes into Korean to ensure he gets the desired result next time.

The video does not discuss specific AI tools or models and is aimed more at intermediate users as it focuses on detailed culinary criticism and preferences.
DeepSWE shows that GPT 5.5 is the best model in the world.
27.5.2026, 16:00:20
The video discusses the new Deep SWE Benchmark, which evaluates the capabilities of AI models in the field of software engineering. Unlike previous benchmarks such as SWE Bench Pro, Deep SWE measures model performance on realistic tasks that include more complex and lengthy code tasks. Results show that GPT-5.5 achieves the best performance with 70%, followed by GPT-5.4 with 56% and Claude Opus 4.7 with 54%. Models like Gemini 3.5 Flash and various Chinese models perform significantly worse. The benchmark also highlights the efficiency of the models, with GPT-5.5 consuming fewer tokens and thus being more cost-effective. The analysis shows that GPT-5.5 delivers consistent and reliable results, while other models like Claude often forget requirements or cheat. The benchmark is rated by many experts as realistic and useful as it reflects the actual use of AI models in practice.

**Final comment:** The video explicitly discusses GPT-5.5, Claude Opus 4.7, Gemini 3.5 Flash, and various Chinese models. It is aimed more at intermediate and advanced users.

Dave Ebbelaar (4 new videos)

Your Pip Install Is a Backdoor – Fix This Now!
21.5.2026, 13:28:18
The video addresses the growing threat of supply chain attacks on Python projects and offers three practical tips to protect yourself. Supply chain attacks often occur through compromised packages in package managers like npm or pip that contain malicious code and can steal sensitive data like SSH keys or API keys. Recent examples include attacks on Tanstack and Mistral AI. The security tips are:

1. **Switch to UV**: Use UV instead of pip as it offers more security settings.
2. **Versioning and Time Windows**: Use `atbounds exact` in your `pyproject.toml` file to pin exact package versions, and set `exclude newer` to, for example, 7 days to install only verified packages.
3. **Use `uv sync –lock`**: Run `uv sync –lock` to ensure only packages listed in the lock file are installed and to avoid conflicts.

Additionally, it’s recommended to instruct AI agents not to add new packages without explicit approval and to implement features manually instead to minimize dependencies.

**Final Comment**: The video covers UV and is geared toward intermediate to advanced Python developers.
The Complete Guide to Hybrid Search in RAG (BM25 + Embeddings + Reranker)
14.5.2026, 17:52:50
The video is a detailed tutorial showing how to build a hybrid retrieval system from scratch that combines BM25, dense embeddings, reciprocal rank fusion (RRF), and a reranker. The focus is on creating a production-ready system that will be relevant in 2026. The tutorial begins by explaining the dataset used, the Financial QA dataset, which is part of the BEIR benchmarks. This dataset contains financial questions and associated answers used to evaluate the retrieval system.

The tutorial walks through the process of building BM25 and dense embedding indices, running queries, and combining results with RRF. Then a reranker is added to further improve results. The tutorial emphasizes the importance of evaluation and demonstrates how to measure system performance using normalized discounted cumulative gain (NDCG).

By the end, it explains how to apply the system to your own projects, including creating your own evaluation dataset. The tutorial is intended for developers and engineers who already have foundational knowledge of retrieval systems and want to use them professionally.

Final Comment: The video covers OpenAI (for embedding models) and Cohere (for the reranker) and is geared toward intermediate to advanced users.
Building Agentic RAG From Scratch in Pure Python
10.5.2026, 09:57:56
The video demonstrates how to build an agentic RAG (Retrieval-Augmented Generation) system from scratch in pure Python. The focus is on making enterprise data or private information accessible to large language models for use in AI automation. The author, Dave Ebbelaar, explains the differences between classical semantic RAG and agentic RAG, with the latter surpassing the former through feedback loops and repeated use of the language model’s intelligence.

The system is built in several steps: First, simple tools are defined that can list, search, and read files. These tools work with Markdown files in the filesystem. The author shows how to implement these tools in Python, including using regular expressions to search for patterns in files. Next, a simple agent is created with these tools that can answer questions about the content of the Markdown files. The agent uses the tools in a loop to find the correct information and self-correct.

The author also covers production best practices, such as using Rust-based tools like ripgrep for faster and more secure file searches, and implementing error messages that can be interpreted by the language model to improve the agent. He shows how to deploy the system in various environments such as VPS, container apps, or serverless functions.

The end of the video presents a complete example of an agentic RAG system in Python that follows production best practices and can be deployed in real-world projects.

The video covers OpenAI and is geared toward intermediate or advanced users.
If I Started AI Freelancing in 2026, I'd Do This
4.5.2026, 15:15:23
**Summary:**

The video describes a three-stage framework that the author calls the “Data Freelancer Blueprint” for succeeding as a freelancer in AI and data. The three stages are:

1. **Get Going:**
– Overcome psychological barriers and start with simple but useful projects, often considered “boring” like data automation or reporting.
– Build three end-to-end projects you can demonstrate and learn how to integrate and deploy code into real systems.
– Update your LinkedIn profile to clearly communicate what problems you can solve.

2. **Getting Paid:**
– Determine your hourly rate through research.
– Use your network for warm introductions and conversations with decision makers.
– After the conversation, create a detailed project proposal with milestones, deliverables, and cost estimates.

3. **Get Good:**
– Focus on improving lead generation, sales, and delivery.
– Build long-term contracts to secure stable income.
– Use various channels like LinkedIn, YouTube, and freelance platforms to generate leads.

The author emphasizes that freelancing in the tech sector is a safe and lucrative way to start a business and encourages viewers to take the first step.

**Final Comment:**
The video covers tools like N8N and Airtable (low-code/no-code solutions) as well as Python and TypeScript (custom code solutions) and is aimed at intermediate and advanced freelancers looking to enter or expand their business in the AI and data space.

Niklas Steenfatt (2 new videos)

Wie findet man eine Partnerin?
24.5.2026, 20:09:03
The video shows a YouTuber’s marriage proposal to his girlfriend in the desert during South Africa’s Burningman event. He shares his emotions and his girlfriend’s surprise as she accepts the proposal. The YouTuber reflects on the significance of choosing a partner from 8 billion people and the criteria involved. He emphasizes that physical and emotional chemistry matters more than shared interests, and that long-term relationships require work. He also discusses the cool effect and the importance of open communication in relationships. The YouTuber recommends couples therapy and honest conversations to build a stable and happy partnership. He shares his personal journey and how he moved from initial doubts to a confident decision. At the end, he promotes the crypto broker Kraken and offers a bonus for new users.

Final comment: The video doesn’t cover specific AI tools or models and is better suited for intermediate and advanced viewers, as it shares deep personal reflections and life experiences.
Ich habe ALLEN KI Agenten dieselbe Aufgabe gegeben
27.5.2026, 16:54:40
The video shows a comparison of four AI agents (Cloud Code, Codex, Hermes, and Amadeus) solving various tasks. The tasks include summarizing tweets, recommending the best AI agent, creating graphics, programming a habit tracker and replicating a website, as well as making money. The agents were installed and tested on a Hostinger VPS using Paperclip software.

The tasks revealed different strengths and weaknesses among the agents. Cloud Code and Amadeus performed particularly well on programming tasks, while Hermes and Codex sometimes delivered similar results, suggesting copying. When tasked with making money, the agents proposed various methods, some unrealistic or uncreative.

The video explicitly covers the AI agents Cloud Code, Codex, Hermes, and Amadeus, as well as Paperclip software and Hostinger VPS. It’s better suited for intermediate and advanced users, as it covers technical details and specific AI tools.