Jiwon Min Developer

Building a Production-Level RAG-based AI Agent: An Autonomous Research Automation System with CrewAI and LangChain

The demand for AI systems that can autonomously perform complex, multi-step tasks is growing, moving beyond simple chatbots that just answer questions. For example, if you assign a research task on “the latest AI semiconductor market trends,” the AI would independently search the web, summarize key information, analyze competitors, and generate a final report. This is the core concept of an AI Agent, and one of the most powerful technologies to implement it is a multi-agent system combined with RAG (Retrieval-Augmented Generation).

This article goes beyond a simple RAG tutorial to provide a detailed, end-to-end guide on building a RAG-based autonomous research AI agent that can be stably operated in a production environment. Centered around CrewAI, a role-based collaborative agent framework, we will implement powerful RAG-based search tools using LangChain and design a sophisticated workflow where multiple agents collaborate to achieve a single goal. Through this guide, you will gain practical, real-world know-how to create a functioning “AI team,” moving beyond simple LLM API calls.

High-Performance Microservice Communication with Python and gRPC: A Production-Level Guide

In modern cloud-native environments, numerous Microservices communicate with each other to execute complex business logic. The most common communication method is undoubtedly the REST API. However, in environments where internal service-to-service communication (East-West traffic) is exploding, REST, a text-based protocol using JSON, can sometimes become a performance bottleneck. The overhead of message serialization/deserialization, the lack of a clear API contract, and limitations in streaming capabilities are challenges that must be addressed in systems requiring high performance and low latency.

To solve these problems, gRPC (gRPC Remote Procedure Call), developed by Google, has emerged as a powerful alternative. gRPC uses HTTP/2 as its transport layer and Protocol Buffers (Protobuf) as its Interface Definition Language (IDL) and serialization format, providing incredible performance and a strong type system. This article is aimed at experienced server engineers and developers, and will provide an in-depth, practical guide to building high-performance, gRPC-based microservices in a production environment using Python. We will go beyond a simple ‘Hello, World’ example to explore key best practices you’ll encounter in real-world operations, including error handling, authentication, timeouts, and health checks.

A Comprehensive Guide to PostgreSQL Streaming Replication: Building High Availability (HA) and Read Scaling for Production Environments

At the heart of every production service lies a database. However, an architecture that relies on a single database instance becomes a critical Single Point of Failure, where an unexpected hardware failure, network issue, or maintenance task can bring down the entire service. To address these risks and maximize service reliability, ensuring database high availability (HA) is not an option—it’s a necessity.

PostgreSQL offers a powerful and reliable Streaming Replication feature to meet these demands. This feature allows you to replicate data from a primary server to one or more standby servers in near real-time. This minimizes service interruptions by enabling a quick switch to a standby server if the primary fails. Simultaneously, it allows for Read Scaling by distributing read queries to standby servers, improving overall database performance. This post provides an in-depth guide on how to build a production-level PostgreSQL streaming replication setup that can be applied directly in a real-world environment.

A Complete Guide to Building a Production-Level Centralized Logging System with Fluentd on Kubernetes

As Microservices Architecture (MSA) has become commonplace, Kubernetes has established itself as the standard for container orchestration. In a Kubernetes environment where numerous containers are dynamically created and destroyed, tracking distributed application logs and troubleshooting issues is nearly impossible with traditional methods. Accessing each pod and checking logs with the kubectl logs command is merely a temporary fix, with clear limitations for real-time incident response and root cause analysis.

To solve these problems, building a Centralized Logging System has become not an option, but a necessity. A centralized logging system collects all logs generated across the entire cluster into a single location, refining and storing them so that developers and operators can easily search and visualize them. This post provides an in-depth guide to building a production-level centralized logging system for Kubernetes using the EFK (Elasticsearch, Fluentd, Kibana) stack, centered around Fluentd, a powerful log collector and a graduated project of the CNCF (Cloud Native Computing Foundation).

Mastering Production-Ready Kubernetes Ingress with Amazon EKS and AWS Load Balancer Controller

When operating a Kubernetes cluster using Amazon EKS (Elastic Kubernetes Service), one of the most critical challenges is routing external traffic to services inside the cluster reliably and efficiently. While Kubernetes provides NodePort or LoadBalancer type services, they have clear limitations in meeting the complex demands of a production environment. For example, every time a LoadBalancer service is deployed, a new ELB (Elastic Load Balancer) is created, increasing costs. It’s also difficult to apply fine-grained L7 routing rules (path-based, host-based routing).

To solve these problems, Kubernetes offers an object called Ingress. An Ingress is a collection of rules that allow inbound connections to reach cluster services, and the component that actually fulfills these rules is the Ingress Controller. In the AWS environment, the AWS Load Balancer Controller integrates most seamlessly with EKS, allowing you to natively leverage AWS’s Application Load Balancer (ALB) or Network Load Balancer (NLB). Using this controller, you can expose multiple services through a single ALB and declare powerful features like SSL/TLS certificate management, advanced traffic routing, and WAF integration in a Kubernetes-native way.

This post, aimed at experienced engineers, will cover the entire process of building a stable and cost-effective ingress system by integrating Amazon EKS with the AWS Load Balancer Controller in a production environment, from A to Z. We’ll go beyond simple controller installation to delve into IAM role setup, advanced configurations using essential annotations, and detailed solutions and best practices for problems you might encounter in the field.