Evaluating Vibe Coding as an AI-Orchestrated Development Methodology: A Case Study on Accelerating Complex Web-Based Educational Management Systems

Abstract

The emergence of generative AI has disrupted conventional software development practices, prompting considerable skepticism among IT professionals about whether such tools displace rather than augment human expertise. This study introduces "Vibe Coding" as a collaborative methodology — one in which AI operates as a capable partner, not a substitute — requiring human guidance for review, analysis, and iterative refinement of generated outputs; the primary objective is to assess whether Vibe Coding, when structured through Model Context Protocol (MCP) and schema engineering, can materially reduce development time for complex web systems — including CRUD operations, API integration, and custom business logic — relative to conventional approaches such as Waterfall. Two research questions drive the inquiry: (1) Can Vibe Coding compress development timelines for complex systems from months to days? and (2) How effective is AI as a collaborative partner in sustaining output quality through human-in-the-loop validation? A single case study approach was employed, applying the methodology to develop an ISO 9001:2015-compliant Management Information System (MIS) for Pondok Pesantren Abu Hurairah Mataram as a solo developer project, with metrics tracked across seven days including total development time, time per phase (planning, development, debugging, and deployment), proportion of AI-generated code (70–85%), prompt and iteration counts, bug frequency, debugging duration, total lines of code (LOC), and feature implementation success rate. Results show a completed system in seven days, with 70–85% of the codebase AI-generated and 15–30% manually refined for business logic, debugging, and performance tuning; human intervention effectively countered AI hallucinations throughout, repositioning the developer's role from syntax-level coding toward architectural orchestration and quality control. These findings suggest Vibe Coding raises productivity for solo developers in AI-saturated environments, though rigorous human oversight remains non-negotiable for production-grade systems.

Keywords: Vibe CodingAI-Assisted DevelopmentModel Context ProtocolHuman-in-the-LoopSoftware ProductivityCase Study

1. Introduction

The rapid integration of Large Language Models (LLMs) into software development workflows has sparked widespread debate within the software engineering community. Traditional developers often express skepticism, viewing AI-generated code as prone to bugs, design inconsistencies, and licensing vulnerabilities, while proponents argue that LLMs mark the end of traditional programming. This polarization obscures a more practical middle ground: the collaborative model where developers orchestrate, rather than perform, the writing of source code syntax.

In early 2025, the term "Vibe Coding" was introduced to describe a high-level development style where code is generated using natural language prompts. However, when left unstructured, Vibe Coding suffers from AI hallucination, context drift, and code regression, which render it unsuitable for production-grade systems. This paper proposes a structured Vibe Coding framework that relies on Model Context Protocol (MCP) to supply external documentation and schema engineering to enforce strict design boundaries.

To test this methodology, this study tracks the construction of an ISO 9001:2015-compliant Management Information System (MIS) for Pondok Pesantren Abu Hurairah Mataram, developed by a single programmer. The goal is to determine if structured Vibe Coding can safely compress traditional development timelines from several months to just a few days without sacrificing stability or compliance parameters.

2. The 3 Pillars of Structured Vibe Coding

Unstructured prompt engineering often leads to chaotic software iterations. To convert Vibe Coding into a repeatable engineering methodology, we establish three foundational pillars:

Pillar 1: Context Persistence & Model Context Protocol (MCP). LLMs are limited by context window sizes and decay of instructions over long chats. We address this using local Markdown configuration guides (e.g., rules.md files) in the agent's repository and implementing MCP servers. This enables the LLM to access up-to-date documentation, inspect schemas, and fetch specific files automatically, minimizing instructions drift.

Pillar 2: Schema Engineering First. Generating code without a strict data contract causes database misalignment and broken references. Our methodology mandates designing database schemas, Entity-Relationship Diagrams (ERD), and relational constraints before writing a single line of application logic. The generated code must conform to the established database contract.

Pillar 3: Conditional Tech Stack Selection. AI coding assistants perform best on opinionated, highly structured frameworks. Choosing Laravel 12 for the backend and Vue 3 / Svelte 5 for the frontend restricts the search space for the AI model, providing sensible defaults, strict folder structures, and clear conventions that prevent the LLM from hallucinating custom design patterns.

3. Case Study & Implementation (SIM-PAH)

We evaluated the methodology by building SIM-PAH, an operational management and compliance system serving 28 internal departments at Pondok Pesantren Abu Hurairah Mataram. The system was designed to handle inventory management, service requests, vehicle booking, and automated compliance tracking for 39 QMS ISO 9001:2015 operational procedures.

Laravel 12, Inertia.js, and Vue 3 were selected as the stack. Rather than using an admin framework builder like Filament, all CRUD views, forms, and tables were written from scratch to test the AI's ability to maintain design consistency across custom code. Over 7 days, the developer used a highly capable AI assistant, communicating through structured markdown prompts. The AI was given access to read the files, inspect database tables, and run migrations, while the human developer focused on review, security audits, and validation.

4. Results & Performance Metrics

The system was completed and successfully deployed within 7 days, demonstrating a massive acceleration in time-to-market compared to standard Waterfall cycles which estimate 3 to 6 months for projects of similar scope. We tracked specific engineering metrics during the implementation window:

Development Distribution: The completed codebase contains approximately 37,875 lines of code (excluding external packages) spread across 250+ custom files (Controllers, Models, migrations, and Vue views). The database schema encompasses 66 tables, supporting multi-tenant access controls for the 28 departments.

AI vs. Human Contribution: We estimate that 70-85% of the codebase was written by the generative AI, including boilerplate code, migrations, basic CRUD routes, and styling. The remaining 15-30% was written or modified by the human developer, focusing on complex authorization guards, custom multi-tenant database queries, payment integrations, and debugging subtle code regressions.

Debugging and Hallucinations: Over the 7 days, the developer encountered 8 major AI hallucinations. These included instances where the AI generated incorrect database relationships or suggested deprecated library methods. In each case, human-in-the-loop code review identified the issue, and the developer corrected the code manually or adjusted the prompt context to resolve the discrepancy.

5. Discussion & Human-in-the-Loop Validation

The case study demonstrates that structured Vibe Coding drastically raises developer output. However, it also highlights that human oversight remains crucial. Without a competent engineer performing review and security audits, the generated code would contain security vulnerabilities (such as weak multi-tenant isolation) and performance bottlenecks.

Instead of replacing the human programmer, Vibe Coding shifts the developer's role. The programmer becomes an architect, system designer, and code reviewer. The task changes from writing boilerplate syntax to analyzing architectural patterns, enforcing code quality, and validating business rules. The human developer's focus moves up the abstraction layer, leaving the execution of syntax to AI systems.

6. Conclusion

Structured Vibe Coding—guided by context persistence, database-first design, and stable framework defaults—enables solo developers to build complex, enterprise-ready systems in fraction of the time required by traditional methods. As generative models continue to advance, standardizing collaborative methodologies like Vibe Coding will be essential to establish clear safety protocols, maintain software quality, and maximize engineering velocity.

References

Adiatma, I. M. (2026). Evaluating Vibe Coding as an AI-Orchestrated Development Methodology: A Case Study on Accelerating Complex Web-Based Educational Management Systems. International Journal Software Engineering and Computer Science (IJSECS), 6(1), 313-322.
Model Context Protocol (MCP) Specification. (2024). Anthropic. https://modelcontextprotocol.org
ISO 9001:2015. Quality Management Systems - Requirements. International Organization for Standardization.
Otani, K., & Nakagawa, H. (2025). The Shift in Software Engineering Roles in the Era of Generative AI. Journal of Systems and Software Productivity, 44(2), 112-125.

Next Paper → Digitalization of ISO 9001:2015 Quality Management Systems: A Case Study on Multi-Tenant Boarding School Operations