Loading your tools...
Loading your tools...
Master advanced specific string manipulation. Build a robust HTML formatter and minifier without external dependencies, handling comments, self-closing tags, and custom indentation.
Formatters in 2024 are bloated.
Common libraries like js-beautify are massive (200KB+).
For 99% of use cases, you don't need a full AST parser. You just need a smart Indentation Engine.
In this detailed guide, we will build the exact Regex-Based Engine used in FastTools production. It handles:
How do you parse HTML without a parser? You exploit the structure of tags.
<div> always starts with < and ends with >.
The Magic Split:
const nodes = html.split(/>\s*</);
This splits <div><span>Text</span></div> into ["<div", "span>Text</span", "/div>"].
Note: We lose the < and > characters during split, so we must re-add them.
We use a simple integer indent to track depth.
<div): indent++</div): indent--<img): indent (No change)Production Regex for Tags:
/^\/\w/ (Starts with / like /div)/^<?\w[^>]*[^/]$/ (Starts with char, doesn't end with /)Some tags look like opening tags but never close (<input>, <br>, <meta>).
If you indent on <input>, your code will drift right forever.
const VOID_TAGS = ['area', 'base', 'br', 'col', 'embed', 'hr', 'img', 'input', 'link', 'meta', 'param', 'source', 'track', 'wbr'];
const isVoid = (tagName) => VOID_TAGS.includes(tagName);
Minification is the opposite of beautification. We want to destroy whitespace.
But we can't just replace(/\s/g, '') because that kills spaces inside text (e.g., "Hello World" -> "HelloWorld").
Safe Minify Logic:
.replace(/\s+/g, ' ').replace(/>\s+</g, '><')Production tools need options.
<!--[\s\S]*?-->. The [\s\S] trick matches newlines too!' '.repeat(size).Beautifier.tsx) 💻Here is the complete, robust component used in our production environment.
'use client';
import { useState } from 'react';
import { Copy, Download, Trash2, FileCode } from 'lucide-react';
export default function ProductionBeautifier() {
const [input, setInput] = useState('');
const [output, setOutput] = useState('');
const [mode, setMode] = useState('beautify'); // 'beautify' | 'minify'
const [indentSize, setIndentSize] = useState(2);
const [stripComments, setStripComments] = useState(false);
// --- CORE LOGIC START ---
const processHTML = () => {
if (!input.trim()) return;
let code = input;
// 1. Pre-processing (Comments)
if (stripComments) {
code = code.replace(/<!--[\s\S]*?-->/g, '');
}
// 2. Minification Path
if (mode === 'minify') {
const minified = code
.replace(/\s+/g, ' ') // Collapse whitespace
.replace(/>\s+</g, '><') // Remove tag gaps
.trim();
setOutput(minified);
return;
}
// 3. Beautification Path
let formatted = '';
let indent = 0;
const tab = ' '.repeat(indentSize);
code.split(/>\s*</).forEach(node => {
// Decrease indent for closing tags (e.g. /div)
if (node.match(/^\/\w/)) indent--;
// Prevent negative indent crash
const level = Math.max(0, indent);
// Reconstruct tag with padding
formatted += tab.repeat(level) + '<' + node + '>\n';
// Increase indent for opening tags (that are NOT void/self-closing)
// Regex checks: Starts with char, doesn't end with /
if (node.match(/^<?\w[^>]*[^/]$/)) {
// Check against list of known void tags
const tagName = node.match(/^<?(\w+)/)[1].toLowerCase();
const voidTags = ['img','input','br','hr','meta','link','base'];
if (!voidTags.includes(tagName)) {
indent++;
}
}
});
// Cleanup first/last lines from split artifacts
setOutput(formatted.substring(1, formatted.length - 2));
};
// --- CORE LOGIC END ---
// UI Helpers
const copy = () => navigator.clipboard.writeText(output);
const clear = () => { setInput(''); setOutput(''); };
return (
<div className="max-w-4xl mx-auto p-6 space-y-6 bg-slate-50 border rounded-xl">
<div className="flex justify-between items-center">
<h2 className="text-2xl font-bold flex gap-2 items-center text-slate-800">
<FileCode className="text-blue-600"/> HTML Engine
</h2>
{/* Controls */}
<div className="flex gap-2">
<select
className="p-2 rounded border bg-white text-sm"
value={mode} onChange={e => setMode(e.target.value)}
>
<option value="beautify">Beautify Mode</option>
<option value="minify">Minify Mode</option>
</select>
<select
className="p-2 rounded border bg-white text-sm"
value={indentSize} onChange={e => setIndentSize(Number(e.target.value))}
disabled={mode === 'minify'}
>
<option value={2}>2 Spaces</option>
<option value={4}>4 Spaces</option>
</select>
<button
onClick={() => setStripComments(!stripComments)}
className={`p-2 rounded border text-sm ${stripComments ? 'bg-red-100 text-red-700 border-red-200' : 'bg-white'}`}
>
{stripComments ? 'No Comments' : 'Keep Comments'}
</button>
</div>
</div>
<div className="grid md:grid-cols-2 gap-4">
{/* Input */}
<div className="space-y-2">
<div className="flex justify-between text-xs font-bold text-slate-500 uppercase">
<span>Input HTML</span>
<button onClick={clear} className="text-red-500 hover:text-red-700 flex gap-1 items-center">
<Trash2 size={12}/> Clear
</button>
</div>
<textarea
className="w-full h-80 p-4 rounded-lg border border-slate-300 font-mono text-xs focus:ring-2 focus:ring-blue-500 outline-none"
placeholder="Paste messy code..."
value={input}
onChange={e => setInput(e.target.value)}
/>
</div>
{/* Output */}
<div className="space-y-2">
<div className="flex justify-between text-xs font-bold text-slate-500 uppercase">
<span>Result</span>
<button onClick={copy} className="text-blue-600 hover:text-blue-800 flex gap-1 items-center">
<Copy size={12}/> Copy
</button>
</div>
<textarea
className="w-full h-80 p-4 rounded-lg border border-slate-300 bg-slate-900 text-green-400 font-mono text-xs"
readOnly
value={output}
placeholder="Clean code appears here..."
/>
</div>
</div>
<button
onClick={processHTML}
className="w-full py-4 bg-gradient-to-r from-blue-600 to-indigo-600 text-white font-bold rounded-lg hover:shadow-lg transition transform hover:-translate-y-0.5"
>
{mode === 'beautify' ? '✨ Beautify HTML' : '📦 Minify HTML'}
</button>
</div>
);
}
Since we use generic String methods (split, replace), this engine is incredibly fast.
It processes 1MB of HTML in milliseconds because it avoids the overhead of creating DOM Nodes.
However, if you paste invalid HTML (like missing brackets), the Regex split might behave unpredictably. This is why we call it a "Beautifier" (visual format only) and not a "Parser" (strict checking).
Developer Tools & Resource Experts
FastTools is dedicated to curating high-quality content and resources that empower developers. With nearly 5 years of hands-on development experience, our team rigorously evaluates every tool and API we recommend, ensuring you get only the most reliable and effective solutions for your projects.