Spaces:
Running
Running
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>SupraLabs | Hidden Topology: Depth vs. Width Scaling for SLMs</title> | |
| <script src="https://cdn.jsdelivr.net/npm/chart.js"></script> | |
| <style> | |
| :root { | |
| --bg: #0f0f0f; | |
| --surface: #1a1a1a; | |
| --border: #333; | |
| --text: #e0e0e0; | |
| --accent: #536bfe; /* Supra Blue */ | |
| --muted: #888; | |
| --success: #00e676; | |
| --warning: #ffb300; | |
| --font-mono: 'JetBrains Mono', 'Fira Code', monospace; | |
| } | |
| * { margin: 0; padding: 0; box-sizing: border-box; } | |
| body { | |
| background-color: var(--bg); | |
| color: var(--text); | |
| font-family: 'Inter', -apple-system, sans-serif; | |
| line-height: 1.6; | |
| padding: 2rem; | |
| } | |
| code, pre, .mono { font-family: var(--font-mono); } | |
| .container { max-width: 1000px; margin: 0 auto; } | |
| header { | |
| border-bottom: 2px solid var(--border); | |
| padding-bottom: 2rem; | |
| margin-bottom: 3rem; | |
| display: flex; | |
| justify-content: space-between; | |
| align-items: flex-end; | |
| } | |
| .logo-area a { text-decoration: none; color: inherit; } | |
| .logo-area h1 { | |
| font-size: 1.2rem; | |
| text-transform: uppercase; | |
| letter-spacing: 2px; | |
| color: var(--accent); | |
| line-height: 1; | |
| display: flex; | |
| align-items: center; | |
| gap: 10px; | |
| } | |
| nav a { | |
| color: var(--text); | |
| text-decoration: none; | |
| margin-left: 1.5rem; | |
| font-size: 0.9rem; | |
| border-bottom: 1px solid transparent; | |
| } | |
| nav a:hover { border-bottom: 1px solid var(--accent); } | |
| .hero { margin-bottom: 4rem; } | |
| .hero h2 { font-size: 3rem; line-height: 1.1; margin-bottom: 1.5rem; font-weight: 800; } | |
| .hero p { font-size: 1.2rem; color: var(--muted); max-width: 750px; } | |
| .section-label { | |
| display: block; | |
| font-family: var(--font-mono); | |
| color: var(--accent); | |
| font-size: 0.8rem; | |
| margin-top: 3rem; | |
| margin-bottom: 1rem; | |
| text-transform: uppercase; | |
| } | |
| .card { background: var(--surface); border: 1px solid var(--border); padding: 2.5rem; margin-bottom: 2rem; } | |
| h3 { font-size: 1.6rem; margin-bottom: 1rem; font-weight: 700; } | |
| p { margin-bottom: 1rem; color: #ccc; } | |
| ul { margin-left: 1.5rem; margin-bottom: 1.5rem; } | |
| li { margin-bottom: 0.5rem; } | |
| .winner-badge { color: var(--success); font-weight: bold; font-family: var(--font-mono); } | |
| .table-container { overflow-x: auto; margin: 2rem 0; border: 1px solid var(--border); } | |
| table { width: 100%; border-collapse: collapse; text-align: left; font-size: 0.95rem; } | |
| th, td { padding: 1rem; border-bottom: 1px solid var(--border); } | |
| th { background-color: rgba(26, 26, 26, 0.8); font-family: var(--font-mono); color: var(--accent); font-size: 0.85rem; text-transform: uppercase; } | |
| tr:hover { background-color: rgba(83, 107, 254, 0.05); } | |
| .highlight-row { border-left: 4px solid var(--success); background-color: rgba(0, 230, 118, 0.02); } | |
| .chart-box { background: var(--surface); border: 1px solid var(--border); padding: 2rem; margin-bottom: 2rem; } | |
| .stats-grid { | |
| display: grid; | |
| grid-template-columns: 1fr 1fr 1fr; | |
| gap: 1rem; | |
| margin-top: 4rem; | |
| border-top: 1px solid var(--border); | |
| padding-top: 2rem; | |
| } | |
| .stat-box { padding: 1rem; border-left: 2px solid var(--accent); } | |
| .stat-box small { display: block; color: var(--muted); font-family: var(--font-mono); } | |
| footer { margin-top: 6rem; padding-bottom: 2rem; font-size: 0.8rem; color: var(--muted); text-align: center; } | |
| @media (max-width: 768px) { | |
| .hero h2 { font-size: 2rem; } | |
| header { flex-direction: column; align-items: flex-start; gap: 1rem; } | |
| nav a { margin-left: 0; margin-right: 1rem; } | |
| .stats-grid { grid-template-columns: 1fr; } | |
| } | |
| </style> | |
| </head> | |
| <body> | |
| <div class="container"> | |
| <header> | |
| <div class="logo-area" style="font-size: 1.5em;"> | |
| <a href="index.html"><h1><img src="./image.png" style="height: 2em" alt="Logo"> SupraLabs_</h1></a> | |
| </div> | |
| <nav> | |
| <a href="#summary">Core Learnings</a> | |
| <a href="#benchmarks">Topology Matrix</a> | |
| <a href="#charts">Visualizations</a> | |
| <a href="https://huggingface.co/SupraLabs" target="_blank">HuggingFace</a> | |
| </nav> | |
| </header> | |
| <section class="hero"> | |
| <h2>Experiment #4:<br>Hidden Topology — Depth vs. Width</h2> | |
| <p>An isolated geometric sweep allocating a uniform parameter budget (~5M) to explore structural limits. We stressed tested deep sequential layers against shallow parallel computing layers over <strong>50,000,000 unique tokens</strong> using local bfloat16 hardware accelerator engines.</p> | |
| </section> | |
| <span class="section-label" id="summary">// Structural_Discoveries_&_Compute_Autobahns</span> | |
| <div class="card"> | |
| <h3>Shallow & Wide Monopolizes the Megabyte Scale</h3> | |
| <p>Traditional deep-network assumptions state that depth unlocks highly abstract multi-step logic circuits. Our empirical architectural investigation completely reverses this paradigm for Small Language Models (SLMs) under 10M parameters:</p> | |
| <ul> | |
| <li><strong>The Bottleneck of Extreme Depth:</strong> Forcing a 4.2M model into a 12-layer deep structure shrinks internal tracking states down to an informational choke point. The layers are too narrow to retain text variations simultaneously.</li> | |
| <li><strong>Massive Parallel Compute Highways:</strong> Truncating the architecture to 3 layers while expanding layer dimensions (Hidden: 256) establishes optimal neuronal real-estate. The weights map raw token dependencies instantly in parallel graphs.</li> | |
| <li><strong>Hardware Acceleration Exploitation:</strong> Because sequential deep layer dependencies are cut, GPU execution pipelines run unfettered. Model througput spikes by 91%, delivering optimal performance profiles at minimal energy footprint.</li> | |
| </ul> | |
| </div> | |
| <span class="section-label" id="benchmarks">// Topology_Evaluation_Data</span> | |
| <div class="card" style="padding: 1.5rem;"> | |
| <h3>Controlled Architectural Sweeps</h3> | |
| <p>Pretrained via optimized hyperparameter constants. Perplexity (PPL) and Lower Pretrain Loss track the precision of language acquisition.</p> | |
| <div class="table-container"> | |
| <table> | |
| <thead> | |
| <tr> | |
| <th>Structural Metric</th> | |
| <th>Exp 1: Deep & Narrow</th> | |
| <th>Exp 2: Balanced Baseline</th> | |
| <th style="color: var(--success)">Exp 3: Shallow & Wide (🏆 Win)</th> | |
| </tr> | |
| </thead> | |
| <tbody> | |
| <tr> | |
| <td class="mono">Layer Configuration</td> | |
| <td>12 Layers (Hidden: 128)</td> | |
| <td>6 Layers (Hidden: 192)</td> | |
| <td style="color: var(--success)">3 Layers (Hidden: 256)</td> | |
| </tr> | |
| </tr> | |
| <tr> | |
| <td class="mono">Active Model Parameters</td> | |
| <td>4,197,504</td> | |
| <td>5,114,304</td> | |
| <td>5,244,672</td> | |
| </tr> | |
| <tr> | |
| <td class="mono">Pretrain Step Loss (↓)</td> | |
| <td>4.539</td> | |
| <td>4.345</td> | |
| <td style="color: var(--success); font-weight: bold;">4.188</td> | |
| </tr> | |
| <tr> | |
| <td class="mono">Pretrain Train Loss (↓)</td> | |
| <td>5.567</td> | |
| <td>5.302</td> | |
| <td style="color: var(--success);">5.093</td> | |
| </tr> | |
| <tr> | |
| <td class="mono">ARC-Easy Zero-Shot (↑)</td> | |
| <td>29.17%</td> | |
| <td>29.63%</td> | |
| <td class="winner-badge">29.97%</td> | |
| </tr> | |
| <tr> | |
| <td class="mono">Wikitext Word PPL (↓)</td> | |
| <td>817.8844</td> | |
| <td>585.5899</td> | |
| <td class="winner-badge">418.6314</td> | |
| </tr> | |
| <tr> | |
| <td class="mono">Compute Throughput (⚡)</td> | |
| <td>218.7 samples/sec</td> | |
| <td>282.9 samples/sec</td> | |
| <td class="winner-badge">417.9 samples/sec</td> | |
| </tr> | |
| <tr class="highlight-row"> | |
| <td style="font-weight: bold;">EFFICIENCY MATRIX</td> | |
| <td style="color: var(--warning)">Severe Structural Bottleneck</td> | |
| <td>Intermediate Compression</td> | |
| <td style="color: var(--success); font-weight: bold;">SOTA TOPOLOGY SWEETSPOT</td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| </div> | |
| </div> | |
| <span class="section-label" id="charts">// Visualizing_Structural_Performance</span> | |
| <div class="chart-box"> | |
| <h3>Downstream Perplexity Collapse vs. Topology</h3> | |
| <div style="position: relative; height:350px; width:100%"> | |
| <canvas id="topologyChart"></canvas> | |
| </div> | |
| </div> | |
| <div class="chart-box"> | |
| <h3>The Hardware Advantage: Active Processing Speeds</h3> | |
| <p style="font-size: 0.85rem; color: var(--muted); margin-bottom: 1.5rem;">Fewer structural steps cut matrix synchronization gaps, unlocking maximum execution parallelization on local Tensor Core engines.</p> | |
| <div style="position: relative; height:350px; width:100%"> | |
| <canvas id="throughputChart"></canvas> | |
| </div> | |
| </div> | |
| <section class="stats-grid" id="hardware"> | |
| <div class="stat-box"> | |
| <small>COMPUTE TENSOR ENGINE</small> | |
| <strong>Local bfloat16 Hardware Run</strong> | |
| </div> | |
| <div class="stat-box"> | |
| <small>CONSTANT SEARCH ENGINE</small> | |
| <strong>Optuna Peak LR 0.001178</strong> | |
| </div> | |
| <div class="stat-box"> | |
| <small>OPTIMAL LAYOUT PROFILE</small> | |
| <strong>3-Layer Wide Configuration</strong> | |
| </div> | |
| </section> | |
| <footer> | |
| <p>© 2026 SupraLabs. High performance. Small footprints. Proudly open-source.</p> | |
| </footer> | |
| </div> | |
| <script> | |
| // Topology Metrics Chart | |
| const ctxTopo = document.getElementById('topologyChart').getContext('2d'); | |
| new Chart(ctxTopo, { | |
| type: 'bar', | |
| data: { | |
| labels: ['Deep & Narrow (12L)', 'Balanced (6L)', 'Shallow & Wide (3L)'], | |
| datasets: [ | |
| { | |
| label: 'Wikitext Word Perplexity (Lower = Better)', | |
| data: [817.88, 585.58, 418.63], | |
| backgroundColor: ['rgba(255, 179, 0, 0.2)', 'rgba(83, 107, 254, 0.2)', 'rgba(0, 230, 118, 0.3)'], | |
| borderColor: ['#ffb300', '#536bfe', '#00e676'], | |
| borderWidth: 2 | |
| } | |
| ] | |
| }, | |
| options: { | |
| responsive: true, | |
| maintainAspectRatio: false, | |
| plugins: { legend: { labels: { color: '#bbb' } } }, | |
| scales: { | |
| y: { grid: { color: '#222' }, ticks: { color: '#888' } }, | |
| x: { grid: { display: false }, ticks: { color: '#aaa' } } | |
| } | |
| } | |
| }); | |
| // Throughput Chart | |
| const ctxSpeed = document.getElementById('throughputChart').getContext('2d'); | |
| new Chart(ctxSpeed, { | |
| type: 'line', | |
| data: { | |
| labels: ['Deep & Narrow', 'Balanced Baseline', 'Shallow & Wide'], | |
| datasets: [{ | |
| label: 'Pretrain Throughput (samples/sec)', | |
| data: [218.7, 282.9, 417.9], | |
| borderColor: '#536bfe', | |
| backgroundColor: 'rgba(83, 107, 254, 0.1)', | |
| fill: true, | |
| tension: 0.2, | |
| borderWidth: 3 | |
| }] | |
| }, | |
| options: { | |
| responsive: true, | |
| maintainAspectRatio: false, | |
| plugins: { legend: { labels: { color: '#bbb' } } }, | |
| scales: { | |
| y: { grid: { color: '#222' }, ticks: { color: '#888' } }, | |
| x: { grid: { color: '#222' }, ticks: { color: '#aaa' } } | |
| } | |
| } | |
| }); | |
| </script> | |
| </body> | |
| </html> |