Incremental Static Regeneration (ISR) represents the next evolution of static sites, blending the performance of pre-built content with the dynamism of runtime generation. While Jekyll excels at build-time static generation, it traditionally lacks ISR capabilities. However, by leveraging Cloudflare Workers and KV storage, we can implement sophisticated ISR patterns that serve stale content while revalidating in the background. This technical guide explores the architecture and implementation of a custom ISR system for Jekyll that provides sub-millisecond cache hits while ensuring content freshness through intelligent background regeneration.
The ISR architecture for Jekyll requires multiple cache layers and intelligent routing logic. At its core, the system must distinguish between build-time generated content and runtime-regenerated content while maintaining consistent URL structures and caching headers. The architecture comprises three main layers: the edge cache (Cloudflare CDN), the ISR logic layer (Workers), and the origin storage (GitHub Pages).
Each request flows through a deterministic routing system that checks cache freshness, determines revalidation needs, and serves appropriate content versions. The system maintains a content versioning schema where each page is associated with a content hash and timestamp. When a request arrives, the Worker checks if a fresh cached version exists. If stale but valid content is available, it's served immediately while triggering asynchronous revalidation. For completely missing content, the system falls back to the Jekyll origin while generating a new ISR version.
// Architecture Flow:
// 1. Request → Cloudflare Edge
// 2. Worker checks KV for page metadata
// 3. IF fresh_cache_exists → serve immediately
// 4. ELSE IF stale_cache_exists → serve stale + trigger revalidate
// 5. ELSE → fetch from origin + cache new version
// 6. Background: revalidate stale content → update KV + cache
The Cloudflare Worker serves as the ISR engine, intercepting all requests and applying the regeneration logic. The implementation requires careful handling of response streaming, error boundaries, and cache coordination.
Here's the core Worker implementation for ISR routing:
export default {
async fetch(request, env, ctx) {
const url = new URL(request.url);
const cacheKey = generateCacheKey(url);
// Check for fresh content in KV and edge cache
const { value: cachedHtml, metadata } = await env.ISR_KV.getWithMetadata(cacheKey);
const isStale = isContentStale(metadata);
if (cachedHtml && !isStale) {
return new Response(cachedHtml, {
headers: { 'X-ISR': 'HIT', 'Content-Type': 'text/html' }
});
}
if (cachedHtml && isStale) {
// Serve stale content while revalidating in background
ctx.waitUntil(revalidateContent(url, env));
return new Response(cachedHtml, {
headers: { 'X-ISR': 'STALE', 'Content-Type': 'text/html' }
});
}
// Cache miss - fetch from origin and cache
return handleCacheMiss(request, url, env, ctx);
}
}
async function revalidateContent(url, env) {
try {
const originResponse = await fetch(url);
if (originResponse.ok) {
const content = await originResponse.text();
const hash = generateContentHash(content);
await env.ISR_KV.put(
generateCacheKey(url),
content,
{
metadata: {
lastValidated: Date.now(),
contentHash: hash
},
expirationTtl: 86400 // 24 hours
}
);
}
} catch (error) {
console.error('Revalidation failed:', error);
}
}
Cloudflare KV provides the persistent storage layer for ISR metadata and content versioning. Each cached page requires careful metadata management to track freshness and content integrity.
The KV schema design must balance storage efficiency with quick retrieval. Each cache entry contains the rendered HTML content and metadata including validation timestamp, content hash, and regeneration frequency settings. The metadata enables intelligent cache invalidation based on both time-based and content-based triggers.
// KV Schema Design:
{
key: `isr::${pathname}::${contentHash}`,
value: renderedHTML,
metadata: {
createdAt: timestamp,
lastValidated: timestamp,
contentHash: 'sha256-hash',
regenerateAfter: 3600, // seconds
priority: 'high|medium|low',
dependencies: ['/api/data', '/_data/config.yml']
}
}
// Content hashing implementation
function generateContentHash(content) {
const encoder = new TextEncoder();
const data = encoder.encode(content);
return crypto.subtle.digest('SHA-256', data)
.then(hash => {
const hexArray = Array.from(new Uint8Array(hash));
return hexArray.map(b => b.toString(16).padStart(2, '0')).join('');
});
}
The revalidation logic determines when and how content should be regenerated. The system implements multiple revalidation strategies: time-based TTL, content-based hashing, and dependency-triggered invalidation.
Time-based revalidation uses configurable TTLs per content type. Blog posts might revalidate every 24 hours, while product pages might refresh every hour. Content-based revalidation compares hashes between cached and origin content, only updating when changes are detected. Dependency tracking allows pages to be invalidated when their data sources change, such as when Jekyll data files are updated.
// Advanced revalidation with multiple strategies
async function shouldRevalidate(url, metadata, env) {
// Time-based revalidation
const timeElapsed = Date.now() - metadata.lastValidated;
if (timeElapsed > metadata.regenerateAfter * 1000) {
return { reason: 'ttl_expired', priority: 'high' };
}
// Content-based revalidation
const currentHash = await fetchContentHash(url);
if (currentHash !== metadata.contentHash) {
return { reason: 'content_changed', priority: 'critical' };
}
// Dependency-based revalidation
const depsChanged = await checkDependencies(metadata.dependencies);
if (depsChanged) {
return { reason: 'dependencies_updated', priority: 'medium' };
}
return null;
}
// Background revalidation queue
async processRevalidationQueue() {
const staleKeys = await env.ISR_KV.list({
prefix: 'isr::',
limit: 100
});
for (const key of staleKeys.keys) {
if (await shouldRevalidate(key)) {
ctx.waitUntil(revalidateContentByKey(key));
}
}
}
Jekyll must be configured to work with the ISR system through content hashing and build metadata generation. This involves creating a post-build process that generates content manifests and hash files.
Implement a Jekyll plugin that generates content hashes during build and creates a manifest file mapping URLs to their content hashes. This manifest enables the ISR system to detect content changes without fetching entire pages.
# _plugins/isr_generator.rb
Jekyll::Hooks.register :site, :post_write do |site|
manifest = {}
site.pages.each do |page|
next if page.url.end_with?('/') # Skip directories
content = File.read(page.destination(''))
hash = Digest::SHA256.hexdigest(content)
manifest[page.url] = {
hash: hash,
generated: Time.now.iso8601,
dependencies: extract_dependencies(page)
}
end
File.write('_site/isr-manifest.json', JSON.pretty_generate(manifest))
end
def extract_dependencies(page)
deps = []
# Extract data file dependencies from page content
page.content.scan(/site\.data\.([\w.]+)/).each do |match|
deps << "_data/#{match[0]}.yml"
end
deps
end
Monitoring ISR performance requires custom metrics tracking cache hit rates, revalidation success, and latency impacts. Implement comprehensive logging and analytics to optimize ISR configuration.
Use Workers analytics to track cache performance metrics:
// Enhanced response with analytics
function createISRResponse(content, cacheStatus) {
const headers = {
'Content-Type': 'text/html',
'X-ISR-Status': cacheStatus,
'X-ISR-Cache-Hit': cacheStatus === 'HIT' ? '1' : '0'
};
// Log analytics
const analytics = {
url: request.url,
cacheStatus: cacheStatus,
responseTime: Date.now() - startTime,
contentLength: content.length,
userAgent: request.headers.get('user-agent')
};
ctx.waitUntil(logAnalytics(analytics));
return new Response(content, { headers });
}
// Cache efficiency analysis
async function generateCacheReport(env) {
const keys = await env.ISR_KV.list({ prefix: 'isr::' });
let hits = 0, stale = 0, misses = 0;
for (const key of keys.keys) {
const metadata = key.metadata;
if (metadata.hitCount > 0) {
hits++;
} else if (metadata.lastValidated < Date.now() - 3600000) {
stale++;
} else {
misses++;
}
}
return {
total: keys.keys.length,
hitRate: (hits / keys.keys.length) * 100,
staleRate: (stale / keys.keys.length) * 100,
efficiency: ((hits + stale) / keys.keys.length) * 100
};
}
By implementing this ISR system, Jekyll sites gain dynamic regeneration capabilities while maintaining sub-100ms response times. The architecture provides 99%+ cache hit rates for popular content while ensuring freshness through intelligent background revalidation. This technical implementation bridges the gap between static generation and dynamic content, providing the best of both worlds for high-traffic Jekyll sites.