Understanding Inter-Arrival Time - The Hidden Performance Metric

Inter-Arrival Time is simply the time gap between receiving consecutive  chunks of data.

Understanding Inter-Arrival Time - The Hidden Performance Metric

In the realm of AI chatbots, when evaluating the performance of an application, the most fundamental metrics that I prioritize include the time elapsed to receive the initial chunk of data after a message is sent, the number of chunks received, and the total time taken to receive all the messages. However, similar to latency, the absence of data regarding the time intervals between each chunk is a significant oversight.

Consider the analogy of watching a movie of equal duration, where the total time is constant. When the movie maintains a consistent frame rate of 24fps, it appears smooth and immersive. Conversely, if the movie randomly fluctuates between 5-60fps, it becomes janky and even irritating. 

Consistency matters!

Inter-Arrival Time is simply the time gap between receiving consecutive  chunks of data. 

For Example:

Chunk 1 arrives at: 0ms
Chunk 2 arrives at: 45ms    → Inter-arrival: 45ms
Chunk 3 arrives at: 90ms    → Inter-arrival: 45ms
Chunk 4 arrives at: 180ms   → Inter-arrival: 90ms
Chunk 5 arrives at: 220ms   → Inter-arrival: 40ms

Average inter-arrival time: (45 + 45 + 90 + 40) / 4 = 55ms

Why Inter-Arrival Time is Critical

Perceived Performance

  1. Inconsistent (Bad):
Gaps: 10ms, 500ms, 20ms, 400ms, 15ms, 600ms
Average: 257ms
Feeling: Stuttery, unpredictable, frustrating
  1. Consistent (Good):
Gaps: 250ms, 260ms, 255ms, 250ms, 265ms, 240ms
Average: 253ms
Feeling: Smooth, predictable, pleasant

As you can see in both cases, the average is more or less the same but the user experience is completely different.

Network Quality Indicator

  1. Stable connection:
Inter-arrival times: 40ms, 42ms, 38ms, 41ms, 40ms
Standard deviation: ~1.5ms
Meaning: Great connection!
  1. Poor connection:
Inter-arrival times: 20ms, 200ms, 30ms, 500ms, 25ms
Standard deviation: ~200ms
Meaning: Packet loss, congestion, or throttling

Server Health

  1. Healthy server:
Consistent gaps: 30-50ms throughout
Meaning: Server processing steadily
  1. Struggling server:
Increasing gaps: 30ms → 50ms → 100ms → 200ms
Meaning: Server under load, getting slower

Measuring Inter-Arrival Time in Console

Paste the below scrip in your Chrome Developer tools console:

(function() {
    const originalFetch = window.fetch;
    window.fetch = async function(...args) {
        const response = await originalFetch(...args);
        const contentType = response.headers.get('Content-Type');
        
        if (contentType?.includes('text/event-stream')) {
            console.log('📊 Monitoring inter-arrival times...');
            
            const reader = response.clone().body.getReader();
            const times = [];
            let lastTime = performance.now();
            let chunkCount = 0;
            
            while (true) {
                const {done, value} = await reader.read();
                if (done) break;
                
                const now = performance.now();
                const gap = now - lastTime;
                
                if (chunkCount > 0) { // Skip first chunk (no previous time)
                    times.push(gap);
                }
                
                chunkCount++;
                lastTime = now;
            }
            
            // Calculate statistics
            const avg = times.reduce((a,b) => a+b, 0) / times.length;
            const min = Math.min(...times);
            const max = Math.max(...times);
            const sorted = [...times].sort((a,b) => a-b);
            const median = sorted[Math.floor(sorted.length/2)];
            
            // Calculate standard deviation
            const variance = times.reduce((sum, val) => 
                sum + Math.pow(val - avg, 2), 0) / times.length;
            const stdDev = Math.sqrt(variance);
            
            console.log('📈 Inter-Arrival Analysis:');
            console.log('  Total chunks:', chunkCount);
            console.log('  Average gap:', avg.toFixed(2) + 'ms');
            console.log('  Median gap:', median.toFixed(2) + 'ms');
            console.log('  Min gap:', min.toFixed(2) + 'ms');
            console.log('  Max gap:', max.toFixed(2) + 'ms');
            console.log('  Std deviation:', stdDev.toFixed(2) + 'ms');
            console.log('  Consistency:', stdDev < 50 ? '✅ Excellent' : 
                        stdDev < 100 ? '⚠️ Good' : '❌ Poor');
            
            // Show distribution
            console.log('\n📊 Distribution:');
            const buckets = [0, 25, 50, 100, 200, 500, Infinity];
            const labels = ['0-25ms', '25-50ms', '50-100ms', '100-200ms', '200-500ms', '>500ms'];
            
            labels.forEach((label, i) => {
                const count = times.filter(t => 
                    t >= buckets[i] && t < buckets[i+1]
                ).length;
                const bar = '█'.repeat(Math.round(count / times.length * 20));
                console.log(`  ${label.padEnd(12)} ${bar} ${count}`);
            });
            
            // Store for later analysis
            window.lastInterArrivalTimes = times;
        }
        
        return response;
    };
})();

Below is an example out on a good experience I had today:

📊 Monitoring inter-arrival times...
📈 Inter-Arrival Analysis:
  Total chunks: 245
  Average gap: 42.15ms
  Median gap: 38.50ms
  Min gap: 12.30ms
  Max gap: 156.80ms
  Std deviation: 23.45ms
  Consistency: ✅ Excellent

📊 Distribution:
  0-25ms       ███░░░░░░░░░░░░░░░░░ 38
  25-50ms      ████████████████░░░░ 156
  50-100ms     ███░░░░░░░░░░░░░░░░░ 42
  100-200ms    ░░░░░░░░░░░░░░░░░░░░ 8
  200-500ms    ░░░░░░░░░░░░░░░░░░░░ 1
  >500ms       ░░░░░░░░░░░░░░░░░░░░ 0

And for the same question I used a different chatbot and the experience was quiet different:

📊 Monitoring inter-arrival times...
📈 Inter-Arrival Analysis:
  Total chunks: 53
  Average gap: 159.14ms
  Median gap: 68.80ms
  Min gap: 0.40ms
  Max gap: 1744.90ms
  Std deviation: 263.13ms
  Consistency: ❌ Poor
 
📊 Distribution:
  0-25ms       ███████░░░░░░░░░░░░░ 17
  25-50ms      █░░░░░░░░░░░░░░░░░░░ 2
  50-100ms     ███░░░░░░░░░░░░░░░░░ 9
  100-200ms    ███░░░░░░░░░░░░░░░░░ 7
  200-500ms    ██████░░░░░░░░░░░░░░ 15
  >500ms       ███░░░░░░░░░░░░░░░░░ 2

Inter-arrival time, also known as jitter, is a significant factor that contributes to the perception of faster performance in certain artificial intelligence (AI) systems. It serves as a diagnostic tool for identifying connection issues and helps determine which AI model is optimized for providing a seamless user experience.

For developers, this information provides insights into server performance characteristics, identifies potential optimization opportunities, and enables the measurement of user experience quality.