Schlemiel the Painter's Algorithm (in JavaScript)

Shlemiel gets a job as a street painter, painting the dotted lines down the middle of the road. On the first day he takes a can of paint out to the road and finishes 300 yards of the road. "That's pretty good!" says his boss, "you're a fast worker!" and pays him a kopeck.

The next day Shlemiel only gets 150 yards done. "Well, that's not nearly as good as yesterday, but you're still a fast worker. 150 yards is respectable," and pays him a kopeck.

The next day Shlemiel paints 30 yards of the road. "Only 30!" shouts his boss. "That's unacceptable! On the first day you did ten times that much work! What's going on?"

"I can't help it," says Shlemiel. "Every day I get farther and farther away from the paint can!"

(Joel on Software, Joel Spolsky)

In Korea, this problem is often referred to as the Russian Painter Algorithm(러시아 페인트공 알고리즘). In the Korean edition, the translator simplified "Shlemiel" to just "painter." "Shlemiel" is a Yiddish term used to describe a clumsy or unlucky person prone to making mistakes or causing problems for themselves—something often seen in Yiddish humor. Given the cultural nuances, directly translating "Shlemiel" may have been challenging, so it seems the translator chose a simpler interpretation.

Additionally, because the story mentions kopecks, a currency used in Russia and former Soviet countries, I believe this might have influenced the naming of the problem as the "Russian Painter Algorithm" in Korea.

Anyway, Let's dive in this!

Schlemiel the Painter's Algorithm

Schlemiel the Painter's Algorithm describes routines that seem efficient for small tasks but become significantly slower as the workload grows due to unnecessary repetitive operations at a lower level. Joel Spolsky explains this problem with the strcat function in C

void strcat( char* dest, char* src )
{
    while (*dest) dest++;
    while (*dest++ = *src++);
}

char string[1000];
string[0] = '\0';
strcpy(string, "one");
strcat(string, ", two");
strcat(string, ", three");
strcat(string, ", four");

After the first strcat, the string becomes "one, two,", then "one, two, three," and so on. The issue with strcat is that every time it's called, it has to scan the entire string from the beginning to find the null terminator (\0), which marks the end of the string.

In C, strings are stored as null-terminated character arrays. This means that the function must iterate over each character, starting from the first, until it finds the null terminator. As the string grows, this repeated scanning becomes increasingly inefficient, leading to performance degradation.

Here's an improved version using mystrcat:

char* mystrcat( char* dest, char* src )
{
    while (*dest) dest++;
    while (*dest++ = *src++);
    return --dest;  // Return pointer to the end of the new string
}

Unlike strcat, mystrcat does not need to scan the entire string every time it appends a new one. This ensures that the performance remains linear (O(n)) when concatenating multiple strings, avoiding the quadratic (O(n^2)) slowdown caused by strcat.

char bigString[1000];  // Allocate 1000 bytes
char *p = bigString;
bigString[0] = '\0';   // Initialize the string
p = mystrcat(p, "one, ");
p = mystrcat(p, "two, ");
p = mystrcat(p, "three, ");
p = mystrcat(p, "four ");

mystrcat appends strings while tracking the end of the current string with the p pointer, maintaining efficient string concatenation.

In javascript

When concatenating strings in JavaScript, there's no need for direct low-level memory manipulation like in C. However, it still involves creating new string objects and copying existing strings internally.

Strings in JavaScript are immutable, meaning once a string is created, it cannot be changed. So when concatenating two strings, JavaScript creates a new string object and copies the existing strings into this new object, appending the new string. For example, when using the + operator to concatenate strings, the JavaScript engine copies the two existing strings, then creates a new combined string. As the string grows, more memory is consumed due to copying.

Garbage Collection

After creating the new string, the old strings (if no longer referenced) become candidates for garbage collection. While you might think frequent garbage collection would impact performance, JavaScript engines perform garbage collection asynchronously, and many modern engines, such as V8, are optimized to reduce performance issues, using techniques like idle-time collection, where garbage collection happens during CPU idle time.

Here's an example of string concatenation with performance issues:

let result = ''
for (let i = 0; i < 10000; i++) {
  result += 'text' // Appending to the string
}
console.log(result)

In this example, every time the += operator is used, JavaScript creates a new string by copying the existing string and appending the new one. This results in performance degradation as the string grows, exemplifying Schlemiel the Painter's Algorithm.

Optimized Approach Using Arrays:

let parts = []
for (let i = 0; i < 10000; i++) {
  parts.push('text')
}
let result = parts.join('') // Join the array into a string
console.log(result)

In this optimized approach, strings are stored in an array and concatenated all at once using join(). This significantly improves performance because no intermediate strings are created during each iteration of the loop.

Another Inefficient Example:

let arr = []
for (let i = 0; i < 10000; i++) {
  arr = [...arr, i] // Creating a new array and copying the old one
}
console.log(arr.length)

This code, similar to the string concatenation example, is inefficient. The spread operator (...arr) copies all elements of the existing array into a new array on each iteration. As the array grows, this results in repeated memory allocations and copying, making it highly inefficient.

Optimized Solution Using push():

let arr = []
for (let i = 0; i < 10000; i++) {
  arr.push(i) // Efficiently adding elements to the array
}
console.log(arr.length)

In contrast, using arr.push(i) appends elements directly to the end of the array. This approach avoids copying the entire array during each iteration, making it much more efficient. When necessary, JavaScript dynamically expands the array's memory allocation to accommodate new elements, typically increasing its size by a factor (e.g., 1.5 or 2), depending on the engine.

Time Complexity: push() operates in O(1) time complexity for each operation. Only when the array needs to expand does a more expensive O(n) copying occur, but this happens infrequently.