Optimization Problem 1.3: String Permutations

This next problem also comes from Cracking the Coding Interview.

Problem 3: Given two strings, write a method to decide if one is a permutation of the other.

Easy Solution

The expected solution here is, of course, straightforward as this is meant as an easy interview question. For each string, create a map from each character in the string to the number of times the character appears. Then, compare the maps.

Naturally, the AI autocomplete gives an incomplete solution.

// Solution3A uses a straightforward solution using character counts.
// Autocompleted solution.  
func Solution3A(left, right string) bool {  
  leftCounts := make(map[rune]int)  
  for _, r := range left {  
   leftCounts[r]++  
  }  

  rightCounts := make(map[rune]int)  
  for _, r := range right {  
   rightCounts[r]++  
  }  

  // Autocomplete begins here.  
  for r, lc := range leftCounts {  
   rc, ok := rightCounts[r]  
   if !ok || lc != rc {  
    return false  
   }  
  }  

  return true  
}

While at first glance it isn’t obvious what is wrong, a careful check should reveal that this code does not handle the case where there are characters present in right which are not present in left. We can verify this bug with tests:

func TestSolution3(t *testing.T) {  
  tt := []Problem3TestCase{  
   {name: "empty", left: "", right: "", want: true},  
   {name: "left empty", left: "", right: "a", want: false},  
   {name: "right empty", left: "a", right: "", want: false},  
  }  

  for _, solution := range solution3s {  
   for _, tc := range tt {  
    if got := solution.F(tc.left, tc.right); got != tc.want {  
     t.Errorf("%s(%q, %q) = %v, want %v", solution.Name, tc.left, tc.right,
       got, tc.want)  
    }  
   }  
  }  
}

The fix is to ensure this can’t happen by checking the character sets.

// If the lengths of the character counts are unequal, then a character is
// missing from one string.  
if len(leftCounts) != len(rightCounts) {  
  return false  
}

Together, verifying the key sets are of equal size, and checking the equivalence of all counts of keys one set are equivalent to the other is sufficient to ensure the strings are permutations. (We don’t need to do a symmetric check.)

There’s an even faster possibility, though.

// Strings of different lengths cannot have the same counts of characters.  
if len(left) != len(right) {  
  return false  
}

We can exit early when strings are of unequal lengths. This covers all cases for the key set size check, is faster, and exits before we create the maps so it is obviously better.

Benchmarking

But … how much better can we do? As with any optimization problem, we should define what “normal” input looks like before committing to this. As with Problem 1, we’ll make the initial simplifying assumption that all characters are ASCII.

We need to decide how we’re going to benchmark this.

It isn’t interesting to benchmark the case of strings of unequal lengths since it’s just a string length check and there’s nothing to optimize, so for our benchmarks .

The case of nearly equal strings will look significantly different from the case where strings are completely different. So we should have one benchmark for strings we know are permutations, and one where it they aren’t (or at least, it is unlikely).

Map iteration in Go is random, so sometimes nearly equal strings will still be identified as such immediately, and for others it may take until the last iteration. We can’t really do anything about this so long as solutions involve maps, but it is a source of randomness that we should be aware of.

As we’ve seen, autocomplete does tend to get things reasonably right. In this case autocomplete did successfully figure out that I wanted to generate slices of data, incorporating functions I’d written in the same package (RandomString and RandomASCII). It just – amusingly – didn’t create a source of randomness, instead invalidly passing nil.

lengths := []int{1, 2, 5, 10, 20, 50, 100}
// Begin autocomplete
data := make([]LengthData, len(lengths))  
for j, length := range lengths {  
  data[j] = LengthData{length: length, data: make([]string, n)}  
  for i := 0; i < n; i++ {  
   data[j].data[i] = RandomString(nil, length, RandomASCII)  
  }  
}

The naive solutions are actually surprisingly slow:

BenchmarkS3Different/Solution3A2/1-16         10935601           100.3 ns/op
BenchmarkS3Different/Solution3A2/2-16          9316386           121.7 ns/op
BenchmarkS3Different/Solution3A2/5-16          5554606           209.3 ns/op
BenchmarkS3Different/Solution3A2/10-16         1032976          1103 ns/op
BenchmarkS3Different/Solution3A2/20-16          431102          2719 ns/op
BenchmarkS3Different/Solution3A2/50-16          185488          6323 ns/op
BenchmarkS3Different/Solution3A2/100-16          84954         13117 ns/op
BenchmarkS3Permutation/Solution3A2/1-16       10739696           105.8 ns/op
BenchmarkS3Permutation/Solution3A2/2-16        8149945           147.3 ns/op
BenchmarkS3Permutation/Solution3A2/5-16        3648794           321.4 ns/op
BenchmarkS3Permutation/Solution3A2/10-16        919902          1307 ns/op
BenchmarkS3Permutation/Solution3A2/20-16        379916          3124 ns/op
BenchmarkS3Permutation/Solution3A2/50-16        142298          7322 ns/op
BenchmarkS3Permutation/Solution3A2/100-16        78962         14562 ns/op

We’re taking around 100 ns to process each character, which is about 10 MB/s. So this solution could easily be a performance bottleneck in a system.

There’s also a few surprising properties the benchmarks reveal:

  1. There is a huge jump from 5 character strings to 10. This is likely due to needing to reallocate the array underlying the map when it grows beyond 8 keys.
  2. There isn’t a huge difference between checking strings which are permutations from strings which are not. The largest difference is at strings of length 5, a 50% increase, but is otherwise 5%-10%.

The Array Solution

Given the massive success of arrays for Problem 1, does a similar optimization here also work?

func Solution3B(left, right string) bool {  
  if len(left) != len(right) {  
   return false  
  }  

  counts := make([]int, 128)  
  for _, r := range left {  
   // Keep track of character counts.  
   counts[r]++  
  }  
  for _, r := range right {  
   // If we see a character and there isn't at least one unaccounted for,
   // we can exit immediately.  
   if counts[r] < 1 {  
    return false  
   }  
   // One instance of this character in left is now accounted for in right.  
   counts[r]--  
  }  

  return true  
}

Instead of tracking the two sets of keys and counts in separate maps, we track counts in a single array that can handle any character from 7-bit ASCII. This has the advantage of not needing to use maps, and requires a single 1kb memory allocation.

With memory allocations, what really matters is the number of times you request memory rather than the amount, when making requests smaller than a – normally 4kb – page of memory. Further, we’re only referencing a single heap-allocated object – the counts slice – rather than two maps.

These results are much better – from 5x faster for small strings to about 100x faster for long strings.

BenchmarkS3Different/Solution3B/1-16         63929632            18.74 ns/op
BenchmarkS3Different/Solution3B/2-16         59162731            19.68 ns/op
BenchmarkS3Different/Solution3B/5-16         51334929            22.14 ns/op
BenchmarkS3Different/Solution3B/10-16        39485048            27.82 ns/op
BenchmarkS3Different/Solution3B/20-16        29531452            37.35 ns/op
BenchmarkS3Different/Solution3B/50-16        14044939            82.91 ns/op
BenchmarkS3Different/Solution3B/100-16        8684762           136.2 ns/op
BenchmarkS3Permutation/Solution3B/1-16       61517580            18.67 ns/op
BenchmarkS3Permutation/Solution3B/2-16       58104522            20.36 ns/op
BenchmarkS3Permutation/Solution3B/5-16       44546454            26.14 ns/op
BenchmarkS3Permutation/Solution3B/10-16      30614641            35.46 ns/op
BenchmarkS3Permutation/Solution3B/20-16      19843489            55.30 ns/op
BenchmarkS3Permutation/Solution3B/50-16       8272191           142.4 ns/op
BenchmarkS3Permutation/Solution3B/100-16      4962151           243.1 ns/op

These results are around 1-2.5 ns/character, so we’re at a processing speed that is significantly faster than data transfer or disk reading.

I tried some other possible optimizations – such as using a raw array instead of a slice for counts, but this didn’t cause any noticeable improvement. This is unsurprising as we allocate the same amount of memory for all cases. The cpu profile also doesn’t suggest any obvious avenues of improvement:

There aren’t obvious interesting changes to make without moving to Unicode, so this seems like a good place to stop.

Leave a Reply