I’m preparing to go back on the software job market, and to sharpen my skills I’m going back through books I initially worked through as a novice. I’ve got over a decade more experience now, and so I want to approach these problems as a seasoned developer – to try and find something more interesting than just solutions to problems that could be uncovered in half an hour.
To begin, I’m starting with Cracking the Coding Interview, by Gayle Laakmann McDowell.
“Problem 1.1 Implement an algorithm to determine if a string has all unique characters. What if you cannot use additional data structures?”
The first part of this question is straightforward to do with a hash map. In fact, it’s so straightforward that generative AI can get you most of the way towards the answer.

// Solution1A is a straightforward attempt to solve the problem with a hashmap.
func Solution1A(s string) bool {
seen := make(map[rune]bool)
for _, r := range s {
if seen[r] {
return false
}
seen[r] = true
}
return true
}
Benchmarking
This wouldn’t be a fun exploration of the problem if we didn’t include some benchmarks. Suppose this uniqueness-detecting algorithm were in a location critical to performance. How well do we do, and can we do any better?
So we need some test data. Ideally test data for benchmarking reflects actual call patterns. Since the input data are strings, the main properties of the input that matter for performance are:
- the distribution of string lengths,
- the set of possible characters, and
- the distribution of characters.
Possible length distributions:
- All the same
- Uniform in a range
- Poisson distribution (English word lengths appear to have this distribution)
Possible character sets:
- All-caps letters (26 possibilities)
- Upper and lowercase letters (52 possibilities)
- All printable ASCII (95 possibilities)
- All Unicode characters as of Unicode 16.0 (155,063 possibilities)
Possible character distributions:
- Uniform distribution
- Pareto distribution (similar to actual text?)
Like with any good optimization problem, different choices here may result in different optimal solutions. Some may even completely eliminate certain solutions.
To start, we’ll go with:
- All strings are of length 12
- All characters are printable ASCII
- All characters are equally likely
Twelve is a nice number of characters to begin with as approximately half of all strings will have a repeated character. Using the birthday problem formula, we can see:
95!
--- x 95^12 ~= 0.484
(95-12)!
For the benchmark we’ll generate 1,024 random strings, and cycle through evaluating them.
const (
n = 1 << 10
nMask = n - 1
)
func BenchmarkSolution1(b *testing.B) {
rng := rand.New(rand.NewSource(time.Now().UnixNano()))
randomStrings := make([]string, n)
for i := 0; i < n; i++ {
randomStrings[i] = RandomString(rng, 12, RandomASCII)
}
for solution, f := range solutions {
b.Run(solution, func(b *testing.B) {
for i := 0; i < b.N; i++ {
f(randomStrings[i&nMask])
}
})
}
}
Hash Map Solution
As it turns out, while our initial solution works, it’s pretty slow for what it does:
BenchmarkSolution1/Solution1_A
BenchmarkSolution1/Solution1_A-64 1958344 602.7 ns/op
So … why is it slow? For this, we turn to the handy builtin profiler that allows is automatically enabled during testing.
go test ./chapter1/ -run=NONE -bench=BenchmarkSolution1/Solution1A -cpuprofile=cpu.out

The big things to notice here are:
- Only 89% of the time is Solution1A running (
chapter1.Solution1A) - 68% of the time is creating new entries in the map (
runtime.mapassign_fast32) - 16% of the time is accessing map entries (
`runtime.mapaccess1_fast32) - 2% of the time is unavoidable runtime calls (
runtime.rand)
Note that the call to runtime.rand is not related to the benchmark setup code – you can see this for yourself if you remove the random string generation, the call is still present in the profile.
So in total, 84%/89%, or 94% of our time is spent dealing with the hash map. Even if we made all other parts of our logic free – taking zero time – at best we could save 36ns out of over 600ns. If there is an improvement, it’s either in how we’re using the hash map or just the fact that we are using a hash map.
For better ways of using the hash map – we could try reducing the size of the keys, or preallocate the map so it doesn’t ever need to be reallocated. In the first case, the idea is to reduce the size of the map’s keys, possibly saving time in memory allocation and in the runtime of the hash algorithm. For the latter we preallocate the map at size 12 (which it should never grow beyond, as strings have exactly 12 characters), saving that time. Implemented:
// Solution1A1 is a slight modification to Solution1A where runes are reduced to
// bytes as all are assumed to be ASCII.
func Solution1A1(s string) bool {
seen := make(map[byte]bool)
for _, r := range s {
if seen[byte(r)] {
return false
}
seen[byte(r)] = true
}
return true
}
// Solution1A2 is a slight modification to Solution1A where maps are initialized
// to size 12 to avoid needing to reallocate the underlying array.
func Solution1A2(s string) bool {
seen := make(map[rune]bool, 12)
for _, r := range s {
if seen[r] {
return false
}
seen[r] = true
}
return true
}
Trying these with our benchmark:
BenchmarkSolution1/Solution1A-64 2068266 617.7 ns/op
BenchmarkSolution1/Solution1A1-64 1332189 939.7 ns/op
BenchmarkSolution1/Solution1A2-64 2088864 572.0 ns/op
We can see that there is a notable regression for converting runes to bytes, and a slight improvement for avoiding reallocating the map. (I tried other values, such as 11, and saw no further improvement)
There could be other improvements using Go’s native hash map, but I’m not aware of them.
Array Solution
Notice that we have a fairly small number of possible characters – 95. Also notice that their byte values range from 32 to 126, all under 128. Thus, if we initialized a bit array of length 128, we could store in that whether we’ve seen a character corresponding to each value.
// Solution1B is like the above but uses a bit array.
func Solution1B(s string) bool {
seen := make([]bool, 1<<7)
for _, r := range s {
if seen[r] {
return false
}
seen[r] = true
}
return true
}
We can immediately see a drastic improvement:
BenchmarkSolution1/Solution1A-64 1892926 626.3 ns/op
BenchmarkSolution1/Solution1A1-64 1231104 948.8 ns/op
BenchmarkSolution1/Solution1A2-64 2023808 573.0 ns/op
BenchmarkSolution1/Solution1B-64 82965651 13.96 ns/op
Our bit array is about 50x faster than our hash map solution. Looking at the CPU profile, we can see that nearly all of the time is spent in the solution logic (0.94s of 0.96s) – any potential gains are likely in the code of that function.

But what gains are even possible?
One possibility is avoiding the heap allocation incurred by defining a slice instead of using an array directly.
// Solution1B1 is like the above but uses an array directly instead of a slice.
func Solution1B1(s string) bool {
var seen [1 << 7]bool
for _, r := range s {
if seen[r] {
return false
}
seen[r] = true
}
return true
}
This results in a slight (but reliable) slowdown:
BenchmarkSolution1/Solution1B-64 89289081 13.75 ns/op
BenchmarkSolution1/Solution1B1-64 84424354 13.90 ns/op
Another possibility is to avoid allocating the slice with each call, instead copying empty data into it.
var (
seenC = make([]bool, 1<<7)
seenC2 = make([]bool, 1<<7)
)
// Solution1B2 is like the above but uses an array directly instead of a slice.
func Solution1B2(s string) bool {
copy(seenC, seenC2)
for _, r := range s {
if seenC[r] {
return false
}
seenC[r] = true
}
return true
}
This option also results in a slowdown, larger than before:
BenchmarkSolution1/Solution1B-64 89082355 13.53 ns/op
BenchmarkSolution1/Solution1B1-64 88801987 13.77 ns/op
BenchmarkSolution1/Solution1B2-64 76817073 15.82 ns/op
Alternatively, we could even hard code the number of iterations (since we know every string is of length 12). The idea here is that it avoids a quirk of how Go iterates over strings.
// Solution1B3 is like 1B but iterates a static number of times.
func Solution1B3(s string) bool {
seen := make([]bool, 1<<7)
for i := 0; i < 12; i++ {
r := s[i]
if seen[r] {
return false
}
seen[r] = true
}
return true
}
This does result in a noticeable speedup, at the cost of making the code more brittle. It isn’t obvious to me why there is a speedup. It could be that creating the (index, rune) tuple is expensive, or it could have to do with how Go checks if it is done iterating.
BenchmarkSolution1/Solution1B-64 85514907 14.02 ns/op
BenchmarkSolution1/Solution1B1-64 83047393 14.11 ns/op
BenchmarkSolution1/Solution1B2-64 74333500 16.27 ns/op
BenchmarkSolution1/Solution1B3-64 97826607 12.03 ns/op
However, we don’t actually have to make the code brittle to realize the same gain.
// Solution1B4 is like 1B but uses the string length to set the number of iterations.
func Solution1B4(s string) bool {
seen := make([]bool, 1<<7)
for i := 0; i < len(s); i++ {
r := s[i]
if seen[r] {
return false
}
seen[r] = true
}
return true
}
In fact, this is even faster than the hard coded length. I suspect there’s a compiler optimization, since it would seem like recalculating the length of s every iteration would take longer than just comparing to a compile-time constant.
BenchmarkSolution1/Solution1B-64 86110658 14.15 ns/op
BenchmarkSolution1/Solution1B1-64 86076312 14.01 ns/op
BenchmarkSolution1/Solution1B2-64 72450708 15.67 ns/op
BenchmarkSolution1/Solution1B3-64 99203356 11.65 ns/op
BenchmarkSolution1/Solution1B4-64 104699756 11.10 ns/op
That’s about as far as I want to take the bit array solution. What about the other part of the question – the part I’ve ignored until now?
No Additional Data Structures
“What if you cannot use additional data structures?“
For Go in particular, this is tricky as strings are immutable. This means that a natural solution – to sort the characters of the string and then check for duplicates – isn’t possible.
If you iterate through a string, you’ll notice that you get the 4-byte rune type rather than the 1-byte byte type you get by indexing a string. In this case, it doesn’t matter as for now we’re assuming all characters are ASCII. Which means we can just index the string, even though that would be problematic if the system we’re designing for might ever encounter non-ASCII.
// Solution1C avoids using other data structures by directly comarping pairs of characters.
func Solution1C(s string) bool {
for i := 0; i < len(s); i++ {
for j := i + 1; j < len(s); j++ {
if s[i] == s[j] {
return false
}
}
}
return true
}
Surprisingly this solution isn’t terrible compared to the other solutions:
BenchmarkSolution1/Solution1B4-64 103588170 11.66 ns/op
BenchmarkSolution1/Solution1C-64 39652282 28.48 ns/op
In fact, this solution *always* beats the hash map solution, despite technically having O(n^2) complexity. At 100-character strings:
BenchmarkSolution1/Solution1A2-64 1279503 940.2 ns/op
BenchmarkSolution1/Solution1B4-64 84820020 13.66 ns/op
BenchmarkSolution1/Solution1C-64 31667222 37.01 ns/op
At 1,000-character strings:
BenchmarkSolution1/Solution1A2-64 1247774 958.0 ns/op
BenchmarkSolution1/Solution1B4-64 79115478 15.28 ns/op
BenchmarkSolution1/Solution1C-64 31556229 38.59 ns/op
This is because of a fundamental limit of our problem – there are at most 95 unique characters, and so despite being an O(n^2) solution, comparing every pair of characters is faster as we exit early.
Pareto Distribution
Suppose characters occur with a frequency following a Pareto distribution. There’s a lot of debate on how to model this properly, so here I’ll pick parameters that seem to fit a letter distribution chart. What matters more is seeing what happens when there’s a small number of characters that are chosen more often than others, but still a good number of the less frequent. If you’re interested in replicating, I’ve gone with X_m=3.0 and alpha=0.45, then offset the results by -3. The results aren’t too awful and result in a more long tail distribution than actual letters, which means any measurements here are likely to be overestimates.

Perhaps unsurprisingly, all solutions perform better in this case. Given that some characters are much more common, there’s a significant likelihood of a common character appearing as the first or second character, and then again later in the string – so generally the logic allowing the algorithms to exit early triggers sooner than if characters are distributed uniformly.
BenchmarkSolution1/Solution1A2-64 2115301 563.4 ns/op
BenchmarkSolution1/Solution1B4-64 106364142 11.02 ns/op
BenchmarkSolution1/Solution1C-64 38778128 29.50 ns/op
BenchmarkSolution1P/Solution1A2-64 3728235 327.9 ns/op
BenchmarkSolution1P/Solution1B4-64 193359403 5.491 ns/op
BenchmarkSolution1P/Solution1C-64 226532450 5.123 ns/op
As a surprise, the apparently O(n^2) solution is actually the fastest compared to the others when given data that better approximates actual text! This should reinforce both that:
- the time-complexity of an algorithm should not be the sole determiner of which algorithm is best, and
- measuring, especially with simulations of real world data, can change which algorithm is best for an application.
In Context
So we’re evaluating these strings for repeated characters, but where do these strings come from? At 5 or 11 ns/op, how likely is this the bottleneck?
Let’s hand wave away any sort of parsing (e.g. JSON parsing is very slow) – we’re either getting this as a raw byte stream from a connection, or we’re reading raw text from a file on disk.
Consider the slower option – uniformly distributed ASCII characters in strings of length 12, since that gives a nice 1 byte/ns processed. A quick conversion shows this is equivalent to about 1 GB/s. For comparison, a mid-range SSD reads about 500 MB/s, and a gigabit connection is only 125 MB/s. So at that speed, it’s no longer a significant bottleneck.
Considering the hash map solution – even in the more lenient case of the non-uniform distribution, we get 12 bytes / 328 ns, which is about 36.6 MB/s. This is slow enough that it could reasonably be a bottleneck if almost nothing else is happening in the application.
Conclusion
There was a lot here, just looking at this problem from the perspective of Go. The O(n^2) solution being the best for normal-looking data surprised me – it strikes me as the kind of finding that’s interesting to bring up in an interview setting.
I doubt there’s more to be gained from this particular perspective at the problem. Next time I may consider other languages (like Python or Rust), or consider what happens if the characters could be any valid Unicode 16.0. Python and Rust may allow trying the sorting algorithm approach, and Unicode likely would make the byte array solution invalid. But validating that is what benchmarking is for!