I was working on a project that used Google spreadsheets to store data. The catch was that the API wrapper referred to the columns in the spreadsheet by number. However, columns in spreadsheets are named with letters. Once Z
is reached the next column is prefixed with an A
so it appears as AA
. This is base 26.
In order to make the conversion of number to column completely dynamic I had to convert the number to its base 26 letter value. It is also helpful to be able to reverse the operation whenever desired. This means taking the string AA
for instance and converting it into 27
.
Originally I wrote this code in Ruby, however I wanted to port it to JavaScript.
Generating a Char Range
If you’ve ever worked with Ruby ranges you know how easy it is to generate a character range in Ruby. It is also just as simple to convert that range into an array.
# Ruby
alpha = ('a'..'z').to_a
Not so much in JavaScript. There is no native way to generate a char range in JavaScript. So the first step was to generate an array of all the characters from a
to z
.
While JavaScript doesn’t have native ranges, it does have some easy to work with unicode functions. Specifically the string.charCodeAt
method.
This bit of code works because the unicode integer representations of characters are in alphabetical and numerical order.
// JavaScript
function charRange(start, stop) {
var result = [];
// get all chars from starting char
// to ending char
var i = start.charCodeAt(0),
last = stop.charCodeAt(0) + 1;
for (i; i < last; i++) {
result.push(String.fromCharCode(i));
}
return result;
}
Now we have a function that can produce an array like this:
['a', 'b', 'c', 'd', ... 'z']
This gives us all the tools we need to do base 26 conversions.
Base 10 Integer to Base 26 String
Now comes the fun part (I bet you were waiting for the fun part). Let’s take that char array and use it to convert an integer like 334123303
into abcdefg
. Why? Because you never know when you might need to do that.
First we need the array of alphabetic chars to work with.
var alpha = charRange('a', 'z');
We’ll also need a variable to which we’ll concatenate our various letters.
var result = '';
Now it’s time to convert a given string to an integer. For our purposes we’ll assume the string is all lowercase alphabetical characters.
Let’s think about this strategically. We need to think in base 26. To quickly illustrate what that means, let’s think in binary (base 2). The number 2
is represented by 10
. This is because what we consider to be the tens position now has a value of 2
for each increment. If it were base 26 then each increment of the tens position would have a value of 26
. However, there aren’t 26 different numbers to represent all of those values in the ones place.
The solution to this problem is to use letters in place of numbers. There are 26 letters so it works perfectly. This means 3
will now be c
(this is assuming a
is 1
).
If we were finding the value of each position in base 10 we’d need to divide by 10
until we get 0
, each time taking the remainder as the value of that position.
Example:
var n = 1234;
var quotient = Math.floor(n / 10);
//=> 123
var remainder = n % 10;
//=> 4
We repeat this process and eventually we’ll get the value of each position, 4
, 3
, 2
, 1
. However we want letters not numbers and we’re using base 26 not base 10. Luckily the approach doesn’t change much. Here is the function that will convert an integer to a base 26 string:
function toString26(num) {
var alpha = charRange('a', 'z');
var result = '';
// no letters for 0 or less
if (num < 1) {
return result;
}
var quotient = num,
remainder;
// until we have a 0 quotient
while (quotient !== 0) {
// compensate for 0 based array
var decremented = quotient - 1;
// divide by 26
quotient = Math.floor(decremented / 26);
// get remainder
remainder = decremented % 26;
// prepend the letter at index of remainder
result = alpha[remainder] + result;
}
return result;
}
The key take away here is the use of the quotient and remainder. The quotient represents what place we’re in. The remainder represents the value of that place.
When we reach a quotient of 0
we know we’re done. No more places to divide by 26
.
We compensate for the array being zero indexed by subtracting 1
from the quotient each iteration. So 26
will become 25
and return a value of z
at index 25
of the array.
As you saw above, we get the ones place first, tens place next, and etc. So we prepend the value to our result string to get the correct order.
That’s it! This approach is also fast.Indexing into an array is very fast, O(1)
. In toString26
we’re dividing in our iteration so we’re chunking large numbersdown by a big amount. Each time we iterate the new number we’re dividing is the quotient
from the last iteration. The time complexity comes out to be O(log n)
. It’s worth noting that the base of the log
is 26
and makes this faster than a base of 2
. But it is a constant and therefore left out.
Base 26 String to Base 10 Integer
Next let’s convert that integer back into a base 26 string. We’ll need a similar group of variables to work with at the beginning. An alphabet array and a result integer to accumulate the sum.
var alpha = charRange('a', 'z');
var result = 0;
Next we’ll need to iterate over each of the characters of the string. Each character represents a position in a base 26 number. This means the last position represents n
multiplied by 26
to the power of it’s string index 0
. This will boil down to n * 1
because any number to the power 0
is 1
. Next we’ll get this equation n * 26
, and then n * (26^2)
, and etc.
We’ll compensate for the zero indexed array in this case by incrementing position. But we must do it AFTER we retrieve the letter from the alpha
array.
The trick to this bit of code is to iterate backwards over the string with i
, but still have a forward iterator j
to set the power of 26
by which we multiply the position.
Here is the code:
function toInt26(str) {
var alpha = charRange('a', 'z');
var result = 0;
// make sure we have a usable string
str = str.toLowerCase();
str = str.replace(/[^a-z]/g, '');
// we're incrementing j and decrementing i
var j = 0;
for (var i = str.length - 1; i > -1; i--) {
// get letters in reverse
var char = str[i];
// get index in alpha and compensate for
// 0 based array
var position = alpha.indexOf(char);
position++;
// the power kinda like the 10's or 100's
// etc... position of the letter
// when j is 0 it's 1s
// when j is 1 it's 10s
// etc...
var power = Math.pow(26, j);
// add the power and index to result
result += power * position;
j++;
}
return result;
}
The time complexity of toInt26
is O(n)
where n
is the length of the string passed to the function. Keep in mind that while O(n)
is not a fast time complexity in this context it is completely acceptable and the function itself is very fast. A string of the entire alphabet produces an integer value of 256094574536617744129141650397448476
. The function above would only need 26
iterations to produce that number. While time complexities of O(n)
can seem slow on the surface it is important to realize how they are applied to truly weigh their effectiveness.
Also O(n)
may be the time complexity in reference to the length of the string. However the underlying time complexity for toInt26
is really O(log n)
. This is because we are really iterating over a string representation of a number. Effectively, by analyzing each character in reverse we are performing a similar operation to toString26
which has a time complexity of O(log n)
. It depends on how you view the problem.
Test cases
Here are a few test cases to try out our new base 26 conversion functions:
var str,
num;
num = toInt26('a');
str = toString26(num);
console.log(str, num);
//=> a 1
num = toInt26('z');
str = toString26(num);
console.log(str, num);
//=> z 26
num = toInt26('aa');
str = toString26(num);
console.log(str, num);
//=> aa 27
num = toInt26('abc');
str = toString26(num);
console.log(str, num);
//=> abc 731
num = toInt26('abcdefg');
str = toString26(num);
console.log(str, num);
//=> abcdefg 334123303
num = toInt26('abcdefghijklm');
str = toString26(num);
console.log(str, num);
//=> abcdefghijklm 103215959525275440
NOTE: It has recently come to my attention that the JavaScript versions of these functions seem to stop working correctly around 'abcdefghijklmn'
. This has to do with JavaScript not supporting very large integers. Under the hood, integers that are larger than the max integer value are converted to scientific notation which breaks the function.
A solution to this problem may appear later, but is a good talking point for the moment.
BROKEN CODE:
num = toInt26('abcdefghijklmn');
console.log(num);
//=> 2683614947657161700
str = toString26(num);
num = toInt26(str);
console.log(str, num);
//=> abcdefghijkmcc 2683614947657161700 // <<<< WRONG!!!!
Here’s the kicker, the Ruby version of this code does NOT suffer from this side effect. This is providing that the to_s26
method is defined on Bignum
and Fixnum
.
Here is the Ruby code:
module Base26
ALPHA = ('a'..'z').to_a
module Number
def to_s26
# initialize an empty string
str = ''
# return that string if
# value is not at least 1
# which would be 'a'
return str if self < 1
# initialize quotient to this value
quotient = self
# until we get to 0
until quotient.zero?
# get the result and remainder
# of (q - 1) / 26
quotient, remainder = (quotient - 1).divmod(26)
# prepent that str with
# the value at the index of
# the remainder
str.prepend(ALPHA[remainder])
end
str
end
end
end
class Fixnum
include Base26
include Base26::Number
end
class Bignum
include Base26
include Base26::Number
end
class String
include Base26
def to_i26
# initialize the return value to 0
result = 0
# ensure we only have lowercase alpha chars
downcase!
gsub!(/[^a-z]/, '')
# iterate backwards over string
(1..length).each do |index|
# get char at index
char = self[-index]
# get numeric position in alpha
position = ALPHA.index(char)
# get the value of 26
# to the power of the current index
power = 26**(index - 1)
# compensate for array being 0 indexed
position += 1
# add to result
result += power * position
end
result
end
end
if __FILE__ == $0
p({
:original => 'foobar',
:to_i26 => 'foobar'.to_i26,
:to_s26 => 'foobar'.to_i26.to_s26
})
p({
:original => 1234567890,
:to_s26 => 1234567890.to_s26,
:to_i26 => 1234567890.to_s26.to_i26
})
p({
:original => 'abcdefghijklmnopqrstuvwxyz',
:to_i26 => 'abcdefghijklmnopqrstuvwxyz'.to_i26,
:to_s26 => 'abcdefghijklmnopqrstuvwxyz'.to_i26.to_s26
})
end
Yet another reason why Ruby is just awesome.
Conclusion
Math problems like this are interesting to implement in various languages. Each language provides a different interface to alter strings and work with various sizes of numbers.
I found that implementing this algorithm in various languages gave me a deeper understanding of it’s solution. It is also a great way to get familiar with the features and limitations of a language. I hope you enjoyed and learned something interesting.