Tuesday, 22 April 2014

Discussing Coding Interview Questions from Google

No. 13 - First Character Appearing Only Once


Problem: Implement a function to find the first character in a string which only appears once.
For example: It returns ‘b’ when the input is “abaccdeff”.

Analysis: Our native solution for this problem may be scanning the input string from its beginning to end. We compare the current scanned character with every one behind it. If there is no duplication after it, it is a character appearing once. Since it compares each character with O(n) ones behind it, the overall time complexity is O(n2) if there are n characters in a string.

In order to get the numbers of occurrence times of each character in a string, a data container is needed. It is required to get and update the occurrence time of each character in a string, so the data container is used to project a character to a number. Hash tables fulfill this kind of requirement. We can implement a hash table, in which keys are characters and values are their occurrence times in a string.

It is necessary to scan strings twice: When a character is visited, we increase the corresponding occurrence time in the hash table during the first scanning. In second round of scanning, whenever a character is visited we also check its occurrence time in the hash table. The first character with occurrence time 1 is the required output.

Hash tables are complex, and they are not implemented in the C++ standard template library. Therefore, we have to implement one by ourselves.

Characters have 8 bits, so there only 256 variances. We can create an array with 255 numbers, in which indexes are ASCII values of all characters, and numbers are their occurrence times in a string. That is to say, we have a hash table whose size if 256, with ASCII values of characters as keys.

It is time for programming after we get a clear solution. The following are some sample code:

char FirstNotRepeatingChar(char* pString)
{
    if(pString == NULL)
        return '\0';

    const int tableSize = 256;
    unsigned int hashTable[tableSize];
    for(unsigned int i = 0; i<tableSize; ++ i)
        hashTable[i] = 0;

    char* pHashKey = pString;
    while(*(pHashKey) != '\0')
        hashTable[*(pHashKey++)] ++;

    pHashKey = pString;
    while(*pHashKey != '\0')
    {
        if(hashTable[*pHashKey] == 1)
            return *pHashKey;

        pHashKey++;
    }

    return '\0';
}

In the code above, it costs O(1) time to increase the occurrence time for each character. The time complexity for the first scanning is O(n) if the length of string is n. It takes O(1) time to get the occurrence time for each character, so it costs O(n) time for the second scanning. Therefore, the overall time it costs is O(n).

In the meantime, an array with 256 numbers is created, whose size is 1K. Since the size of array is constant, the space complexity of this algorithm is O(1).

No comments:

Post a Comment