Thursday 30 January 2014

New Game Source Code: Star Pusher (Sokoban clone)

Here’s a Sokoban (“box pusher”) clone called Star Pusher. I’ve used the graphics from the Planet Cute collection. You’ll need Python (2 or 3) and Pygame installed to run it. Just download and unzip the files to the same directory. It comes with 201 levels from 
(UPDATE: I’ve updated the code so that it works on Linux. It was due to a \r\n newline issue. I’ve tested the game in Ubuntu and it works.)
The source code is designed to be readable so that new programmers can understand and modify it without much effort.

Virtual Caesar Cipher Wheel program

The Caesar Cipher Wheel is a paper cutout that can be used to perform encryption and decryption in the Caesar Cipher. However, if you don’t have a printer but do have Python and Pygame installed, you can use this Caesar Cipher Wheel program to rotate a virtual cipher disk instead.
There is also an online JavaScript Virtual Cipher Wheel.
You will need Python (2 or 3) and Pygame installed to run the program. Or, if you are running Windows, you can download the win32 binary.

scrape macy's deals using beautiful soup

scrape macy's deals using beautiful soup

Let me show you a tiny real example on how to use the bs4 (beautiful soup version 4) module of Python. Say we want to collect information about the hot deals from macy's. The URL is here. Well, you can see all the info in one page and copy-paste, but that's not our purpose. First you have to get the content of the page using the cute requests module.
import requests

url = 'http://bit.ly/19zWmQT'
r = requests.get(url)
html_content = r.text
Now start cooking the soup:
from bs4 import BeautifulSoup

soup = BeautifulSoup(html_content)
Now look at the html code (page source code) of the url. You will see that the offers are in a list (li) that has 'offer' as a css class name (and some other class names). So you can write the code in the following way:
offer_list = soup('li', 'offer')
Or you can write:
offer_list = soup.find_all('li', 'offer')
Another way to write this is:
offer_list = soup.select('li.offer')
Now run this loop:
for offer in offer_list:
    title = offer.find('h3').text
    url = offer.find('a')['href']
    description = offer.find('span').text
    promo_code = offer.find('span', class_='promo-code').text
    promo_date = offer.find('span', class_='end-date').text
    print title, url, description, promo_date, promo_code
You are done! :) 

Heap Sort Implementation in Python

Here I share my Python implementation of heap sort algorithm. Heap sort is O(n log n) sorting algorithm though quicksort is used more frequently. You should learn heap sort to implement priority queue or at least learn the internals of priority queue. :)

In the Python code, I directly followed the heap sort algorithm presented in Introduction to Algorithms by CLRS.
# heapsort

def max_heapify(li, i, heap_size):
  l = i * 2 #left(i)
  r = l + 1 #right(i)
  if l <= heap_size and li[l] > li[i]:
    largest = l
  else:
    largest = i
  if r <= heap_size and li[r] > li[largest]:
    largest = r
  if i != largest:
    li[i], li[largest] = li[largest], li[i]
    max_heapify(li, largest, heap_size)

def build_max_heap(li, heap_size):
  for i in range(heap_size/2 - 1, 0, -1):
    max_heapify(li, i, heap_size) 

def heap_sort(li, heap_size):
  build_max_heap(li, heap_size)
  for i in range(len(li[1:]), 1, -1):
    li[1], li[i] = li[i], li[1]
    heap_size = heap_size - 1
    max_heapify(li, 1, heap_size)

def main():
  # we shall consider the list from element 1, not 0
  li = [0, 16, 4, 10, 14, 7, 9, 3, 2, 8, 1]
  heap_size = len(li[1:])
  heap_sort(li, heap_size)
  print li[1:]

if __name__ == "__main__":
  main()

Calendar Federation with an Exchange Hybrid

What is Exchange federation?

There are a number of federation types utilised in Microsoft technologies and specifically Office 365.  Federation ultimately is the creation of a “trust relationship” between two organisations so that information can be shared between them.
Similar to Active Directory Federation Services sharing and security tokens and identity information, Lync federation allowing separate organisations to see each other’s presence, setting up a federation between Exchange environments allows for calendar Free/Busy information to be exchanged (pardon the pun) between them.
Ultimately this is useful for people in different organisations to see each other’s calendar availability and schedule a meeting without having to correspond a great deal beforehand. We’ve all been in that situation where scheduling a meeting with more than one person in a different organisation – it’s not easy. Exchange calendar federation fixes that!

What it’s used for and how it works

While a hybrid effectively joins Exchange Server to Exchange Online to appear as a single environment for a single business – it is important to remember that these are separate environments and that the federation is simply providing a mechanism for information to flow simply and easily between them.
Exchange federation is also used to allow separate businesses to share Free/Busy information between each other. 

How does Exchange federation work in a hybrid environment

As described earlier, Exchange federation is used to exchange information between two environments. Prior to Exchange Server 2010 if a company wanted to share calendar information with another company, it had to go through a series of steps to set it up. One of these steps was to exchange account information for a service account which would be used to retrieve the requested information. Because this is not always desirable, Microsoft developed a service called the “Microsoft Federation Gateway”, hereafter referred to as the MFG.
The MFG acts as an authentication broker between both environments and explains where the term “Exchange federation” comes from. Requests from one organization to the other are “authenticated” through the MFG, and therefore these requests are federated. It does not matter if an organization wants to federate with a remote organization or with its hybrid counterpart in Office 365, the principle and how it works is the same. In fact, Exchange treats its hybrid counterpart as if it would be a remote organization which – to be technically correct – it is.
In order to be able to use this free MFG service from Microsoft, one has to setup a ‘trust’ with the MFG. Usually, this is something which has to be done manually. But in case of a hybrid deployment, the Hybrid Configuration Wizard automatically takes care of this.
Next to having a trust with the MFG, an organization has to setup a “relationship” with the other organization (or hybrid counterpart) by means of creating an Organization Relationship. This organization relationship is an object in Exchange which provides Exchange with more information on what URI to contact the other Exchange organization and on what information can be exchange between both environments.
In a hybrid deployment both the Exchange on-premises environment and Exchange Online have an organization relationship which look similar to what you see in the following image:
 
The on-premises organization has an organization relationship with information which points to Office 365 and the Exchange online organization has information about the on-premises deployment.
Now, let’s have a high-level look at how Exchange uses all these components to query Free/Busy information in a hybrid deployment. Take a look at the following image:
 
  1. User ‘Loryan’ requests Free/Busy information for Michael’s mailbox (michael@contoso.com). This request is made at the on-premises Exchange server.
  2. The Exchange server will lookup michael@contoso.com and find a mail-enabled user object. This object has a targetAddress attribute which points to Michael’s mailbox in Office 365 (michael@contoso.mail.onmicrosoft.com).
  3. The Exchange server will now lookup its organizational relationships and verify if it has one for contoso.mail.onmicrosoft.com
  4. Now, the Exchange server will contact the Microsoft Federation Gateway and request a authentication token for contoso.mail.onmicrosoft.com
  5. The MFG sends back a token because contoso.com has a trust relationship with the MFG.
  6. The on-premises Exchange server now uses the information it obtained from the Organization Relationship to do an Autodiscover request for contoso.mail.onmicrosoft.com in order to retrieve the remote Exchange Web Services endpoint it should connect to. It then uses the address it received to send the Free/Busy request to.
  7. Exchange online will first check whether the MFG token which the Exchange on-premises has sent across with is valid before accepting the Free/Busy Request.
  8. Exchange online will also verify its organization relationship so that it knows what information it is allowed to return
  9. Exchange online queries Michael’s mailbox for the Free/Busy information
  10. Exchange online sends back the Free/Busy information to the on-premises organization
  11. The on-premises Exchange server sends back the request Free/Busy information to Loryan
As you can see, there are quite some ‘moving parts’ in requesting a remote user’s Free/Busy information. The same process is applied when an Exchange Online user queries the Free/Busy information for an on-premises user.

 

External organisational Free/Busy constraints in Exchange hybrid scenario

Now that we know how Exchange queries Free/Busy information, let’s have a look at the following scenario.
Contoso and Paradyne wish to share Free/Busy information with one another. In order to do so, they decide to setup everything that is necessary to make this work using Exchange Federation. Paradyne, however, is also enrolled in a hybrid deployment. In this particular scenario, Contoso users will not be able to request Free/Busy information for user in Paradyne’s organization if they are Exchange online users.
Take a look at the following scenario in which Loryan – working for Contoso – tries requesting Free/Busy information for Michael whose mailbox is hosted in the Exchange Online tenant of Paradyne:
 

  1. User ‘Loryan’ requests Free/Busy information for Michael’s mailbox (michael@paradyne.com). This request is made at the on-premises Exchange server.
  2. Given that the Exchange server isn’t authoritative or otherwise configured for paradyne.com it will lookup whether it has an organization relationship for that domain.
  3. Now, the Exchange server will contact the Microsoft Federation Gateway and request a authentication token for paradyne.com
  4. The MFG sends back a token because there’s a trust relationship with the MFG.
  5. The on-premises Exchange server now uses the information it obtained from the Organization Relationship to do an Autodiscover request for paradyne.com in order to retrieve the remote Exchange Web Services endpoint it should connect to. It then uses the address it received to send the Free/Busy request to.
  6. The Paradyne Exchange server will first check whether the MFG token which the Exchange on-premises has sent across is valid before accepting the Free/Busy Request.
  7. The Paradyne Exchange server will also verify its organization relationship so that it knows what information it is allowed to return.
  8. The Exchange server now looks up the recipient Michael@paradyne.com and discovers that object has a targetAddress stamped to it, which points to the cloud.
  9. The Paradyne Exchange server now sends back that information to the Exchange server at Contoso.
There’s no 10th bullet. The Contoso Exchange server now receives back information which it doesn’t know how to handle. In fact, even if Exchange would know how to handle the information (which isn’t Free/Busy information but rather a redirect), there would still be an issue because the Contoso Exchange organization doesn’t have an organization relationship with the Exchange Online tenant of Paradyne. So, as a result, this scenario is sort-of broken.

Workaround

With Office 365 becoming more and more popular, it’s more likely you will encounter such a scenario sooner rather than later. As long as the remote organization is an Exchange Online-only deployment, everything will work fine. In the case where the remote organization has a hybrid deployment, there’s one thing you can do: create (additional) separate Organization Relationships to and from the remote organization’s Exchange Online tenant. This allows you to bypass the limitation we discussed earlier. However, in order to query those user’s Free/Busy information, you would also need to use their target addresses, e.g.michael@paradyne.mail.onmicrosoft.com instead of their regular email address (michael@paradyne.com).
This workaround is far from perfect, but will at least allow you to query Free/Busy information for the remote organization’s Exchange Online users. The fact that you will need to know what users are hosted in Exchange online and what their targetAddresses are, is just something you’ll have to learn with until Exchange gets updated with the code required to handle these ‘redirects’.

Sunday 26 January 2014

How does one 'Get that job at Google'!

                How does one 'Get that job at Google'!

Following post is inspired from Steve Yegge's legendary post on prepping for Google Interviews.
                                 
Coding Knowledge C/C++ and Java are the preferred programming languages for Google Interviewers. You must know at least one of them really well. You will be expected to write code in the phone screen interviews and in the onsite interviews as well.


Recommended books for CS interviews: 

Recommended websites for coding practice: InterviewStreetTopcoder


Big-O This should be the starting point in preparing for an algorithmic interview. You must not struggle with basic complexity analysis, as it will guarantee not being hired. You should be familiar and understand the O, Θ and Ω notations. I recommend reading section on complexity analysis from Data Structures and Algorithms book.



Sorting You should be able to write algorithms O(n*lgn) like QuickSort and MergeSort with ease. Compare and understand the best, worst and average case complexities. I found this table on wiki to be very handy; it lists important properties of all sorting algorithms. Don’t neglect the basic O(n^2) algorithms like Bubble sort or Insertion sort, since other algorithms improve over these. Interviews are more about improving a basic idea, sorting algorithms will help with this process.



Hash Tables When in doubt, think of hash tables. They are useful in most of the problems and frequently help us improve the time complexity of some problems by caching results.


Trees Go through basic tree construction, traversal and manipulation algorithms. You should be able to implement algorithms based on binary search trees. You should be familiar with balanced trees although you are not expected to write code for them in the interview: AVL trees, Red-Black trees, Trie, n-ary trees etcThorough knowledge about inorder, postorder and preorder traversals is necessary, because we can solve many tree problems by doing simple modifications to one of these traversals.



Graphs 

  • Graphs are a very important concept in Computer Science. Practice the three basic representation of graphs (objects and pointers, matrix, and adjacency list) and familiarize yourself with their pros & cons. 
  • There is not much time during the interview so you should not expect something very complex. However, basic graph traversal algorithms (DFS and BFS) are a must, you should implement them in all basic representations. 
  • You should be able to implement the Dijkstra or Floyd-Warshall algorithms as well as minimum spanning tree algorithms (Kruskal and Prim). Learn about topological sorting, since it is surprisingly very useful in many ordering problems.




Dynamic Programming This is probably the most important subject as the implementations are small. You should be able to implement 2-3 dynamic algorithms during a 35-40 minutes time. As you’ll check the resources on this blog or on the web, you’ll find that you should expect at least one dynamic programming question per interview.



Operating Systems Learn about processes, threads and concurrency issues. Know about mutexes, semaphores, monitors and how they work. Understand what deadlock and livelock are and how to avoid them. Learn about context switching, scheduling etc.



Mathematics You should familiarize yourself with counting, combinatorics and probability.



Google's publications Read up Google's path-breaking publications listed below if you have time.


I hope you find the above article useful :) Enjoy!

Strand Life Sciences - Algorithm Test



            

                    Strand Life Sciences - Algorithm Test

                           


Duration: 1 hour
Type: Pen-and-paper based


1.  void star(int i) {
if (i > 1) {
star(i/2);
star(i/2);
}
cout << "hello" << endl;
  }

  int main() {
star(5);
  }
How many times does "hello" gets printed?

2. Rank functions (3/2)^n, 2^nn^3, n! in order of increasing Big-O

3. Find maximum number of partitions in which the 2D-plane is divided by n lines. Extend this argument to V-shaped figures instead of lines. Assume the V-shaped figures extend infinitely

4. Given 2 strings, check if they are anagrams

5. Find all permutations of a string

6. Find maximum sum contiguous sub-sequence in an array of +ve and -ve numbers