Showing posts with label Interview Question. Show all posts
Showing posts with label Interview Question. Show all posts

May 03, 2011

Full Question:
You are given a the source to a application which is crashing when run. After running it 10 times in a debugger, you find it never crashes in the same place. The application is single threaded, and uses only the C standard library. What programming errors could be causing this crash? How would you test each one?

In most case, it should come from incorrect use of memory. If the codes is never changed while using debugger, time of random numbers are problematic. For example, for index of array, it might be using time, or perhaps random numbers. Or it does nasty things with memory. So 'out of bounds of array' might be shown.

Other possible answers can exist. For example,

There might be multiple recursions in the code which are causing a stack overflow, cause "Out Of Memory".

A probable cause would be an uninitialized pointer/variable being used. Or it is possible that pointer variables have wrong value.

Write a Regular Expression Which Matches an Email Address

Full Question:
Write a regular expression which matches an email address.

In computing, a regular expression, also referred to as regex or regexp, provides a concise and flexible means for matching strings of text, such as particular characters, words, or patterns of characters. A regular expression is written in a formal language that can be interpreted by a regular expression processor, a program that either serves as a parser generator or examines text and identifies parts that match the provided specification.

Following Table shows basic meta-character for regular expression.

MetacharacterDescription
.Matches any single character (many applications exclude newlines, and exactly which characters are considered newlines is flavor, character encoding, and platform specific, but it is safe to assume that the line feed character is included). Within POSIX bracket expressions, the dot character matches a literal dot. For example,a.c matches "abc", etc., but [a.c] matches only "a", ".", or "c".
[ ]A bracket expression. Matches a single character that is contained within the brackets. For example, [abc] matches "a", "b", or "c". [a-z] specifies a range which matches any lowercase letter from "a" to "z". These forms can be mixed: [abcx-z] matches "a", "b", "c", "x", "y", or "z", as does [a-cx-z].
The - character is treated as a literal character if it is the last or the first (after the ^) character within the brackets: [abc-][-abc]. Note that backslash escapes are not allowed. The ] character can be included in a bracket expression if it is the first (after the ^) character: []abc].
[^ ]Matches a single character that is not contained within the brackets. For example, [^abc] matches any character other than "a", "b", or "c". [^a-z] matches any single character that is not a lowercase letter from "a" to "z". As above, literal characters and ranges can be mixed.
^Matches the starting position within the string. In line-based tools, it matches the starting position of any line.
$Matches the ending position of the string or the position just before a string-ending newline. In line-based tools, it matches the ending position of any line.
BRE: \( \)
ERE: ( )
Defines a marked subexpression. The string matched within the parentheses can be recalled later (see the next entry, \n). A marked subexpression is also called a block or capturing group.
\nMatches what the nth marked subexpression matched, where n is a digit from 1 to 9. This construct is theoretically irregular and was not adopted in the POSIX ERE syntax. Some tools allow referencing more than nine capturing groups.
*Matches the preceding element zero or more times. For example, ab*c matches "ac", "abc", "abbbc", etc. [xyz]* matches "", "x", "y", "z", "zx", "zyx", "xyzzy", and so on. \(ab\)* matches "", "ab", "abab", "ababab", and so on.
BRE: \{m,n\}
ERE: {m,n}
Matches the preceding element at least m and not more than n times. For example, a\{3,5\} matches only "aaa", "aaaa", and "aaaaa". This is not found in a few older instances of regular expressions.

Additionally, next three metacharacters are used to simplify regular expression.

MetacharacterDescription
?Matches the preceding element zero or one time. For example, ba? matches "b" or "ba".
+Matches the preceding element one or more times. For example, ba+ matches "ba", "baa", "baaa", and so on.
|The choice (aka alternation or set union) operator matches either the expression before or the expression after the operator. For example, abc|def matches "abc" or "def".

So, if username of Email address is allowed to use alphabets, digits, and some symbols, then a possible regular expressions is:
[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}
Assuming that [word] means combination of alphabets and allowed symbols, then
\word+@\word+\.[\word]{2,4}

If interviewer ask a particular format of Email address, you can modify/change this. For example, only digit is allowed for username and it should be ended com or net, then
\digit+@\word+.(com|net)

How are cookies passed in the HTTP protocol?

Full Question:
How are cookies passed in the HTTP protocol?


A cookie, also known as a web cookie, browser cookie, and HTTP cookie, is a piece of text stored on a user's computer by their web browser. A cookie can be used for authentication, storing site preferences, shopping cart contents, the identifier for a server-based session, or anything else that can be accomplished through storing text data.

A cookie consists of one or more name-value pairs containing bits of information, which may be encrypted for information privacy and data security purposes. The cookie is sent as a field in the header of the HTTP response by a web server to a web browser and then sent back unchanged by the browser each time it accesses that server.

Cookies may be set by the server with or without an expiration date. Cookies without an expiration date exist until the browser terminates, while cookies with an expiration date may be stored by the browser until the expiration date passes. Users may also manually delete cookies in order to save space or to address privacy issues.

As text, cookies are not executable. Because they are not executed, they cannot replicate themselves and are not viruses. However, due to the browser mechanism to set and read cookies, they can be used as spyware (see zombie cookie and evercookie for more details). Anti-spyware products may warn users about some cookies because cookies can be used to track computer activity—a privacy concern, later causing possible malware.

Most modern browsers allow users to decide whether to accept cookies, and the time frame to keep them, but rejecting cookies makes some websites unusable.

May 02, 2011

Describe the Algorithm for a Depth-First Graph Traversal.

Full Question:
Describe the algorithm for a depth-first graph traversal.

Graph traversal refers to the problem of visiting all the nodes in a graph in a particular manner. Tree traversal is a special case of graph traversal. In contrast to tree traversal, in general graph traversal, each node may have to be visited more than once, and a root-like node that connects to all other nodes might not exist.

Depth-First Search (DFS)
Depth-first search (DFS) is an algorithm for traversing or searching a tree, tree structure, or graph. One starts at the root (selecting some node as the root in the graph case) and explores as far as possible along each branch before backtracking.

Formally, DFS is an uninformed search that progresses by expanding the first child node of the search tree that appears and thus going deeper and deeper until a goal node is found, or until it hits a node that has no children. Then the search backtracks, returning to the most recent node it hasn't finished exploring. In a non-recursive implementation, all freshly expanded nodes are added to a stack for exploration.

*must see Breadth-First Search (BFS) to compare and understanding graph traversal deeply. Interviewer may ask BFS after DFS.

Below pseudo code shows DFS algorithm.
algorithm DFT(G,v):
label v as visited
for all edges e in G.incidentEdges(v) do
if edge e is unvisited then
w ← G.opposite(v,e)
if vertex w is visited then
label e as a discovery edge
recursively call DFT(G,w)
else
label e as a back edge

Not considering edges, the algorithm can be simple.
algorithm  DFT(x)
visit(x); x<-visited;
FOR each y such that (x,y) is an edge DO
IF y != visited THEN
DFT(y)
Full Question:
Write a C program which measures the speed of a context switch on a UNIX/Linux system.Context

Context switching roughly happens when either:

  • User process enters the kernel via system call or a trap (e.g. page fault) and requested data (e.g. file contents) is not yet available, so the kernel puts said user process into sleep state and switches to another runnable process.
  • Kernel detects that given user process consumed its full time quanta (this happens in code invoked from timer interrupt.)
  • Data becomes available for higher current priority process that is presently sleeping (this happens from code invoked from/around IO interrupts.)

The switch itself is one-way, so the best we can do in userland (I assume that's what you are asking) is to measure sort of an RTT, from our process to another and back. The other process also takes time to do its work. We can of course make two or more processes cooperate on this, but the thing is that the kernel doesn't guarantee that one of our processes is going to be picked next.

Below code which shows a possible solution is from stackoverflow.

// Compile with g++ latencybench.cc -o latencybench -lboost_thread-mt
// Should also work on MSVC and other platforms supported by Boost.

#include <boost/format.hpp>
#include <boost/thread/thread.hpp>
#include <boost/date_time.hpp>
#include <algorithm>
#include <cstdlib>
#include <csignal>

volatile bool m_quit = false;

extern "C" void sighandler(int) {
    m_quit = true;
}

std::string num(unsigned val) {
    if (val == 1) return "one occurrence";
    return boost::lexical_cast<std::string>(val) + " occurrences";
}

int main(int argc, char** argv) {
    using namespace boost::posix_time;
    std::signal(SIGINT, sighandler);
    std::signal(SIGTERM, sighandler);
    time_duration duration = milliseconds(10);
    if (argc > 1) {
        try {
            if (argc != 2) throw 1;
            unsigned ms = boost::lexical_cast<unsigned>(argv[1]);
            if (ms > 1000) throw 2;
            duration = milliseconds(ms);
        } catch (...) {
            std::cerr << "Usage: " << argv[0] << " milliseconds" << std::endl;
            return EXIT_FAILURE;
        }
    }
    typedef std::map<long, unsigned> Durations;
    Durations durations;
    unsigned samples = 0, wrongsamples = 0;
    unsigned max = 0;
    long last = -1;
    std::cout << "Measuring actual sleep delays when requesting " << duration.total_milliseconds() << " ms: (Ctrl+C when done)" << std::endl;
    ptime begin = boost::get_system_time();
    while (!m_quit) {
        ptime start = boost::get_system_time();
        boost::this_thread::sleep(start + duration);
        long actual = (boost::get_system_time() - start).total_milliseconds();
        ++samples;
        unsigned num = ++durations[actual];
        if (actual != last) {
            std::cout << "\r  " << actual << " ms " << std::flush;
            last = actual;
        }
        if (actual != duration.total_milliseconds()) {
            ++wrongsamples;
            if (num > max) max = num;
            std::cout << "spike at " << start - begin << std::endl;
            last = -1;
        }
    }
    if (samples == 0) return 0;
    std::cout << "\rTotal measurement duration:  " << boost::get_system_time() - begin << "\n";
    std::cout << "Number of samples collected: " << samples << "\n";
    std::cout << "Incorrect delay count:       " << wrongsamples << boost::format(" (%.2f %%)") % (100.0 * wrongsamples / samples) << "\n\n";
    std::cout << "Histogram of actual delays:\n\n";
    unsigned correctsamples = samples - wrongsamples;
    const unsigned line = 60;
    double scale = 1.0;
    char ch = '+';
    if (max > line) {
        scale = double(line) / max;
        ch = '*';
    }
    double correctscale = 1.0;
    if (correctsamples > line) correctscale = double(line) / correctsamples;
    for (Durations::const_iterator it = durations.begin(); it != durations.end(); ++it) {
        std::string bar;
        if (it->first == duration.total_milliseconds()) bar = std::string(correctscale * it->second, '>');
        else bar = std::string(scale * it->second, ch);
        std::cout << boost::format("%5d ms | %s %d") % it->first % bar % it->second << std::endl;
    }
    std::cout << "\n";
    std::string indent(30, ' ');
    std::cout << indent << "+-- Legend ----------------------------------\n";
    std::cout << indent << "|  >  " << num(1.0 / correctscale) << " (of " << duration.total_milliseconds() << " ms delay)\n";
    if (wrongsamples > 0) std::cout << indent << "|  " << ch << "  " << num(1.0 / scale) << " (of any other delay)\n";
}

Explain the Significance of "Dead Beef"

Full Question:
Explain the significance of "dead beef".

Dead Beef. It does not mean a real dead beef. Don't say I'm so sad of hearing about it.

0xDEADBEEF ("dead beef") is frequently used to indicate a software crash or deadlock in embedded systems. It is used by IBM RS/6000 systems, Mac OS on 32-bit PowerPC processors and the Commodore Amiga as a magic debug value. On Sun Microsystems' Solaris, it marks freed kernel memory. On OpenVMS running on Alpha processors, DEAD_BEEF can be seen by pressing CTRL-T. The DEC Alpha SRM console has a background process that traps memory errors, identified by PS as "BeefEater waiting on 0xdeadbeef".

DEADBEEF is a hexadecimal value that has was used in debugging back in the mainframe/assembly days because it was easy to see when marking and finding specific memory in pages of hex dumps. Most computer science graduates have seen this at least in their assembly language classes in college and that's why they expect software engineers to know it.

What is the Difference Between a Mutex and a Semaphore?

Full Question:
What is the difference between a mutex and a semaphore? Which one would you use to protect access to an increment operation?

  • Mutex is typically used to serialize access to a common resource while a semaphore is a number of concurrent accesses.
  • Mutex is like a semaphore with a count of one.
  • Mutex only allows a single thread to have access while semaphores can be concurrently signaled by any thread or process.
  • Semaphores are ideal for synchronization and often used for event notification and mutual exclusion while mutex is only applied for mutual exclusion.
  • Both mutexes and semaphores are used to control access to a shared resource – most often in multithreading scenarios. A mutex is used when only one thread or process is allowed to access a resource and a semaphore is used when only a certain set limit of threads or processes can access the shared resource. Essentially a mutex is a semaphore where the limit is set to 1.
  • Which one would I use to protect access to an increment operation? In the general scenario I would use a mutex. But, in some programming languages support incremental function or methods for this kind of situation, for example Interlocked.Increment in C#.