• Working through UNP

      1 comment

    As I wrote previously, I’ve been working through Richard Stevens’ Unix Network Programming (3rd Ed) Vol 1. It covers the basics of the Sockets API in UNIX and similar OSs.

    Unfortunately I’ve been able to devote time for this mostly on the weekends. This translates to slow progress because UNP goes into a lot of depth about everything. This is a good thing, but it also means that I’ve to reread and review the last few pages everytime I try to pick up where I left off. This does not really help when I’m trying to understand concepts in depth. So what’s the solution? I’ll try to do a bit atleast 3-4 days a week from now on.

    UNP is an amazingly detailed book – and as one of my colleagues said – “If you read that book properly, Stevens makes sure that there’s nothing left for you to know on the topic”. I agree.

    Stevens wrote his own wrapper functions over common socket functions and used them in all the code examples in the book. The wrappers handle all error codes and portability issues (like IPv4/IPv6). These are included in a header unp.h (available in the back of the book as well as online on http://www.unpbook.com/src.html).

    Some reviewers of the book gripe about this and say that this is an obstacle to learning the actual functions. But I think that there was no other way to do it without littering every example snippet with code for portability and error handling. The wrapper strategy makes it easier to follow the examples, and at the same time – as I found out – it makes you write those wrappers yourself. True, you can just include the unp.h header as you try the examples, but then you’ll never know what those functions are doing. I’ve found that creating my own header and writing the functions as I come across them, after looking at the book’s source code, works great. Most of them will end up identical to those in unp.h.

    I’m pushing the examples I’m trying out into github – it’s a scratchpad so not everything might compile.

    I’ve added a generic startserver function to my header – this takes a pointer to a function as an argument. The generic function starts a server socket (bind/listen/accept), forks a child when a client connects and calls the function that was earlier passed as an argument, abstracting out the actual serving part. The function pointer syntax was not hard to figure out – I’d read Peter van der Linden‘s algo on unscrambling declarations in C last week. Interesting how things add up!

    SocialTwist Tell-a-Friend
  • Falling through

      1 comment

    When I was studying for my CS degree, the programming language we were taught was C. To be honest, I didn’t code much C in college – just what was necessary to scrape through the lab assignments. I did stay up one night writing Tetris in C with 2 close friends for an internal contest, but such occasions were rare. Pointers? Grokked them, but I suspect it was a very shaky understanding. For the rest of my years there, it was C for all lab projects. I didn’t dislike C, but I did find it tough for a first language. So as soon as I got a chance, I set out to do some side projects in Java (atleast I thought was writing Java). That was the end of my relationship with C, because my first (and current) job post-college was and still is in a Java shop.


    Fast forward 8 years. The products I’ve worked on till now have been in Java. Serverside Java, Infrastructure Java, Applications Java. An appserver, a desktop app, a SaaS app. Abstractions, APIs, SPIs, Specifications, Javadocs, Objects talking to each other – it’s a satisfying, pleasant world if you like high level order.


    I made occasional forays into Python and Ruby for some side projects and scripting.


    But lately, I’ve been wanting to code in a really low level language. Yes, like C. Let me rephrase that. I’ve been wanting to write code where I can be as close to the OS as possible, learn bit twiddling hacks, make low level system calls. I’m not sure what this desire was born out of. Maybe 8 years of Java. Maybe wanting low level control over what I was doing.

    So I’ve ploughed through K&R the last couple of months. Bought Deep C Secrets. Regrokked pointers – properly, this time. Had a brief period of not knowing where to go after this, which resolved itself in the form of Unix Network Programming (Richard Stevens). I’m working through Vol 1 right now. It’s slow going at times, but I’m loving every moment. I’ve fallen through – the abstractions, the bytecode, the layers of objects – and reached the ocean floor.

    <In the background, the Toccata ends, and the Fugue begins.>

    SocialTwist Tell-a-Friend
  • Everybody’s Recommending

      0 comments

    Is Google the only one who has possibly accumulated a lot of data on your online activities?

    Think again.

    Most of us use one of these -

    1. Facebook
    2. Twitter
    3. ShareThis
    4. Technorati/Digg/et al

    There’s a common aspect to all these networks/tools – all of them can potentially collect data about the online preferences of their users. So – do they? Some of them do.

    Online preferences are links that you visit, which translates to things that you are interested in. This kind of data can be used to build up a profile of the user.

    Think about it -

    1.  Facebook knows what you share on facebook.com, knows what you “Like” among others’ shared links, and now with OpenGraph knows what you “Like” on sites that have the Facebook Like button.

    2.  Twitter knows what links you share, and now with t.co – its own shortening service – it will know what links shared by others you clicked on (read “interested in”). From a Twitter blog post -

    routing links through this service will eventually contribute to the metrics behind our Promoted Tweets platform and provide an important quality signal for our Resonance algorithm—the way we determine if a Tweet is relevant and interesting to users

    3. ShareThis – if you’re logged into ShareThis, it knows what you shared.

    Links you share and visit provide a picture, albeit incomplete, of your online preferences.

    The question is, how are these tools and services planning to use this data?

    If you know what someone likes, you can recommend stuff to that person. A lot of sites do this already. These recommendations are based on multiple parameters. E.g. Amazon’s recommendation system – which does a great job – uses collaborative filtering. Simply put, it uses data from your past purchases, ratings and I-Own-This history and from other users whose history is similar to yours. The more history you have on Amazon, the better your recommendations get.

    Building a content recommendation system seems to be an obvious step once you have a data mountain of your users’ likes. And this is what these sites seem to be doing but to achieve different ends.

    E.g. Facebook – See slide #29 http://www.slideshare.net/CMSummit/ms-internet-trends060710final. This has not happened yet, but what’s  stopping it, considering what Mark Zuckerberg said earlier this year ?

    Twitter has recommendation plans – http://groups.google.com/group/twitter-development-talk/browse_thread/thread/14d5474c13ed84aa?pli=1

    ShareThis already has behavioural advertising in the works with its segmentation technology.

    The bottom line is – some of these services are going to use it to improve the end user’s experience – and will do so within the boundaries of their privacy policies. The rest – we don’t know.

    SocialTwist Tell-a-Friend
  • Instance Initializers in Java

      1 comment

    Take a look at this simple code

    Code Snippet 1

    public class Init {
        {
            System.out.println("In the beginning was the command line");
        }
    
        public Init()
        {
            System.out.println(&quot;Created an instance&quot;);
        }
    
        public static void main(String[] args)
        {
            Init init = new Init();
        }
    }
    

    What do you think the output is? It’s this -

        In the beginning was the command line
        Created an instance
    

    The ‘hanging’ braces at the start of the class definition are instance initializers. Most of us are more familiar with static initializers -

    Code Snippet 2

    static
    {
        //Do stuff
    }
    

    Instance initializers (II) are not seen often in everyday Java code – so they might seem odd at first. They are executed every time an instance of that class is created, before the statements in the constructor are executed. (See The Java Language Specification 3 section 8.6).

    One use of IIs can be to execute something whenever an instance is created, and the class has multiple constructors, without calling it in every single constructor.
    Another one which has become popular is populating collections during declaration, in the style of Ruby or Python single-line initializers -

    Code Snippet 3

    private Set<String> names = new HashSet<String>() {
        {
            add("Rigel");
            add("Vega");
            add("Antares");
        }
    };
    

    This idiom was how I encountered IIs first while reading somebody’s blog.
    What is actually happening here?

    1. An anonymous inner class is created.
    2. An instance initializer block is added to the anon inner class.
    3. Objects are added to the instance of that class when the names variable is initialized.

    Now take this scenario
    Code Snippet 4

    public class WrongUsage {
    
        private Set<String> names;
    
        {
            add("pleiades");
        }
    
        public void WrongUsage()
        {
            names = new HashSet<String>();
        }
    
        public void add(String name)
        {
            names.add(name);
        }
    }
    

    Based on what we have seen above, the names set is used before it’s initialized. So this throws a NullPointerException.
    Let’s take another case – similar to the above but involving inheritance.

    Code Snippet 5

    public class MyHashSet extends HashSet {
        {
            add("pleiades");
            System.out.println("Added");
        }
    
        public MyHashSet()
        {
            super();
            System.out.println("After calling super");
        }
    
        public static void main(String[] args)
        {
            Set set = new MyHashSet();
        }
    }
    

    This runs, with the output being

        Added
        After calling super
    

    In this case, add() internally uses the inner HashMap inside HashSet which is initialized in the HashSet constructor. This implies that the instance initializer is invoked before the class constructor, but after the superclass constructor (The super call is redundant here. It will be called anyway).

    So the sequence is

    1. Superclass initialization (this includes superclass instance initializers and constructor)
    2. Current class’s Instance initializers
    3. Current class’s Constructor

    This is why the code in Code Snippet 3 does not throw an NPE – because it’s a case of inheritance (the anon inner class is a subclass of HashSet)

    SocialTwist Tell-a-Friend