On Websites & APIs

A few weeks back, I attended Startup Weekend in Israel. Startup Weekend is a gathering of people of all sorts – coders, designers, marketers and the like – that join forces for one intensive weekend to create something out of nothing. While most groups spent most of their time discussing business plans and polishing presentations (which was a disappointment for some of the more talented developers in the bunch), our team spent almost all of our time developing a new internet service.

What’s In A Service?

Developing a new service in 48 hours is not an simple task, especially for a group of ten people who have only just met and have widely varying skill sets. We wanted to take one commercial area, which we felt was badly served by existing sites, and revamping it. Creating, in two days, the open source seed that could later be used to take over the category. In Israel, the worst served segment is of classified ads, so that is what we were aiming for.

Here’s a list of what we had in mind:

  • Insert a new item (for example, a car or a cellphone)
  • Different displays and properties for each kind of classified ad (cars are different than cellphones)
  • Filter many items (according to properties of the item)
  • Search
  • User registration and login (via existing services, such as Facebook Connect)

To make things more difficult, we had ambitious goals about creating the front end of the service as well:

  • Develop not only a website, but also mobile applications (mainly, iPhone and Android).
  • Translate to several languages.
  • Test several, completely different,  design concepts and user interfaces. We really wanted to make a radically better website. Aside from the graphic design, we had several ideas about comments and Facebook integration that were pretty cool. We also had ideas about mixing UI concepts from price comparison sites as well as classified ads sites.
  • Use real ads, taken from competing sites (which, in Israel, is probably legal since classified ads are considered too utilitarian to be protected by copyright law).

The Open Website API

Building several clients to the service required extreme separation of responsibilities. In most existing frameworks (it’s slightly better in Ruby on Rails than in Django), are built on three layers: the data model, the controller or view and the template.

The data model defines what data objects are used in the system. We’ve built a simple generic model for a classified ad, and a set of meta-models that define the properties expected for each type of item.

The controller or view is responsible for fetching the data required for this action. For example, in order to create a new ad, I need the list of expected properties for this type of item (a car). The controller is responsible of all the heavy lifting (such as fetching data, validating input correctness, etc.). We created the controllers for our main actions, quite simply.

The template defines the way the page layout. It is responsible for rendering the way the website looks like (the HTML). This is what we found most cumbersome. First, each template is tied to a URL. This meant that we couldn’t have two HTML clients without splitting a lot of code. Then, the mobile client needed an API, which required a third branch of controllers and templates to render the objects for the iPhone.

This just wouldn’t do. Instead, we decided to create the API once, and use it for everything else. That means that on the server side we only had to develop the API, and not even a single HTML page. Then, we could have written as many clients as we wanted, each with a completely different flow, look and feel. The web clients were just html with some ajax, that didn’t have to be on the server. The iPhone app was just as simple to develop. No code was duplicated. I was surprised at how fast we have iterated ideas.

What’s it good for:

  • Complete API – by default, there’s an API for everything. There isn’t anything on the website that can’t be easily implemented on any other client. It makes the website very open for developers, without requiring any special treatment.
  • Complete separation of functionality from design – the server side was responsible for authentication, data validation and simple access paths to the data. The client is responsible for the flow, and user interface. Each client can be completely different, tailored for the device it is used on (think: web vs. mobile). There were hardly any limitations on what a client could do, because the API was so basic.
  • DRY code – write once, use everywhere
  • Third party friendly – having an API is important for another reason. Think of the Twitter API and how it helped create the Twitter ecosystem (something they are fighting today). For an open source website, this is an only an advantage. You want as many people plugging in and creating something new on top of it. Over time, the best ideas will merge, and community will benefit from the competition, while not wasting resources duplicating the data layer.

Cons:

  • Slower loading times – pre-rendered HTML will always be faster to load. Do people care that gmail takes a few seconds to load, every time you open it? Not really, because it’s so useful. And with smart client side caching and some clever ajax pre-loading, you can cut this time down significantly.
  • Command & Control issues – does the project include a client? If so, which one? How do you chose which client is “official”, or best? If not, does it mean you need two packages to install the website? How do you manage a list of clients? Where’s the data? Is it free and portable?
  • Security and tampering – when you have a very open API, you are vulnerable. There’s a fine line between being open, and being so open that you endanger the data integrity.

Rapid Development, Distributed Development

There are two special cases where this method can show itself to be especially useful.

The Lean Startup:

A startup in its most early stages is an organization trying to build an unknown solution for an unknown problem. It is a team of people, trying to find both a business problem and a solution to such a problem. This is called the product/market fit. The lean startup mentality dictates that the early stages of the startup should focus on learning.

Having an open website API is a great way to learn. First, it’s a great way for A/B testing and iteration of ideas. Second, there’s a real chance for serendipity. Your users will create clients for themselves, thus telling you what they need, and why they love your service.

The Open Source Website:

We’ve all heard about open source code projects, but an open source website is a much rarer creature. I’ve talked a bit about the reasons why it’s so difficult last year. The open website API liberates the project in several ways. The biggest pain point of the open source API is the data. A website without data is useless. By having the data in one central place, there’s great opportunity for innovation on the client side, having several open source clients developed with ease. In any other way, you wouldn’t be able to fork the website’s look and feel without copying all the data as well (think: Wikipedia).

I’d love to hear what you think. What other pros and cons are there? Would you want to see more open source websites?

How To Read Code

Scientifically Tested Code Reading Skills

I’ve run across a paper [pdf] researching how we read code. The article describes an eye tracking device that can identify what people were focusing on when they were given a piece of code. Each subject was given six snippets of code, each with a built-in error, and was asked to analyze the code and find the error. Here’s a sample from the article:

On the left, a small snippet of code that should sum numbers for 1 to a given maximum. On the right, the eye movement of the subject as he reads the code. Notice how the eye first scans the function headers, then briefly scans the function’s body and lastly focuses on where the problem is most likely to appear in real code (the loop).

Reading Code is Like Reading the Talmud

This reminded me of a very old post I’ve read at Joel Spolsky‘s blog. Joel quotes Seth Gordon which says:

The following Talmud-reading tactics are, I think, also useful for code-reading:

  1. Work in pairs, thinking out loud to one another.
  2. Argue. If your partner says “this means X”, and you either don’t understand why or you have another opinion, demand an explanation.
  3. Sometimes, when dealing with a chunk of text, it’s easier to figure out the middle *after* you understand what’s on both ends. Therefore, if a fragment of text has you stumped, try skipping over it and seeing if you can come back to it later. (But you still have to come back to it eventually.)
  4. Read the text both “inside” and “outside”. An inside reading translates the text into English (or whatever your native language is) phrase-by-phrase; an outside reading translates a larger chunk into an idiomatic paragraph. If you only read inside, you can miss the forest for the trees; if you only read outside, you can fool yourself by making broad guesses and not verifying them with details.

Seth describes two types of code. Some code is easier to read and understand first, and is the basis to the rest of the program (Type I). The second type of code is dependent of other code, and so is easier to understand in context (Type II). This fits with what I look for when I need to learn a new piece of code:

  1. Class and interface dependency: look for the base classes, which depend on no other class. These, usually, are small and simple objects, with the purpose of holding a little bit of state. They may do some calculations, but are mostly built around an internal storage and a bunch of get-set functions. These are classes of Type I. Skim through them now.
    After you are familiar with the basic building blocks, you can look for the manager classes. Most of the time there will be  a bunch of very big, hairy classes that are the core engine of the software. These are classes of Type II. Make sure you understand, from the function names, what each of these managers is supposed to do. After that, you can delve in. Manager objects usually only make sense when they interact with each other, so you have to have an idea about how all of them are designed before going into any of them too deeply. (Tip: The class called Manager is not always the manager. The class called from the main() usually is).
  2. Assume a lot: Assume that function and variable names matter. If you would have written the code, what would the GetNextIndex do? Assume it does exactly that. If a function has a name that sounds obvious enough, skip it. You’ll get back to it if you ever need to use it. Why waste your time on something you don’t need right now? There’s only this much detail you can keep in mind at this point. If you dig too deep, you’ll start forgetting things you’ve already read. Focus on purpose and structure, not on implementation.
  3. Learn by debugging: Unit tests matter, but real code matters more. If you really want to understand how the code works, the fastest way is to use it. Fire up your IDE and write a quick and dirty demo of what you wanted to use the code for. Chances are your code is bust. I’ll be surprised if you can get it to work on the first go. That’s OK. Now, start debugging. You’ll learn the inner workings of the code (and its bugs) much faster, and you’ll only focus on the parts that matter to your task.

Writing Readable Code Matters

  • Function names matter: we’ve known this for a long time, but now we know that we scan function names first. A good name, and you don’t need to focus on the implementation all that much.
  • Variable names matter: again, nothing new. Notice, though, how the eye goes back to the variable definition and assignments. We constantly look for reminders of the state of the variable.
  • Loops are the source of all evil: well, that’s not entirely true, but they are worse than all other statements. Notice how figuring out the loop is the most time consuming bit of the code. It’s where most pitfalls are (Oh, the index should be smaller OR EQUAL. Right. I forgot). Commenting can help some, but just keeping the conditions simple and readable is key. Break complexity down around the loop.
  • Short code is easier to process, so get your refactoring tools out, everybody, it’s time to split these functions.

My favorite list of tips for readable code is uncle Bob’s guide to Clean Code: A Handbook of Agile Software Craftsmanship. Even if on some occasions it gets tedious and overzealous, it is still a great read for the practical coder, dealing with many issues of writing readable code and refactoring.

Scientific Side-note

The research certainly doesn’t prove much about the way we read code, as it was only tested on 5 (that’s right – five) people. To me it seems like more could have been done.

I mean, going through the lengths of creating a really awesome eye tracking device, carefully calibrated so the machine can tell which line of code is being inspected at any moment, and then test it only on 5 people? Is getting more people into the test room so much hard work? I’d love to be a test subject and learn about my code reading habits. Next time, pick me.

Learn How To Cook Code

Life is a lot like jazz… it’s best when you improvise…

—George Gershwin

Jamie Oliver is the jazz players of cooks. If you have ever seen his show, you know he never measures anything. He throws in a bunch of ingredients very rapidly into a pot. He puts the pot in the oven, and he gets a great meal. Does he know exactly how much salt has he put in? Sure. Just enough. But how many grams? He doesn’t have the faintest idea. He shouldn’t. Cooking, for Jamie Oliver, is an art form, an act of improvisation on a familiar theme.

Jamie Oliver is passionate about food and health. He is appalled by the obesity epidemic, and he attributes it to the lost art of cooking. The relationship between what we eat and who we are is inseparable. In his passionate Ted Talk, Jamie presents us with a mother of two. She can’t cook. She isn’t able to find her way in the kitchen enough to make even the most basic meals. Her children are eating too much fat and too much sugar, the main ingredients in cheap, ready made fast food.

If we are what we eat, we should learn to cook. We need to take control over our lives.

How To Become A Jazz Artist In The Kitchen

Learning any new skill is basically the same. I’ve learned to cook only a couple of years ago. I’m still a lazy cook, but I can get by if I have to. It’s a simple, four step system:

  1. Lose The Fear:  The thing that prevents you from cooking is fear. You have never done this before, so you think you can’t. Before you start, you need to lose this fear. Remember, you don’t have to study twelve years at the Sorbonne to make an omelet. Imagine the worst that could happen:
    The food will not be tasty. You will burn it. You will throw it away and order take-away.
    Hey, you were planning to take-away anyway, so why not try to cook first? You’ve got nothing to lose.
  2. Start Simple: Buy some fresh vegetables. Cut them. Go online and find a salad dressing that requires only three ingredients. Mix the three ingredients and put them on the vegetables. Congratulations. You have just cooked your first healthy dinner.
  3. Add complexity: Tomorrow, a tasty toast with cheese, a tomato and some ketchup. The day after, an egg. After a week, try baking a cake. In three weeks you will have created a dozen different dishes. Some you will like. Others – not so much. It doesn’t matter. You are now able to follow instructions in the kitchen. You are a almost a cook, way better off than you have started.
  4. Improvise: This is where learning becomes mastering. After you feel more comfortable with recipes, you can start improvising on what you know. You’ve been eating your entire life. You know about tastes. Take one of the dishes you have created in the last month, and improvise on it to create something new. Remember step one: what’s the worst that could happen? Not much. Go on. Try it. If you fail, try something different next time. You are on the way to being a cook.

Why You Should Learn To Code

Most people today feel about technology the same way as the woman in Jamie’s talk felt about the kitchen. They don’t understand it, and they don’t trust it. They buy technology but feel helpless whenever anything goes wrong.

I see people feeling helpless about technology everyday. They don’t understand how the computer works, and they are afraid they’ll terribly mess something up, so whenever there’s a problem, they run for the hills (or the nearest geek).

We have computers everywhere. Our laptops and our cellphones are obvious, but there are computers in our TVs, refrigerators and cars as well. We will soon have computers in our shoes and clothes and in our blood. It is imperative to have at least basic understanding of computation and code.

Our relationship with technology is as necessary and complex as our relationship with food. Learning how to code, even some basic stuff, will help people feel in control over their environment. Technology is not magic.

Code Jazz

The steps in learning to code are the same as learning to cook, the same as learning how to play. Lost the fear, start simple, grow complexity and improvise. You may never be a jazz player, a chef or a chief programmer. That’s OK. It’s not about mastering the art. It’s about becoming self reliant. It’s about being able to do, on your own, things you have to get help for. It’s about control.

Start today. You have nothing to lose, and everything to gain. Here’s something to get you started:

Hey, it's dad. How do I print the flowchart?

My Week Of Work

It played out like an old film-noir story. It was Saturday night. I was hanging out with a couple of friends, when she called. She has been crying, I could tell, even though she tried to hide it.

“I’m sorry,” she said, “I have been crying, but try not notice it.”

“I didn’t notice,” I lied. I’m a gentleman like that.

“You have to help me,” she said, “you are my only hope.”

As I walked out into the rainy night, I thought about what I’ve gotten myself into. The mist gather playfully around my feet. A gust of cold wind. Lightning followed by thunder clap. But I was lost in thought, not noticing the cold. I’ve taken a job I’ve never done before. I wondered if I will be able to finish it in time. By the time I’ve reached home, I had a game plan in place. I sent her an email, asking if the plan was acceptable and then crashed into bed.

Just Do It

She runs a design studio, and she closed a deal with a small company to design and deploy their website. This is a simple almost static website, with just a few special pages here and there. The problem was that her previous “programmer”, hours before deadline, admitted he has no idea what he is doing, and quit. She managed to get an extension from the customer – one week.

I have some experience with WordPress, as it runs this blog, but I never learned the internals (how to write a plugin, a theme or anything PHP related, etc). I also have some experience in web development in Django which is an awesome framework for building web applications in Python, some CSS skills. Things I picked up along the way, but I’m not a professional web developer, not even close.

This is not rocket science, but still, the time was short, and there was plenty I didn’t know how to do. Under these conditions I like to employ a very aggressive problem solving methodology I call “Just Do It”. Here’s the methodology in a nutshell:

  1. Do It!
  2. Repeat until success.

Repeat Until Success

Most people fear failure more than they love success. They prefer not leaving their comfort zone. In fact, they are more loyal to their job description than they are to their workplace (also known as the “I’ve been a DBA for 12 years at 6 different companies” syndrome). When they need to learn something new, they take a few months to study the subject, read books and get mentally ready for the job. They need time to acclimate their comfort zone. If any of these people will have been asked to help my friend, they would have said “I’d love to, but I don’t think I can do it”.

Other people just get at it. They hack and slash at the problem until they solve it. First attempt? Almost always a failure. However, instead of giving in to despair, they try again. And again. And again. Each time they learn a bit more about the problem, about the technology, about best practices. After a short while, they get it done. It won’t be pretty. In six months time, if they ever see that code again, they’ll bow their heads down in shame. But there, in the moment, the problem is solved, they’ve learned something new and they are posed towards a successful future.

The important thing is to repeat, repeat, repeat, until you win. Each time, try a different approach. It doesn’t matter which one, as long as you try something. This is how I did it:

Day 1 – Reaching Out

The first day was the easiest. I’ve installed a new WordPress for development purposes. I’ve downloaded an empty skeleton theme I could use as scaffolding for the design of the site. I’ve put content into the site, set up the header, the pages, the navigation menu. These things are fairly simple and are supported pretty well by WordPress. By mid day I had a basic site in place. Then I started working on the more advanced features.

I found a WordPress plugin for the image galleries, but it didn’t really do what I wanted. The theme I was using had some defaults it was printing for each page (comments, written-by lines, etc.), which I couldn’t disable. The sitemap plugin was all wrong. The WordPress documentation really isn’t as good as it should have been.

This is what reaching out is all about. You go into the world and bring back what existing knowledge you can find.

By 2 a.m. I had to call it quits for the day. I found a great deal of existing samples but none of it was what I was looking for.

In the middle of the night, when all you want to do is go to sleep, do it. Don’t think: “I’ll just finish this, and then I’ll go to sleep”. No. Leave it all and go to sleep right now. You are in a hurry to finish, and you are tired, and the combination is making you stupid. You will not learn anything new at 2 a.m., believe me.

Day 2 – Tampering and The Rule Of Minimal Change

Next morning I decided I’ll try and use what I’ve gathered the previous day. I opened up an editor and started looking at code. With code in front of you, generic how-to questions become specific what-is questions. Yesterday I wanted to add a script to the site. Today, I learned what the “register _script” function does. I’ve spent the morning trying different modifications to the code until I got the results I wanted.

By noon, I was a different person. The day started with a feeling of frustration, as the previous day I couldn’t find what I needed. Now I was no longer dependent on existing solutions. I could change them to fit my needs. It was liberating and motivating. Things started falling into place relatively quickly.

How quickly? So quickly that everything I did was a mess. I call this the rule of minimal change:

Start with something that works in one way and change the minimal amount of code, to make it work in a different way.

This technique includes swapping the return value just before the return, overriding a single function somewhere along the call stack or commenting out blocks of code. There is no change too ugly or too short. Just get it to work in the way you intended with the least tampering. Since most of the code you haven’t read or understood, changing too many things will brake it, and tracing the bug will be much harder.

By 2 a.m. I decided it’s time for bed, because, well, it was getting late and I was already ahead of schedule.

Day 3 – Paying the Technical Dept

There’s a saying: There’s no such thing as a free lunch. And boy, do I wish this wasn’t true. When you work as messy as I did on day 2, you have encured a lot of technical dept you need to return. Technical dept is defined as follows:

Technical Debt is a wonderful metaphor developed by Ward Cunningham to help us think about this problem. In this metaphor, doing things the quick and dirty way sets us up with a technical debt, which is similar to a financial debt. Like a financial debt, the technical debt incurs interest payments, which come in the form of the extra effort that we have to do in future development because of the quick and dirty design choice. We can choose to continue paying the interest, or we can pay down the principal by refactoring the quick and dirty design into the better design. Although it costs to pay down the principal, we gain by reduced interest payments in the future.

The site was working as expected, more or less. The functionality was programmed in, in any case. Now there was the matter of cleaning up and applying the design. This day was spend cleaning up after myself. It was a long, tedious and boring day. Hardly any visible progress was done this day, but it was more stable and robust.

After 16 hours of work I was done. Or so I thought.

Day 4 – (Im)perfection Has Its Price

It Ain’t Over Until the Fat Lady Sings – the customer has to say it is finished for it to be finished. You can’t always get away with good enough. Even if I’ve worked long hours, or my sofa cushions are shaped like my rear end, if the client isn’t happy I’m not done. I am not a professional web developer, so I expected there’ll be things I miss.

Designers are perfectionists. They seek perfect beauty and aesthetics, and of course, they want the result to look exactly like the PDF they’ve sent you. Me? I love quick & dirty & be-done-with. I got an email that morning with a list of minute pixel sized changes.

So I cleaned some more, and fixed some more. After an even more tedious day, I’ve got it to the point where I was allowed to deploy the site.

Takeaway

This isn’t my best work, but it’s amongst the fastest. I’ve employed this sprint-learning methodology a few times before, and I’ve always over performed my expectations in the short term. In the long term, the technical dept will be returned. It is sometimes worth throwing away and redoing everything, but it is always better to learn through doing than through reading books.

So, when in doubt, just do it.

For a Man With a Nail, Everything is a Hammer

A good software engineers is a practical purists. He loves the purity of code and methodology, but he lives in the practicality of shipping a product. Sometimes I’m faced with a nasty piece of functionality, and I know that if I could just have a few weeks, I’d be able to gracefully build a beautiful and complex system that will be maintainable and work. But I only have a day, so I solve it with duct-tape and spit. I know I’ll feel guilty about it for the rest of the week. And then I’ll feel bad about it again, when we’ll find bugs in the code. And we will.

A Unit Test A Day Keeps The Doctor Away

Unit tests are small automatic tests for little pieces of functionality. Good unit tests will help you realize bugs near the time you have written them. That’s great because fresh bugs are easier to fix. However, there has been a surge of unit-test-fanatics, who claim this is the one-true-way of software development. They shun any other QA methodology. Tests, they say, are more important than code. So we’ve seen the rise of TDD, BDD and I-don’t-know-what-else-DD, and they are going forward at such a speed that there is no way for the rest of us to catch up.

I don’t like writing unit tests. They bore me, and I hate writing boring code. Jeff Atwood seems to get both sides of the argument:

Even if you only agree with a quarter of the items on that list– and I’d say at least half of them are true in my experience– that is a huge step forward for software developers. You’ll get no argument from me on the overall importance of unit tests. I’ve increasingly come to believe that unit tests are so important that they should be a first-class language construct.

However, I think the test-first dogmatists tend to be a little too religious for their own good. Asking developers to fundamentally change the way they approach writing software overnight is asking a lot. Particularly if those developers have yet to write their first unit test. I don’t think any software development shop is ready for test-first development until they’ve adopted unit testing as a standard methodology on every software project they undertake. Excessive religious fervor could sour them on the entire concept of unit testing.

That is why I’ve developed the  Stolen-Tests-Development practice

Stolen Tests Development (STD)

The last few weeks, I’ve been writing a database middleware. This product can parse SQL and then do some logic before passing the command to the database. Building an extensive test suite for a product like this is almost impossible. SQL is one of the most complex, free-form, high-level languages there are.

My test cases are filled with SQL queries and expected outcomes, but there is a limit to my mental-capacity to produce these long and boring lists of tests. Soon enough I’ve felt like driving an icepick through my eye.

Select * from a1

Select col1 from a1

Select col1, col2 from a1

Select * from a1 alias

Select * from a1, a2 where a1.col1 = a2.col1

Select monkey_name from mad_table where drive_number is not null

This can drive a monkey mad.

So, I’ve started thinking. How can I test my program, extensively, while writing a minimal number of tests? Steal really good tests. Django is an extremely popular web development framework. I love it. One of the things that make Django so incredibly awesome is that the core-development team really understands the importance of good tests and documentation of their product. This means you can use the bleeding-edge development version of Django in production and it’ll probably work fine.

One of the features of Django is the ORM, the Object-Relational-Model, which takes care of the persistence of the objects in a database. What I’m doing is providing Django with a DB back-end that connects to my middleware instead of MySQL. Now I can run the entire Django test-suite against my software. If this works, I have successfully tested my entire product (integration included), with Django. I feel very Django-certified.

Then, the only tests I have to write are the one regarding the core-logic of my middleware.

I bet there’s also good test-suite for MySQL and Rails. Anyone else I can get my STDs from?

I don’t like writing unit tests. They bore me.