Documenting projects as you write them

December 12th, 2011

In Feb 2009 I started converting a PHP application over to Python and a framework. Today, I have finished all of the functional code and am at the point where I need to write a number of interfaces to the message daemon (currently a Perl script I wrote after ripping apart a C/libace daemon we had written).

The code did work in prior frameworks, I’ve moved to Pyramid, but, now I’m having to figure out why I ever used __init__ with 13 arguments. Of course everything is a wrapper around a wrapper around a single call and nothing is documented other than some sparse comments. Encrypted RPC payloads are sent to the daemon – oops, I also changed the host and key I’m testing from.

Yes, I actually am using RPC, in production, the right way.

Total Physical Source Lines of Code (SLOC) = 5,154

The penultimate 3% has added almost 200 lines of code. I suspect the last 2% adding the interfaces will add another 100 or so lines. Had I written better internal documentation, getting pulled away from the project for weeks or months at a time would have resulted in less ramp-up time when sitting back down to code. There were a few times where it would take me a few hours just to get up to speed with something I had written 18 months ago because I didn’t know what my original intentions were, or, what problem I was fixing.

Original PHP/Smarty project:

Total Physical Source Lines of Code (SLOC) = 45,040

In 2009, when I started this project, test code written resulted in roughly a 10:1 reduction in the codebase. It isn’t a truly fair comparison — the new code does more, has much better validation checks, and adds a number of features.

It’s been a long road and I’ve faced a number of challenges along the way. After Coderetreat, I’ve attempted to write testing as I’m writing code. That is a habit that I’ll have to reinforce. I don’t know that I’ll actually do Test Driven Development, but, I can see more test code being written during development, rather than sitting down with the project after it is done and writing test code. Additionally, I’m going to use Sphinx even for internal documentation.

People might question why I went with Turbogears, Pylons, and ended up with Pyramid, but, at the time I evaluated a number of frameworks, Django‘s ORM wasn’t powerful enough for some of the things I needed to do and I knew I needed to use SQLAlchemy. While Django and SQLAlchemy could be used at the time, I felt TurboGears was a closer match. As it turns out, Pyramid is just about perfect for me. Light enough that it doesn’t get in the way, heavy enough that it contains the hooks that I need to get things done.

If I wrote a framework, and I have considered it, Pyramid is fairly close to what I would end up with.

Lesson learned… document.

Today is going to be a very frustrating day wiring up stuff to classmethods that have very little documentation and buried __init__ blocks. Yes, I’ll be documenting things today.

What is a startup?

December 12th, 2011

All this talk about startups, and I often wonder if people really understand what a startup is.

A bakery is not a startup. A consulting company is not a startup. Anything that requires multiplying the number of people in the business to scale, is not a startup.

A startup is a company that has an idea where doubling income does not require doubling staff. It is a business where scaling to add another 1000 clients requires very little additional hardware.

A software company is a startup. After writing that first copy, selling 1000 more has only slight incremental costs. Selling 10000 more requires only slightly more resources than that. A subscription web site is a startup. The difference between 10 paying subscribers and 1000 paying subscribers in terms of the labor to produce the site is minimal.

If you start a business and you are primarily responsible for earning the income through your direct efforts, you are an employee, not a business. A consultant is not running a business, s/he is a contracted employee with many employers. A software developer that sells his product and can take three days off without materially affecting his income is a business.

Google Groups Captcha 404

December 8th, 2011

The other day I was reading a thread on google.groups and wanted to add the user to my Google+ circles as we work on a number of projects that are somewhat related. A search of his name came up with too many results to be helpful, so, I figured I would try searching by his email address.

I mistyped the first captcha:

Properly typed the second captcha:

and received a 404 page:

It is completely repeatable and I’ve tested it numerous times. You can of course go back to the original page, click the … and get a new captcha, but, make sure you solve it on the first try! Note that the topic is also set to “” on the 2nd captcha.

Now if only there was a place to report bugs on google.groups. A fifteen minute search of the sparse FAQ didn’t turn up anything.

Global Day of Coderetreat

December 4th, 2011

Yesterday I participated in the Global Day of Coderetreat. Coderetreat is an event inspired by Corey Haines who spent time traveling around the country teaching groups fundamentals of software development – asking for just enough money to make it to the next city, accommodations on someone’s couch, etc.

It is a one day, intensive. pairs programming exercise focused on Test Driven Development. CoderetreatMiami was organized by Tom Ordonez and Carlos Ordonez from Aeronautic Investments, Inc. and facilitated by Bryce Kerley and Michael Feathers.

After a brief intro, we were explained the rules for Conway’s Game of Life – basically, you have a matrix, and look through each live node. If the node has two or three neighbors, it lives, otherwise it dies. Then, you look at all of the dead nodes and if it has three living neighbors, it comes alive.

Then, we were told to choose a partner to pair with and start writing code. We chose python, wrote a quick library, some functions and did some test code to make sure our function was working as it should. We chose tuples stored in a list. After 40 minutes, we started working in unittesting when we were told that time was up, delete your code…

What? People were a little shocked. We just spent 45 minutes writing the code, delete it? Can’t save it, can’t work on it later, can’t save it to a repo, etc. Delete it.

rm -rf life/

After a few minute discussion, we’re told to pair up with another person – someone we haven’t paired with and do it again. This time I paired up with a Drupal/Javascript guy and we proceeded to write the game of life again. We didn’t get as far due to the fact that my Javascript coding isn’t as strong as my Python/Perl coding, but, we did have some tests written and had some functionality. Time’s up, delete your code. Again?!?

I then paired up with another person and we used Python. We changed our strategy a bit, decided to do a bounding box check for the alive portion to eliminate having to walk the large grid. Times up, delete your code.

We took a brief break followed by Michael Feathers showing us Test Driven Development in Ruby. Starting from an empty function with a test defined that showed him what his expected output should be. Run the test script, failure, fix this, test, failure, fix another bug, test, different results, still a failure, fix the code, test passed. Then we looked at a more detailed example of a (in his words) badly written Game of Life and he showed us a few iterations of the testing.

Time to pair up again and we’re off and running. Perl this time, however, additional condition, try to write it without IF statements. I’ve missed something because I can’t quite remember passing arrays of arrays and strings in Perl so I take a few minutes to write some test code to remember that @{$blah} gets me what I need and we’re off and running. Boolean and binary anding gets us pretty close to not needing ifs. We decide our test case code can have ifs, but not our game functions. Again, writing a test that has a few cells populated and writing the check_alive function, we get through that and start to write the check_dead routine and bam. Time’s up. Delete the code? Yes…

Another partner and this time it is PHP. While I am comfortable with PHP, we’re told, no two dimensional arrays. After some internal debate, we decide that using a transform on a one dimensional array is really just using a two dimensional array and we settle on some tricky column math and three loops to test the adjacent cells. After a bit of coding, we start writing some test functions to test check_alive, run into an Out of Bounds error because one of our test points is on the edge (a case which we talked about, but, didn’t code for), time’s up, delete the code.

On my last pairing, I am paired with someone I know. I’ve gotten the game working twice, he’s gotten it working once. He’s running a language called Processing which has a really simple IDE and ability to run the code, and, graphically display our matrix. Prior to this, all of our development has been testing code and looking at True/False tests or lists of strings to make sure they are equivalent, etc. We write our code really quickly but run into a problem with Processing’s storage of global arrays, so, we have to do a little trickery to swap arrays before the draw function. At the end of 45 minutes, we’re very close, it iterates once, goes to a blank screen, then displays the start screen again. We know it is something with the array copy (and Don ends up solving it later), 45 minutes is up… you can keep this code if you want to work on it.

Normally, they do one more session where you are paired up with your original partner to make a final attempt, but, we ran a little short on time. After a recap, we’re asked to stand up and give a brief Introduction, What we learned, What surprised us and What we’ll do differently in the future.

For me, I’ve often focused on unit tests well after the code has been written. While I don’t think I can easily change that on a number of projects, I think I will try some Test Driven Development for new code and some other projects.

All in all, it was a great experience. I met a lot of great people, learned some new coding techniques and learned a rather intriguing method of teaching. Pairs programming is good, but, for learning coding, iterating over the same problem six times in a rather intensive environment showed me other people’s thought processes in how they attacked the problem.

Many of the people looked at the problem in a much different way. One group used functional programming and maps, my first attempt used tuples and a list (and a second attempt as well which tried to solve the dead check by looking at the maximum bounding box rather than doing a complete traversal of the matrix). When we were asked to try it without if statements and in a one dimensional array, people’s thought processes changed as did the process when we couldn’t use a two dimensional array.

While there is always more than one way to solve a problem, seeing the different ways people approached the same problem, even after a number of iterations was intriguing.

Pictures from CoderetreatMiami.

Quite a fun event, highly recommended if you get the chance to attend one.

Python coding standards for imports

December 2nd, 2011

Recently I’ve been refactoring a lot of code and I’m seeing a few trends that I find easier to read.

With imports, I prefer putting each import on its own line. It takes up more screenspace, but, when looking at a commit diff, I can see what was added or removed much more easily than a string changed in the middle (though, more of the diff tools are showing inline differences).

However, what I started to do recently was do imports like:

from module import blah, \
                   blah2, \
                   blah3

I keep them in alphabetical order, but, I find that reading through that removes the ‘wall of text’ effect.

Original imports:

from module import blah
from module import blah2
from module import blah3

I find that the new method I’m using allows me to more easily see that two imports came from the same module.

Another possibility as mentioned by Chris McDonough is:

from module import (blah,
                    blah2,
                    blah3)