Kaffee und: [Kuchen]: productmanagement

Showing posts with label productmanagement. Show all posts

Saturday, 7 August 2010

Computer Adaptive Testing and the GMAT

Back around Christmas, I had a few weeks free and decided to prepare for the GMAT exam, a "computer-adaptive" standardized test required (or at least accepted) by business schools around the world. With university application deadlines looming, most of the testing sessions were already fully booked, but I managed to find a mid-January test date just a few hours' drive away.

I'll start by saying that I finished university eight years ago, so—ignoring a few intense weeks of German classes—it's been a while since I "studied". And it was about half my lifetime ago that I last wrote a standardized test: you know, with those sealed envelopes, bar-coded stickers, big machine-readable answer papers, and detailed instruction books reminding you to use a #2 pencil and "chose the BEST answer". If, like me, you haven’t written one of these in a while, you may be surprised by how much has changed.

The GMAT, like a number of other admissions tests, is now administered exclusively by computer and the test centres even have palm and iris scanners, which are used any time you enter or leave the room. Unlike most computer-based tests, though, which stray little from the well-worn paths of their pencil-and-paper siblings, the GMAT uses a computer-adaptive process undoubtedly conceived by a singularly sinister set of scholars and statisticians. This process is complex and has a number of interesting implications but basically it works like this: when you get an answer wrong, the questions get easier; when you get an answer right, they get harder. The theory is that by adjusting the test to your ability, the computer is able to rate you more precisely against people of about the same level.

The material on the test is not really that hard. It's no walk in the park either, but it mostly limits itself to stuff you learned (then unfortunately forgot) in high school. Everything else about the GMAT, though, seems designed to maximize stress:

The test is long. Nearly four hours long. There are three sections 60-75 minutes long, with a short break between each.
The test is timed. Ok, what test doesn’t have a time limit? But this one has a clock counting down in the corner of your screen, taunting you to pick up the pace. Worse than that: you'll probably actually need all the time because the questions keep getting harder as you get them right, remember? The challenge on this test is not usually solving the problems, but rather solving them in time.
The breaks are timed. Again, not surprising, I guess. But your break is only eight minutes long and, if you’re not back, the next section simply starts running without you! Inevitably you spend the entire break worrying about how many minutes you have left. Since you need to scan your palm and iris at the beginning and end of each break, your trip to the bathroom is not going to be leisurely.
Erasable notepads. No pens or paper are allowed, presumably so you can't smuggle out questions. Try working out math problems quickly on a laminated card with a dry-erase marker.
You can't skip a question. Remember that the questions get harder as you get them right and easier as you get them wrong. This means that the next question you see is largely determined by how you answer the current one. The computer needs an answer to the question, so you can’t skip it.
You can't go back. Similarly, since your current position is determined by your earlier answers, you can't go back and change them. So if you're used to finishing early and then checking over your work, you'd better start unlearning that habit.
You don't know the difficulty level of the questions. Is the test feeling easy because you really know your stuff or are you simply earning yourself easier questions by choosing a lot of wrong answers? The only saving grace here is that you're so busy madly answering questions that you don't have many brain cycles left to worry about this.
Some of the questions are "experimental". About 25% of the test is made up of new questions being tested on you to determine their difficulty level, but of course you don’t know which. That's right: that really hard question you just spent 5 minutes working on because you were sure you could solve it... doesn't count!
You are heavily penalized for not finishing. Right, ok, so you have a fixed time, you can't skip or come back, and you can’t predict the difficulty of the remaining questions. But if you want a decent score, you still need to pace yourself to answer all of them. Remember that countdown clock? You have about two minutes per question–so keep an eye on that average time! Oh, and the clock counts down but of course the question numbers go up, so you’d better get real quick at subtracting your time from 75 (you’ll be working out your average question time every few questions).
Data Sufficiency questions. These nasty little buggers are, I think, unique to the GMAT. Given a math problem and two statements, you are asked whether the problem can be solved using either statement, both statements, or both statements together. You don't need to work out the answer to the problem but you need partially solve it several times with different information and keep each attempt separate in your mind. Don't think that sounds tricky? Try searching for "sample gmat data sufficiency questions" and try a few. I think I got only about a quarter of these right on my first practice test.

You have to admire the brilliantly evil minds that came up with this thing. The experience for the test-taker is four hours of pure, non-stop stress. At least that was my experience: my brain literally didn’t stop whirring. The adaptive process pushes everyone to their limit, challenging them to keep their feet under them and ensuring that they're sweating right until the end.

The test designers have really optimized the experience around their own needs: the test is easy for them to grade, minimizes cheating, allows new questions evaluated automatically, and measures something in a pretty precise, consistent way. I’m not entirely certain what it measures, but I’m pretty confidant that people who are generally smarter, better organized, faster to learn and adapt, and better at dealing with stress will obtain a better result.

As a company selling a product, it might seem odd that GMAC (the company that runs the test) can get away with optimizing the test for their own needs. But, although it may appear that the test is the product you’re buying, I think what you’re really buying is the report that is sent to the universities. The cost of this report just happens to be $250 + study time + four hours of stress. If GMAC had competitors, they might be forced to optimize for the test-taker but, as a virtual monopoly, the motivation just isn’t there.

The challenge with the GMAT, I think, is really learning an entirely new test-taking strategy. I used a couple of books (Cracking the GMAT, by Princeton Review and The Official Guide for GMAT Review) to first understand the test and the differences in approach that were required and then to practice as many questions as possible of the specific types that appear. Doing computer-based practice exams is, of course, also essential given that what you’re learning is the test-taking strategy more than the material.

I emerged from the exam feeling absolutely drained but energized by the rush of tackling something so intense and coming out on top. In some ways it was fun but I have no intention of rewriting it any time soon. :)

Friday, 2 July 2010

Seaside 3 "Release Candidate"

You could say it's been a long time coming.

Seaside 3.0 began ambitiously and grew from there. We began (at least I did) with the goal of cleaning up the architecture, revisiting each aspect and asking what could be simplified, clarified, or standardized. As functional layers were teased apart, suddenly pieces became unloadable and a repackaging effort got under way. From this we realized we could make the process of porting Seaside much less painful. Along the way, we lowered response times and reduced memory usage, added 10x the number of unit tests (1467 at last count), standardized code and improved code documentation, added jQuery support, and, oh, did you hear there's a book?

The result? This release runs leaner, on at least six Smalltalk platforms and is, I think, easier to learn, easier to use, and easier to extend. Seaside 3.0 is the best platform out there for developing complex, server-side web applications. Is it perfect? No, but I'll come to that part in a moment. It is the result of literally thousands of hours of work by a small group of people across all six platforms. But this release also exists only due to the generosity of Seaside users who tried it, filed bugs against it, submitted patches for it, and eventually deployed it.

Deployed it?! Yeah, you see, not only have all the commercial vendors chosen to ship our alphas and betas, but our users have also used them to put national-scale commercial projects into production. I alluded last month to a conference session I attended, in which somebody made the statement that

The best way to kill a product is to publicly announce a rewrite. Customers will immediately avoid investing in the "old" system like the plague, starving the product of all its revenue and eventually killing it.

It was a shocking moment as I realized we'd attempted just that. At first we justified the long release cycle because we were "doing a major rewrite"; then we just had "a lot more work to do". Eventually there were "just too many bugs" and things "just weren't stable enough". And, finally, once we realized we desperately needed to release and move forward, we just ran out of steam (no quotes there—we really did).

I still think the original architectural work needed doing and I'm really happy about where we ended up, but here's what I've learned:

When your wonderful, dedicated users start putting your code into production, they're telling you it's ready to be released. Listen to them.
We don't have the manpower to carry out the kind of QA process that goes along with an Development, Alpha, Beta, RC, Final release process.
We need to figure out how to get more users actively involved in the project. This could be by writing code but probably more importantly by writing documentation, improving usability, building releases, managing the website, doing graphical design, or something else entirely. The small core team simply can't handle it all.

Trying to apply these lessons over the past month, I asked for help from a few people (thank you!) and we closed some final bugs, ran through the functional tests, developed a brand new welcome screen, and managed to bundle everything up. We're releasing this today as 3.0RC.

We're not planning a standard multi-RC process. The "Release Candidate" simply signifies that you all have one last chance to download it, try it , and let us know about any major explosions before we do a final release, hopefully at the end of the month. From there we'll be reverting to a simpler process, using frequent point releases to fix bugs. 3.1 will have a smaller, better defined scope and a shorter cycle. I have some ideas but before we start thinking about that, we all need a breather.

I also have some ideas about the challenges that potential contributors to the project may face. But I'd like to hear your thoughts and experiences. So, if you have any suggestions or you'd like to help but something is stopping you, send me an email or (better yet if you're there) pull me aside at Camp Smalltalk London or ESUG and tell me about it.

Ok, ok. You've waited long enough—thank you. Here's the 3.0RC one-click image, based on Pharo 1.1 RC3 and Grease 1.0RC (just the image here). Dale has promised an updated Metacello definition soon. Enjoy!

Sunday, 6 June 2010

Berlin, product management, and Smalltalk events

Beach bars, cuba libres, bircher müsli. I'd forgotten how classically German these things are but it only takes being away for a few months to make them stand out again.

Thanks to the official un-organizers of Product Camp Berlin, yesterday was a very successful day of discussions and networking. Some interesting points for me were:

Kill a feature every day. That way people get used to the process and don't scream so loudly when support for features and platforms needs to be removed. This reminds me of the concepts of constant refactoring and non-ownership in software development, which helps ensure that people are similarly used to code going away.
The problem may be your pricing model. When products (in startups particularly) begin to flounder, there may be nothing wrong with the product itself. Sometimes a simple tweak of the pricing model can be the most effective solution.
The best way to unofficially kill a product is to publicly announce a "rewrite". Customers will avoid investing in the old system like the plague, rapidly starving the product of all its revenue.
It sounds like there are some interesting products on the way from Nokia.
This is my second conference since I actively started using twitter — it was not as well used this time but I still really like the technology for this sort of use case: it's great to see what you're missing, share your thoughts, and catch up with people after the event is over.

The weather was gorgeous in Berlin but has turned foul in southern Germany today. No big deal though as I've been slogging away indoors at my presentation for epicenter in Dublin on Thursday. I'm getting close with my slides and looking forward to the event but, before I can get that checked off my list, it's off to Stuttgart tomorrow evening for the VASt Forum.

[update: I've been offered 10 discount tickets for epicenter to give away; details here if you'd like to come see me in Dublin this week.]

For those wanting to attend the Camp Smalltalk London event on July 16-18, make sure you head over and sign up now. It's looking like we're going to fill up even our expanded capacity. If it's full by the time you get there, add yourself to the waiting list and we'll see what we can do.

Sunday, 24 January 2010

Facebook time

I was poking through some of Seth Godin's eBook What Matters Now this afternoon (apparently, in my case, it didn't matter until 6 weeks later). I like this message from Howard Mann:

There are tens of thousands of businesses making many millions a year in profits that still haven’t ever heard of twitter, blogs or facebook. Are they all wrong? Have they missed out or is the joke really on us? They do business through personal relationships, by delivering great customer service and it’s working for them.

How much time are you spending with your customers?

Wednesday, 16 December 2009

Training in Boston

So I'm most of the way through this jaunt down the US East Coast and have yet to post even a single update (unless you count the occasional tweet). I know, I know... what can I say? I've been busy. My first attempt was overly rambling so I'm going to focus on one aspect here and follow up with a few more posts over the next couple of weeks.

The main reason for the trip was a Product Management seminar led by Steve Johnson of Pragmatic Marketing—and I definitely recommend the course to anyone who's interested in this stuff. One thing I found interesting: in North America smalltalk usually means asking, "so what do you do?"; well at a seminar made up of 30 people who all do the same thing, that gets replaced with, "so where do you work?". Fun watching the puzzled looks on people's faces as they stared at the blank line below the name on this independent consultant's name tag. :)

The main focus of the course is on guiding product development through market problems and on grounding those problems in real data instead of hunches and "wouldn't it be cool if...?". I'm interested in Product Management from two angles: first, as a possible career direction and, second, in its applicability to open source projects, such as Seaside.

In past jobs, I've found myself naturally trying to fill an institutional void. I've been the one asking, "Are you sure the students want an on-campus version of Facebook? I kind of suspect they just want to use Facebook...". Actual demand for what we were doing, the exact problems we were trying to solve, and even the development costs have all been more-or-less-hand-wavy things. How do you know what to implement if you don't know what problem you're solving and for whom? Or, to look at it another way, if you develop without that knowledge, how do you know anyone will find the result valuable? It was revealing for me when I first learned there are people who make a living doing these things I found rewarding.

The applicability to open source is an interesting issue. On the one hand, it is almost intuitively obvious that most of the same factors apply. A project that meets a market need will succeed while one that does not will fail. A project that knows who its users are can be more effectively marketed; one that does not will succeed only through chance or an inefficient shotgun approach. What I'm not sure of yet is what is different: is it the formulas, the costs of the resources, or maybe their units of measurement? Or do we need to tweak one or more of the definitions? As a random example, Product Management makes a distinction between users and buyers of a product; what's the correct mapping for these concepts in open source? I'm still pondering all this... more to come.

Before I leave off, I should mention that the Hilton DoubleTree in Bedford is one of my best hotel experiences in recent memory. Everything was efficient and painless. The room was roomy, modern, and spotlessly clean. The internet was fast and free. And the (three!) extra pillows I tossed on the floor were left there neatly for my entire stay instead of being put back on the bed. They even insisted on comping a meal I had in the restaurant which was, admittedly, slow in arriving but not to the point I was concerned about it. So, I don't know why you'd be in suburban Boston, but if you are, go stay at the DoubleTree.