Lessons Learned from Seeds 1st Prototype Metrics

  • Seeds 1st Prototype

    Seeds is the Flash game we’re currently developing. As our debut title, it’s very near and dear to us so we want to make it the best it can be! So far, metrics have been of great help to us; they’re a critical tool we use to improve the player’s gameplay experience.

    In a nutshell, our game creation process looks a little something like this:

    1. Come up with an idea
    2. Try it out
    3. Playtest the hell out of it to see what works and what doesn’t
    4. Repeat

    Since almost all of our players are spread out across the globe, unfortunately we can’t look over their shoulder while they play and so we needed a way to observe them from a distance.  This is why we added a simple metrics system to our games which tracks a couple key pieces of information and uploads it to our servers. This way we were free to spread our prototype across the internet and still get a good idea of where players are succeeding or having trouble.

    The Process

    The Prototype

    In Seeds, you play as a little kid blowing a bunch of seeds into the wind to replant a garden you destroyed.

    Since it’s such an integral part, the first thing we thought we needed to get right is the initial blowing gesture. This is what the 1st prototype focused on.

    You can try out the game prototype which was used to collect the data yourself right here: http://www.funstormgames.com/seeds/prototypes/1/

    Our goal was to create a blowing gesture that:

    • Is easy to learn but hard to master
    • Gives a fair score that is proportional to how well the player thinks they did
    • Gives a variety of scores (hard to get the same score twice in a row)

    To test how well we achieved those goals, we collected…

    Metrics Database View

    Metrics Database View

    The Data

    This post is based on the data uploaded from 108 plays by 20 players between the 11th and 12th of May 2011. Players came to us from 2 gaming related forums we posted on (NeoGaf & Flixel).

    We tracked and collect a total of 5 pieces of information:

    • A unique ID for each player
    • The 3 (secret) factors that determine how many points are awarded
    • Points awarded

    We wanted the metrics to be as unintrusive for the player as possible. For the tech savy: from within Actionscript, we modify the url to a PHP script to include the variables, and load it in the background without displaying it. The script uses the GET method to grab the variables from the modified url and insert them into a MySQL database. Not the most elegant solution, but it was extremely quick to create and gives us all the info we need without annoying the player.

    Results

    First, the good news: 90% of players chose to play more than 1 round and the average number of rounds played was 6.3! Nobody was forcing these random people from the internet to play another round, so the fact that they did made us think that at the very least our idea has potential.

    Number of Rounds Played Per Person

    That’s great, but we had clear goals going into this, so how did we do? Well, that’s the not-so-good news.

    1. The blowing gesture should be easy to learn but hard to master

    Once you know what you’re doing, it should be easy to score at least 50 points in the prototype. Therefore we defined a player who has learned the gesture as someone who consistently scores 50 points or more.

    From looking at the data, we could see there was a lot we had to improve here. Only 55% managed to learn the gesture according to our definition. In the final version, we would like this figure to be near 100%.

    Examples of Players Learning Our Gesture

    What was even more troubling is that some players who did learn the gesture, had difficulties maintaining a good score. A player mastering the gesture should show a clear series of improvement, but 37% of the time, players scored worse in a round than on their previous one – yikes!

    Change in Player Score Per Round

    The highest score recorded was 860, out of a maximum 1000. We thought the results were pretty clear – this gesture it too complicated and too hard!

    Because of these results, we decided to:

    • Reduce the number of things the computer looks at when rating a blow so that the criteria is less strict and a wider range of gestures receive positive ratings.
    • Make it easier to get a decent score by being more lenient in the way we awarded points

    They say a good designer doesn’t just know how to add features but also when to remove them, and we think this is a case where reducing the complexity of the system made for a better experience.

    2. The blowing gesture should give the player a score proportional to how well they think they did

    There are few things more frustrating when playing a game than thinking you nailed something, just to have the game tell you: “that sucked!” Conversely, you know when you screwed up and if the game rates you highly, it makes it look like dumb machine.

    To test this, we wanted to ask players to rate themselves, and then show them the score the game assigned so we could measure how much the machine’s ratings matched the players’. However when we tested out this idea, we realized the game really didn’t have enough visual or audio feedback in there yet for players to be able to judge themselves. There was nothing to let players know whether a gesture was good or bad, and without context the score was meaningless to them.

    This is still something we think is important to get right and test, so we shelved the idea for a future date when we’ve been able to integrate more feedback systems.

    3. The blowing gesture should give a variety of scores

    Probably the easiest to get right, we could see that we succeeded here because the numbers were all over the place!

    Other Lessons Learned

    • The importance of starting metrics early – if we hadn’t done this, how long would we have gone on thinking everything is alright?
    • The importance of sending out your prototype to a small group to find the most critical errors first. We had just a couple friends try it before sending it out to the larger group, and those few found some obvious big mistakes that we hadn’t been able to see. But once they were identified we were able to fix them quickly and as a result we think we were able to collect much more meaningful data from the prototype.
    • Seeing so many people struggle to grasp the basics of your game is a fantastic motivator. As the results started to pour in, we got to work on improvements right away and stayed up late implementing them.


    May 21st, 2011 | Wolfgang | 2 Comments | Tags: , , , ,

About The Author

Wolfgang Graebner

I'm half of Funstorm - I do the programming and business stuff.

2 Responses and Counting...

  • test84 05.21.2011

    Lovely, It would be lovely if you would post a blog post about how you put data into SQL database with flixel.

  • I’ve actually since started using http://www.playtomic.com/ for this since. Their servers work really well and they have some super neat features – way more powerful and easier to setup than my ghetto solution!

Leave a Reply