Friday 15 May 2015

PyCon 2015

Last month I was lucky enough to attend PyCon 2015, the largest gathering of the Python developer community, held in Montreal, Canada.  I've been wanting to write about it for a while, but sadly uni exams and project deadlines have been keeping me busy!

I was able to attend thanks to the very generous financial sponsorship offered by the conference, in particular I received a grant that covered the cost of my flight through PyLadies (who also organise a meetup I've been attending for a while, and how I discovered that this grant was available).  I think the generosity of the financial aid package shows a commitment from the Python Software Foundation (the organisers) to improving the diversity of conference attendees.  I believe this year was the first time that the female contingent of the conference reached about 1/3 in terms of conference attendees AND presenters, which is an amazing achievement.  Some fellow London PyLadies were also there and presented posters on Clojure and a Raspberry Pi game that emulates "Simon"!

It goes without saying that I met a lot of cool people and learnt tons about Python and programming in general.  I had no idea that so many companies used Python (Yelp, Eventbrite, Dropbox, LinkedIn, Survey Monkey to mention but a few "household" names....there was also a host of data science companies I had not heard of doing really exciting work), it was fascinating talking to them about how they used the language.  Montreal is also a lovely city to walk around (other parts in Quebec and Ontario were fun to visit too - check out my travel blog to see some photos).

Palais des Congres, a pretty cool conference venue!


Conference Sessions that were awesome


Keynotes

Conference attendees were treated to several inspirational keynote speeches at the start of every conference day.  Python creator, Guido Van Rossum spoke about how we wanted everyone to fully migrate to Python 3 within 5 years (a show of hands in the room suggested the majority were still using Python 2.7 exclusively) and he wanted at least 2 women on the core development team (currently there are none) by the end of the year, even if he has to train them himself!  Gabrielle Coleman gave a very interesting talk on hacker activism and the research she had done into the Anonymous group - her book, Coding Freedom on the "ethics and aesthetics" of hacking is definitely on my to-read-list.  Catherine Bracy, (director of Code for America -an organisation that hosts meetups for people to hack code that helps public sector services,  she also previously worked on Obama's re-election campaign), gave a very uplifting talk on how developers can contribute a lot to their local community, as many public sector services are still in the dark ages when it comes to their use of tech.

My favourite keynote came from Jacob Kaplan-Moss, one of the creators of Django (this is how he is often described, but he really downplays his involvement).  His talk was about how the "talent myth" drives people away from tech - the fact that you have to be a "ninja"/"rockstar" programmer to succeed (phrases that are often seen on job specs). By definition of the word, "average", the vast majority of people are average programmers, but this is something that is perfectly acceptable (he in fact said several times in his talk "I am a mediocre programmer" and he doesn't understand the hype surrounding the fact he supposedly created Django) - other careers don't expect everyone to be the best at what they do.  He believes that in tech, the expectation for everyone who works in the industry to be the next whizz kid or to do many side projects in their spare time is intimidating (although most people work on side projects because they are passionate about it, not because they are pressured to do so) and off-putting for people who are considering learning how to code.  The keynote was received with a well-deserved standing ovation, which shows that the message really resonated with the majority in the audience.



Favourite Talks:

You can click here to see just what a varied and action-packed schedule the conference had and what a difficult choice it was to go to just one talk out of the 4-5 that ran concurrently during the main conference days.  Luckily, all talks were recorded and anyone can check them out online at this link - I have a list of over 10  to catch up on that I am still working through!  I recently watched a fantastic talk from the website that I wasn't able to catch in person, called "Facts and Myths about Python Names and Values", going through what happens when you assign variables and why sometimes the behaviour you see isn't what you expect, particularly with mutable (e.g. lists) versus immutable (e.g. strings) objects.  "Usability Testing on the Cheap" was also a very interesting non-language specific talk on effective methods of collecting feedback and designing products around user experience.

From the talks I attended at the conference, I really enjoyed "Type Python, press enter. What happens?" - this was about what your computer does "under the hood" when you type python into the command line, and how the terminal deals with commands like Ctrl+C and Ctrl+D to end the program.

"Virtualenv for New Pythonistas" was also both very interesting and highly useful - it was about why it makes sense to set up a virtual environment when you are coding, not just in case you break your machine, but also if you are working on multiple projects on the same machine that use different versions of things/setups.  I had seen people use virtualenv at the code dojos and hackathons, but never fully understood why - this talk answered questions for me that I had felt too embarrassed to ask!

Another talk I really enjoyed was "Data Science in Advertising", given by a data scientist from Yelp (a website that provides reviews of local services, mainly restaurants).  This was not python-specific, but more about the challenges that Yelp faces in deciding how much to charge businesses for advertising space on their website, versus making their website helpful and non-spammy for their users - e.g. if I search for Korean restaurants near me, how should they prioritise the order in which search results are displayed?  As a user, I want to see the ones nearest to me with the best reviews, but further away/less well-reviewed restaurants may be willing to pay Yelp more to be higher up on the search results.  Yelp as a company want to maximise profits, but they also want to retain their users by providing the most useful information.  This talk gave an interesting insight into some of these issues, and also how they look at seasonal/event-based search trends to give local businesses more information (e.g. when there is a football game going on, searches for "pubs" are likely to be higher).  You can check out www.yelp.co.uk/trends to see the search trends by category and location.


Other Personal Highlights:


Coding Challenges: 

Booz Allen Hamilton (a management consultancy) ran a "capture the flag" event where there are several problem solving puzzles worth different points based on the difficulty and you are given 30minutes to score as many points as possible.  They ran a competition twice a day throughout the conference, with the highest scoring participant in each session winning a Galaxy Tablet, so I was very annoyed that I only found out about it when they were running the last session!  I managed to place a decent third though!  Thumbtack (a US consumer service company) were one of the few event sponsors that were only giving away swag upon successful completion of a mini coding challenge (see picture below) - I used the itertools library to iteratively create all the permutations and check whether they reached the solution and then proudly claimed my Thumbtack tshirt, sunglasses and beer mug.

Getting a free copy of the Python Cookbook from the O'Reilly stand

I just happened to be in the right place at the right time, and was one of the 25 lucky people to get the latest copy of the Python Cookbook, signed by the author and Python community legend David Beazley.

Meeting the BDFL

Before I attended this conference I had not even heard of this acronym - it stands for Benevolent Dictator For Life - I learnt about this acronym while at breakfast, chatting with some people I just met.  They were using it to refer to Guido Van Rossum, the creator of the Python language.  The next day, (again at breakfast), the BDFL himself happened to be hanging out at a nearby buffet station (it even says BDFL on his lanyard) so I seized my opportunity for a selfie.  He is a pretty chilled guy, and it's actually very easy to spot him during the conference, he is often milling around!

Sponsor Workshops

Free tutorials/workshops are given a day or two before the conference officially starts.  I went to Heroku 101 and Effective Code Review with Code Climate - both were very interesting and gave me a bit more insight into the processes involved with working as a professional developer.  Like many other Pythonistas I came across during the conference, the speakers were very friendly and approachable, encouraging people with any questions big or small to get in touch after the workshop.  In general, the conference gave me a lot of exposure to things/issues related with working as a developer that I don't get to see on my uni course  -I had never heard of continuous integration servers, barely understood what QA (Quality Assurance) and TDD (Test Driven Development) involved and that IDES other than IDLE was available (lol).

GPG Key Signing Party

GPG stands for Gnu Privacy Guard and is an open-source cryptographic software that is used to encrypt messages/data files.  Hopefully I will have time to cover this more in depth in a different blog post, but in short, your GPG Key is a way for people who want to send messages to you, to encrypt them for your eyes only and it is linked to your email address, and a name that you provide.  However, linking a GPG key to an email address still does not authenticate that you are who you say you are - e.g. I could make an email account called mark_zuckerberg_rocks@gmail.com (perhaps this is already taken) and link it to a GPG key, but unfortunately that doesn't mean that I am the creator of facebook.  One way to authenticate that an email address and corresponding GPG key belongs to the person's name is for others to "sign" the key, sort of like giving their stamp of approval.  One geeky way to get many people to sign your GPG key is to attend a key signing party, where other GPG afficianados turn up with their official ID documents (their actual passports/driving licences etc), show you that it matches the name that their GPG key is registered under and you use your own key to "sign" their key, which is then registered in a server.  When other people are curious about the authenticity of this person's key, they can see how many people have signed it to get an idea of how likely it is that they are who they say they are (and also how many key signing parties they have been to!).  I learnt so much about encryption and enjoyed the key signing event so much (felt like a spy!), that I might actually try to organise one myself at a meetup.  For more information about key signing, see the very helpful page written by the person who organised the key signing party at PyCon.


Conference Top Tips:


1. Pack Light:

You will be bombarded with free t-shirts, tote bags, umbrellas, books, usb extensions, stickers, rubix cubes, foam models of dragons.....there is a lot of opportunity to pick up cool swag!

2. Introduce yourself to a stranger:

The Python community is full of friendly and approachable people.  There was such a positive buzz at the conference because everyone was so excited and happy to be there (or was that just the endless supply of free coffee?).  I met so many interesting people, even if it's just striking up a short conversation in the lunch queue.

3. Apply to give a poster session/ lightning talk / a full-on talk:

It is a myth that to give a talk at a conference, you have to be the world expert on the topic.  All you need is enthusiasm for the subject and the public speaking skills to deliver the message.  Many interesting talks were given by people who thought of a topic that they themselves wanted to find out more about,  and researched more thoroughly after finding out that their proposed topic had been selected as a talk!  Neither does the talk have to delve into the intricate technicalities of coding - talks are tagged with Novice/Intermediate/Expert background knowledge assumed.  By presenting a poster or a talk, I think you are giving something back to the community and it's a great way to put yourself out there!

4. If you can't afford to go, apply for a grant:

There's no harm in taking a few minutes to fill in a form, write about how much you would love to attend, what you hope to learn.  I did it for PyCon back in late 2014, without any expectation of receiving a grant, so it was a pleasant surprise when in January I received an email telling me I'd gotten it!  I believe EuroPython 2015 (INSERT LINK) also have a financial aid scheme, and I've heard that generally other conferences do sometimes have the ability to award travel grants, especially for students or those who are doing a poster session or presenting a talk.

Sunday 15 February 2015

Programming and Pizza Perks

Java is one of the most commonly used programming languages in the world and derives its name from the fact that programmers are apparently addicted to coffee.  While I have not seen much evidence of this (maybe because the labs at Imperial forbid you to bring in hot drinks), I can say that ever since I started programming, my consumption of pizza has gone up exponentially.  In fact, I'm sure I can produce a graph that charts the progression in my programming skills with weekly pizza consumption that would indicate a significant level of correlation - surely there must be causation in there somewhere!

Almost every tech related meetup or event I have attended is catered with pizza.  It is convenient and easy to eat, the vast majority of people like it, you can have great vegetarian choices and toppings to satisfy meatlovers alike, it is probably quite cheap and doesn’t taste too bad when slightly lukewarm either.  There is already a programming language called Pizza but it definitely ought to be more mainstream!

In an alternative post to one about programming, I thought I would write a review of the best takeaway pizza that I’ve had at programming meetups! I would also like to give a shout out to Women Who Code and PyLadies for bucking the trend and providing delicious non-pizza catering (as well as desserts!) at their most recent meetup!

Basilico 


The codebar meetup is particularly fond of ordering from Basilico, partly because they are careful to cater to gluten-free and vegan diets.  This is how I got to try vegan cheese for the first (and last) time (I promise I was not stealing food from vegans - the vegan Margherita pizza slices I snaffled were unloved and would have otherwise been destined for the bin). For anyone who has not yet had the pleasure, it is an interesting shade of highlighter yellow, has the same consistency as very cheap cheddar and the only similar taste to real cheese is general savouriness.  Luckily, Basilico’s normal pizzas are otherwise delicious - their sundries tomatoes can be slightly acidic and too tangy, but their white pizzas  are delicious, the slice sizes and topping distribution are just right.  They also do delicious and very healthy juices and salads (again, kudos to codebar for ordering these extras!).

Dominos


The classic pizza delivery chain has improved a lot in recent years - they have great variety in both their meat and veg toppings.  I especially like their Vegi Supreme and Vegi Volcano pizzas - they have the right idea with their meat pizza topping combinations, but the quality of the meat can't really compete with the more gourmet meat options offered by Basilico, Lupa and Franco Manca.  Although the base is not too sophisticated, and the cheese is just a tad on the rubbery and greasy side, you know that you are going to get a satisfying meal.  Dominos also do something unique with their pizzas by supplying a dip  - die hard pizza afficianados may argue that a good crust and high quality toppings need no extra dressing, but I say having a garlic or barbecue dip can only be a good thing!

Franco Manca - BEST BASE AND BEST MEAT


It is almost worth ordering their pizza just for the delicious sourdough base, which apparently takes 20 hours to rise and has its roots in a sourdough starter that is 500 years old?!.  The quality of ingredients used for the toppings (especially the sausage and cheese) is amazing, although a few more vegetables would be nice.  The reason why Franco Manca is not my top choice is because of the topping distribution - the coverage is far too sparse and they cut their slices very unevenly so some slices are basically margheritas.  Quite often the toppings are not firmly attached and roll off easily when you are trying to eat and mingle at the same time!

Lupa - BEST ALL ROUNDER


In terms of taste, Pizza Lupa is my favourite and I have thoroughly enjoyed all of the toppings I have tried.  I especially like their goat's cheese, sausage and roast vegetable toppings (not all on one pizza!), but to be honest their wide range of pizzas are all top notch. The base is inconveniently floppy but the taste more than makes up for this, and I find their pizzas are especially resilient to falls in temperature - the cheese doesn't go rubbery when cold!  One thing I would say is that they need to cut their slices A LOT more - they are absolutely huge!!!  I'm one of those people who loves trying a little slice of lots of pizzas, but one slice of their large pizzas (circa 17 inch) is not too far off half of a 12 inch pizza!  Also, Pizza Lupa if you are reading this - although it doesn't affect the quality of your produce, I would suggest you hire a better web developer/designer!

Pizza Hut


There’s not much to say really -  their pizzas are solid but unspectacular, heir range of toppings satisfy most palates, but are not particularly imaginative and in the absence of special offers, they are not great value for money.  I get the impression that the company is more focused on their restaurants, where they emphasise that they are also great at doing salads and pastas (remember when they changed their name to Pasta Hut for a while?).    One especially noticeable difference with other takeaway pizzas is that the base is relatively more rigid, which improves ease/tidiness while eating!

Voodoo Ray’s - BEST WEBSITE


I have only had Voodoo Ray's once at a tech meetup and they are a solid contender, with interesting and slightly unusual topping combinations.  They are better known for their Dalston branch and van which pops up at various food markets, where they specialise in selling pizza by the slice rather than in whole units - a concept that I'm very supportive of.  Some online reviews do suggest they are on the pricey side though, so perhaps that's why not many meetups order from them. However, their very colourful and eye-catching website (I guess they are based in Dalston after all!) is well worth a visit (and bookmarking for future web design ideas!)

Other Pizza that I paid for myself and have thoroughly enjoyed:


Monday 19 January 2015

Seven Deadly Programming Sins

Now that mid-January has passed, is it too late in the year to greet people with "Happy New Year" or think about New Year's resolutions?  On the latter, I like to think that it is never too late to turn over a new leaf (or at least have intentions of doing so).  Given that this time of year is often associated with detoxing our minds and bodies, I thought I would do a blog post about programming sins (very loosely related to the original Catholic sins!) that we might like to cast aside in 2015.

Pride

Disclaimer: Despite being a proud geek, I'm not actually a trekkie - Star Wars all the way!

Often considered to be the most serious of the sins and the source of all other sins, this is no less true in the world of programming.  Overconfidence or underestimating the needs of the problem at hand can have disastrous consequences later on, as you realise you have not taken into account all of the potential special cases/situations possible!  It can be all too tempting to get stuck in straight away without reading all of the specification in detail and drawing out a plan of action.  Taking a humble step back and a few minutes to model a problem with pen and paper, can save hours of potential refactoring further down the line (see a previous blog post on how to problem solve).

Checking your code before hitting the Enter key is also vital - a commonly given example is when using the command line to delete files on Unix terminals:  "rm *.doc -f " means remove (rm) all files that end with the .doc file extension and don't bother asking if I'm sure I want to do this (-f).  However "rm * .doc -f" (note the space between * and .doc) means remove ALL FILES, then remove any files that are called .doc, and oh yeah, don't bother checking whether this is what I really want to do!!

Overconfidence can also lead to forgetting to save your code (or commit changes) frequently!  Computers don't crash these days, so this is never a problem, right?!

Sloth


A lack of attention to detail or leaving it up to the compiler or your code editor to detect errors can be the root of many debugging frustrations! Whether or not you include that one little character can make all the difference and could come back and torture you for the next hour as you attempt to figure out why your code isn't working (a particularly bad state to be in when under exam conditions, as I recently discovered!!!).  From time to time, I'm sure all C programmers have been caught out at one time or other by one of these bugs...

  • while(x=y) This statement will always evaluate to true, as the variable x is being assigned a value y.  If you want to compare whether the value of x equals y, you need that all-important second equality sign!
  • for(x=0; x<100; x++);  That extra semi-colon tells the compiler that this command ends here, so it means that the for loop just increments x 100 times without actually stepping into the code that you have written in the body of the loop!
  • Passing input parameters by value rather than by reference: If you want the function to make alterations to your input parameters, rather than just making a copy of it, you need to remember to add an '&' before the parameter in the function definition!

Vanity

Ok, it's really hard to find a funny picture on vanity!

A friend once told me that they have heard it being said that "premature optimisation is the root of all evil".  Related to pride, a desire to make your code look elegant or "clever" combined with failing to understand the requirements of the specification can lead to misery, frustration and obfuscation.

I was also once the victim of vanity in a different way - after finishing writing a simple Python program, I thought I would try to tidy up the code a bit to make it look prettier - spacing out functions and comments a bit more...all the while forgetting that whitespace actually matters in Python!!! Needless to say, it took me quite a long time to work out why my program was no longer working, but it was a tough lesson learnt in how spacing affects code in Python!

Gluttony (for memory!)


Getting the hang of memory management is an integral part of C++ programming.  Computer Science students dread the "Segmentation Fault" error when doing their coursework on manipulating pointers, caused by their program trying to access areas of memory outside of the allowed bounds of their program, or an infinitely recursing program that uses up all the available memory space while running.  Using uninitialised variables can also lead to confusion - consider this code:
int x; x++; cout<<x;  No value for x is declared when the variable is created, but x is likely to hold some value - whatever previously happened to be at the memory location the computer has assigned x to, so printing/accessing/using the value of x will not fail - instead it will just be gibberish.

None of these issues are caught at compile time, but become apparent at run time.  Of course, failing to free up dynamically allocated memory is also a cardinal sin, although many modern computers "forgive" this by automatically freeing up any memory allocated on the heap when the program exits.  This issue can be avoided all together if you program in a language like Java, where unlike C++, memory management is taken care of for the programmer....this has always made me slightly suspicious of the language, as if it were leading the coder into temptation...

Repetition 


If you have lots of repeated sections in your code, there is almost certainly a way to redesign it more efficiently.  Let the computer do the repetitive work - that's what it's there for!

Obfuscation

Perhaps nobody deliberately goes out of their way to make their code difficult to understand (unless of course you are entering into a Code Obfuscation Competition), but it is worth taking the time to either comment your code or dedicate that time to making your code clearer for the sake of the sanity of your fellow programmers.  Sometimes when we have been given coursework, I often think there is clearly one obvious and mainstream way to solve the problem - this is simply not true.  The variety of ways in which people tackle problems is vast (and in itself very interesting, as well as something that is great to learn from).  At the same time, it is difficult for someone else to put themselves in your shoes, see the world from your perspective and adopt your way of thinking.

Interdependency 

I think this is the least serious of the sins I have discussed here, mostly because sometimes it is unavoidable.  Ideally, there should be as little dependency between functions or between classes as possible, with each section of code being allocated a certain task that it can carry out on its own.  Not only is this more efficient, but it should also enable speedy refactoring of your program if the need ever arises. However, sometimes you just can't avoid needing to have two classes that simultaneously rely on each other - in that scenario, all you can do is make sure you have your class declarations the right way round - I have found myself referring to this Stack Overflow link more than a few times!


This is by no means a comprehensive list of programming sins, and some may have noticed that of the seven original deadly sins , I did not incorporate Wrath, Lust and Envy, although these are emotions that I have often experienced while doing coursework (i.e. "WHY DOESN'T THIS CODE COMPILE?! GRRR, sometimes followed by asking a fellow student "How did you get it work on your machine?!").

Saturday 8 November 2014

A Hitchhiker's Guide to Hackathons

This summer, I went travelling in Australia and New Zealand, and did some pretty crazy stuff - helicopter rides, bungy jumping, white water rafting, skydiving...it was scary at times, but I survived and loved the adrenaline rush.  I guess our bodies reward us with the adrenaline rush because we're about to do something that looks like it might kill us, and we are happy not just because we stayed alive, but also because we feel like we have achieved something by conquering the fear.

Although not quite as extreme, I had the same feelings of trepidation, excitement, stress and sense of accomplishment when attending a hackathon for the first time.  Last month I went to CodessHack and Hasbro Play/Hack, click on the links to find out more about these events or go to the bottom of this post for a bit more blurb on what they involved and what my team(s) built!

As a hitchhiker's guide to hackathons for the newbs, let's start with the basics...

What is this hackathon business?

It's usually a one or two day event where a large group of people come together, form teams and work on creative projects for a particular theme/brief to try and create something new and exciting in a limited amount of time.  For example, at the Codess hackathon we had to build a web app, mobile app or game to illustrate or visualize the impacts of climate change.  And the Hasbro involved creating a new game or adapting existing ones - it didn't have to be something digital, in fact quite a few people made some pretty cool prototypes out of cardboard!

Do you have to be a programmer to enter?

No - there are some super-serious highly-competitive hackathons out there that are more suited to experienced coders, but the ones I have been to warmly welcomed newbs and it was still possible to make a meaningful contribution to the team.  I helped out with the project management side of things, and you can create a lot of cool looking stuff with minimal programming experience by adapting existing templates to your needs!  With the help excellent documentation for Javascript libraries, HTML5, and other things like Google Charts, Google Maps etc, you can make a nice little web app.  Knowing about the existence of these resources and what is or isn't possible, rather than deep programming knowledge is enough to get you started on your project.

So how do these things work?

The general format is that a theme and a project brief is sent out to participants beforehand - you are encouraged to read a little bit about the background and brainstorm some ideas, but any actual coding doesn't take place until the day of the event.  The hackathon usually kicks off with some mingling, followed by a series of pitches-  this is when people who have a particular idea that they want to work on, are given 1 minute to describe their idea to all the other participants.  After everyone has pitched their ideas, you then go and scope out a team that you might want to join.  Once everyone is in a team, the hacking begins!  Several hours (and coffees) later, the frantic coding ends and each team gives a 3 minute presentation on their idea for the judges to deliberate over.


Hackathon Vital Prep:

Sleep as much as possible the night before - you have a pretty intense day ahead of you

Read the preparatory materials - familiarity with the project brief will save on vital hacking time and give you a better idea of what kind of projects you would like to work on

Tell your loved ones you are going "off the grid" - not only does this sound cool and make out like you are doing some top secret undercover operation, but it also means your boyfriend isn't annoyed when you haven't texted him to say goodnight (even though you aren't actually going to sleep until 4am).



Bring toothpaste and toothbrush (very important due to the vast amounts of sweets and coffee that will be consumed) and spare clothes, sleeping bag, pillow (and if possible, a yoga mat as a mattress) if you are staying overnight.  I felt very apprehensive about staying the night - I decided to do it in the end to try and get the "full hackathon experience" and it was kinda cool!  It was not the most comfortable night's sleep, but I got enough shuteye.  Lots of people did go home for sleep though, and as long as everyone on your team knows that's the plan, then it's totally fine so there's no need to feel pressured into staying overnight!

Come with an open mind, ideas are not necessary.  There will be a LOT of other people pitching their ideas, so if you don't have a fully formed killer app idea in your head, it's fine!  In fact, if you pitch an idea, you should basically try to lead a team and bring it to fruition - this could be a little overwhelming at your first hackathon!


During the Hackathon:

Keep calm! You will be working with people you only just met, people who are passionate about their ideas that might be different to your own, deadlines will creep closer at a seemingly exponential rate, code will fail to compile, your laptop may freeze, you might have to evacuate the building due to someone burning their toast and end up standing outside Tower Bridge in only your pyjamas... Rise above this and remain composed - somehow, things will all work out in the end.

Fantastic Four or Fabulous Five - it seems to me like 4-5 people teams are generally speaking, the optimum sizes.  Two or three-man teams have been really successful at the hackathons I've been to, but you have to have some serious skillz to compensate for reduced manpower.  Teams with more than 5 people can find it hard to agree on things, and it gets increasingly difficult to make sure everyone is on the same page and working towards the same goal as team sizes get larger.

Cool and Simple - a simple idea that serves a specific purpose, pitched in a very cool and jazzy way (e.g. using Moovly instead of Powerpoint!) is better than an idea that aims to meet all aspects of the project brief in one go.

The Demo is your only objective - I think this picture sums it up quite well:
At the end of the day you have 3 minutes to pitch what you have been working on for the past 12 hours.  Therefore the old saying that done is better than perfect really cannot be overemphasised!  It's called a hackathon for a reason. You won't be able to demo a fully functional product and you will not have time to share the trials and tribulations you have gone through.  What you want to show is an exciting-looking prototype that is enough to give people an idea of what the full thing would look like if you had more time to build it.

Chocolate is your friend - Coffee is good, but I find white chocolate mice are best for keeping me going!  How do I avoid crashing after a sugar high? - maintain the consumption of chocolate at an even rate throughout the hackathon and everything will be just fine! (Disclaimer: this does not constitute medical advice).

Best of luck and keep on hacking....  :)

Stuff that I worked on

Check out my github page for the gory details (warning: it might be a bit messy!)

CodessHack: Codess International Women’s Hackathon Oct 2014:  Climate Change Data Challenge for the Nature Conservancy.  Created a web app to visualise the forecasted evolution of rainfall and temperature by country and continent in 10 hours using Python and Google Charts.  Click on this link to see an animated graph - the individual circles are countries, colour coded by continent.

Hasbro Play/Hack: Created a prototype of an online game aimed at girls with Science, Technology, Engineering and Maths themes, using Google Maps API and Kinetic JS.  You can find the github repository here.

Sunday 5 October 2014

Collatz Sequence: Euler 14 Problem Walkthrough in Python

Have you heard the statistic that only 1 in 5 developers is female?  Many people in the world of tech have certainly taken notice of this and there is an abundance of meetups, initiatives, networking groups and email lists to encourage more women to get into coding (I mentioned one called Codebar in a previous blog post).

I went to one of these meetups recently called Pyladies (many thanks to them and Mozilla for organising a lovely evening!) and that's where I came across the problem set that I'm going to tackle in this post (as with previous posts, you can find all my code uploaded at https://github.com/ttz21).

The following iterative sequence is defined for the set of positive integers:

n -> n/2 (n is even)
n -> 3n + 1 (n is odd)

Using the rule above and starting with 13, we generate the following sequence:
13->40->20->10->5->16->8->4->2->1

13: 9
40:8
20:7

It can be seen that this sequence (starting at 13 and finishing at 1) contains
10 terms. Although it has not been proved yet (Collatz Problem), it is thought
that all starting numbers finish at 1.

Which starting number, under one million, produces the longest chain?

In case you are not familiar with Project Euler, it's a series of mathsy programming problem sets which you can work through to help develop your coding skills and this problem is number 14 of Project Euler.  



After talking about the recent Python conference and other very interesting-sounding Python-related meetups, as well as munching through some pizza, we got down to business.  It was pretty obvious that we could solve this thing using the brute force method and go through all the numbers between 1 and 1million to find the answer, but this would not be the most efficient solution by any stretch of the imagination!  Still, we were curious to know how long the brute force method would take...

Solution Attempt #1: Brute Force

The first line of attack is to write code that implements the "if the number is even, do this..., but if it's odd, do that..." rule : 

def use_rule(n):
    if n%2 ==0:
        n /= 2
    else:
        n = 3*n+1

    return n

Simple enough!  And next, we need a process that will keep using this rule until the number collapses (or should I say.. "collatzes"!...ok that was a terrible joke, but jokes about the Collatz sequence are very limited) to 1.  Then finally, we just ask the computer to do this a million times, and keep a track of which number gives us the longest chain.  The parameter highest is the largest number we want to iterate to, and so the range of numbers we calculate over is from 2 to highest+1 (Python's "for" loops start from the first number in range() and goes up to, but not including the second number).

def brute_max_chain(highest):
    max_count=0
    max_value=1
    for i in range (2, highest+1):
        latest = i
        chain_ length =0
        while(latest != 1):
             latest =use_rule(latest)
             chain_ length +=1

        if(chain_ length >max_count):
            max_count=chain_length
            max_value=i
    return (max_value, max_count)

On my embarrassingly old Windows 7 laptop, this brute force method took almost 90 seconds (using Python's time functions as a stopwatch).  Since the meetup, I have gotten my hands on a shiny new 2.8Ghz 16GB RAM Macbook Pro and with this laptop it takes just under 30 seconds!  Who needs to write efficient code when you can just buy a better machine?!

Whichever machine is used to run the code, the answer for the number with the longest Collatz sequence is the same: 837799 (with a chain length of 524 - the way I have calculated the chain lengths is the number of steps taken to get to 2, so the chain length is 1 less than the way the problem description above counts the chain).

Can we improve on this?



Solution Attempt #2: The dictionary method

When trying to think of a way to improve on the brute force method, one route that came immediately to mind was storing information on the chain length for the other numbers we encounter during the chain calculation.  In the problem description, the example given calculates the chain for the number 13, but while we are doing this series of calculations, we are also discovering what the chain length is for 40, 20, 10 etc etc.  Therefore, there is no longer any need to do the Collatz iteration rule over the other numbers that we experience along the way while we are doing the calculations with the number 13.   Similarly, for any future calculations that we do for higher numbers, if their chains ever reach 13, we now know that they are 8 steps away from collapsing to 1 and there is no need to carry on with the iteration.

Those were the thoughts running through my head while trying to improve on the brute force method.  Bear in mind I had not touched Python for quite a few months, having been concentrating on doing C++ pre-course reading for my Masters, and given I was at a Python meetup,  I was trying hard to come up with a "Pythonic" solution.  I vaguely remembered from my Udacity courses that dictionaries are a data structure unique to Python - they are a set of key:value pairs, where the keys are unique and they do pretty much what it says on the tin.  You can look up a key in the dictionary and it returns the value associated with that key.  They are a very nifty, highly intuitive way of storing and fetching data.  

The method I coded using a dictionary is not as clear as it could be, but I shall try to explain my thinking!  We start off with an empty dictionary called known_chains  - this is where we will store numbers (the dictionary keys) and their corresponding chain lengths (values associated with those keys).  

Similarly with the brute force method, we put in a "for" loop that goes from 2 to highest+1.  Remember that each time we use the Collatz rule on a number, we create a chain of numbers and during the process, we also discover the chain length for those numbers.  I have used another dictionary called  new_knowns to track the chain length for the numbers in the current chain, and we can add the new_knowns dictionary to known_chains once the sequence collapses to 1.

Because we now have a dictionary of known_chains, if the Collatz rule leads us to a number for which we already know the chain length (i.e. if latest in known_chains), we can use that knowledge instead of carrying on iterating with the use_rule() function.  Otherwise, if the latest number is not in the dictionary of known_chains, then we carry on using the use_rule() function (and incrementing the chain length values for the numbers in new_knowns) or until we hit 1 or a number that is in known_chains.

def dict_max_chain(highest):
    known_chains = dict()
    max_count=0
    max_value=1
    for i in range(2,highest+1):
        new_knowns=dict()
        latest = i
        while(latest != 1):
            if latest in known_chains:  
                for entry in new_knowns:
                    new_knowns[entry] += known_chains[latest]
                latest=1
            else:
                new_knowns[latest]=0;
                latest =use_rule(latest)
                for entry in new_knowns:
                    new_knowns[entry] += 1

        for entry in new_knowns:
            known_chains[entry] = new_knowns[entry]
            if new_knowns[entry] > max_count:
                max_count= new_knowns[entry]
                max_value = entry
     
    return (max_value, max_count)

For example, let's say we are calculating the chain length for the number 13 for the first time.  An entry is made into the new_knowns dictionary of 13:0 as we start counting.  If we apply the Collatz rule once, we get the number 40 - let's say we have not yet calculated the chain length for 40 - because we have used the rule once, we increment the dictionary value for 13:1 and we also add 40:0 to the new_knowns dictionary.  Next, we apply the rule once more and hit 20 - again, let's say that this is not already in the dictionary of known_chains, and so we enter 20:0 to the new_knowns dictionary while incrementing the values for the previous numbers in the chain (i.e. we now have 13:2 and 40:1).  

This time when we apply the rule (and increment chain length values...i.e..13:3, 40:2 and 20:1), we end up with 10.  This should definitely already be in the dictionary of known_chains (as 13>10) with a corresponding value of 5.  Therefore we add 5 to all the values in the new_knowns dictionary (i.e. 13:8, 40:7, 20:6) and we can end the while loop, as we no longer need to do any calculations relating to the chain that started with the number 13.  All that's left for us to do now is add these new_knowns to the known_chains dictionary and while doing so, check if any of the chain lengths exceed what we currently have recorded as the maximum chain length count.

Phew! That was a long explanation.  Thankfully it didn't take nearly as long for the computer to do this calculation - my Macbook clocked it at just under 5 seconds.  That's an improvement on the brute force method by a factor of 6!  But surely we can do even better...?

Solution Attempt #3: The list method

So Python dictionaries are super easy to use, but it turns out that they tend to be more useful when the keys are unordered, e.g. if you were adding words in no particular order.  For our purposes, we are filling in chain lengths corresponding to the numbers from 2 to 1,000,000  and although we are not discovering chain lengths in numerical order,  numbers do inherently have an order!!! (Unless you are in a black hole or something crazy like that...then you probably wouldn't care that much about the Collatz sequence). 

When you want to store data in an ordered way, we can use a list in Python (sort of like an array in other programming languages).  Instead of starting out with an empty dictionary, you can start out with a list that has 1million entries of the number zero (I've kept the name known_chains in the code below).  That sounds like a pretty long list, but it still uses less memory than a dictionary with a million entries.  The ith entry in the list can then correspond to the chain length for the number i - if the ith entry is zero, we know that we have not yet calculated the chain length and as we discover chain lengths, we can add them to the list using an expression that follows the format: list[i] = chain_length.  

Another reason why the dictionary method is slower and uses up more memory is that it stores chain lengths for numbers greater than 1,000,000.  We are only interested in finding the chain length for numbers below a million, but while calculating a chain, there is no restriction on the size of numbers in the chain.  If we are using a list with a pre-defined size however, we cannot add entries for an index, i that is greater than this pre-defined size (i.e. 1,000,000).  This is why there are extra if conditions in this code that tests whether a number is higher than a million.

The code otherwise works in a very similar way to the dictionary method.  Instead of having a new_chains dictionary, we store the numbers in the current chain in a list called current_chain_numbers.  As we calculate numbers along the chain, we append them to this list and we keep going until we hit 1 or a number that we already know the chain length for.  When that happens, we go through the numbers we have in current_chain_number and add our new knowledge of chain lengths to known_chains, which also depends on the position of each number in the current_chain_number list.

def list_max_chain(highest):
    known_chains = [0]*(highest+1)
    known_chains[2]=1
    max_count=0
    max_value=1
    for i in range(2,highest+1):
        current_chain_numbers = []
        latest = i
        while(latest != 1):
            if latest > highest or known_chains[latest]==0:  
                current_chain_numbers.append(latest)
                latest=use_rule(latest)
            else: 
                for entry in current_chain_numbers:
                    if entry <= highest:
                        known_chains[entry] += known_chains[latest]+(len(current_chain_numbers) - current_chain_numbers.index(entry))
                        if known_chains[entry]>max_count:
                            max_count=known_chains[entry]
                            max_value=entry
                latest=1

     
    return (max_value, max_count)

Let's use the 13 chain again as an example, i.e. currently in the code below i=13, we have an empty list of current_chain_numbers,  known_chains[13] is still at zero and latest has been set to 13.  Therefore in the while statement, we follow the first if branch and append 13 to the current_chain_numbers list and use the Collatz rule on 13.  This means that latest is now set to 40 and we go round the while loop for a second time and again, follow the if branch (assuming known_chains[40]==0), adding 40 to current_chain_numbers and using the Collatz rule to set latest to 20.  Doing this whole process one more time means that current_chain_numbers will become [13, 40, 20] and latest=10.  However, the next time we go through the while statement, let's say we have an entry for known_chains[10] , so we now follow the else branch.  Because all the numbers we have in current_chain_numbers are below a million, we need to add the chain lengths for all of them to known_chains.  current_chain_numbers is a list with 3 entries - the chain length of the first number in the list is going to be known_chains[10]+3, the length for the second number will be known_chains[10]+2 and the length for the third number will be known_chains[10]+1.  Accounting for the number's position in current_chain_numbers is why we add the term: len(current_chain_numbers) - current_chain_numbers.index(entry).

And of course, as we are filling in these new chain lengths, we want to check whether they are higher than what we currently have recorded as the maximum chain length.

Don't ask me why, but Python can interpret lists and do things with lists much faster than with dictionaries.  This method takes around 1.7 seconds, an improvement on the dictionary method by a factor almost 3x!

But bear with me, because we can still shave off a bit more from the clock!

Solution Attempt #4: The recursive method

After writing the list method, I have to admit, I was feeling pretty smug with my sub-2 seconds solution.  However, as with many things, someone else on the internet is bound to have come up with an even better method and that's when I came across the idea of using a recursive way to calculate chain lengths.  Recursion is when a function references itself (sort of like how a Russian doll contains a version of itself, or kind of like how Ryan Gosling and Macaulay Culkin wear tshirts of each other).

Recursion....sort of.
I don't need an excuse to use pictures of Ryan Gosling in my blog anyway!
With a recursive solution, you have to have a base input so that the solution knows when to stop referencing itself.  That's why this problem is perfect for a recursive solution combined with the list method of storing previously discovered chain lengths.  We take a number, n and we keep adding 1 to the chain length (stored at known_chains[n]) and run the recursive_method() function again (after applying the Collatz rule).  We keep doing this until we hit a number whose chain we have already discovered (i.e. where known_chains[n]>0)- that number is our base input that stops the recursion.  

Similarly with the list method, because we have a list of a pre-defined size, we have to make allowances for numbers along the chain to be higher than a million.

highest = 1000000
known_chains = [0] * (highest+1)
known_chains[2] = 1

def recursive_method(n):
    if n < highest:
        if known_chains[n]: #i.e. known_chains[n]!=0
           return known_chains[n]
        elif n%2: #if the number is odd
            known_chains[n] = 1 + recursive_method(3*n + 1)
        else:
            known_chains[n] = 1 + recursive_method(n/2)
        return known_chains[n]
    elif n%2:
        return 1 + recursive_method(3*n + 1)
    else:
        return 1 + recursive_method(n/2)

max_count = 0
max_value = 1

for i in range(2, highest+1):
    chain_length = recursive_method(i)
    if chain_length > max_count:
        max_count = chain_length
        max_value = i

My Macbook Pro can get this solution in just under 1 second, so we've come a long way since the brute force method! There are things you can do to tweak the running time by writing the code a bit more succinctly and I think there are packages you can install to speed up simple calculations like these, but I'm satisfied with the clock stopping at sub 1-second.  And now for a well earned cup of tea...

Wednesday 24 September 2014

The Game of Life Solution Walkthrough

Dear blog, it has been an awfully long time since my last post, I hope you are in a forgiving mood.  During the past two months, I have actually been blogging elsewhere, but you will always be my first and most important blog site.  The reason for the absence is that in mid-July I started my one-year sabbatical from my finance job in order to do my Computer Science Masters (see my Back To School post from March) and I used the summer to go travelling in Singapore, Australia and New Zealand.  I used this blog as a photo diary of my adventures so that my parents could check that I was still alive.
Maybe I should have done a career change to become a beach bum instead....
While on my travels, I really lived life on the edge - I went on helicopters, mountain and glacier hikes, caving, white water rafting, scuba diving, bungy jumping and sky diving, but throughout my trip my laptop and university pre-course reading were never far away.  The textbook I have been working through is called Problem Solving with C++ by Walter Savitch...many people I met on my trip thought it was weird that I had brought work with me, but I had some looooong coach journeys to sit through and so far I have enjoyed working through the "homework" problems at the end of each chapter.

This blog post is a walkthrough of my solution to one of these problem sets, and what better way to resurrect my blog than with an article about the Game of Life.  Before we get into the meat of the solution, I just want to spend a bit of time comparing and contrasting C++ with Python (the programming language I had previously had most experience with).  When people who have some programming knowledge ask me what language my Masters course will use, they generally make the following face, when I tell them it will be taught in C++:


People then go on to say things like "If you can do it in C++, you can do it in any language".  I'm not sure how C++ acquired this formidable reputation, but so far, I have to say that Python was definitely easier to learn and seems to be more flexible.  As you will see with my C++ solution to the Game of Life, at the top of the code, there are some header statements  - when you are a newb you just have to accept this as given and not get too bogged down with the details of why they are there.  Also, with C++, when you use variables in your program, you have to define the variable's type in advance (e.g. whether it will be an integer, or a character etc), whereas Python sort of allows variable types to be defined as and when they are created.  Similarly, when you want to run programs you have written in C++, they must be compiled first (compiling just means the computer checks through the code you have written in your editor and converts into code that your machine can understand), but again, Python cobbles it all together while the program is being run.  The compilation stage is often useful as it brings up any errors before your program is being executed, and you do get a super satisfying feeling whenever your program compiles correctly the first time!

Ok, I've used this meme before - forgive me, I'm a Futurama fanatic


Most of the time, it doesn't compile and for me, more often than not, it's because of these darn semicolons you have to put at the end of every statement in C++!  After being used to coding in Python, it takes a while to get used to them...  My final gripe with C++ is Microsoft Visual Studio Express - ok, so it's not really related to the language itself, but this is possibly one of the least intuitive and most beginner-unfriendly development environments EVER!  Thankfully a new Macbook Pro is on its way to me and I will hopefully never have to deal with Visual Studio again!

Another way to illustrate the differences between Python and C++ is the different philosophies behind the two languages.  The bullet points below are taken from Wikipedia, and I think the way they are written gives you a real feel of the conceptual differences between them as well as an idea of what it is like to code with them.

Python:
  • Beautiful is better than ugly
  • Explicit is better than implicit
  • Simple is better than complex
  • Complex is better than complicated
  • Readability counts
C++:
  • It must be driven by actual problems and its features should be useful immediately in real world programs.
  • Every feature should be implementable (with a reasonably obvious way to do so).
  • Programmers should be free to pick their own programming style, and that style should be fully supported by C++.
  • Allowing a useful feature is more important than preventing every possible misuse of C++.
  • It should provide facilities for organising programs into well-defined separate parts, and provide facilities for combining separately developed parts.
  • No implicit violations of the type system (but allow explicit violations that have been explicitly asked for by the programmer).
  • Make user created types have equal support and performance to built in types.
  • Any features that you do not use you do not pay for (e.g. in performance).
  • There should be no language beneath C++ (except assembly language).
  • C++ should work alongside other pre-existing programming languages, rather than being part of its own separate and incompatible programming environment.
  • If what the programmer wants to do is unknown, allow the programmer to specify (provide manual control).
This all sounds like a big rant against C++, but as with all new things, I've learnt to adapt and get used to it, so there really is no bad feeling between me and C++.  Also, I'm sure it has many advantages over other languages that I'm yet to understand, but hope to discover during my Masters course!

Phew! That was a long detour, but now we've gotten that out of the way, let's get onto the meat of the Game of Life solution! The following description is taken from Chapter 7,  Programming Project 13 of the Savitch textbook and you can also read a more detailed and very interesting Wikipedia article (complete with cool gif animations!):

The mathematician John Horton Conway invented the "Game of Life". Though not a "game" in any traditional sense, it provides interesting behaviour that is specified with only a few rules.  This Project asks you to write a program that allows you to specify and initial configuration.  The program follows the rules of LIFE to show the continuing behaviour of the configuration.

LIFE is an organism that lives in a discrete, two-dimensional world...This world is an array with each cell capable of holding one LIFE cell.  Generations mark the passing of time.  Each generation brings births and deaths to the LIFE community:

  • We define each cell to have 8 neighbour cells...(directly above, below, to the right, to the left, diagonally above to the right and left, and diagonally below to the right and left).
  • If an occupied cell has 0 or 1 neighours, it dies of loneliness (awwww....). If an occupied cell has more than three neighbours, it dies of overcrowding (anyone who has been on the Central Line will sympathise with this).
  • If an empty cell has exactly 3 occupied neighbour cells, there is a birth of a new cell to replace the empty cell (hmm, talk about a love triangle..?! this Conway dude can't have been much of a biologist)
  • Births and deaths are instantaneous and occur at the changes of generation.  A cell dying for whatever reason may help cause birth, but a newborn cell cannot resurrect a cell that is dying, nor will a cell's death prevent the death of another, say, by reducing the local population.

So given these rules and a starting pattern of live cells on a two dimensional grid, with every generation the pattern of the live cells evolves.

I have put my code on my github page: https://github.com/ttz21/Game-of-Life

Using the framework for problem solving I outlined in my TicTacToe Solution walkthrough, I'm going to start off with the all-important Step 0: Don't Panic.  So after taking deep breath, let's consider the following....

1. What are the inputs?

That's simple - the initial configuration of live cells on a board.  In my solution I have used a 20 x 20 array, because it roughly suits the default size of the screen that pops up when the code is run.  Of course, we could have also written the program to let the user specify the size of the board.  We can declare the size of this board as a constant right at the beginning, so in future we can easily alter it's size without having to go through the entire code:

const int WIDTH = 20, HEIGHT = 20;  (it's customary to give constants names in uppercase, the const means the program can't alter these variables, and the int declares them as integer variables)

I have used an array of integers to track live (1) versus dead (0) cells, though this could also be accomplished using an array of booleans (true/false).  For this solution, the initial configuration has to be input manually (no, I didn't sit and write out all the curly brackets and "0," x400, I wrote a program that did it for me!) - I have put in a little star pulsar pattern that repeats itself every third generation, to give us something pretty to look at when the program is complete!  The code below declares a two-dimensional array of integers named "board", with HEIGHT+2 rows and WIDTH+2 columns (the reason for the +2 will become clear later...)

int board[HEIGHT + 2][WIDTH + 2] = {
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }
};

2. What are the outputs?

We want to display the board to the screen in response to a user prompt (e.g. pressing the Enter key to bring up the next generation).  Rather than showing 1s and 0s, it would be more visually appealing to show asterisks for live cells and just a blank for dead ones.  The function display_board below uses board as an input and uses a nested loop (i.e. loop within a loop) to go through all the rows and columns to print "*" to a screen when the ith row and jth column of the board has an entry of 1.

In C++, when functions are defined, you must declare whether the function returns a value or variable.  The display_board does not return anything, as it simply prints stuff to the user's screen so we declare it as a "void" function.  The "for" loops can be interpreted as: starting with an integer, i at 1, increment i by 1 each time (i++) while i is smaller than HEIGHT+1.  "cout <<" means output the statements that follow to the screen and "endl" simply puts the cursor onto a new line.

void display_board(int board[HEIGHT+2][WIDTH+2])
{
for (int i = 1; i < HEIGHT+1; i++)
{
for (int j = 1; j < WIDTH+1; j++)
{
if (board[i][j] == 1)
cout << "*";
else
cout << " ";
}
cout << endl;
}
}

3. Solve the problem!

To work out what the pattern of the new generation of cells looks like, we need to look at the neighbouring cells of every cell in the current generation.  The problem brief specifies that every cell has 8 neighbours - however, this is not strictly true, because the cells at the very edges of our board will have fewer than 8 neighbours.  In order to get around this, we actually use a 22x22 size board to help us calculate each generation's births and deaths (i.e. giving our 20x20 board a border with the width of 1 cell), but when we are choosing what to display to the user, we only iterate over the inner 20x20 board, and reset the "shell" to dead cells every period (because they technically don't exist and are not part of the game).

We need to know what the previous generation's board looks like in order to calculate the new generation, but after this has been done, we can discard the old board. This means we need to have two boards to work with at one time, as well as a function to copy the board that we use for our calculations onto the board used by the display_board function to show the latest generation.

The new_generation function below takes our board as an input and creates a temporary board of the same size.  This temporary board is then filled with 1s and 0s according to the rules of the Game of Life using the configuration of our input board.  When this has been done, our board with the initial configuration is overwritten with the temporary board using the copy_board function.


void new_generation(int board[HEIGHT + 2][WIDTH + 2])
{
int temp_board[HEIGHT+2][WIDTH+2];
int neighbours;

for (int i = 0; i < HEIGHT+2; i++)
{
for (int j = 0; j < WIDTH + 2; j++)
{
if (i == 0 || j == 0 || i == HEIGHT + 2 || j == WIDTH + 2)
temp_board[i][j] = 0;
else
{
//counting alive neighbouring cells
neighbours = board[i - 1][j - 1] + board[i - 1][j] + board[i - 1][j + 1]
                                   + board[i][j - 1] + board[i][j + 1]
                        + board[i + 1][j - 1] + board[i + 1][j] + board[i + 1][j + 1];
 
if (board[i][j] == 1)
{
if (neighbours < 2 || neighbours > 3)
temp_board[i][j] = 0; //cell dies due to loneliness or overcrowding
else
temp_board[i][j] = 1;  //don't strictly need to write this line, but I have added it for extra clarity
}

else
{
if (neighbours == 3)
temp_board[i][j] = 1; //birth of a new cell
else
temp_board[i][j] = 0;
}
}
}
}

copy_board(temp_board, board);


}

void copy_board(int board1[HEIGHT + 2][WIDTH + 2], int board2[HEIGHT + 2][WIDTH + 2])
{
for (int i = 0; i < HEIGHT + 2; i++)
{
for (int j = 0; j < WIDTH + 2; j++)
{
board2[i][j] = board1[i][j];
}
}
}

Once we have the display_board, new_generation and copy_board functions, all we need to do is implement them in the main function of the program!  The code below has a "while" loop that keeps repeats as long as the user hits the Enter key. The program displays the board, then reads the user's input using the command cin.get(repeat) and then calculates the next generation of LIFE.  cin is like the opposite of cout - it reads in a user's input from the keyboard and we are asking it to "get" one character (which I have called repeat).  '\n' is the way we represent the "Enter" character in C++.

int main()
{
char repeat;

do
{
display_board(board);
cin.get(repeat);
new_generation(board);

}while (repeat == '\n');

return 0;
}

For the initial configuration I entered, the displayed output should look a little something like an asterisk version of this:
Ta da!

I don't want the solution to get too complicated in this blog post, but from the subsequent chapters I have read in my textbook, I could have implemented LIFE as a class and represented the board as a dynamic array.  There are also interesting extensions to this problem to find static displays or cool evolving patterns, but I'll leave that to the mathematicians out there...