Just after starting at Mango I made the decision to start learning Italian. I have always been interested in learning languages and I was really keen to go back to Italy, so I thought it would be something fun to do out of work. It turned out to have a much greater impact on my work than I expected – and not just because of project work based in Italy.
In the early stages of my learning, I read “Fluent in 3 Months” by Benny Lewis. If you have never heard the name before, Benny Lewis is a polyglot (someone who speaks multiple languages) who left school unable to speak anything other than English. After six months living in Spain and failing to learn the language, he switched his approach. In a matter of weeks, he was speaking Spanish with natives. Now he can speak a dozen languages to varying degrees. And this really got me thinking. Could his approach be applied to R and Python? Could we get more people engaged in the languages more quickly?
The “Language Hacking” Approach
To start let’s consider the approach that Benny Lewis advocates for learning languages.
Part of the approach is to speak the language from the start, not a few days or weeks in, but on the first day. And to continue to speak the language every day. It’s a simple but powerful idea. You arrange to have a conversation with a native speaker of the language and prepare a few sentences to have a basic conversation. It doesn’t have to be long, it can literally be three or four sentences, but you are using the language from the start.
Combined with speaking from the start is the idea of “language hacking”. This is really what makes the technique powerful. The idea that you don’t need to know everything to be able to use a language. Think about that conversation you are having on day one. You won’t know about how to conjugate verbs or all the rules of sentence structure or all the vocabulary, but you can certainly use a phrase book to find out how to ask “how are you?”, “what is your name?” and respond “I am well”, “my name is Aimee”.
The fundamental concept of the approach that Benny Lewis proposes is to learn what you need to communicate right away. Don’t spend months learning grammar and rules and hope that this will be enough to get by, just start to speak.
There are of course challenges to this approach, the biggest is the knowledge that you will make mistakes. This is typically the main blocker to language learning. Fear that you will make a mistake, but the reality is that nothing bad will happen if you try and get it wrong, generally the people you are talking to will helpfully point you in the right direction and you will learn from it. Once you get over this fear you can very quickly learn a language.
Language Hacking for R and Python
At the time that I read Benny Lewis’ book, I was just starting to teach more and I was interested in whether it would be possible to teach R (and Python) this way. But what does language hacking mean for programming for data science?
The answer is simple, it means the same thing. If you want to “hack” learning R and Python for data science you focus on learning the code that you need to do what you need to do. Don’t worry about the details of programming, put aside the ins and outs of functional or object oriented programming, forget the technical language. Just focus on getting things done.
For data scientists that typically means starting with a basic workflow. Your first “conversation” will typically be more along the lines of loading some data and generating some summaries. Let’s think about that example for a moment.
Suppose that I am going to read the iris data from a csv file and find the mean sepal length. How much code does that take? Three or four lines. Do we need to spend hours or possibly days studying the rules of the language first or can we simply jump in with those lines? We can, and should, jump straight in and teach those three or four lines right at the start. Put yourself in the shoes of the learner. If after just minutes of learning you can see a result that is meaningful and useful – the chances are you will keep going and you will want to learn more. What’s more is that you will start to experiment with that code. You will see how you can make changes to the code to find the mean of a different column, or maybe you will think about finding other summaries. You are no longer just learning rules of a language to be implemented, you are actively living the language.
You can very quickly build these “conversations” up to include grouping, performing common manipulation tasks and creating visualisations. In no time at all, you will be doing analytics, modelling and machine learning.
At Mango, we switched our R training to this approach around 3 years ago and we haven’t looked back. Our trainers no longer teach programming the traditional way. They are all taught to teach the hacking approach, and they all come back from teaching with the same success stories. It took a little longer to convince our Python team that they should make the same change, but it is now the approach we take with all of our training. From a personal perspective, after years of avoiding Python because I didn’t want to spend weeks learning to program, I was the first tester for the Python version of our training. Now I am comfortable with running some of my common analysis in Python, and whilst I still make mistakes and it takes me a bit longer than when I write R, I have finally got the confidence to consider Python as a solution as well as R and I can talk more confidently to my Python using colleagues.
In practical terms, I would strongly recommend focusing on the tidyverse for R and pandas for Python, with seaborn for graphics. These packages have been designed to make the tasks that we perform regularly with data easy and accessible, so if we are trying to hack our approach to learning and be able to use the languages quickly, why would we use anything else?
But What About Grammar?
You can get a long way in a language without the need to learn lots of grammar. Think about how you learned your native language, I don’t remember being taught grammar when I started to speak but I could still communicate effectively. My friends are not actively teaching their pre-school aged children grammar. But they can communicate, and whilst it is not always the best way, they can get their message across. But eventually, to really master a language, you do need to get to grips with the grammar.
So those of you who are passionate about the detail of R or Python, who like the “best” way to do things, who want to promote programming paradigms and philosophies. Don’t worry. There is still a place for this. It just doesn’t come first, and it isn’t necessary for everyone.
If all I want to do with Python is import data, and run some analytics, then I really don’t need to worry about more than what I have achieved through language hacking. If I want to master it and be able to produce tools that are used by a wider community, then I do need to know more. The good news is that this is much easier for a programming language than a spoken one. There are rarely exceptions to rules for a start, and you don’t have to learn an endless stream of tenses!
But we can do more to make even this accessible to learners. We can help them to understand the practical applications. We can focus on immediate needs rather than eventualities. We can provide constructive feedback that helps learners to develop their skills.
By making even the detail of a language interesting and accessible we ultimately end up with greater numbers of people who can speak the language and contribute to its success. But we must start with practical code that achieves a specific goal and leave the grammar for later.