Interview with Joe Cheng

Blogs home Featured Image

Background

What’s your background?

I’m a software engineer through and through. I majored in Management Science, but I’ve been working professionally as a programmer since my freshman year in college (1996, the start of the dot-com boom). I started out as a web developer and then, once I graduated and realized I wanted to spend my career writing software, focused on becoming a generalist software engineer. As a self-taught programmer, it took until about 2008 for my imposter syndrome to completely disappear.

I’ve spent most of my career at Boston-area startups, and I’ve enjoyed it tremendously. I’ve had the privilege of working with many wonderful people, and released a lot of software that I’m proud of (or at least, that I was proud of at the time!). In 2006, the startup I worked for was acquired by Microsoft, which meant a relocation to the Seattle area. In late 2009 I left Microsoft to join RStudio, but chose to remain in Seattle.

I’ve worked on a variety of software in my career: web, desktop, front end, back end. I take special pleasure in writing parsers and multithreaded code (generally not at the same time though!).

Tell us about your first experience with R

My very first experience with R was my first day at RStudio. JJ had been working on RStudio for a few months already, and the first feature he assigned to me was syntax highlighting of R code for the source editor. So before I ever wrote a line of R code, I was reading The R Language Definition and got intimately familiar with the grammar.

How did you come to work at RStudio?

JJ Allaire (RStudio’s founder) and I go back a ways. I was a web development intern at his first company, Allaire, which jump-started my software development career; and I also worked at his second startup, Onfolio. So I’ve basically been working for JJ on and off since 1997.

At the time that JJ was thinking about R, I was working at Microsoft, itching to get back into a startup. As soon as JJ convinced himself that a web-based IDE for R was technically feasible, he brought me in to help him build it, and I started in September 2009. Unlike JJ, I wasn’t interested in statistics at all. But the technical challenge of building a web-based IDE was alluring, and I was not going to turn down the chance to work with JJ directly.

In those days, it was not at all clear to us whether we could build a sustainable business writing tools for R users. Luckily, JJ didn’t care—he wanted to make the IDE a reality whether we ever saw a dollar or not, and he was willing to invest both his own time and my salary to make that happen. Obviously, the best case scenario was to build a robust business, so we could hire more people to write more good software.

What does your role as CTO at RStudio involve?

Actually, the majority of my job is working on Shiny and leading the Shiny team. Mostly what I do is try to move Shiny forward. I still code, but not nearly as much as I would like. On any given day, I might be writing docs, investigating trickier bugs, explaining parts of the code to other team members, reviewing PRs, prioritizing feature lists, planning releases, checking on peoples’ status, and reporting our team’s status to other people. I also speak at conferences a few times a year, and those presentations usually take me a really pathologically long time to prepare.

The title of CTO doesn’t define my responsibilities, but instead is more of an acknowledgement that as the longest tenured RStudio employee, and having been intimately involved in the creation of many of our products (RStudio IDE, Shiny, Shiny Server, and Connect), I should have a seat at the table when we make major company decisions. I do take part in a lot of technical discussions and decisions outside of my team, but the same could be said for a lot of other experienced technical folks around RStudio: JJ, Hadley, Jonathan McPherson, Aron Atkins, Tareef Kawaf, and on and on.

Going forward, I’m hoping to find a way to clear my schedule so I can get down to writing a book about Shiny. I’m astonished that people like Hadley, Garrett, and Yihui can write whole books while still doing their jobs; it takes all my concentration to write well and I find it extremely taxing, though ultimately satisfying.

Shiny Origins

What led you to create the Shiny framework?

From pretty early on, JJ and I received feedback from potential users that they wanted the ability to create interactive applets and reports using R. The first person who asked us for this was Danny Kaplan at Macalester College (who was also the first beta tester of RStudio). At the time, he was having grad students learn Java so they could build in-browser applets to help students explore statistical concepts. He implored us to make it possible to build those applets in R instead.

JJ and I both thought the idea was really appealing, but we were 100% focused on RStudio IDE at the time. I told JJ I thought we should do it someday, but only if we could come up with a really great API for doing so. Having spent years specializing in UI programming for both desktop and web, I really did not want to subject R users to that highly specialized black art.

The hard part of web UI programming, to me, was not HTML and JavaScript. Learning those just required time. Rather, it was the explosion of state management spaghetti code that inevitably occurred when creating even moderately complicated UIs, regardless of language. In my experience, it was possible to create complicated UIs without exponentially increasing code complexity, but it required experience, discipline, and a bit of luck. Only the very best teams could pull it off.

In April 2012, the Meteor JavaScript framework was announced on Hacker News. The Meteor screencast evoked for me the old Arthur C. Clarke quote, “Any sufficiently advanced technology is indistinguishable from magic.” Despite having just built a state of the art app in RStudio IDE, I could not conceive of how their framework’s UI layer could be so interactive with so little state management code. I couldn’t stop thinking about that mystery, so a couple of weeks later, I took advantage of a long plane flight to delve into the Meteor source code. I eventually ended up at this tiny JavaScript file, and the light bulb went on. It was an incredibly elegant little hack that enabled a whole new style of UI programming.

It took a few more months before I made the connection that Meteor-style reactivity could be used on the server side to create a high-level app framework for R.

How long did it take to come up with?

It took a couple of years between Danny asking us for a web framework, and the conception of Shiny, during which I spent zero time consciously thinking about it. But the Meteor reactivity implementation must have worked its way into my subconscious. On the last morning of useR! 2012, I woke up and literally the first thought in my mind was the architecture of Shiny: a simple, semantic HTML vocabulary for specifying inputs and outputs; a reactive programming library on the server side for specifying those outputs using pure R code; and some JavaScript and WebSocket plumbing to tie everything together automatically. JJ added the final piece, which was specifying the HTML itself using R.

Were there any unexpected challenges in that first version?

All of the ideas turned out to be really surprisingly easy to implement. Those first few months, JJ and I made progress at an almost absurd pace. During that period, I almost couldn’t type fast enough to get the ideas out of my head and into code. Reactive programming was this fantastically powerful and general technique, but once you knew about the little hack from deps.js, actually implementing it was dead simple.

Work on Shiny officially started on June 20, 2012. The first prototype of Shiny was actually written in Ruby (as I barely knew R at the time), just to prove the architecture. It took a day and a half to go from zero to a working little Shiny.rb app. (You can see the state of the repo on that day here. Looking at server.rb and www/index.html, you can clearly see that the core ideas in Shiny were present back then.)

The biggest challenge in those early days was the lack of a truly robust web server package for R. We needed not only a traditional HTTP server, but also support for WebSockets, which was not even an IETF-approved standard at that time. We started out building Shiny on top of the websockets package by Bryan Lewis, an early friend of the company. I’m not sure what had compelled Bryan to write the package in the first place, but by the time we adopted it, he had moved on and was looking to transfer the maintainership to someone else. I gratefully accepted the responsibility. But soon after we shipped the first versions of Shiny, it became clear to me that we couldn’t keep going with the websockets package, as it was trivially vulnerable to denial-of-service attacks and I couldn’t fix it without starting over from scratch. Shiny was already on CRAN and interest in it was growing quickly, so I felt tremendous pressure to get us onto a stable foundation. The result was a six week, hair-on-fire sprint to create httpuv, which I published to CRAN in March of 2013.

Besides that, the biggest challenges were API design and writing good docs. The former was especially challenging because I had so little R experience at the time, which made it hard to design APIs that would feel idiomatic to R users (to the degree that a reactive web framework could feel idiomatic!). So the addition of Hadley and Winston Chang to the company in late 2012 was a huge help, and led to significant changes in the API. Writing good docs, on the other hand, is just hard. It’s so much easier to build a web framework than to teach people how to use it effectively. We made a big push for that initial release, but it was years before the documentation even began to catch up to fully describe what we had built (and in some areas, still hasn’t).

Did the reception to Shiny surprise you?

It really did. I knew we had created something that was technically interesting, demoed especially well, and served a need that R users would find interesting. What I didn’t know was whether the community could get their heads around reactive programming, or rather, whether they’d be willing to invest the time necessary to get their heads around it. I was shocked to find how eager people were to jump in and invest. Within the first month we were already getting really surprisingly sophisticated questions from people we’d never met.

Have you been surprised by the ways in which it’s been used?

On one hand, yes, constantly. A lot of the features we’ve added over the years were inspired by brave users who managed to shoehorn Shiny into a scenario that we had not designed for. And every time I teach a training workshop, at least one person will ask me to look at some bug in their app that they haven’t been able to figure out, and then demo some mind-blowing thing.

But at another level, I’m not that surprised that people have built surprising things with Shiny, if that makes sense. Shiny provides a pretty general set of capabilities, in that it gives you a way to create user interfaces and a way to make them interactive. So there was always the expectation that if R users were sufficiently motivated and invested, they could build really cool things that we had never thought of–and that’s exactly what happened.

The present

Will async become the default for Shiny?

No, not a chance! I think of async as raising the ceiling on Shiny’s potential scalability, but most apps shouldn’t need to use it. But I hope most users will feel good knowing it’s there in case they ever do need it.

In terms of products, we see a lot of people using RStudio Connect, what’s going on with that right now?

For those who aren’t familiar, RStudio Connect is our answer to on-premises publishing and sharing of the reports and apps you create in R. First, you can use it to deploy Shiny apps to your on-prem server without leaving RStudio—it’s just like publishing to ShinyApps.io. Second, it’s an extremely powerful R Markdown publishing server: you author .Rmd docs in RStudio as usual, but then you can one-click publish your project to Connect. Once on Connect, your report can be re-rendered on a schedule, run with user-specified parameters, automatically emailed to your colleagues, and more.

One of the recent focuses for the Connect team has been expanding the types of projects you can publish, beyond Shiny and R Markdown. The last release added support for deploying Plumber APIs (web service endpoints written in R) and TensorFlow deep learning models.

Another feature that’s under development is a programmatic API for the Connect server itself. This will let you programmatically execute tasks that previously needed to be performed through Connect’s user interface. This is an important feature for enterprises, who often want to integrate Connect to their existing systems.

There’s plenty more to come, but I’ve been sworn to secrecy!

At what point do you think it becomes useful for an R user to know some JavaScript when working with Shiny?

A lot of R users seem to come to JavaScript through d3, and that’s a totally understandable motivation. Personally, I think any R users who seriously want to get into bespoke visualization should consider JavaScript as their second programming language. That said, a lot of Shiny users have built pretty sophisticated apps without directly writing a line of JavaScript (the shinyjs package helps bridge the gap).

I would encourage R users who have JavaScript skills, to look for opportunities to package up JavaScript code into a friendly R package, so that R users who don’t (yet) know JavaScript can take full advantage of your work. The htmlwidgets package is the most popular way of doing this and is ideal for wrapping JS-based visualizations.

The future

How does the future look for Shiny? Can you share any of your plans?

We’ve just come off a big 18 months of work where we had a big focus on making Shiny easier to deploy in production settings: regression testing with shinytest, load testing with shinyloadtest, a new mechanism for scaling with async. In the near term, we’ll be following up with a new plot caching feature that can dramatically speed up certain classes of apps, and a ground-up rewrite of the reactivity visualizer that will finally deliver on the promise that the original implementation (?showReactLog) only hinted at.

We have some plans for the rest of the year, but we’re not ready to talk about them just yet, sorry!

Do you have an overall roadmap for Shiny and is there anything you can tell us about that?

We’ve always been much more reactive than proactive in our planning for Shiny. We almost didn’t have a choice about it in the early years, when every month we were learning so much about how people wanted to use Shiny and the problems they were encountering. That’s not to say that we don’t have a long backlog of features, fixes, documentation, and examples we’d love to tackle; just that we traditionally don’t commit to anything until we start working on it, in case it’s preempted by something we decide is more important.

I suspect we will need to adopt a formal roadmap someday soon. Both the Shiny team and RStudio as a company have grown so much that the lightweight processes I’ve insisted on in the past have started to break down.

And the one we all really want to know the answer to…

Why did you call it Shiny?

It’s from the late and lamented sci-fi series Firefly; in the show, they casually toss that word around to mean “cool”. I just liked the sound of it, and thought it’d make a good name for an open source library, but not for RStudio as we tended to use mostly straightforward, literal names in those days (“RStudio”, “R Markdown”, “RPubs”).

When the time came to create the GitHub repo for our new R web framework project, I intended it to call it something bland—not “RWeb”, but similar. But something strange happened. The new repo page on GitHub has a little prompt that suggests a random name for your repo, and to my delight, this time it said “Need inspiration? How about shiny-octocat.” I took that as a sign, named the repo Shiny, and despite some moments of doubt, it ultimately stuck.

Hmmm, I wonder if it’s too late to rename the shinytest package “gorram“.