How did you first get started in AI?
I became interested in machine learning around my third year of undergraduate at the University of Washington. I was majoring in mathematics and computer science, and machine learning seemed like a nice union between the two. I got involved in some research with Emily Fox and helped publish a paper on variational methods for hidden Markov models, which I thoroughly enjoyed.
What are you currently working on at Landing AI?
Right now I’m working on Landing Light, a defect detection software. A lot of manufacturing companies have to employ hundreds of people to look at defective parts all day. So we’re trying to make it possible for a camera to look at a part and automatically label it ok or defective. It’s typically not enough to run a basic CNN (Convolutional Neural Network) and call it good. One of the most challenging parts of our work is that we don’t have a lot of training data because our clients can only provide a small number of defective images (for good reason). Consequently, we have to leverage prior knowledge or knowledge from the good images to help the model learn.
Something I only learned as an AI practitioner was that you have to think about how the model will work in the real-world situation you’re building it for; that context can change the entire end result that you’re shooting for. For example, one of our manufacturing clients may define an item with one dent as still ok, while an item with five dents is defective. We have to figure out how to embed these rules in the network while maintaining a low number of false negatives, and we have to keep the false positives down so the model still produces a cost benefit for the client.
Take us through your typical workday.
I try to start out every day reading a research paper. I always learn about new techniques or existing techniques, some of which help with our current projects. The entire Landing AI team tries to read at least 2-3 papers a week, either on our own or in reading groups.
When I get to the office, I’ll go over emails and slack messages, then examine results of experiments run the previous night. Project syncs with the team are next – we go over all the experiments we’ve run and discuss whether the experiment confirmed or denied a hypothesis. For example, an experiment may tell us that “changing this piece of the architecture will improve performance for this issue.” Then we use these conclusions to plan the next sets of hypotheses we want to test. Successful experiments will go through code review and be merged into the main development branch.
The rest of the day is spent coding up new experiments, doing local testing, and doing interviews or candidate reviews. Since we’re growing fast, hiring is one of the most important things I work on. Bringing in even one new team member translates into a huge productivity increase for the company. After dinner, I like to end the day reading a new research or paper or a book. I’m currently reading Crossing the Chasm.
Want to build your own career in deep learning?
Get started by taking the Deep Learning Specialization!
What tech stack do you use?
Vim: I use vim for my editor. Since I do a lot of work on remote servers, it’s nice to be able to quickly edit files in vim. I use gruvbox for my background color. I don’t use too many extensions but a few are: lightline, nerdtree, vim-fugitive and ale for linting. I know a lot of people like to use jupyter notebooks but I prefer being closer to the command line. I also like to use ipdb to debug my own software or learn about someone else’s program.
Tmux: Tmux + vim is an amazing combination. I always work in a tmux session. I usually have one session open for editing a file and another session open for running or debugging the program. It’s also nice for creating shared sessions on EC2 instances so you can code with your team members.
iTerm2/zsh: I use this for my terminal. I use Oh My Zsh to manage my zsh config (I tried to make it as close to fish shell as possible but I don’t use fish because I ran into a lot of issues).
AWS: I run a lot of ML jobs on EC2. I also use S3 a lot for storage.
Docker: Docker is great if I want to quickly develop/test something in Linux or another environment. It’s also nice to be able to download other people’s docker images with the environments already set up to run different code. Dealing with environment issues is one of the more time-consuming tasks when programming in industry.
What did you do before working in AI and how does it factor into your work now?
In my first job at Bias Intelligence, I helped build software for supply chain optimization. I got a lot of exposure to different optimization techniques (more on the engineering side) and different tech stacks to help build the software, such as Celery and AWS for distributed computing. We also looked into a modified version of the early Graphlab engine (now Turi) and CUDA for using the GPU.
In my second job at PitchBook, I built a recommender system to help customers navigate the company’s internal text data. Later, I helped build a system for scraping key pieces of text data from the internet.
My experiences at Bias Intelligence and PitchBook taught me some important lessons about working in the AI industry:
Communicating with your customer and understanding their needs is important. If you don’t do this properly, no one will use your AI model and it won’t be useful.
You can never spend too much time on data. Dealing with data is by far the most time-consuming task in an AI project. You do everything from thoroughly analyzing data to creating pipelines to forming your features correctly. Building the actual AI model is only a small part of creating a fully working AI product that provides actual benefit.
How do you keep learning? How do you keep up on recent AI trends?
I usually read Hacker News, which is more related to general CS, but there will occasionally be an ML related post. I used to read the ML subreddit, but now I mostly prefer following my favorite researchers on Twitter, such as David Ha (@hardmaru), Durk Kingma (@dpkingma), Soumith Chantala (@soumithchintala), Ian Goodfellow (@goodfellow_ian), and Andrej Karpathy (@karpathy). Landing AI holds a lot of internal reading groups so I get a lot of recommendations from colleagues at work and from friends in school working on their CS PhDs or in CS labs.
What AI research/trend are you most excited about?
I’m not sure how popular it still is, but I’m very into Alex Grave’s work on neural Turing machines and later the differentiable neural computer.
Why did you choose to work in industry vs. academia?
I almost stayed in academics, but I decided that that I wanted to help build a product that provided some real observable value to people. I was also interested in expanding my knowledge and learning new skills outside of machine learning and computer science. While it’s certainly possible to do these things in academics, I think it’s easier to move around and work across projects and departments in industry.
What advice do you have for people trying to break into AI?
Make sure to take the time to learn the basics: calculus, linear algebra, Intro to Machine Learning classes (take a look at the Stanford 229 Syllabus), standard clustering algorithms (k-means, GMM), algorithms (EM), supervised methods (logistic regression, SVMs, tree methods), dimensionality reduction (PCA, bias variance tradeoff, regularization.)
Read lots of papers and books. I would recommend “Pattern Recognition and Machine Learning” by Christopher Bishop. It’s a very efficient way to expand your knowledge. I would also emphasize working on your software development skills. As a MLE, you still write a lot of code. Being a good software developer can help you become a more efficient and effective MLE.