How did you first get started in AI?
My first exposure to machine learning was at the University of Tehran when I was studying electrical engineering. The course that intrigued me the most was on pattern recognition. One instance that I vividly remember was when the professor was teaching support vector machines and explained that they can be used as Google search engine classifiers. After seeing such examples, I got very excited about the fact that machine learning can have such a profound impact on our everyday lives, so I decided to study it even further. Later on during my PhD in electrical and computer engineering, I researched kernel methods for manifold learning, in addition to Gaussian graphical models applications for anomaly detection in sparse high dimensional data.
What are you currently working on at Microsoft AI & Research?
I’m working on a general-purpose deep reinforcement learning platform that has applications in robotics, industrial control and calibration, energy optimization, and autonomous systems. I started working on this at Bonsai, an AI startup which was acquired by Microsoft in 2018. The platform works with a proprietary programming language that abstracts away the low-level complexities of machine learning algorithms. This provides an interface for what we call “machine teaching”. Users can train an RL model to complete a desired task in a simulation through SDKs. Right now I’m working on building the AI engine for the platform. The engine consists of scalable RL algorithms for real-world production as well as machine teaching abstractions and algorithms. My work on the first front has involved writing software to work with parallel streams of data from simulators and writing software to support distributed model training. I have also been designing and developing the machine teaching methodologies in the AI engine. We’ve been developing novel methods; for example, I’m researching hierarchical RL for decomposing problems using a multi-level hierarchy of concepts.
Take us through your typical workday.
All days start with a daily standup. It helps in clearing what I want to accomplish that day. Then coffee, individual contribution work, catered lunch at desk, continuation of work until the task is finished, with another coffee in between. What I’m working on differs depending on the stage of the product cycle. If I’m on product development tasks, I’m doing a lot of software engineering work like coding, writing documents, code reviews, preparing demos, etc. If it’s a research task, I’m reading papers, studying code bases, prototyping, running experimentations, tuning the models, evaluating performance, dockerizing the project for sharing and reproducibility, discussing the results, and preparing presentations and writing papers.
How is your job as an Applied Scientist different from a Machine Learning Engineer or Machine Learning Researcher?
ML Researchers or Research Scientists develop new models, write academic papers and sometimes conduct fundamental research. For instance, they improve the state of machine translation or come up with new deep learning techniques or architectures. Applied ML Scientists, on the other hand, may not be inventing new algorithms or theories. Their job is to find innovative solutions to a real-world problem by picking the most powerful and applicable ML technique for the given scenario. For instance, they could improve a feed ranking system by using VAEs, or improving a recommendation engine by using the power of RL. ML Engineers, a lot of times, take the prototypes of new algorithms and models and figure out how they can operate at large scales in production. These roles can be very different depending on the company or the team. My role is a mix of Applied ML Scientist and ML Engineer in this classification.
What tech stack do you use to do your ML work?
- Python: The main language for both prototyping and production code. I use pdb as a debugger.
- Tensorflow: Deep learning framework for tensor implementation and computations.
- Ray: High-performance distributed execution framework for large-scale ML/RL applications.
- Jupyter Notebook: Data analysis, experimentations, and sharing results. Also Azure VMs for scaling up experiments.
- Docker: Engine for building and managing containers in product infrastructure. It also helps a lot to use docker images with the environment already set up for ML/RL applications. Setting up the environments to reproduce results of experiments can be very problematic, and Docker is a solution for that.
- Azure DevOps: Product development process, sprint planning and board management, Git repositories, code reviews and pull requests.
What did you do before working in AI and how does it factor into your work now?
Before starting my PhD at Northeastern University, I used to work on a series of image processing and computer vision jobs at University of Tehran, University of Iowa, and Eigen, a medical imaging company. They were based on classical computer vision and signal processing techniques, as we were still in the pre-deep-learning era! The fundamentals weren’t that different though, and many of those ideas ultimately found their way into deep learning.
How do you keep learning? How do you keep up on recent AI trends?
I usually learn fastest when I challenge myself to take on a project that I’m initially not very comfortable with. I also read papers and blog posts of other companies like Facebook, Google and Uber. For an up-to-date summary of ML trends and best industry practices, I also like to attend some industry conferences like MLConf, which is held twice a year.
What AI research/trend are you most excited about?
One major trend that I’m excited about is more powerful representation learning and the ability to capture useful information in dense learned representations. They have been applied in NLP, Reinforcement Learning and Computer Vision and have gotten much better over the past few years. A particular subset of this that can be very powerful is unsupervised representation learning (GANs, VAEs, etc). Another trend is the rise of AutoML, which is related to my current work. I believe AutoML can be a game changer and will make the progress in ML and AI much faster and more democratized.
Why did you choose to work in industry vs. academia?
I believe the tech industry is where I can have the most impact in bridging the gap between theory and practice. I also enjoy building products and seeing the impact of them firsthand.
What advice do you have for people trying to break into AI?
First, do your research and make sure you have good reasons for breaking into AI, especially if it involves transitioning into a new career. Once you’ve made up your mind, take your time to learn classical machine learning theory in addition to deep learning. It is not going to happen overnight, and you need to keep studying and reading and practicing so that you develop intuition about core concepts. Understanding the theory will help you develop realistic expectations of what’s possible with machine learning. Other than that, brush up your general software engineering and system design skills.